Loading…

Development and structure of the VariaNTS corpus: A spoken Dutch corpus containing talker and linguistic variability

•The VariaNTS (Variatie in Nederlandse Taal en Sprekers) corpus is a new corpus of spoken Dutch.•The VariaNTS corpus was designed to maximize both linguistic and talker variability.•It contains audio recordings of 1000 items from 11 linguistic subcategories, produced by 8 male and 8 female native sp...

Full description

Saved in:

Bibliographic Details
Published in:	Speech communication 2021-03, Vol.127, p.64-72
Main Authors:	Arts, Floor, Başkent, Deniz, Tamati, Terrin N.
Format:	Article
Language:	English
Subjects:	Afrikaans language Corpus linguistics Dutch language Linguistic variability Linguistics Orthography Perception Phonotactics Speech corpus Speech perception Speech recognition Spoken Dutch Talker variability Vocabulary development Word frequency Word recognition Words (language)
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•The VariaNTS (Variatie in Nederlandse Taal en Sprekers) corpus is a new corpus of spoken Dutch.•The VariaNTS corpus was designed to maximize both linguistic and talker variability.•It contains audio recordings of 1000 items from 11 linguistic subcategories, produced by 8 male and 8 female native speakers of standard Dutch.•Materials were developed to be used for broad assessment of speech perception in Dutch clinical and academic settings. Speech perception and spoken word recognition are not only affected by what is being said, but also by who is speaking. Currently, publicly available corpora of spoken Dutch do not offer a wide variety of linguistic materials produced by multiple talkers. The VariaNTS (Variatie in Nederlandse Taal en Sprekers) corpus is a Dutch spoken corpus that was developed to maximize both linguistic and talker variability. It contains 1000 items from 11 linguistic subcategories, recorded by 8 male and 8 female native speakers of standard Dutch. The corpus contains audio recordings, orthographic transcriptions, item-specific details such as word frequencies, neighborhood densities and phonotactic probabilities, and talker details. The VariaNTS corpus aims to provide new materials to be used for broad assessment of speech perception and word recognition in Dutch clinical and academic settings.
ISSN:	0167-6393 1872-7182
DOI:	10.1016/j.specom.2020.12.006