Odyssey 2012

The Speaker and Language Recognition Workshop

New Resources for Recognition of Confusable Linguistic Varieties: The LRE11 Corpus

Presented by:
Stephanie Strassel
Stephanie Strassel, Kevin Walker, Karen Jones, Dave Graff and Christopher Cieri

The NIST 2011 Language Recognition Evaluation focuses on language pair discrimination for 24 languages/dialects, some of which may be considered mutually intelligible or closely related. The LRE11 evaluation required new data for all languages, comprising both conversational telephone speech and broadcast narrowband speech from multiple sources in each language. Given the potential confusion among varieties in the collection, manual language auditing required special care including the assessment of inter-auditor consistency. We report on collection methods, auditing approaches, and results.