Odyssey 2016

The Speaker and Language Recognition Workshop

Odyssey 2016

The need for fast, efficient, accurate, and robust means of recognizing people and languages is of growing importance for commercial, forensic, and government applications. Odyssey is a Research Workshop organized every two years by the ISCA Speaker and Language Characterization Special Interest Group (SpLC-SIG).

Odyssey 2016: The Speaker and Language Recognition Workshop was hosted by the University of the Basque Country (UPV/EHU) in its venue Bizkaia Aretoa, Bilbao, Spain, from June 21 to June 24, 2016. The local organizers are GTTS from the University of the Basque Country (UPV/EHU) and VivoLab from the University of Zaragoza. Odyssey 2016 aims to continue fostering interactions among researchers in speaker and language recognition as the successor of previous successful events held in Martigny (1994), Avignon (1998), Crete (2001), Toledo (2004), San Juan (2006), Stellenbosch (2008), Brno (2010), Singapore (2012) and Joensuu (2014).

Website: http://www.odyssey2016.org

Keynotes

1:01:25

Voice conversion and spoofing countermeasures for speaker verification

Haizhou Li

1:02:24

Understanding individual-level speech variability: From novel speech production data to robust speaker recognition

Shri Narayanan

0:59:27

I-Vector Representation Based on GMM and DNN for Audio Classification

Najim Dehak

Text Dependent Speaker Verification

0:23:41

A Low-Power Text-Dependent Speaker Verification System with Narrow-Band Feature Pre-Selection and Weighted Dynamic Time Warping

Qing He, Gregory Wornell and Wei Ma

0:29:05

Deep Neural Network based Text-Dependent Speaker Verification : Preliminary Results

Gautam Bhattacharya and Patrick Kenny

0:25:05

Uncertainty Modeling Without Subspace Methods For Text-Dependent Speaker Recognition

Patrick Kenny, Themos Stafylakis, Jahangir Alam, Vishwa Gupta and Marcel Kockmann

0:19:14

Deep Neural Networks and Hidden Markov Models in i-vector-based Text-Dependent Speaker Verification

Hossein Zeinali, Lukas Burget, Hossein Sameti, Ondrej Glembek and Oldrich Plchot

Speaker Recognition: i-vector approaches

0:25:27

Fast Scoring for PLDA with Uncertainty Propagation

Weiwei Lin, Man-Wai Mak

0:24:13

I-vector transformation and scaling for PLDA based speaker recognition

Sandro Cumani, Pietro Laface

0:23:30

Rapid Computation of I-vector

Longting Xu, Kong Aik Lee, Haizhou Li, Zhen Yang

0:26:01

Constrained discriminative speaker verification specific to normalized i-vectors

Pierre-Michel Bousquet, Jean-Francois Bonastre

0:16:07

Iterative Bayesian and MMSE-based noise compensation techniques for speaker recognition in the i-vector space

Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-Francois Bonastre

Poster Session 1: Language Recognition

0:01:38

Between-Class Covariance Correction For Linear Discriminant Analysis in Language Recognition

Abhinav Misra, Qian Zhang, Finnian Kelly and John H.L. Hansen

0:01:29

Incorporating uncertainty as a Quality Measure in I-Vector Based Language Recognition

Amir Hossein Poorjam, Rahim Saeidi, Tomi Kinnunen, Ville Hautamäki

0:02:33

Discriminating Languages in a Probabilistic Latent Subspace

Aleksandr Sizov, Kong Aik Lee, Tomi Kinnunen

0:02:24

Automatic Accent Recognition Systems and the Effects of Data on Performance

Georgina Brown

0:02:33

The ‘Sprekend Nederland’ project and its application to accent location

David van Leeuwen, Rosemary Orr

0:02:34

Deep Language: a comprehensive deep learning approach to end-to-end language recognition

Trung Ngo Trong, Ville Hautamäki, Kong Aik Lee

0:02:41

On the use of phone-gram units in recurrent neural networks for language identification

Christian Salamea, Luis Fernando D'Haro, Ricardo Cordoba, Rubén San-Segundo

0:01:54

Language Recognition for Dialects and Closely Related Languages

Gregory Gelly, Jean-Luc Gauvain, Lori Lamel, Antoine Laurent, Viet Bac Le, Abdel Messaoudi

0:01:32

Identification of British English regional accents using fusion of i-vector and multi-accent phonotactic systems

Maryam Najafian, Saeid Safavi, Phil Weber, Martin Russell

0:02:05

Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification

Yan Song, Ruilian Cui, Ian Mcloughlin, Lirong Dai

Speaker Recognition in Multimedia Content

0:18:17

Deep complementary features for speaker identification in TV broadcast data

Mateusz Budnik, Ali Khodabakhsh, Laurent Besacier, Cenk Demiroglu

0:25:53

First investigations on self trained speaker diarization

Gaël Le Lan, Sylvain Meignier, Delphine Charlet, Anthony Larcher

0:24:45

Soft VAD in Factor Analysis Based Speaker Segmentation of Broadcast News

Brecht Desplanques, Kris Demuynck, Jean-Pierre Martens

Speaker & Language Recognition Systems

0:25:38

BAT System Description for NIST LRE 2015

Oldrich Plchot, Pavel Matejka, Ondrej Glembek, Radek Fer, Ondrej Novotny, Jan Pesan, Lukas Burget, Niko Brummer, Sandro Cumani

0:24:57

The IBM 2016 Speaker Recognition System

Seyed Omid Sadjadi, Sriram Ganapathy, Jason Pelecanos

0:21:31

The Sheffield language recognition system in NIST LRE 2015

Raymond W. M. Ng, Mauro Nicolao, Oscar Saz, Madina Hasan, Bhusan Chettri, Mortaza Doulaty, Tan Lee, Thomas Hain

0:17:35

Analyzing the Effect of Channel Mismatch on the SRI Language Recognition Evaluation 2015 System

Mitchell Mclaren, Diego Castán, Luciana Ferrer

0:22:09

The MITLL NIST LRE 2015 Language Recognition System

Pedro Torres-Carrasquillo, Najim Dehak, Elizabeth Godoy, Douglas Reynolds, Fred Richardson, Stephen Shum, Elliot Singer, Douglas Sturim

Speaker & Language Recognition: Deep learning approaches

0:23:28

Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE15

Alan Mccree, Greg Sell, Daniel Garcia-Romero

0:22:59

LID-senone Extraction via Deep Neural Networks for End-to-End Language Identification

Ma Jin, Yan Song, Ian Mcloughlin, Lirong Dai, Zhongfu Ye

0:25:50

On autoencoders in the i-vector space for speaker recognition

Timur Pekhovsky, Sergey Novoselov, Aleksei Sholohov, Oleg Kudashev

0:19:49

Channel Compensation for Speaker Recognition using MAP Adapted PLDA and Denoising DNNs

Fred Richardson, Brian Nemsick, Douglas Reynolds

0:25:38

Evaluation of an LSTM-RNN System in Different NIST Language Recognition Frameworks

Ruben Zazo, Alicia Lozano-Diez, Joaquin Gonzalez-Rodriguez

Poster Session 2: Speaker Recognition I

0:02:31

Feature-based likelihood ratios for speaker recognition from linguistically-constrained formant-based i-vectors

Javier Franco-Pedroso, Joaquin Gonzalez-Rodriguez

0:01:23

Improving Robustness of Speaker Verification Against Mimicked Speech

Kuruvachan K George, Santhosh Kumar C, Ramachandran K I, Ashish Panda

0:03:03

Multi-channel i-vector combination for robust speaker verification in multi-room domestic environments

Alessio Brutti, Alberto Abad

0:02:00

VOICE LIVENESS DETECTION FOR SPEAKER VERIFICATION BASED ON A TANDEM SINGLE/DOUBLE-CHANNEL POP NOISE DETECTOR

Sayaka Shiota, Fernando Villavicencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen, Tomoko Matsui

0:02:33

A PLDA Approach for Language and Text Independent Speaker Recognition

Abbas Khosravani, Mohammad Mehdi Homayounpour, Dijana Petrovska-Delacrétaz, Gérard Chollet

0:02:01

Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks

Md Jahangir Alam, Patrick Kenny, Vishwa Gupta, Themos Stafylakis

0:02:30

Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy

Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen, Ville Hautamäki

0:03:05

A New Feature for Automatic Speaker Verification Anti-Spoofing: Constant Q Cepstral Coefficients

Massimiliano Todisco, Héctor Delgado, Nicholas Evans

0:02:19

Multi-Bit Allocation: Preparing Voice Biometrics for Template Protection

Marco Paulini, Christian Rathgeb, Andreas Nautsch, Hermine Reichau, Herbert Reininger, Christoph Busch

0:01:40

Analysis and Optimization of Bottleneck Features for Speaker Recognition

Alicia Lozano-Diez, Anna Silnova, Pavel Matejka, Ondrej Glembek, Oldrich Plchot, Jan Pesan, Lukas Burget, Joaquin Gonzalez-Rodriguez

Industry & Forensics Track (Short Talks + Panel Session)

1:28:43

Forensic and investigative speaker recognition

Daniel Ramos, Jonas Lindh, Michael Jessen, Anil Alexander, Geoffrey Stewart Morrison

0:34:10

Commercial applications of speaker and language recognition

Sergey Novoselov, Carlos Vaquero, Antonio Moreno

NIST 2015 Language Recognition i-Vector Machine Learning Challenge

0:12:50

Summary of the 2015 NIST Language Recognition i-Vector Machine Learning Challenge

Audrey Tong, Craig Greenberg, Alvin Martin, Desire Banse, John Howard, Hui Zhao, George Doddington, Daniel Garcia-Romero, Alan McCree, Douglas Reynolds, Elliot Singer, Jaime Hernandez-Cordero, Lisa Mason

0:16:43

Out-of-Set i-Vector Selection for Open-set Language Identification

Hamid Behravan, Tomi Kinnunen, Ville Hautamäki

0:26:28

I2R Submission to the 2015 NIST Language Recognition I-vector Challenge

Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Kong Aik Lee, Bin Ma, Haizhou Li

0:17:30

A Semisupervised Approach for Language Identification based on Ladder Networks

Ehud Ben-Reuven, Jacob Goldberger

Poster Session 3: Speaker Recognition II

0:02:53

Cantonese forensic voice comparison with higher-level features: likelihood ratio-based validation using F-pattern and tonal F0 trajectories over a disyllabic hexaphone

Phil Rose, Xiao Wang

0:01:46

I-Vectors for speech activity detection

Elie Khoury, Matt Garland

0:02:37

Compensation for phonetic nuisance variability in speaker recognition using DNNs

Themos Stafylakis, Patrick Kenny, Vishwa Gupta, Jahangir Alam, Marcel Kockmann

0:01:31

Local binary patterns as features for speaker recognition

Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-Francois Bonastre

0:02:25

Robustness of Quality-based Score Calibration of Speaker Recognition Systems with respect to low-SNR and short-duration conditions

Andreas Nautsch, Rahim Saeidi, Christian Rathgeb, Christoph Busch

0:01:06

From Features to Speaker Vectors by means of Restricted Boltzmann Machine Adaptation

Pooyan Safari, Omid Ghahabi, Javier Hernando

0:01:54

Reducing Noise Bias in the i-Vector Space for Speaker Recognition

Yosef Solewicz, Hagai Aronowitz, Timo Becker

Speaker Clustering and Diarization

0:23:19

Semi-supervised On-line Speaker Diarization for Meeting Data with Incremental Maximum A-posteriori Adaptation

Giovanni Soldi, Massimiliano Todisco, Héctor Delgado, Christophe Beaugeant, Nicholas Evans

0:22:58

Influence of transition cost in the segmentation stage of speaker diarization

Beatriz Martínez-González, José M. Pardo, Rubén San-Segundo, J.M. Montero

0:22:23

Analysis of the Impact of the Audio Database Characteristics in the Accuracy of a Speaker Clustering System

Jesús Jorrín Prieto, Carlos Vaquero, Paola García

0:21:54

Short- and Long-Term Speech Features for Hybrid HMM-i-Vector based Speaker Diarization System

Abraham Woubie Zewoudie, Jordi Luque, Javier Hernando

0:24:59

On the Use of PLDA i-vector Scoring for Clustering Short Segments

Itay Salmun, Irit Opher, Itshak Lapidot

Opening & Closing

0:31:46

Opening ceremony

0:09:21

Closing ceremony