
you Haizhou Li

and Ambikairajah Kong Aik Lee and


presented by

good afternoon every one

the paper i would like to present is entitled

Bhattacharyya based gmm

SVM system with adaptive from

relevance factor for pair language recognition

and outline

oh for this presentation is shown here

in this pair language recognition system and we major focus by using

a studying the three

techniques including Bhattacharyya based gmm svm

an adaptive relevance factor as well as strategies for pair language recognition

given a specified language pair the task of

recognition of

language pair is to decide which of these

two languages is in fact spoken in the specified in a given segment

so we develop pair language recognition systems by studying bhattacharyya base gmm svm

by introducing mean supervector and the covariance supervector and we merge these two kind of

sub kernels together to form a better performance

for this

a hybrid system

we also

in order to compensate those duration effect

and we introduce adaptive relevance factors



and MAP in gmm svm systems

and for the purpose of pair language recognition we introduce two set of strategies

for this a big

condition purpose

and we report our system design

for each progress

for LRE twenty eleven submission


in a speaker and language recognition system normally

and there are two typical kernals for gmm svm they are

kullback leibler kernel and bhattacharyya kernel


conventional kl kernel only includes mean information

for recognition that modeling


a Symmetrized version of the k l

can extend

it to include the covariance term


so why we choose

Bhattacha ryya based kernel for language pair


so based on many experiments

for speaker and language recognition systems

we observed the bhattacharyya based kernel has better performance than k. l.


in the bhattacharya kernel

there are

this kernal actually could be splitted

can be splitted into three terms the first term

can contribute is contributed by mean and covariance of


and the second term

involves the covariance term only the third term is

involves weight but

parameter of gmm only

so actually these three terms can be independently used to give

the recognition decision score

with different degree of information contribution

so by using the first term of the Bhattacharyya kernel

so with

keeping covariance

not updated


we can get the mean supervector train



so these kind of kernel could be independently used as a sub


and then

second term only includes the covariance term

ah so we can get the

covariance supervectors from this term

we only use

the first two terms of the bhattacharyya kernel

for our

for our pair language recognition

system design

so the NAP for both

a mean supervector and the covariance supervector of Bhattacharyya

are trained by using different

a database with

a certain amount of overlap

this purpose is to


to increase those compensation factors

so for this UBM database and the

relevance factor database training

we can

use the common to both

supervector mean and covariance

so in order to compensate duration variability we introduce adaptive relevance factor


and this adaptive relevance factor of MAP

in gmm svm

here we show the MAP position

in gmm svm system

so this equation is the mean updated

of MAP

so here the x_i is the first of sufficient

statistic statistics

so you can see the relevance factor gamma_i can indirectly affect the degree of update

for the mean vectors of gmm


so we assume

once we

we have this relevance factor be a function of duration it is possible to do

some compensation work

in this

mean update

so far there are two types of relevance factors

one is in the classical MAP

usually we use fixed value of relevance factor

so the relevance factor also can be data dependence by this question

this equation is derived from

from the factor analysis research

here the phi is a diagonal matrix that can be trained by using development database

so assume this relevance factor be function of k. is related to the number of

features that is connected to duration

so we can see the occupation

count N_i

we do the expectation

on this occupation count and we can see this

the expectation of the occupation count is directly

proportional to proportional to the durations

so if we choose this function as the duration function for

for the relevance factor so we can have expectation of adaptation coefficient

of MAP mean adaptation trends to a constant vector so we can get this

adaptive relevance factor by this equation

so this equation will result in

g.m.m. being independent of duration

now we go to the third point of our presentation

we propose two strategies for pair language recognition the first one is one

to all strategy

also called core to pair modeling

this modeling means we train gmm svm models for certain

target language against all other target languages

so we can have the score vectors here

with this score vector and by using our development database for all the target

languages and we can have the back

the gaussian backend modelings

for this the end

for these N languages



and language pair scores can be obtained

through the log likelihood ratios shown here

so the second

strategy is a pairwise strategy also called pair modeling

this modeling is very simple just use

two languages' database from the language pair

directly train the model of gmm svm and we get

this modeling

and we get

the scores

for the fusion of the two strategies

we only apply equal weights

for this

that means we assume

that importance of the two strategies

are the same

so we get the final score by fusion the two strategies

here we show a hybrid

pair language recognition system

we get the test utterance we can have

Bhattacharyya mean supervector and covariance supervector

together input to

the two


and we get the merging of the two supervectors in each of the


finally we fusion these two strategies together and we get the final score

we do the evaluation for our

pair language recognition design

by using

NIST LRE 2011 platform

here there are twenty-four target languages so totally

there are

two hundred and seventy six language pairs

so we choose

five hundred and twelve Gaussian components for gmm

and ubm and

oh we

do these experiments

and show the results based on thirty second task in this paper

but we also do other duration parts in our experiments

so here we use eighty dimensions MFCC SDC

and this MFCC SDC features

with energy based vad

and the performance is computed

as average cost

for the N worst language pairs

here we list

the training data base

for both CTS and BNBS


for our language pair recognition training

now we show the experiment results

by comparing firstly we compare the fixed relevance factor and adaptive relevance factor


the table one shows


the core to pair

strategy we show


fixed relevance factor set to three different

value zero point two five eight thirty two and we give

the eer and the minimum cost

here and compare with arf that is

adaptive relevance factor and we compare these two

compare these data we can say

the adaptive relevance facotr performs

better than any of the

fixed relevance factor settings

so the similar observations


in this pair strategy

here and say twelve point

seven five percent for

in terms of eer


and the other one is higher one

with the relevance factor settings

the second experiment we are doing

is for

the effect of the merging.

the two sets of supervectors

mean supervector and covariance supervector

the blue color means the mean supervector

the green color we present


Bhattacharyya covariance

supervector with eighty dimension

MFCC sdc features

and arf is adaptive relevance factor

so we

we do this experiment


core to pair strategy and we show the red


this merging effect

in the red color and we can see

performance is obviously

over the previous one that's mean and covariance

this figure is based on

N top

language pairs that is

the worst

performance of EER

with N times N minus one divided by two

language pairs

so the similar


can be found in the

pair strategies

also the red color always


most of the language pairs is lower it gives

lower minimum detection cost


we will show the fusion effect


the two pairs

the first one

the blue one is core to pair and the green one is for the pair

strategies after we merging this two strategies we can get the final results

with eer of

ten point

something percent

and the minimum cost is zero point zero nine

oh we come to conclusions for my presentation we have developed a hybrid

Bhattacharyya based gmm-svm system for pair language recognition

for the purpose of LRE twenty eleven submission

performance after the merge of

mean supervector and covariance supervector is obvious

we compare to the fixed relevance factor

and we aobserved the adaptive relevance factor is effective

for the pair language recognition



we can say the fusion of core to pair and pair strategies

is useful

here we show some reference papers especially for the first one from patrick kenny he

proposed this database

data dependent relevance factor

thank you

oh okay

firstly we choose these

mean and covariance super vectors

this means we don't want to merge

this mean and covariance informations in one kernel

we want to separate it because we find if we separate it

we may get better performance after merging these two

supervectors together

we ever compared it

so that is when we


when we do the kernel with the first term and the second term merging together

to produce only one kernel and compare with the separated kernels that is mean kernel

and covariance kernel after that fusion together

the latter effect is better




that is

i think at least

because it is based on different training and testing environment

and database

so totally the effect is obvious
