hi everyone i

moneymaking sponsored by

i come from the null suppressed and only technical university over time

is that we deal is a presentation on my paper

although word for work or

workshop of the odyssey two so

to sound and the twenty

now that speaking

the title of this paper these partial using metric learning best a speaker verification back

end

in other wars

this paper proposed a shallow match learning back end algorithm both speaker verification

okay i will present it from this for aspects

including as the metric learning and of the motivation

the proposed objective function

some experimental results

and the and last i will give some conclusions

and i will also introduce several all of that works

this paper and do

our future plans

first

the maxent learning and the motivation

and illustrated in the title i thing well i can't on their these two questions

the motivation of this paper we are equally

the first one is what at the automatic learning and what i've we proposed a

metric learning passed back end algorithm

the mac learning em's to learn distance function to matters the similarity of them both

third and the mahalanobis distance

both speaker verification as displayed in the right speaker of this slide

we first extract it is speaker identity features problems what i'm she's by a front

and the speaker feature extractor

but and the i-vector of the extractor

and the thing we feed them to the metric learning past the back end to

calculate the here

similar just goals

for the learning of the metrics

we

employed a loss function best on the optimisation of the actual use the as displayed

in select speaker of this slide

follows them actually learning i thing the first other one g h e that's the

challenge of as a distance function is a consistent with the evaluation procedure

therefore it back into can directly optimize the

tom evaluation metrics the for speaker verification

such as the equal the rats the life use the

and style

thank and eat can be easily combined to these

accents front ends for them both the i-vector of the x better

third this channel matched learn a matter that can be easily extended to choose the

and to and the pram work

the second requesting i needed to uncertainties

what is the partial a use the

and the

why was them metric learning back end aims at its optimising

actually use the

in the

left finger of this slide

the power to use the divine and or small part of what re on there

is a all c call

like

this correct re

vol

the metric learning can directly optimize thumb evaluation metrics

its implementation fess these

some difficulties

as we all know

we needed to "'cause" tried to peer wise all triple edge chanting trials with speaker-level

labels to change is this function

in matched learning

in this edition

the number of all possible training trials

e is very large

besides many easily distinguishable channels unnecessary to the challenge of the distance function

in terms of these difficulties

i think

the optimisation of the pa use the has the

pointing to the ones you jeez

first

it is easy to select the difficulty samples by cindy a two

the overall

and the we'd have to

relative small value

in this to be

we can also progress the number of the

ct of the training trials

second we can optimize them interested the partial use the according to some specific applications

and obviously

a to z is a special case of partial using

next

in the centre part of your express the bedding comparing the impulse of the proposed

algorithm

in this slide i will introduce the whole to calculate to the partial use the

and i health and metric learning need to construct pairwise trials

here we don't see the whole to construct them

and the be the in that

t is an hour a day constructed this there'd

here x and y n

speaker features over two speech segments

our is the year round to choose level

you they come from of them speaker

l a equal one

otherwise i l and you quote the are able

besides the function of s

is use the to calculate the similarity

of two speaker features

here we used to the mahalanobis distance function

no creativity the level l had can be obtained by a comparison of the distances

calls

as a

and the is the threshold receiver

given a fixed the value of the hot we i about to compute to posterior

at t p r

and to

post

positive rats f p r

boundary of the hobby can get a theories o t p and the f b

r

which one

and are of the call

and the role in the speaker

and to really optimize the entire

optimising the optimize the entire roc call if an actual follows

you were this is not only costly but also unnecessary

because in most practical system

work

and only practical

because the most of practical systems

work

and the part of their our roc curves

walking them whole

back security system you're leave equalized smaller force posterior rats

in contrast

terrorist the detector system always hopes

we in

hyper record react

so without optimize the partial use the your the walk imports look at it is

a better choice

in this light

t even though constructed up here was trained if that's

key and do a

the positive and negative subset of t

then be needed to compute a new stuff that and the or

vol

from by eating that they'll

can stress of that's the value of p r is peachy

are far and the beta

you order to compute and the oral we first needed to thank you lance our

and the be higher but this formula

then all this values of connectives that

so

sorted in ascending order

and then e

and the overall he's is selected as a subset of the samples under the problems

at all

i was to be fast position of the result you discourse

after obtaining the overall

p a use the can be calculated and all

normalized

it was the

or p

and the and they are well

in respectively

the partial if the is calculated by they'll

that can a full meal or

of this light

you

all i

is an indicator function so directory optimising this formula is np-hard therefore we needed to

relax eight in the best if agree

elias there's no

here use the calculation function by replacing the indicator function v is a huge loss

function

here

third time is eligible hyper parameter and the it is larger than the oral

the

last from lord give of the relaxed the loss function

to prevent

it to bremen to this

loss function

or feed into the training data be also

indeed regular

not addition term

the land that all mean a

to the minimization problem

finally

this green part in large as the between-class distance

and this read the patch

try to minimize no between-class variance

in awards our objective function ends

and

enlarging of each he'd marketing

been to use the

pasta you and in

negative trials by minimizing they'll sitting at the various

of the two colours trials simultaneously

in the third part i go give some experimental results

this lighted display our experimental it's easiness

more details can be bounded in the paper

this paper

this table lists no comparison results on the conscience that's the data set

it is then that's of the proposed

pa use them actually it's better performance than p lda

given both the i-vector and the expected front ends

specifically the pac p a using them actually over ten s

not persons and to twenty percent relative improvement over p lda

in terms of the

pa use the and it was the

actually

respectively

no worry

it achieves models that eleven percent relative eer reduction

and five percent

relative this the effort reduction over p lda

table two at least the results on the core task

the s i t w data that is that

it is thing that's the problem lost

p a using matching it's better performance than p lda

specifically but the x factor front and is used

pa using matching achieve some of them

eight percent

relative pa use the

an improvement all work p l d a t

if the

it is also

of a tent

no then

twenty percent and the channel or since about it you a was the improvements on

the development and evaluation call tasks respectively

moreover it achieves

ten percent relative eer reduction and the

three percent relative dcf

reduction

although the performance improvement to be though

i-vector front end is not still significant

and that the extract a front end

the tense with different a front ends are consistent

this page displayed as some experimental results

bid i use the two analysis the if at all hyper parameters hopefulness

we adopt e d

read the source to study the impact of the values of common enemy performance

in the

a vector

yes

from these two tables bank and the data does double working region is quite large

this fink or souls the relative performance improvements all work p lda

in terms of the difference

of different adored

in the objective function

from this finger be fine this dances the pa use them actually is a robust

e o by the advantage of is the best value around do one point two

five

finally

i will give some conclusions and the introduced several for the works as you and

of our future plans

in this paper

mahalanobis distance past them magical learning back end is proposed to optimize partial a use

the both speaker verification

because directly optimize thing

partial you the at and b heart

be relaxed aid by a huge loss function

experimental results

carried out of the

nist is a risky and data

s i t w that have that's

that must just as the effectiveness of our proposed algorithm

after this work we also mad the general done normalization

and to compress the analysis

to the pac metric

we show me

published as the

without relative without

in this paper

besides

we also extended the extended to the

pa is the magic to an energy and the framework

more information can be found in this too

more information can be found in this paper

in the theatre

maybe all research more general mexican and best the speaker verification or rhythm

to optimize

evaluation metrics

in order to

further improve speaker verification performance

that all from my presentation

thank you for your watching