okay

so that morning

is the what we mean by a classifier fusion

classifier fusion is applicable uh

whenever we have some uh and symbol of of

i X

and we need to come to some final decision don't

furthermore

in this um but example we we assume that those experts are able to give us off

decisions

in in in a a uh a form of some can fit

so so perhaps the simplest and also own mostly working method uh

how to fuse those scores would be just to a breach out

those confidence values

that sometimes we we have some prior information about uh

the experts and

about better uh

in the past

um so so we would like to exploit the host

this information to to uh make

that there

fusion

so the task of

classifier fusion is to take uh

the of and

base classifiers and uh produce one

output score

uh which which ideally a uh

which

we better performance than uh

a single base classifier

so we now we where we we assume uh

so called

linear fusion

which is

a very simple method that but

i also uh

used in the state of the art tools

like the focal uh

toolkit kit or or

that's that the word to at

so a linear fusion is just wait it's sum of of

the input scores

where are the weights are trained

uh from from previous

trials

with with the known based through

but what we mean by uh subset fusion of uh

is that uh

in in

subset fusion

we first

so like

only certain classifiers from from the full set

and those uh

then for C to to the fusion training and and fusion

what what

could be the motivation for for such

to something uh so first for the traditional

uh approach uh with the full set

it's it's

the

mostly used method it's

forward

computationally efficient since you don't have to do the a subset selection

but

for for the lot and when we have a large number of classifiers

uh we

could be

possibly simply over

training

fusion

virus in in the

stops case

we might

possibly suitably but there

that

of course this this uh

matt that relies on on a good subset selection

so the question is can a subset fusion give better performance than the force

oh forty for this system overview uh

on the input we have

speech

typically two utterances

those are

classified by a classifier

which

i by several classifiers that we that we selected from from of full set of the classifiers

and

those

passive that were selected that and fuse

more more in detail

how we do it

is uh we first

uh train uh the S skull mapping

for for each of the base

base classifiers scores

a a S come mapping mac maps the scores in

uh well calibrated log likelihood ratio

on the one that

first

yeah a you see you see that as kyle mapping

and on this second and uh is is uh

cost function C L

which uh we minimize

uh for the match score

okay then then for each of the subset

in be uh power set up two

you

a power of and minus one

uh we train a linear fusion

uh uh with a C C W L are objective function

same same that that's in the focal toolkit

a a that one you C

in the first

uh formal a

the the prior uh

with which the the C W L R

function is way

comes from the cost function

so so for the cost function we we use the new next function

but at the cost of miss type of error one cost of false alarm is one

and uh a probability of target

you're

a target trial is

zero point zero zero one

okay that then after we uh use all the possible subset we we select the

subset based on the smallest

minimum uh decision cost function

so the decision cost function of uh is

is a function of threshold

and and

the

cost function parameters

so so for uh

we we we pick the

we pick the one with with the low

uh with the minimum decision

function

and it possible threshold

and finally we we still

but the actual uh a decision cost function which is

the cost function

in a threshold in and all the multi racial that we trained on the training

a with includes

uh uh also the

calibration error

oh of a our

base classifiers

uh we had

well

different

classifiers

which are used in the a i for you called salt to part for the nist two thousand then

evaluation

we used three different sets of scores

uh the so called train set and it about set one

where from the extended nice

uh sre sorry two thousand page

files set

and they are just

a like they have very similar uh

score distribution

and then for um

or something different you have also to

is is is the

if we shall nice

two thousand and a

uh evaluations

so for the results

we we divide it uh

all the possible subset

i size

from one to twelve since we had

twelve classifiers fires and and study different

and measure

we can get by selecting a good

a subset

but three

most important point in

points in this

a a lot of are

the worst individual subsystem

the

uh best individual system subsystems so that was are the sets of

size one

only only once is them not no fusion

and uh the baseline is uh the full

in sample the fusion

where where all the twelve plus fires

usual

so first for for the blue line uh

the blue line shows

the non of non cheating really realistic use case where

we predict the best

uh subset

uh from the training set and then we evaluate on the about that one

so so for for this one unfortunately we we cannot but get the better result than the set fusion but

we can get

sometimes for for

in the size if of seven right

and

uh we can get

a a very similar result

and the best subset selection or or four shows uh the best subset uh

the uh performance of the best subset uh

if if we knew how to select a

then the worst subset selection or well

uh i shows the case

uh uh when we

cell like the worst possible

subset from from the power set

those are uh

and and of are bound

okay

this is the same case uh

only not to not for the actual dcf but for

minimum dcf and you rely right

so you can see we we can still uh get

but their mean dcf or equal error rate by

not doing the full set fusion

but selecting a subs

and finally

this is the performance on the of all set to

or or of the nist two thousand ten

evaluation set

and we can also see

see that for for most of the conditions

interview interview uh

interview telephone and telephone telephone

the best subset

gives

that their their performance than the full and sample

only only

in the

mike mike condition there is something wrong

here uh even the even the full and sample

it's worse

results than

best individual

conclusion of

this research is that

subset fusion has

a then shall to perform the full set fusion

course

if we knew how to select

best

there are the further study should focus on

subset selection methods

they i i think that

okay

we have a a a a a a time question

right

you this was uh yeah back from please

uh i'd like to ask if you use the same subsets for all that i was or different subsets for

all the files

you mean in one of the block

i generally so a this is this the system

you you put a not that i was to it in

yeah

do you miss select a different subsets for each high or a no no no now

okay

so like one cell

okay

did you can are you a solution with the random selection of the subset set of positions

what we mean by a round them

just

see to D you can you shows one to me

a so you have to plot here the

to a but the two bound

okay

well well the random decision

somewhere uh

in the base

and when you when you pick randomly you you and up with the performance between them

two

well

and can be could be interesting to do where you these days

maybe

okay

the on the random selection but uh you what

probably like to see a distribution

oh okay

but

because

okay do not mess up the speaker