okay

so that morning

i

is the what we mean by a classifier fusion

of

classifier fusion is applicable uh

whenever we have some uh and symbol of of

i X

and we need to come to some final decision don't

uh

furthermore

in this um but example we we assume that those experts are able to give us off

decisions

in in in a a uh a form of some can fit

so so perhaps the simplest and also own mostly working method uh

how to fuse those scores would be just to a breach out

those confidence values

that sometimes we we have some prior information about uh

the experts and

about better uh

is

uh

in the past

um so so we would like to exploit the host

this information to to uh make

that there

fusion

so the task of

classifier fusion is to take uh

the of and

base classifiers and uh produce one

output score

uh which which ideally a uh

which

we better performance than uh

a single base classifier

so we now we where we we assume uh

so called

linear fusion

which is

a very simple method that but

uh

i also uh

used in the state of the art tools

like the focal uh

toolkit kit or or

that's that the word to at

um

so a linear fusion is just wait it's sum of of

the input scores

uh

where are the weights are trained

uh from from previous

uh

trials

with with the known based through

but what we mean by uh subset fusion of uh

is that uh

in in

subset fusion

we first

uh

so like

uh

only certain classifiers from from the full set

and those uh

then for C to to the fusion training and and fusion

what what

could be the motivation for for such

to something uh so first for the traditional

uh approach uh with the full set

it's it's

the

mostly used method it's

forward

it

computationally efficient since you don't have to do the a subset selection

oh

but

for for the lot and when we have a large number of classifiers

uh we

could be

possibly simply over

training

fusion

virus in in the

stops case

um

we might

possibly suitably but there

that

of course this this uh

matt that relies on on a good subset selection

so the question is can a subset fusion give better performance than the force

oh forty for this system overview uh

on the input we have

uh

speech

typically two utterances

a

those are

um

uh

classified by a classifier

uh

which

i by several classifiers that we that we selected from from of full set of the classifiers

and

those

passive that were selected that and fuse

more more in detail

uh

how we do it

is uh we first

uh train uh the S skull mapping

for for each of the base

base classifiers scores

a a S come mapping mac maps the scores in

uh well calibrated log likelihood ratio

um

on the one that

first

yeah a you see you see that as kyle mapping

and on this second and uh is is uh

cost function C L

which uh we minimize

uh for the match score

okay then then for each of the subset

in be uh power set up two

you

a power of and minus one

uh we train a linear fusion

uh uh with a C C W L are objective function

same same that that's in the focal toolkit

a a that one you C

in the first

uh formal a

uh

the the prior uh

with which the the C W L R

function is way

comes from the cost function

so so for the cost function we we use the new next function

but at the cost of miss type of error one cost of false alarm is one

and uh a probability of target

you're

a target trial is

zero point zero zero one

okay that then after we uh use all the possible subset we we select the

subset based on the smallest

uh

minimum uh decision cost function

so the decision cost function of uh is

is a function of threshold

um

and and

the

cost function parameters

so so for uh

we we we pick the

we pick the one with with the low

uh with the minimum decision

function

and it possible threshold

and finally we we still

but the actual uh a decision cost function which is

the cost function

in a threshold in and all the multi racial that we trained on the training

a with includes

uh uh also the

calibration error

oh of a our

base classifiers

uh we had

well

different

classifiers

uh

which are used in the a i for you called salt to part for the nist two thousand then

evaluation

um

we used three different sets of scores

uh the so called train set and it about set one

where from the extended nice

uh sre sorry two thousand page

files set

and they are just

uh

a like they have very similar uh

score distribution

and then for um

or something different you have also to

is is is the

uh

if we shall nice

two thousand and a

uh evaluations

ah

so for the results

we we divide it uh

all the possible subset

i size

uh

from one to twelve since we had

twelve classifiers fires and and study different

and measure

we can get by selecting a good

a subset

uh

but three

uh

most important point in

points in this

a a lot of are

the worst individual subsystem

the

uh best individual system subsystems so that was are the sets of

size one

only only once is them not no fusion

and uh the baseline is uh the full

in sample the fusion

where where all the twelve plus fires

so

if

usual

so first for for the blue line uh

the blue line shows

uh

the non of non cheating really realistic use case where

we predict the best

uh subset

uh from the training set and then we evaluate on the about that one

so so for for this one unfortunately we we cannot but get the better result than the set fusion but

we can get

sometimes for for

in the size if of seven right

and

uh we can get

a a very similar result

and the best subset selection or or four shows uh the best subset uh

the uh performance of the best subset uh

if if we knew how to select a

uh

then the worst subset selection or well

uh i shows the case

uh uh when we

cell like the worst possible

subset from from the power set

so

those are uh

and and of are bound

ah

okay

this is the same case uh

only not to not for the actual dcf but for

minimum dcf and you rely right

so you can see we we can still uh get

but their mean dcf or equal error rate by

by

not doing the full set fusion

so

but selecting a subs

and finally

um

this is the performance on the of all set to

or or of the nist two thousand ten

a

evaluation set

um

and we can also see

see that for for most of the conditions

interview interview uh

interview telephone and telephone telephone

the best subset

gives

that their their performance than the full and sample

only only

in the

mike mike condition there is something wrong

uh

here uh even the even the full and sample

it's worse

results than

best individual

oh

uh

conclusion of

this research is that

subset fusion has

a then shall to perform the full set fusion

course

if we knew how to select

best

uh

there are the further study should focus on

subset selection methods

they i i think that

it

uh

okay

we have a a a a a a time question

right

you this was uh yeah back from please

uh i'd like to ask if you use the same subsets for all that i was or different subsets for

all the files

uh

you mean in one of the block

or

uh

i generally so a this is this the system

you you put a not that i was to it in

yeah

do you miss select a different subsets for each high or a no no no now

okay

so like one cell

i

okay

did you can are you a solution with the random selection of the subset set of positions

uh

what we mean by a round them

just

see to D you can you shows one to me

a so you have to plot here the

to a but the two bound

okay

well well the random decision

somewhere uh

in the base

oh

and when you when you pick randomly you you and up with the performance between them

two

well

and can be could be interesting to do where you these days

maybe

okay

it

the on the random selection but uh you what

probably like to see a distribution

oh okay

but

because

okay do not mess up the speaker