yeah

thank you

um

and let's the some audience left for the last talk of today day

and a

the

it is uh a of different to the talks before

um for for getting line of mike talk we can just this is uh

i to the book were that's uh uh i was taught to

given that very short introduction seduction what lights also plays

say it

what's the problem can load it's case may need to permutation and the greedy and how i'm solving in using

as sparsity basically criteria

so um

and the you a case for like some separation is when you have a cocktail party problem

um we have some sources

uh at this point i say we have

speech sources to a people talking

and he would like to get

sing the components of that

uh but a what what you get a some recordings which are just make chance

these send single components

and um

in this case

but i'm

looking here uh uh we have the

better problem off to the

uh mixture of being convolutive one

as we have to of speech we have reflections and so and so on

so uh

the problem becomes more complicated

and the mathematical formulation for this um we have

some source

some extent

and matrix

at least for the instantaneous then use case

i gets of measurements and what we want to do is to

a estimate might matrix uh separating matrix so we get again to

uh i'll in

a signals

uh for this we had like the ica

so nothing you at this point

uh what we have to um

take into account uh we never now the although of the sources and you never know which energy the sauces

have

um

in my work i used to

done not feature a of the natural gradient

uh as i think if you but you know

oh

for speech signals we need

uh

as always we need some

a a probability dispersion

functions for speech what when considering here

we can safely assume uh we have using a class industry

so

as i you said you have

uh not to simply case we have to convolutive mixture we have

a in this case

you you

different delays you have to reflections and so on

so we model this

using uh a convolution

and uh four

we a situations we have some known that to us

two thousand four thousand taps or whatever

um

estimating these filters directly in time domain

is

hot

possibly but very hard

so the you wouldn't way is to go to the

a time-frequency domain using the short fourier transform

and now what we have is

just again uh what implication in each frequency bin

so uh we can just use the

uh up to a to are you shown in each frequency bin independently

which is again

uh

not a problem

but

no

we have

the problem of

uh the different

and rotation patients and and scaling things uh

and the previous example

can do in you think about that in this case we have to correct

um

the scaling

uh there some standard was you have to solve it

uh

the typical the case is the minimum distance

or often principle

uh which we

multiply the

i'm next matrix by yeah

and the with to tight on you down at them and

uh what

this

and that's that we

uh X and and scaling done by the mixing system

you do not know which was

but at least we do not

at new distortion

just point

um

some new method

uh presented

and last time uh a filter shorting filter shaping

but for these masks that you need

well

um

solve the permutation problem first

uh well it's as

uh you can so it didn't each frequency bin independent

so

we were talking about the permutation problem what what is so how can be

uh well

uh

scrap

in this case

we have to

short time

the

some space two spectrograms for time free transform

of two signals

where just

when you exactly know

these spots a swell between the do use two

uh

spectrograms

when you are we start these signals

back

to time domain of course

both signals appear in boston channels

so again you didn't uh

separate and so you have to correct

for use permutation and these can be

uh and every frequency band different

and usually comes quite complicated

uh usually the two main approaches

uh

the

a lot of paper as in and of friends

concentrate on on direct T V two patents and directions of arrival

uh the idea is

when you have to or mixing matrix as

uh we can just

uh

calculate

to directions with a some come from and assume

uh that one direction is one source

this works

good

a strong we have low reverberation

but i reverberation uh you can't

um um

pinpoint point a the sauce to thing the direction in all frequencies together

uh in this case here

i i used the statistics of the separated signals

um one

trivial simple case is uh

you just

look

such a a line in the neighbouring nine in this say

i

they have to look to same

so

they here they are highly correlated

um

yeah this is true

does this

at least for

when when you are looking for a very near bring bent so we have here to a wreck neighbouring bins

and blue and green and yeah okay yeah highly correlated

if you just

go

a few bins away

yeah i i wouldn't say

these been covered

so the correlation method

is not

so to robust

but uh they have been extensions to make it

uh a lot more robust

oh okay so

yeah

at these um

the correlation coefficients uh

take the

um

and then low

calculate the correlation

and decide

the pen what station

depending on all four possible permutations take

and then

and uh using is uh

uh are you can just use a this this way

as a already said this isn't very robust you have

to make it

a because of the

yeah when comparing more distant bins

a

you just got wrong

uh and then

so um

and

you years ago uh uh just been proposed it is the other so think she you as proposed here

but you don't compare

single bins

uh yeah

but how blocks of bins

so that the S luck like this

you compare

it's a first stage you compare one been but another

zero you one

and calculate a couple

correlation can created in and you get

you permutation and take the next to bins and so and so on

so in this case you have neighbouring bands and you can assume okay to

assumption to five related bins

it's met

in the next step

you take

these to correctly calculated bins

take to two and calculate now

uh these four collation so actually what you get

F here for coefficients

and we have to decide

which one to take to you site which can eight uh which permutation do we take

to big as one

to mean

to always one or whatever

four

but not a problem

here you go to already sixteen and the next

yeah we get a sixty four and so on

so it becomes even harder

um

a simple example for this

um

when we just plot

for the the situation but for a frequency

bins

um

the coefficients yeah

um

for all frequency bins so

and the first page you would just take the correlation it C coefficients

directly

uh on the first of their i don't know

uh

and a

uh okay when you look at this

it's

looks like

just go to uh

well

it just one here and here

hardly

so when you going

next up to next steps

so that's say

you compare

the block

five from that to eight hundred to the block a time that to one thousand

we on that or whatever

you compare all the coefficients well which are and a square

so we have a lot of coefficients which are correctly

and a lot of coefficients with or

not can

and and so on in this case here

K

as we work

here are not

but in the next steps you compare these coefficients

a K just me still worked as might a stable

but this case here

if a lot

one computations

which is a lot of

indicators of our limitations which

in a right and

one conditions so

usually the dyadic sorting scheme

is that are but still

phase

but

so and signal

um

no i want to

um

a present if you approach

uh the first

uh observation i i and you can make it

when you're just take

speech signals

speech signals as past

and um

a mixture of two signals which are in a independent

this last

and a

you can extend this

even if the signals are on a signal

as long as the independent

to mixture is less spots

and just is exactly what we have a a permutation problem we have to bound a signals and one to

look which permutation do we have

so the wrong permutation will be

uh

a past

a a you have he an example of this

uh just

to plain speech signal

but nothing

hadn't yeah

and in this case

i just

most to

hi are that's uh uh of of the signal so that

hi up

half of the signal

to the other so we have to mutation

and the lower

level of of the the R T K that sorting scheme

and when we compare these

we have here a lot of

you was or more zeros

and when you look here we have

clearly a signal which is less spots

and uh

this is exactly what we need to uh

from late

the a new criterion

you want to signal to be S sparse as possible

uh the measurement of sparsity um

for this is an hour of uh to take to

some new method of the lp norm

uh

in my case cases a usually it takes something like zero point one

for for P

but it's not that

and part you can vary

um okay so

uh i there is no

S with the correlation coefficient

we take

our signal

calculate

no not the correlation between two signals

but the sparsity of a sum of two signal

and

take again

the four coefficients

every every one against each other

and you get one

um

yeah coefficients

coefficient which can decide which permutation

the point think about this

snow

we don't take the

coefficients in the time-frequency domain but D transform

is

point process

coefficients

to uh

time domain signal

where we can apply

it it you know

uh

using this

even if we take

that's a hundred frequency bins from K to S

still again P that the calm

just one and coefficient

for the whole sorting she

so when we now know do the

the are or thing

so we have again here and

frequency

just one thing the frequency band transform to the time domain

he again one

E applied to you know

is

and here again

and

at this point that is uh

different

no we transform

to frequency bins the time domain

and calculate again one comes and so and so and so

so it's this point you don't know

have to problem of

which coefficients of this

that's a thousands or or whatever

do you do you takes on you uh but you have always just one coefficient

and

due to the

different

criterion

uh a a it's it's much more robust

mostly

um

i have

done some

simulations

um

so first set

uh uh data set this does a for the set up

use

go

T

um

so on so about last they can set from five years ago so

um

we have

a separate

this this state set that uh is

the lot uh somehow

it's a reverberant

recordings some some speech but to relation is

quite whole

you can when you hear of to is that has that you can see yeah it's

government art

derivations like

this this case

the direction of of uh an approach

it's

very good

um

it

it works because of the low vibration

the proposed method it

not as good

almost

but uh when you're local closely Y

is performing

not that good it's because

it's a very low stage where we compare just one thing and frequency bin

i

yeah uh

happened some limitations to and correct

and uh

so

perhaps

uh

should it this to get so that a bit more

if

uh is

assumption of

sparsity and

solves

a a one pass cygnus is of this is correct

and um

but

when you going to a a set which uh a that the cartons that high reverberation

uh

all over you got

less

uh suppression performance

the do approach

is

because it with to set up you don

to have the uh

the signal coming from one direction because

of the reverberation

but

the new approach we all again get almost the performance of the non right algorithm

uh because this case um

you don't

matter which direction to signal comes as long as we

i able to separate it

in every frequency bin

and um um

so it's not always

matching the non by case

but it's

more robust

compared to the

signal it's of the dot pro

so to conclude

um

the converted by source separation

can be soft and the sorry time-frequency domain

a you have to solve the scaling and permutation

and

no we presented a new algorithm based and sparsity

in the time domain

not as user a and a dating time domain

and with tire of variation we have usually better

separation performance and there

direction five

uh

so

yeah let's a hard a set up it's like seven and a half set and for this i used five

seconds

i i saying

if

an a signal uh enough signal to make i C in each frequency band

then there would be enough signal to make you

you know