oh

hi

this is uh a joint work with uh four

back from my in yeah this shot team in a selective but

and a

from a you know as an

just sticks team from a to to comply that can we uh

i'm not and going

tech was say to a non-negative matrix factorization with group sparsity so uh there have been a several talks about

the the quite site two

a negative matrix factorization so

uh

we have been working on

uh

with

this frame

so i go over a quickly and non-negative matrix factorization uh

the the next your slides so uh

yeah

yeah you can see a a a a steep it's simple example of uh

a just signal

it's uh it's composed of a can or not

it's uh

it's a

uh and uh at to to

at

each each or you can see that uh first

for notes of can are light and then combinations of two which are

how money

which are

a a one up one dave uh

to the other

so uh this is a

this is an example of a very a very difficult to and source separation a compact

and uh what we can see here is that the so you have the data and uh a money to

matrix factorisation in learning

a basis dictionaries so with the basis spectra

and they time activations yeah we can see the dictionary and the time activations and you can see that

uh a very clearly you can see a the the notes

are uh a separate it and you can see that the make actually since a very

or easily so you can see that for notes are played together and then combination

of to notes

and uh uh there are still two components that are left one that explains the one that explains the noise

and one that you can see here

but uh uh so

if we could listen to it uh sounds like the hammer

of the of the channel

so this is an example of a or where uh the cross a to nonnegative matrix factorisation works are really

where i i'm seven example of a

a a nonnegative matrix factorization

using another a a a a uh another all us which is the euclidean us

so here's the the same

uh it's the same type plots

uh except

you can see that uh for um

or the first not reece is the the thought components here

uh you can see that the the top component gets split up with other components

so the separation is not so good as a as before

and that is explained by the fact that uh a take a to uh a measure of divergence

is uh more sensitive uh most since if to to uh

to high frequency uh and to choose

so it seems a

but a suppression for

so uh now if we want to more complicated uh

well just signals

uh

a problem uh a appears that uh

if uh if you have only two sources uh each source can "'em" meets several uh several different spectra for

example when i speak

uh there are so spectra but you can associate my course i'm not only

uh always saying that be the same thing

and uh so there is the problem of grouping the components

uh into two sources assigning several components to sources so

uh uh for instance you can you can simply a run and M F and look at uh the activation

coefficient

okay can see matrix H and you can see that the

in this uh in this very simple or uh example where are you have a base

and uh yeah

and there are overlapping

this the region

you can see that the for some components there is already

uh very clear means that that these components second be assigned to the base

and uh other components to uh

data

so uh one approach is to look at the dictionary and is are guided by a stick or just uh

with the yeah

uh the an engineer can uh

the design the best uh the best grouping of components and

two sources

but uh the problem is that

as the tracks get longer as you get a a a more tracks

and uh also as the dictionary get larger

uh is because more complicated so for the engine now because there is a a lot a more work to

do

uh uh and if you are used uh a get it by heuristic

uh this series stick will involve a considering or permutations

of of the of your generate so uh if you have five permutation mutation you have a factor your five

a limitations

to see too small

but if you want to ten

twenty uh components and of the jury this becomes

uh

wait

wait too long it uh so you would run uh and an F of thoughts to seconds and would spend

one day

considering all the permutations

a source of so

uh

with the

want to do is to include the grouping in the learning of the of the dictionary

so um

when way of uh when we have thinking uh a how to group for the components is to uh is

i think about the the the some levels of uh each source

at uh at a given time

so uh uh here uh for a given track a a uh i i are did the volume for

each so the base get down the voice

and uh

you can see that there are some uh he that you can use for instance uh uh at this time

you can see that the the basis a very low level

uh compared to the other sources so you could say that that's some points

one source is inactive or as the other as a are active

and also uh

another idea yeah is to exploit the fact that their shapes of uh is volume activations are are are very

different

so

uh

so uh

not coming back to the the from of was set the the notations a a little bit

so uh what we have been looking at uh so that that there is a you

of the power

spectrum

uh and at time you can consider out that uh in a model that if you of a

and it's eve uh you know model

uh there was a spectrogram

is

uh

is gonna but that that's the sum of uh

for several components

and

each component

uh

each components of the complex spectrum

uh it's not the gaussian

with a diagonal covariance and uh

nonnegative matrix factorization consists

in

uh computing using uh a factorization of uh the parameters of a matter

so uh

in this case in in the case of uh D uh tech why said to uh your a chance

this corresponds to uh i mean you gaussian model

uh which means uh that's we have a truly the additive model for the power spectrum runs so even if

uh

but it is not at a T for the observed or gram

for all what want to estimate that is the parameters and uh

it is the only model for which you can get the is uh this

i

this to be true

so i mean the gaussian a assumption and uh you can don't to uh looking at the power spectrogram in

it uh it means that actually the power spectrum is uh

distributed as an exponential

uh

with problem itself uh W and H got that we you use the bases dictionary and H uh that time

coefficients

the time activation

and so in my annotation uh H has several role and you want to uh

you want to find uh

and

want to uh

you you want to find a a a a a partition

of the rows of H in to uh say two groups uh but this may generalized

an trial are a number of rules

you want to find a partitions of the rows of H in two

so here had would be to groups we the the same number of uh the same number of uh of

from

now coming back to the coming back to the P just lies what is the volume in uh uh what

is the the some level of fit shots in a in a model

well if you assume that uh the sense of uh each column of uh W

sums to one

then the some level of one source will be the seven

of activation coefficients

of uh of a group one which corresponds to force one

so what we want mother is

uh these

these coefficients

so

inference we propose is to round the grouping at the same time as the factorization

uh so this corresponds to uh uh doing a a up

if close to to an F and and ink

we propose a adding a prior

uh that is that sector is in the groups

a all the different sources so uh

so yes since you have a a nonnegative coefficient this uh this uh and one um is just the the

sum of the coefficients of age for uh one schools

that is one group at a given time

and uh

and the i uh here we only assume that is a that it is a a concave function

uh

and so what this uh what this uh optimization problem tends you is that's you want to have a fit

to the data

but at the same time of uh you have a prior on that they are uh that uh

at a given time there is only a a that they are only a few sources that are active at

the same time

so uh

so in in you know that we have a

but that's choice for

side

but so if you if you if you look at the paper are you

you would see that the uh it to that it comes from a a graphical model with

we two layers

uh i

so

um

and this corresponds actually uh so to uh

maximum like you an france

of uh of the problem of a model

or

even a a out to model of the data

and uh

a parameter

on H

um so about the inference of the parameters for the algorithm is uh in uh the the que c'est chance

it's uh

so it's very hard

uh to uh to have a a a and so the related methods to to the parameter on friends we

must the results to let's get to the date

uh because they go way faster

uh

here an example of uh a a a a a at the right uh window

running the algorithm with you know a a great in reading methods are or multiplicative at that's method and that

but you get you of that's them goes away faster and

actually are converges to but uh are a but on the pony no

so um

or or go with and uh a doesn't change significantly from a

stand

the class i two and F we just add uh

terms

which correspond to a to our prior

and uh since yeah size use a is a concave function

uh

you have that the

site in in upsets a prime uh in is with

uh that's some level of source one

so what the algorithm than that you is that that each step you are gonna a a bit H so

as to get to

a better fit of the data uh corresponding to

the the class a two and gets you

matrix like relation

uh

and the more source one uh

the the the less source one uh will be at a high volume

the more you will be uh but then broke coefficient at this time so uh

it means that the this uh so this algorithm

will push

uh a low amplitude sources to zero and keep i i'm should source

uh

and uh so it's on the fact that uh even if we have a a a a a a a

a you prior this doesn't change the speed of uh

this doesn't change at of the speed of the algorithm it's are compulsion

approximately a thousand iterations durations

the time uh the time customs algorithm is

the classic in

now one complicated aspect of uh i having this prior is that the you must to uh selection for the

i'd proper all so uh uh i a prime thousand are uh in a mother are and on that uh

and uh a a a of the choice of uh of side

so

even that we have a a given that we actually have a a a a a a graphical model that

explains the choice of this prior

uh we could result to uh we could is up to uh

a a bayesian tools to to estimate the was parameters that

uh

actually we uh we we devised a statistic it is a a a a uh

much more simple

uh to uh

it you on that so and it's

the principle of this to stick is that

if you become all the right palm tells then

V

given this parameter L should be exponentially distributed so

uh if you compute now uh

sadistic this thing that are we over the estimation of the you H

and you have a a a and and you are and you look at is uh at this a random

variable then it should be a distributed as an exponential one

and you have a a lot of samples of this because you have a a a a a a a

as many menu frequencies and it as many a frequency in is uh as many state is the statistics it

is you have a a time-frequency bins we have a lot of them

and uh then uh run uh computing a chroma graphs none of sadistic becomes a

very interesting because it it's a very cheap

and you can uh

and uh

we can just run a whole rid of experiments

and look at the parameter values for which you have the lowest a

we also have statistic

um and so we did that on uh that that to get that check a that a lot

or or uh

source so we have a

and the see that that to generated for from the model

and now you can look at uh so i we look at the different number of uh

uh a training sample for the mother

and you can look at uh at the top

yeah value of all set stick it is in blue

uh uh in red uh a measure of the mountains to good to my because uh in this uh

in this setting uh we have a a we generate it's synthetic that that from a non model so we

can uh actually compute

you and parameters to is the divergence good to mother

and yeah you can see uh uh a a class if you can should scroll also gets which can vacations

got is uh if that's a uh uh if a correct source one one source one is a

exactly

if we cover a a correct is source to and source

exactly

so uh when they are only a hundred observations

can see that the there is

with a good the classification accuracy but uh it is difficult to find a minimum of the is T

and as you increase uh the the number of points

in gets the the the set to get you were uh but on that there are and uh

more in see the the minimum of the statistic T

and also uh the

the development of a model uh says

yet but there are you get the rest

so so

this just means the model that i you have

the but are a a are our prior will uh estimate the

as as possible

this uh this is a based on to that at that time

we not want to uh experimental results so uh uh a a first the is to try

uh is to trade this in a simple segmentation task or you know that that

the

it's a given time that is only one still that that is a key

and uh uh a good thing is to uh compare are or them with uh just the simple idea of

doing a a and then F and then finding the best uh mutation

given a a a you given a statistics so he a a a a given a heuristic so he other

heuristic is uh

compute an and have

and uh find the permutation that the minimize is uh this quantity

this to give a faq compared

so this is a result of a an an F with the this a heuristic grouping so uh uh you

can see so the mix was

first uh a then speech

uh you can see each other sources

uh so that's the result of uh and an F with heuristic to groupings we can see that the still

a a a lot of uh missing up the

the sources are are

it's not a lot

and uh this is a result with a are them that's long the grouping at the same time as

uh the an F

so you i can see that uh uh the separation gets a a a a a uh uh

that lot uh

lot more yeah

uh

or

original result

and uh

the second experiment that run was on a a a a real of valid signals

so uh so

we took uh

to some from the C sec that the base

and we evaluated uh the quality of the separation

uh

when we vary the degree of overlap between the sources so

the them that up the

more difficult it becomes to

the separation

and and i insisting on the fact that we have no they're on the use so uh uh so uh

you can talk for perfect separation

but uh you would see that the

when you varies over a

the

the less of a lab you have a

but they'll the separation so you know you

very good separation for a thought T three percent on year

or that so that the sources is it is the

is is mix

uh this is a this

as the source

as deep dark

voice

you can see the that of thought people percent of a like to get the

very good

separation or T

terms of as yeah

and as the overlapping increases

it's

uh

works and mars

so

what the prior what they'll prior is that for that is

when

and uh not all sources are active at the same time it's a people the

the where separation so we can not

we can listen that uh

examples so and it doesn't work

that's

hasn't work

sources

i

i

oh

okay so let's six this this is first meets

i

um

uh

a

i

hmmm

this is this to don't

the

source

skip directly to the results

guess

but with source and that's C of would we have an estimate of uh an on is

for

oh

or

a

oh

oh

a

yeah

i

do you so we have a ten seconds that

a for the computer and so we have a proposed the simple sparsity prior

to do a a group uh grouping of the sources and solve the permutation problem in this uh

a single channel source separation case

uh and we show that the algorithm them but there was of the grouping with the as a a post

processing step

and if you in future work we will try to incorporate

smoothest might prior to uh

to understand the time the paul then a mix of H

i should

we have time for only one with question

no

so the the most you play

they are mostly

a love part E

component

and how mix a very much of like it because they playing

according to the egg are so that a single

close like it

and i am wondering how much the

sampling rate and the effect these signs in is you R

i mean if you would to

hi are and fifty resolution oh

what it be different

what do you yeah

that are separation and that you can see there

such a

yeah

now from uh from my parents this is the

does not is sensitive to the simply me that you choose

uh

for this experiment i chose a sampling rate of uh a twenty two Q do have

because uh it just just because of a computing time uh

concern

and

i guess uh for the example or well you have a time voice

since uh they play in the

approximately in the in the mid and high level of of the spectrogram

uh this wouldn't uh uh have too much effect because uh

because the

so the this i range the frequencies are pretty well separated

but uh uh if you have base

and another source

that purely having a good the resolution uh we will help for uh since we have no model

uh since we and number than for the basis of and

then a having a would sampling right to i mean have a good something rate will help because you then

you can get but the resolution uh

you can you can afford

a longer time window

and

but the resolution in the frequency range which is particularly important

in the low frequency one

a some that i i would say that the results are very robust

the problem goes from

okay thank you