0:00:26oh
0:00:27hi
0:00:28this is uh a joint work with uh four
0:00:30back from my in yeah this shot team in a selective but
0:00:33and a
0:00:34from a you know as an
0:00:36just sticks team from a to to comply that can we uh
0:00:39i'm not and going
0:00:40to talk about
0:00:42tech was say to a non-negative matrix factorization with group sparsity so uh there have been a several talks about
0:00:47the the quite site two
0:00:50a negative matrix factorization so
0:00:51uh
0:00:52we have been working on
0:00:53adding uh uh priors
0:00:56uh
0:00:57with
0:00:58this frame
0:00:59so i go over a quickly and non-negative matrix factorization uh
0:01:04the the next your slides so uh
0:01:07yeah
0:01:08yeah you can see a a a a steep it's simple example of uh
0:01:11a just signal
0:01:13it's uh it's composed of a can or not
0:01:15it's uh
0:01:16it's a
0:01:18uh and uh at to to
0:01:20at
0:01:20each each or you can see that uh first
0:01:23for notes of can are light and then combinations of two which are
0:01:27how money
0:01:28which are
0:01:29a a one up one dave uh
0:01:31to the other
0:01:32so uh this is a
0:01:34this is an example of a very a very difficult to and source separation a compact
0:01:39and uh what we can see here is that the so you have the data and uh a money to
0:01:43matrix factorisation in learning
0:01:45a basis dictionaries so with the basis spectra
0:01:48and they time activations yeah we can see the dictionary and the time activations and you can see that
0:01:53uh a very clearly you can see a the the notes
0:01:56are uh a separate it and you can see that the make actually since a very
0:02:00or easily so you can see that for notes are played together and then combination
0:02:04of to notes
0:02:06and uh uh there are still two components that are left one that explains the one that explains the noise
0:02:12and one that you can see here
0:02:14but uh uh so
0:02:17if we could listen to it uh sounds like the hammer
0:02:19of the of the channel
0:02:21so this is an example of a or where uh the cross a to nonnegative matrix factorisation works are really
0:02:26where i i'm seven example of a
0:02:29a a nonnegative matrix factorization
0:02:31using another a a a a uh another all us which is the euclidean us
0:02:35so here's the the same
0:02:37uh it's the same type plots
0:02:39uh except
0:02:41you can see that uh for um
0:02:44or the first not reece is the the thought components here
0:02:47uh you can see that the the top component gets split up with other components
0:02:52so the separation is not so good as a as before
0:02:56and that is explained by the fact that uh a take a to uh a measure of divergence
0:03:00is uh more sensitive uh most since if to to uh
0:03:05to high frequency uh and to choose
0:03:07so it seems a
0:03:10but a suppression for
0:03:14so uh now if we want to more complicated uh
0:03:18well just signals
0:03:20uh
0:03:21a problem uh a appears that uh
0:03:24if uh if you have only two sources uh each source can "'em" meets several uh several different spectra for
0:03:30example when i speak
0:03:31uh there are so spectra but you can associate my course i'm not only
0:03:36uh always saying that be the same thing
0:03:38and uh so there is the problem of grouping the components
0:03:43uh into two sources assigning several components to sources so
0:03:47uh uh for instance you can you can simply a run and M F and look at uh the activation
0:03:51coefficient
0:03:52okay can see matrix H and you can see that the
0:03:55in this uh in this very simple or uh example where are you have a base
0:04:00and uh yeah
0:04:01and there are overlapping
0:04:03this the region
0:04:04you can see that the for some components there is already
0:04:08uh very clear means that that these components second be assigned to the base
0:04:12and uh other components to uh
0:04:15data
0:04:16so uh one approach is to look at the dictionary and is are guided by a stick or just uh
0:04:22with the yeah
0:04:23uh the an engineer can uh
0:04:27the design the best uh the best grouping of components and
0:04:30two sources
0:04:32but uh the problem is that
0:04:34as the tracks get longer as you get a a a more tracks
0:04:38and uh also as the dictionary get larger
0:04:41uh is because more complicated so for the engine now because there is a a lot a more work to
0:04:45do
0:04:46uh uh and if you are used uh a get it by heuristic
0:04:49uh this series stick will involve a considering or permutations
0:04:53of of the of your generate so uh if you have five permutation mutation you have a factor your five
0:04:58a limitations
0:04:59to see too small
0:05:01but if you want to ten
0:05:02twenty uh components and of the jury this becomes
0:05:05uh
0:05:06wait
0:05:06wait too long it uh so you would run uh and an F of thoughts to seconds and would spend
0:05:11one day
0:05:12considering all the permutations
0:05:14a source of so
0:05:15uh
0:05:16with the
0:05:17want to do is to include the grouping in the learning of the of the dictionary
0:05:22so um
0:05:24when way of uh when we have thinking uh a how to group for the components is to uh is
0:05:28i think about the the the some levels of uh each source
0:05:32at uh at a given time
0:05:33so uh uh here uh for a given track a a uh i i are did the volume for
0:05:38each so the base get down the voice
0:05:41and uh
0:05:42you can see that there are some uh he that you can use for instance uh uh at this time
0:05:46you can see that the the basis a very low level
0:05:49uh compared to the other sources so you could say that that's some points
0:05:53one source is inactive or as the other as a are active
0:05:56and also uh
0:05:58another idea yeah is to exploit the fact that their shapes of uh is volume activations are are are very
0:06:03different
0:06:04so
0:06:05uh
0:06:07so uh
0:06:09not coming back to the the from of was set the the notations a a little bit
0:06:13so uh what we have been looking at uh so that that there is a you
0:06:17of the power
0:06:18spectrum
0:06:19uh and at time you can consider out that uh in a model that if you of a
0:06:24and it's eve uh you know model
0:06:26uh there was a spectrogram
0:06:28is
0:06:28uh
0:06:29is gonna but that that's the sum of uh
0:06:32for several components
0:06:34and
0:06:34each component
0:06:36uh
0:06:36each components of the complex spectrum
0:06:38uh it's not the gaussian
0:06:40with a diagonal covariance and uh
0:06:44nonnegative matrix factorization consists
0:06:46in
0:06:46uh computing using uh a factorization of uh the parameters of a matter
0:06:51so uh
0:06:53in this case in in the case of uh D uh tech why said to uh your a chance
0:06:58this corresponds to uh i mean you gaussian model
0:07:01uh which means uh that's we have a truly the additive model for the power spectrum runs so even if
0:07:07uh
0:07:08but it is not at a T for the observed or gram
0:07:11really additive
0:07:12for all what want to estimate that is the parameters and uh
0:07:16it is the only model for which you can get the is uh this
0:07:19i
0:07:19this to be true
0:07:21so i mean the gaussian a assumption and uh you can don't to uh looking at the power spectrogram in
0:07:26it uh it means that actually the power spectrum is uh
0:07:29distributed as an exponential
0:07:31uh
0:07:32with problem itself uh W and H got that we you use the bases dictionary and H uh that time
0:07:37coefficients
0:07:37the time activation
0:07:39and so in my annotation uh H has several role and you want to uh
0:07:44you want to find uh
0:07:46and
0:07:47want to uh
0:07:48you you want to find a a a a a partition
0:07:51of the rows of H in to uh say two groups uh but this may generalized
0:07:56an trial are a number of rules
0:07:58you want to find a partitions of the rows of H in two
0:08:01so here had would be to groups we the the same number of uh the same number of uh of
0:08:05from
0:08:07now coming back to the coming back to the P just lies what is the volume in uh uh what
0:08:12is the the some level of fit shots in a in a model
0:08:16well if you assume that uh the sense of uh each column of uh W
0:08:21sums to one
0:08:22then the some level of one source will be the seven
0:08:25of activation coefficients
0:08:27of uh of a group one which corresponds to force one
0:08:31so what we want mother is
0:08:33uh these
0:08:34these coefficients
0:08:38so
0:08:39inference we propose is to round the grouping at the same time as the factorization
0:08:45uh so this corresponds to uh uh doing a a up
0:08:49if close to to an F and and ink
0:08:51we propose a adding a prior
0:08:53uh that is that sector is in the groups
0:08:56a all the different sources so uh
0:08:59so yes since you have a a nonnegative coefficient this uh this uh and one um is just the the
0:09:05sum of the coefficients of age for uh one schools
0:09:07that is one group at a given time
0:09:10and uh
0:09:12and the i uh here we only assume that is a that it is a a concave function
0:09:17uh
0:09:20and so what this uh what this uh optimization problem tends you is that's you want to have a fit
0:09:24to the data
0:09:25but at the same time of uh you have a prior on that they are uh that uh
0:09:31at a given time there is only a a that they are only a few sources that are active at
0:09:35the same time
0:09:38so uh
0:09:40so in in you know that we have a
0:09:43but that's choice for
0:09:44side
0:09:45but so if you if you if you look at the paper are you
0:09:48you would see that the uh it to that it comes from a a graphical model with
0:09:51we two layers
0:09:53uh i
0:09:54so
0:09:55to much about this
0:09:57um
0:09:58and this corresponds actually uh so to uh
0:10:02maximum like you an france
0:10:04of uh of the problem of a model
0:10:07or
0:10:07even a a out to model of the data
0:10:09and uh
0:10:11a parameter
0:10:12on H
0:10:14um so about the inference of the parameters for the algorithm is uh in uh the the que c'est chance
0:10:21it's uh
0:10:22so it's very hard
0:10:23uh to uh to have a a a and so the related methods to to the parameter on friends we
0:10:28must the results to let's get to the date
0:10:30uh because they go way faster
0:10:32uh
0:10:32here an example of uh a a a a a at the right uh window
0:10:36running the algorithm with you know a a great in reading methods are or multiplicative at that's method and that
0:10:42but you get you of that's them goes away faster and
0:10:45actually are converges to but uh are a but on the pony no
0:10:50so um
0:10:53or or go with and uh a doesn't change significantly from a
0:10:57stand
0:10:58the class i two and F we just add uh
0:11:00terms
0:11:01which correspond to a to our prior
0:11:03and uh since yeah size use a is a concave function
0:11:08uh
0:11:09you have that the
0:11:11site in in upsets a prime uh in is with
0:11:14uh that's some level of source one
0:11:17so what the algorithm than that you is that that each step you are gonna a a bit H so
0:11:21as to get to
0:11:22a better fit of the data uh corresponding to
0:11:25the the class a two and gets you
0:11:27matrix like relation
0:11:28uh
0:11:30and the more source one uh
0:11:34the the the less source one uh will be at a high volume
0:11:38the more you will be uh but then broke coefficient at this time so uh
0:11:42it means that the this uh so this algorithm
0:11:45will push
0:11:47uh a low amplitude sources to zero and keep i i'm should source
0:11:54uh
0:11:55and uh so it's on the fact that uh even if we have a a a a a a a
0:11:58a you prior this doesn't change the speed of uh
0:12:01this doesn't change at of the speed of the algorithm it's are compulsion
0:12:04approximately a thousand iterations durations
0:12:07the time uh the time customs algorithm is
0:12:10read the same as a
0:12:11the classic in
0:12:13now one complicated aspect of uh i having this prior is that the you must to uh selection for the
0:12:18i'd proper all so uh uh i a prime thousand are uh in a mother are and on that uh
0:12:22and uh a a a of the choice of uh of side
0:12:26so
0:12:27even that we have a a given that we actually have a a a a a a graphical model that
0:12:32explains the choice of this prior
0:12:34uh we could result to uh we could is up to uh
0:12:38a a bayesian tools to to estimate the was parameters that
0:12:41uh
0:12:43actually we uh we we devised a statistic it is a a a a uh
0:12:48much more simple
0:12:49uh to uh
0:12:50it you on that so and it's
0:12:53the principle of this to stick is that
0:12:56if you become all the right palm tells then
0:12:58V
0:12:59given this parameter L should be exponentially distributed so
0:13:03uh if you compute now uh
0:13:06sadistic this thing that are we over the estimation of the you H
0:13:10and you have a a a and and you are and you look at is uh at this a random
0:13:14variable then it should be a distributed as an exponential one
0:13:17and you have a a lot of samples of this because you have a a a a a a a
0:13:20as many menu frequencies and it as many a frequency in is uh as many state is the statistics it
0:13:26is you have a a time-frequency bins we have a lot of them
0:13:28and uh then uh run uh computing a chroma graphs none of sadistic becomes a
0:13:34very interesting because it it's a very cheap
0:13:36and you can uh
0:13:37and uh
0:13:38we can just run a whole rid of experiments
0:13:41and look at the parameter values for which you have the lowest a
0:13:44we also have statistic
0:13:46um and so we did that on uh that that to get that check a that a lot
0:13:51or or uh
0:13:52source so we have a
0:13:53and the see that that to generated for from the model
0:13:57and now you can look at uh so i we look at the different number of uh
0:14:02uh a training sample for the mother
0:14:04and you can look at uh at the top
0:14:06yeah value of all set stick it is in blue
0:14:10uh uh in red uh a measure of the mountains to good to my because uh in this uh
0:14:15in this setting uh we have a a we generate it's synthetic that that from a non model so we
0:14:19can uh actually compute
0:14:21you and parameters to is the divergence good to mother
0:14:24and yeah you can see uh uh a a class if you can should scroll also gets which can vacations
0:14:29got is uh if that's a uh uh if a correct source one one source one is a
0:14:34exactly
0:14:35if we cover a a correct is source to and source
0:14:38exactly
0:14:39so uh when they are only a hundred observations
0:14:42can see that the there is
0:14:44with a good the classification accuracy but uh it is difficult to find a minimum of the is T
0:14:49and as you increase uh the the number of points
0:14:52in gets the the the set to get you were uh but on that there are and uh
0:14:59more in see the the minimum of the statistic T
0:15:01and also uh the
0:15:03the development of a model uh says
0:15:06yet but there are you get the rest
0:15:07so so
0:15:08this just means the model that i you have
0:15:10the but are a a are our prior will uh estimate the
0:15:14as as possible
0:15:17this uh this is a based on to that at that time
0:15:20we not want to uh experimental results so uh uh a a first the is to try
0:15:25uh is to trade this in a simple segmentation task or you know that that
0:15:29the
0:15:30it's a given time that is only one still that that is a key
0:15:34and uh uh a good thing is to uh compare are or them with uh just the simple idea of
0:15:40doing a a and then F and then finding the best uh mutation
0:15:44given a a a you given a statistics so he a a a a given a heuristic so he other
0:15:48heuristic is uh
0:15:49compute an and have
0:15:50and uh find the permutation that the minimize is uh this quantity
0:15:55this to give a faq compared
0:15:57so this is a result of a an an F with the this a heuristic grouping so uh uh you
0:16:02can see so the mix was
0:16:04first uh a then speech
0:16:06uh you can see each other sources
0:16:08uh so that's the result of uh and an F with heuristic to groupings we can see that the still
0:16:12a a a lot of uh missing up the
0:16:15the sources are are
0:16:17it's not a lot
0:16:18and uh this is a result with a are them that's long the grouping at the same time as
0:16:23uh the an F
0:16:25so you i can see that uh uh the separation gets a a a a a uh uh
0:16:30that lot uh
0:16:31lot more yeah
0:16:32uh
0:16:33or
0:16:35original result
0:16:36and uh
0:16:38the second experiment that run was on a a a a real of valid signals
0:16:42so uh so
0:16:43we took uh
0:16:44to some from the C sec that the base
0:16:47and we evaluated uh the quality of the separation
0:16:51uh
0:16:52when we vary the degree of overlap between the sources so
0:16:56the them that up the
0:16:58more difficult it becomes to
0:16:59the separation
0:17:01and and i insisting on the fact that we have no they're on the use so uh uh so uh
0:17:06you can talk for perfect separation
0:17:08but uh you would see that the
0:17:10when you varies over a
0:17:12the
0:17:12the less of a lab you have a
0:17:14but they'll the separation so you know you
0:17:16very good separation for a thought T three percent on year
0:17:18or that so that the sources is it is the
0:17:21is is mix
0:17:23uh this is a this
0:17:24as the source
0:17:26as deep dark
0:17:27voice
0:17:28you can see the that of thought people percent of a like to get the
0:17:31very good
0:17:32separation or T
0:17:34terms of as yeah
0:17:35and as the overlapping increases
0:17:37it's
0:17:38uh
0:17:39works and mars
0:17:40so
0:17:41what the prior what they'll prior is that for that is
0:17:44when
0:17:46and uh not all sources are active at the same time it's a people the
0:17:50the where separation so we can not
0:17:53we can listen that uh
0:17:55examples so and it doesn't work
0:17:58that's
0:17:59hasn't work
0:18:02sources
0:18:16i
0:18:37i
0:18:51oh
0:18:56okay so let's six this this is first meets
0:19:11i
0:19:12um
0:19:13uh
0:19:22a
0:19:24i
0:19:25hmmm
0:19:33this is this to don't
0:19:37the
0:19:37source
0:19:38skip directly to the results
0:19:40guess
0:19:41but with source and that's C of would we have an estimate of uh an on is
0:19:45for
0:19:46oh
0:19:54or
0:19:56a
0:19:57oh
0:19:57oh
0:20:05a
0:20:09yeah
0:20:10i
0:20:12do you so we have a ten seconds that
0:20:14a for the computer and so we have a proposed the simple sparsity prior
0:20:18to do a a group uh grouping of the sources and solve the permutation problem in this uh
0:20:24a single channel source separation case
0:20:26uh and we show that the algorithm them but there was of the grouping with the as a a post
0:20:32processing step
0:20:33and if you in future work we will try to incorporate
0:20:35smoothest might prior to uh
0:20:38to understand the time the paul then a mix of H
0:20:42i should
0:20:50we have time for only one with question
0:21:02no
0:21:03so the the most you play
0:21:05they are mostly
0:21:07a love part E
0:21:08component
0:21:09and how mix a very much of like it because they playing
0:21:12according to the egg are so that a single
0:21:15close like it
0:21:16and i am wondering how much the
0:21:19sampling rate and the effect these signs in is you R
0:21:23i mean if you would to
0:21:25hi are and fifty resolution oh
0:21:27what it be different
0:21:28what do you yeah
0:21:31that are separation and that you can see there
0:21:34such a
0:21:35yeah
0:21:35oh O K so we don't talk about this in the article
0:21:39now from uh from my parents this is the
0:21:42does not is sensitive to the simply me that you choose
0:21:46uh
0:21:48for this experiment i chose a sampling rate of uh a twenty two Q do have
0:21:51because uh it just just because of a computing time uh
0:21:56concern
0:21:57and
0:22:00i guess uh for the example or well you have a time voice
0:22:05since uh they play in the
0:22:07approximately in the in the mid and high level of of the spectrogram
0:22:11uh this wouldn't uh uh have too much effect because uh
0:22:15because the
0:22:16so the this i range the frequencies are pretty well separated
0:22:20but uh uh if you have base
0:22:23and another source
0:22:24that purely having a good the resolution uh we will help for uh since we have no model
0:22:29uh since we and number than for the basis of and
0:22:32then a having a would sampling right to i mean have a good something rate will help because you then
0:22:36you can get but the resolution uh
0:22:38you can you can afford
0:22:39a longer time window
0:22:41and
0:22:41but the resolution in the frequency range which is particularly important
0:22:45in the low frequency one
0:22:48a some that i i would say that the results are very robust
0:22:52the problem goes from
0:22:54okay thank you