0:00:16thank you
0:00:17and let's the some audience left for the last talk of today day
0:00:21and a
0:00:24it is uh a of different to the talks before
0:00:28um for for getting line of mike talk we can just this is uh
0:00:33i to the book were that's uh uh i was taught to
0:00:36given that very short introduction seduction what lights also plays
0:00:39say it
0:00:40what's the problem can load it's case may need to permutation and the greedy and how i'm solving in using
0:00:47as sparsity basically criteria
0:00:49so um
0:00:51and the you a case for like some separation is when you have a cocktail party problem
0:00:56um we have some sources
0:00:59uh at this point i say we have
0:01:01speech sources to a people talking
0:01:04and he would like to get
0:01:06sing the components of that
0:01:08uh but a what what you get a some recordings which are just make chance
0:01:13these send single components
0:01:15and um
0:01:18in this case
0:01:19but i'm
0:01:20looking here uh uh we have the
0:01:22better problem off to the
0:01:24uh mixture of being convolutive one
0:01:27as we have to of speech we have reflections and so and so on
0:01:30so uh
0:01:32the problem becomes more complicated
0:01:34and the mathematical formulation for this um we have
0:01:38some source
0:01:39some extent
0:01:40and matrix
0:01:41at least for the instantaneous then use case
0:01:44i gets of measurements and what we want to do is to
0:01:48a estimate might matrix uh separating matrix so we get again to
0:01:52uh i'll in
0:01:53a signals
0:01:54uh for this we had like the ica
0:01:57so nothing you at this point
0:01:59uh what we have to um
0:02:01take into account uh we never now the although of the sources and you never know which energy the sauces
0:02:11in my work i used to
0:02:13done not feature a of the natural gradient
0:02:16uh as i think if you but you know
0:02:19for speech signals we need
0:02:23as always we need some
0:02:25a a probability dispersion
0:02:27functions for speech what when considering here
0:02:30we can safely assume uh we have using a class industry
0:02:36as i you said you have
0:02:39uh not to simply case we have to convolutive mixture we have
0:02:42a in this case
0:02:44you you
0:02:45different delays you have to reflections and so on
0:02:48so we model this
0:02:49using uh a convolution
0:02:52and uh four
0:02:54we a situations we have some known that to us
0:02:57two thousand four thousand taps or whatever
0:03:02estimating these filters directly in time domain
0:03:07possibly but very hard
0:03:09so the you wouldn't way is to go to the
0:03:12a time-frequency domain using the short fourier transform
0:03:15and now what we have is
0:03:18just again uh what implication in each frequency bin
0:03:21so uh we can just use the
0:03:25uh up to a to are you shown in each frequency bin independently
0:03:29which is again
0:03:32not a problem
0:03:35we have
0:03:36the problem of
0:03:37uh the different
0:03:39and rotation patients and and scaling things uh
0:03:42and the previous example
0:03:44can do in you think about that in this case we have to correct
0:03:49the scaling
0:03:50uh there some standard was you have to solve it
0:03:54the typical the case is the minimum distance
0:03:57or often principle
0:03:58uh which we
0:04:00multiply the
0:04:01i'm next matrix by yeah
0:04:03and the with to tight on you down at them and
0:04:06uh what
0:04:08and that's that we
0:04:10uh X and and scaling done by the mixing system
0:04:14you do not know which was
0:04:15but at least we do not
0:04:16at new distortion
0:04:17just point
0:04:19some new method
0:04:21uh presented
0:04:22and last time uh a filter shorting filter shaping
0:04:26but for these masks that you need
0:04:29solve the permutation problem first
0:04:32uh well it's as
0:04:33uh you can so it didn't each frequency bin independent
0:04:39we were talking about the permutation problem what what is so how can be
0:04:44uh well
0:04:47in this case
0:04:48we have to
0:04:49short time
0:04:51some space two spectrograms for time free transform
0:04:54of two signals
0:04:55where just
0:04:56when you exactly know
0:04:58these spots a swell between the do use two
0:05:04when you are we start these signals
0:05:06to time domain of course
0:05:08both signals appear in boston channels
0:05:11so again you didn't uh
0:05:14separate and so you have to correct
0:05:17for use permutation and these can be
0:05:19uh and every frequency band different
0:05:22and usually comes quite complicated
0:05:25uh usually the two main approaches
0:05:31a lot of paper as in and of friends
0:05:34concentrate on on direct T V two patents and directions of arrival
0:05:38uh the idea is
0:05:40when you have to or mixing matrix as
0:05:42uh we can just
0:05:45to directions with a some come from and assume
0:05:49uh that one direction is one source
0:05:52this works
0:05:54a strong we have low reverberation
0:05:56but i reverberation uh you can't
0:05:59um um
0:06:01pinpoint point a the sauce to thing the direction in all frequencies together
0:06:05uh in this case here
0:06:07i i used the statistics of the separated signals
0:06:11um one
0:06:12trivial simple case is uh
0:06:15you just
0:06:17such a a line in the neighbouring nine in this say
0:06:20they have to look to same
0:06:23they here they are highly correlated
0:06:28yeah this is true
0:06:29does this
0:06:31at least for
0:06:32when when you are looking for a very near bring bent so we have here to a wreck neighbouring bins
0:06:37and blue and green and yeah okay yeah highly correlated
0:06:41if you just
0:06:43a few bins away
0:06:45yeah i i wouldn't say
0:06:47these been covered
0:06:49so the correlation method
0:06:50is not
0:06:51so to robust
0:06:53but uh they have been extensions to make it
0:06:56uh a lot more robust
0:06:58oh okay so
0:07:01at these um
0:07:02the correlation coefficients uh
0:07:05take the
0:07:07and then low
0:07:08calculate the correlation
0:07:09and decide
0:07:10the pen what station
0:07:12depending on all four possible permutations take
0:07:16and then
0:07:16and uh using is uh
0:07:18uh are you can just use a this this way
0:07:21as a already said this isn't very robust you have
0:07:24to make it
0:07:25a because of the
0:07:27yeah when comparing more distant bins
0:07:30you just got wrong
0:07:32uh and then
0:07:33so um
0:07:36you years ago uh uh just been proposed it is the other so think she you as proposed here
0:07:42but you don't compare
0:07:44single bins
0:07:45uh yeah
0:07:46but how blocks of bins
0:07:48so that the S luck like this
0:07:50you compare
0:07:51it's a first stage you compare one been but another
0:07:53zero you one
0:07:55and calculate a couple
0:07:56correlation can created in and you get
0:07:59you permutation and take the next to bins and so and so on
0:08:02so in this case you have neighbouring bands and you can assume okay to
0:08:07assumption to five related bins
0:08:09it's met
0:08:10in the next step
0:08:12you take
0:08:13these to correctly calculated bins
0:08:15take to two and calculate now
0:08:18uh these four collation so actually what you get
0:08:21F here for coefficients
0:08:23and we have to decide
0:08:24which one to take to you site which can eight uh which permutation do we take
0:08:29to big as one
0:08:30to mean
0:08:31to always one or whatever
0:08:34but not a problem
0:08:35here you go to already sixteen and the next
0:08:38yeah we get a sixty four and so on
0:08:41so it becomes even harder
0:08:44a simple example for this
0:08:46when we just plot
0:08:48for the the situation but for a frequency
0:08:52the coefficients yeah
0:08:56for all frequency bins so
0:08:58and the first page you would just take the correlation it C coefficients
0:09:03uh on the first of their i don't know
0:09:06and a
0:09:07uh okay when you look at this
0:09:10looks like
0:09:11just go to uh
0:09:13it just one here and here
0:09:16so when you going
0:09:17next up to next steps
0:09:19so that's say
0:09:20you compare
0:09:21the block
0:09:23five from that to eight hundred to the block a time that to one thousand
0:09:28we on that or whatever
0:09:29you compare all the coefficients well which are and a square
0:09:34so we have a lot of coefficients which are correctly
0:09:37and a lot of coefficients with or
0:09:38not can
0:09:39and and so on in this case here
0:09:44as we work
0:09:44here are not
0:09:46but in the next steps you compare these coefficients
0:09:49a K just me still worked as might a stable
0:09:52but this case here
0:09:54if a lot
0:09:55one computations
0:09:56which is a lot of
0:09:57indicators of our limitations which
0:10:00in a right and
0:10:01one conditions so
0:10:03usually the dyadic sorting scheme
0:10:06is that are but still
0:10:10so and signal
0:10:15no i want to
0:10:18a present if you approach
0:10:20uh the first
0:10:21uh observation i i and you can make it
0:10:24when you're just take
0:10:26speech signals
0:10:27speech signals as past
0:10:29and um
0:10:32a mixture of two signals which are in a independent
0:10:35this last
0:10:37and a
0:10:38you can extend this
0:10:41even if the signals are on a signal
0:10:44as long as the independent
0:10:47to mixture is less spots
0:10:50and just is exactly what we have a a permutation problem we have to bound a signals and one to
0:10:55look which permutation do we have
0:10:57so the wrong permutation will be
0:11:01a past
0:11:03a a you have he an example of this
0:11:06uh just
0:11:07to plain speech signal
0:11:09but nothing
0:11:10hadn't yeah
0:11:12and in this case
0:11:13i just
0:11:14most to
0:11:16hi are that's uh uh of of the signal so that
0:11:19hi up
0:11:19half of the signal
0:11:21to the other so we have to mutation
0:11:23and the lower
0:11:25level of of the the R T K that sorting scheme
0:11:28and when we compare these
0:11:29we have here a lot of
0:11:31you was or more zeros
0:11:33and when you look here we have
0:11:36clearly a signal which is less spots
0:11:39and uh
0:11:41this is exactly what we need to uh
0:11:45from late
0:11:45the a new criterion
0:11:48you want to signal to be S sparse as possible
0:11:51uh the measurement of sparsity um
0:11:54for this is an hour of uh to take to
0:11:57some new method of the lp norm
0:12:01in my case cases a usually it takes something like zero point one
0:12:05for for P
0:12:06but it's not that
0:12:08and part you can vary
0:12:10um okay so
0:12:12uh i there is no
0:12:14S with the correlation coefficient
0:12:17we take
0:12:18our signal
0:12:22no not the correlation between two signals
0:12:24but the sparsity of a sum of two signal
0:12:29take again
0:12:30the four coefficients
0:12:31every every one against each other
0:12:34and you get one
0:12:37yeah coefficients
0:12:38coefficient which can decide which permutation
0:12:41the point think about this
0:12:43we don't take the
0:12:45coefficients in the time-frequency domain but D transform
0:12:50point process
0:12:52to uh
0:12:53time domain signal
0:12:54where we can apply
0:12:56it it you know
0:13:00using this
0:13:02even if we take
0:13:04that's a hundred frequency bins from K to S
0:13:07still again P that the calm
0:13:09just one and coefficient
0:13:11for the whole sorting she
0:13:14so when we now know do the
0:13:16the are or thing
0:13:17so we have again here and
0:13:20just one thing the frequency band transform to the time domain
0:13:24he again one
0:13:25E applied to you know
0:13:28and here again
0:13:29at this point that is uh
0:13:32no we transform
0:13:34to frequency bins the time domain
0:13:38and calculate again one comes and so and so and so
0:13:41so it's this point you don't know
0:13:43have to problem of
0:13:44which coefficients of this
0:13:46that's a thousands or or whatever
0:13:49do you do you takes on you uh but you have always just one coefficient
0:13:54due to the
0:13:58uh a a it's it's much more robust
0:14:04i have
0:14:05done some
0:14:09so first set
0:14:10uh uh data set this does a for the set up
0:14:16so on so about last they can set from five years ago so
0:14:21we have
0:14:22a separate
0:14:23this this state set that uh is
0:14:25the lot uh somehow
0:14:27it's a reverberant
0:14:29recordings some some speech but to relation is
0:14:32quite whole
0:14:33you can when you hear of to is that has that you can see yeah it's
0:14:36government art
0:14:38derivations like
0:14:39this this case
0:14:41the direction of of uh an approach
0:14:44very good
0:14:48it works because of the low vibration
0:14:51the proposed method it
0:14:53not as good
0:14:56but uh when you're local closely Y
0:14:59is performing
0:15:00not that good it's because
0:15:02it's a very low stage where we compare just one thing and frequency bin
0:15:07yeah uh
0:15:08happened some limitations to and correct
0:15:11and uh
0:15:15should it this to get so that a bit more
0:15:19uh is
0:15:19assumption of
0:15:21sparsity and
0:15:23a a one pass cygnus is of this is correct
0:15:27and um
0:15:29when you going to a a set which uh a that the cartons that high reverberation
0:15:35all over you got
0:15:38uh suppression performance
0:15:41the do approach
0:15:43because it with to set up you don
0:15:46to have the uh
0:15:48the signal coming from one direction because
0:15:50of the reverberation
0:15:53the new approach we all again get almost the performance of the non right algorithm
0:15:58uh because this case um
0:16:01you don't
0:16:02matter which direction to signal comes as long as we
0:16:05i able to separate it
0:16:07in every frequency bin
0:16:09and um um
0:16:11so it's not always
0:16:13matching the non by case
0:16:15but it's
0:16:15more robust
0:16:16compared to the
0:16:18signal it's of the dot pro
0:16:20so to conclude
0:16:23the converted by source separation
0:16:25can be soft and the sorry time-frequency domain
0:16:29a you have to solve the scaling and permutation
0:16:33no we presented a new algorithm based and sparsity
0:16:37in the time domain
0:16:38not as user a and a dating time domain
0:16:43and with tire of variation we have usually better
0:16:46separation performance and there
0:16:48direction five
0:17:15yeah let's a hard a set up it's like seven and a half set and for this i used five
0:17:22i i saying
0:17:24an a signal uh enough signal to make i C in each frequency band
0:17:28then there would be enough signal to make you
0:17:31you know