0:00:15oh uh i
0:00:17i mentioned uh this work is uh
0:00:19i one for and the initial a trained that for a more quickly
0:00:24on the the a project of do you and at least that's not C G
0:00:30or or i or one
0:00:32i know i had the opportunity to work for uh
0:00:35some uh is search for those
0:00:36sparse it
0:00:37i is is that uh
0:00:39results of out
0:00:42what's that what would like to
0:00:43you was i a the top P and evaluation of noise power spectral lines
0:00:47summation them
0:00:49in at the best an and one
0:00:52but the outline of mine each face
0:00:55give the motivation of the whole
0:00:57and now i in to use you an overview of that is that's that we can see that the uh
0:01:02a frame or
0:01:04then uh the evaluation measures that use
0:01:07and uh the
0:01:08and of my uh
0:01:10that a they would be uh experimental results and uh
0:01:16but you know that and noise power spectral density estimation or or uh and noise power
0:01:21estimation summation is a a a crucial part of a speech enhancement the frequency domain
0:01:27and uh this N T and in new algorithms have mean in to use in this uh in your
0:01:33uh but unfortunately there is no uh compressive and you
0:01:39frame for evaluating
0:01:42the performance of noise probably made
0:01:45therefore uh
0:01:47the aims of the framework that you got uh
0:01:50look for was
0:01:51to uh present the
0:01:53performance of some uh the on and B sent and was power estimators
0:01:58and uh
0:02:02for the more at the uh uh a new measure two
0:02:05do you are a more comprehensive uh evaluation of the perform
0:02:11we can see there uh a a lose them in our framework uh the minimum statistics noise power estimate
0:02:18uh proposed boy uh rain a marketing two thousand one which is one of the is state of the or
0:02:25a second uh i is that was uh a minimal
0:02:28controlled recursive averaging or
0:02:31and crawl
0:02:32and which is also the the state of the art i is and in this area
0:02:36quite and two thousand two
0:02:38V i with them belonging to M prague category
0:02:42uh the improved version of this algorithm
0:02:45in cross and uh em crawl which is a and hand
0:02:49uh minimal controlled recursive averaging
0:02:52and am cry maybe
0:02:54uh we consider it's bass notes tracking approach or it's and T what was what hendrix of doesn't eight
0:03:02to algorithms exams uh based on a minimum unit square error or estimation
0:03:07which are available in two
0:03:09uh a different approaches uh mini
0:03:11mmse U and mmse had two thousand ten
0:03:17a a a a a uh for evaluation uh to he issues that there uh taken into account and uh
0:03:23the face
0:03:24issues that
0:03:26we one oh the uh evaluation being uh independent of uh a speech enhancement system
0:03:32uh there is then is that we want to uh separate the effects of any a speech enhancement system
0:03:38on the performance of noise power estimator
0:03:41we just focus on the
0:03:43uh estimation error or of noise
0:03:45a track
0:03:47a second issue is that uh we need
0:03:50it to have a
0:03:51a suitable able friends noise for our evaluation
0:03:55and uh uh you have as the
0:03:57three uh
0:03:59reasons for our uh consideration
0:04:01first during link speech activities
0:04:04uh the an instantaneous and noise but is not available
0:04:08uh and noise
0:04:10and most
0:04:10noise power estimation approach
0:04:14also sorry in noise reduction approaches require a smooth
0:04:17uh version of noise
0:04:20and uh the last one not that this one if we want to
0:04:23uh that to use the impact of uh random fluctuations in the
0:04:28origin original noise uh a pretty okay well
0:04:30therefore four three
0:04:32consider the
0:04:35let for noise not the original
0:04:37the reference noise which is this means not
0:04:40the innocent in
0:04:45the first evaluation measure that we can see is
0:04:48uh mean estimation error or used the most common
0:04:51uh use
0:04:53the estimation
0:04:55evaluation measure
0:04:56and is defined as the average if there's based being the
0:05:01yeah reference noise
0:05:03oh and
0:05:04estimated well
0:05:05if you can see that the look you of uh the noise power that are you have shown here
0:05:12uh the reference noise power boy
0:05:16a that are you power to and uh as that's the hot
0:05:20you see uh that the operation the
0:05:23but shoot the look or a issue over all frequency bins and frame is
0:05:28where capital K is the number of frequency bins and a capital R i
0:05:33is the number of frame the
0:05:35is is that you did not model uh
0:05:38evaluation measure
0:05:40and the pope was one is the estimation error value
0:05:43which is
0:05:45uh propose in this uh a frame
0:05:47and in fact if if you imagine the
0:05:51ratio between
0:05:52uh a reference noise ball
0:05:55no if reference noise power and estimated noise power
0:05:58overall all of frequency bins and frame in this is as the re
0:06:03we divide these meant to some uh of units and uh
0:06:07we estimate the variance of is
0:06:10so you need then the computed value values are
0:06:13uh i over the number of some units
0:06:18and trap it all N
0:06:20is there a number of us all units
0:06:22in a role of the matrix and N is the number of sub units in colour
0:06:28this body has operator
0:06:30is the operator which computes the variance of his soap unit in the and so on "'em" score
0:06:36that in the next nine
0:06:39here are we shall uh for example the
0:06:42do you i think the sub you need
0:06:44uh but the number of frequency bins of this up in it is
0:06:48case so on
0:06:49the number of frames in this
0:06:51i so
0:06:53so uh i for this uh up units V
0:06:56compute the value as uh which is uh uh is
0:07:00uh equation to estimate the value
0:07:03and here
0:07:04uh for example um mu
0:07:06i and is there
0:07:08mean of the
0:07:10uh you know of the
0:07:11expected values
0:07:13for us stop unit in the and strong
0:07:16uh in our experiments we consider at the number of uh the number of
0:07:20yeah frame in this is for just sub unit be to fifteen
0:07:24and the number of frequency bins ten
0:07:30or each uh present you the
0:07:32the experiments are settings of the algorithm
0:07:35uh as i mentioned before i
0:07:37eight a yeah i it so it's right
0:07:39where implement implemented
0:07:42a sampling frequency of all signals is eight khz
0:07:46uh they've window length as well as the uh you have length
0:07:50a uh is it one is fifty six samples or sit two the cans
0:07:56consider it uh
0:07:58to source of clean is speech signal and are taken from timit database
0:08:03uh one made a speech on a a one theme in speech
0:08:06each of with the duration of two you
0:08:10right yeah concatenating the
0:08:11short segments
0:08:13of for example a us
0:08:15six speakers different speakers
0:08:18and uh for uh simulating the
0:08:22adverse environments
0:08:23acoustic can stick one ms we consider a a seven different type of noise
0:08:28uh taken from uh sound at as database
0:08:31this simple less and the
0:08:37easy S one is the what question noise to estimate uh the on a stationary and white gaussian noise that
0:08:43we can see that
0:08:45bank i of noise
0:08:47uh and sinusoidally modulated white question noise
0:08:50uh the a noise car noise and traffic want traffic to noise
0:08:54but the traffic to noise uh mainly can uh contains
0:08:58uh a home
0:08:59uh sounds
0:09:01and the difference between traffic one traffic one is that the traffic to noise
0:09:05is a more structured and how
0:09:10at the range of input snrs used from minus five db to twenty db with this sub size of uh
0:09:15five db
0:09:18as i talk uh
0:09:20before the reference noise that's uh was
0:09:23important for our evaluation
0:09:25uh finally we decided to
0:09:28uh can can see that that we can receive temporal a smooth single of the noise pretty a ground
0:09:33with the
0:09:35a most you factor of a point nine
0:09:40yeah i i shall a a uh pretty at a crumb or the noise power
0:09:44uh which is a uh or
0:09:47uh frequency bins
0:09:49uh and is
0:09:50plot to or frame in this is
0:09:53uh here you see the
0:09:57the noise power of white question noise uh bank is to while question noise
0:10:02on to the
0:10:04a noise
0:10:05and you see the
0:10:06a a stationary T of the noise
0:10:09and uh are also a stationary
0:10:17here is their results of our yeah evaluation
0:10:21in terms of uh a not
0:10:23estimation or or yeah it's make mean estimation error or
0:10:26and uh estimation error variance for what question noise
0:10:31that is uh the pick that
0:10:33for a a eight i'll
0:10:35uh all this space was tracking mmse mse hmmm X
0:10:38i mean was it it's six uh on the
0:10:41mmse you
0:10:43and is a
0:10:45uh is uh depicted for uh
0:10:48uh six level of signal soon as a issue
0:10:52uh be different colours
0:10:54and you see that uh or
0:10:57no signals and was you uh most most of the algorithms perform or less the same
0:11:02or by increasing the signal to lose a issue
0:11:06uh some of i with sam's
0:11:08or it seems to be more susceptible
0:11:11and uh here
0:11:14or show the results for
0:11:15and not a stationary what gaussian noise uh sinusoidally modulated one
0:11:19uh and you see some of the algorithms on not robust in tracking the noise power
0:11:25uh but the others us
0:11:27like us of base notes tracking
0:11:30mmse hand leaks and mmse you
0:11:33or or a in the low level of signals signal so those a seems to be one
0:11:38and fast in tracking of the noise power or
0:11:40what uh
0:11:42the um
0:11:44in terms of the uh
0:11:46estimation error variance also you see
0:11:48that the uh same result used to live in the ranking go five reasons
0:11:55is is the
0:11:56results or babble noise
0:12:01you presents here
0:12:05and uh uh here is that not for traffic to noise
0:12:08uh and we selected actually these noise to for uh the sub bass notes tracking i with them
0:12:14uh there are uh that in this algorithm uh one of the national uh assumptions used that
0:12:21the noise uh
0:12:22shouldn't be
0:12:25a structure or how because
0:12:30uh this
0:12:31how how many signals
0:12:33uh can be calm it
0:12:34the extra we uh low rank model
0:12:37and can be confused beat the speech signal in the signal subspace
0:12:41what of course are some modifications in these uh
0:12:44algorithms that in to do in the paper
0:12:47uh but uh for the
0:12:49algorithms we talked to uh a modulation you see that
0:12:53uh and want of the mean estimation or
0:12:56which is uh
0:12:57what's than the
0:12:59mm as
0:13:05yeah a is the
0:13:07actually a short here as some of the results of our evaluation
0:13:11uh for uh
0:13:13a limited time
0:13:16and uh one of the important points that we can't to from the evaluation is that
0:13:21uh estimation in or writing as a trying was i addition iteration are uh in size uh for the evaluation
0:13:28of the perform
0:13:30because a using estimation error variance
0:13:33we can uh measure the amount of fluctuations in the noise power
0:13:38uh in the noise estimated noise power
0:13:41for example of uh to mess so it's performance uh a very close to show the in terms of uh
0:13:47is mean estimation all
0:13:50why having the estimation error variance we can
0:13:53a a get the that to uh and more comprehensive you've performance and i uh the most rate might claim
0:14:00showing an example
0:14:02uh for example can see if you made speech signal to by uh sinusoidally modulated noise what question noise
0:14:10at twenty db signal to a issue
0:14:13in this figure are uh the
0:14:15you look is the speak in a speech for power
0:14:19uh at the green a curve is the estimated noise at to live quite in with them
0:14:25and uh the red curve is the estimated noise by
0:14:29and crime maybe are result and
0:14:32there are noise is
0:14:35the black four
0:14:37you see that be a different behavior of algorithms in tracking of noise power
0:14:42for example uh
0:14:45and cry maybe i'll with them
0:14:46uh denoted by a red curve
0:14:48a has the
0:14:50on the estimation of noise power
0:14:53well not
0:14:54a fast
0:14:55the action to it's tracking of noise
0:14:58uh a in the
0:15:01in a i'll is a gives uh
0:15:04some over a
0:15:05estimation of noise power we some
0:15:08a fluctuations
0:15:10and these fluctuations that are actually
0:15:13a uh
0:15:15uh following or a tracking the
0:15:17speech component
0:15:20so is important for us to a predict that
0:15:23if the error or is related to on the estimation or or estimation or the fluctuations
0:15:29here uh by using estimation
0:15:32a mean estimation and error you see
0:15:34uh that the
0:15:39uh you see that the
0:15:41estimated value they mean estimation error is uh
0:15:44the same very close
0:15:49and uh but a and and also sort the performance of in cried with them use some hope better than
0:15:56uh but there that in prime three
0:15:59but in terms of the estimation error run ins you see that the
0:16:04uh in prime a maybe i'd with them gives a less
0:16:08and hence
0:16:10this shows the
0:16:11uh three for ability of the
0:16:14employ a maybe because you
0:16:16uh gives a less for uh fluctuations or
0:16:21is more a smooth
0:16:22the tracking of noise power
0:16:27by uh
0:16:29and of my presentation be giving some conclusions of the frame work
0:16:33uh first
0:16:35conclusion is that this some noise power estimators are sorts of the the
0:16:39and sensitive to the increase meant
0:16:42of the
0:16:42signals to ratio
0:16:44and uh for some of them you see uh the robustness in this test signals so as issue but for
0:16:51others no
0:16:52and this is can uh uh of uh can N is that uh a you having
0:16:57uh estimation error variance we and a
0:17:00gets better
0:17:03two words
0:17:04comp where comparing the most power estimator
0:17:07and uh in fact uh
0:17:10these fluctuations maybe
0:17:12uh put a
0:17:13a voice some a musical noise at the end of this beast and enhancement for the enhance the speech
0:17:19so is important to
0:17:20predict then want of
0:17:22but uh fluctuations
0:17:23and uh the says conclusion is that uh for non is stationary and noise types
0:17:30few algorithms uh can give us
0:17:33fast tracking of noise power
0:17:35and uh a according to our experiments we found that the
0:17:39uh mmse hand leaks uh i i with M is the
0:17:43a most of was one
0:17:44uh and
0:17:45it can
0:17:46i gonna to guarantee that
0:17:48the and has a speech at the end of a speech enhancement you you was
0:17:52more improvement in signals the most issue
0:17:55for what we don't a can name that these
0:17:58and was probably estimator we give us better intelligibility
0:18:02it should be tested thing
0:18:03and another the uh
0:18:07i Q and
0:18:08yeah i
0:18:13so at the end we actually know
0:18:15which are rules
0:18:17we should use bleach
0:18:20but uh
0:18:20that might be a
0:18:22one question
0:18:23what's about complexity could you comment on that
0:18:27you know
0:18:28if we uh
0:18:30consider the the algorithms speech uh track
0:18:33uh better in terms of uh mean estimation error or
0:18:36uh uh uh
0:18:38mm as to exists no complex
0:18:40i mean uh and in comparison to some was space knows striking to
0:18:45a performs that so a fast
0:18:47okay i thank you
0:18:49the same question
0:18:50for a question
0:18:57to to this just to your just time tracking or inside one of asked this question your
0:19:02i in the babble noise at look to snow three uh power was some changing as much one of your
0:19:09plots there
0:19:10i i worked a little bit more constant than i thought it might be
0:19:13um are are you actually you using map or using a large crowd noise
0:19:18uh large scale oh parts are you know you sure back but we normally you think and you can actually
0:19:23hear individual speakers or individual word so is uh is what isn't it distinguishable the
0:19:29it it it's more broad band or or an edge more two
0:19:32pinkish maybe a right
0:19:34right in this figure for example yeah like the
0:19:38that that's what i'm looking at "'cause" you have a couple of spots were kind of shot you know
0:19:44okay no further a questions thank you once more i