0:00:14and only one
0:00:15i am whether a student formula can
0:00:18one and one causing a lot of a banana split in previous to deal was
0:00:22interested recognition is performed with a probabilistic verification
0:00:28so yes ones everybody's causing you was having we present the motivation
0:00:33there have been introduced in this is hypothesis
0:00:36after that i will talk about a constant residual coefficient
0:00:39then i describe what is one percent
0:00:43and the discipline analysis of the i'm about this program or resolution and baseline completely
0:00:52and i don't think will addition model and i'm in every and example
0:01:00so next time maybe more relation between the thing dysphonia to be found okay
0:01:05there is a really useful was not i'll find out what i also be included
0:01:09by different from confirmation it means
0:01:12what i had to use
0:01:15for the one of my speech
0:01:17so we don't
0:01:19and is the more
0:01:21i per cent signal or this might help us to discriminate
0:01:26the more not quite useful for speech
0:01:28and those kind of understanding
0:01:31i was to design a better and more reliable numbers of detection
0:01:37we get it was taken motivation and he went down with
0:01:41a visual within need to different
0:01:43a front end performance on is rather than continuing this so it can see everything
0:01:51mean speaker-specinc which is especially more effective in detecting as an
0:01:55when we listen to it in a less effective in detecting the one okay just
0:02:00is eight
0:02:01you money or
0:02:04sequences as well collect is equal error
0:02:07there are very seriously
0:02:11similar kind of observations regarding both my differences people associated with it is in is
0:02:17greater than i can challenge
0:02:19an external data sequences in front end of all right
0:02:23and the case
0:02:25for the finals
0:02:27so can be okay
0:02:29why sequence is different from the positive and six
0:02:35no less
0:02:36but whatever this is how this is so
0:02:38is we know is finally this can utilise in the spectrum for example in high
0:02:43hiding behind
0:02:45the mailman or indian or whatever
0:02:47so the use of investments analysis that would be information across different manner
0:02:53and i will be localised information
0:02:56and there is no degraded performance
0:02:59so we can see
0:03:01more reliable detection with the features that precise information is available
0:03:09no discuss the differences they have anything can be so as you know in the
0:03:15mean and they were available in the early nativized the and then window is for
0:03:20this gaussian means
0:03:21they're exactly once
0:03:24and temporal resolution
0:03:26and again singing
0:03:27the inference is quite well
0:03:30in contrast a security which we develop a spectral and temporal resolution
0:03:37so i think in seen once and you press continues better really know what that
0:03:43was then
0:03:45high resolution in the lower frequency and the higher than within the temporal resolution with
0:03:54that means
0:03:56the synchrony with late fusion one solution more realistic than fifty
0:04:05in this i in this line we will it's pretty
0:04:09considering use the solution using the cost of possible
0:04:13given a within the by imposing is as a feature extraction the we live the
0:04:19constraint on the human audio file
0:04:22to illustrate the audio file is basically is you in one you for the spectral
0:04:28resolution of this paper
0:04:30so do not need to adapt and in the frequency domain
0:04:34we live in form with something clusters or endorsement power spectral density
0:04:39which can be no performance is good of you can be
0:04:43to it you know what is good
0:04:46like giving infinitely information across the voice vector
0:04:50and finally we will explain the cepstral recursion
0:04:54we apply the discrete cosine non-uniform sampling
0:04:58this is what is it was to use cepstral coefficient feature
0:05:06no i don't want it is mainly focus on those of police is the result
0:05:10is visible nineteen change
0:05:12and we use the standard problem or
0:05:15for a policeman he was relatively is applied mimicry really implement
0:05:22the difference of automation
0:05:25in the following experiment
0:05:27we used
0:05:27standard is reasonable to the nineteen baseline system
0:05:31so this is a gmm based system and b l is a gmm based system
0:05:37for one point is exactly the database and the baseline system description you can therefore
0:05:42before and references
0:05:47no there is no knowledge of the baseline results on is feasible doesn't i database
0:05:54is the most substantial variation in the performance of is baseline system
0:06:00yes we can see in the human eye as an additional the for so no
0:06:05they for example is the same thing is sixteen and eighty nine
0:06:10this is a gmm based system
0:06:12give them better performance and bubble
0:06:15where there is a gmm based system
0:06:18where s
0:06:19for either incorrectly for in estimating the l s is a gmm based system used
0:06:25to better performance
0:06:27so while it is in difference in performance
0:06:30using more differently
0:06:33the difference in this paper or solution
0:06:37insecurity which might suggest that be i think that it is to use this one
0:06:44my representing the specifics right and then people
0:06:49so that a nine
0:06:50where the difference it is something you would basically the difference in the performance using
0:06:55c and the mfcc representation
0:06:59so we use so that analysis
0:07:03so in this little someone analysis we propose in will be emailing representation then nutritional
0:07:10i five tokenizer present in the spectral this domain representation you realise
0:07:17what i think it'd implement the information they represent different are scored
0:07:23so in this time
0:07:24the thing i don't be many presentation of a specific is something i
0:07:30okay you didn't seem to me
0:07:33genetics within got frequency mean and the lexus there was because can see it makes
0:07:41and in the and the leftmost autonomy human that it was in the market is
0:07:46a localising the low frequency of this is what was in there was a localiser
0:07:53five was spectrum and
0:07:55that i in the weighted within the i-vectors are presented for the signal
0:08:00and in my eyes in the email that imposing that are compared to a single
0:08:05band-pass filter
0:08:08for some time analysis
0:08:09the remaining where can you denies gaussian with the ones
0:08:15i by integrating
0:08:18existing using the specific content of imposing
0:08:24definitely representation signifying the performance of a different is performed combination in the damsel lately
0:08:37something so i in this line mean within a do you might representation all
0:08:43of six different specific is performed okay well i roll single within the represent the
0:08:52and using the secrecy sequences in gmm based system
0:08:57leaving i think is a c and d processing be a more general representing
0:09:03he may representation using the ellipses in gmms listed in gmm based system
0:09:09so you can see that
0:09:11you for specifically for example is the same thing is sixteen in may nineteen
0:09:18this is in gmm based system
0:09:19the estimated performance and then
0:09:22well i yes for identity in country and fourteen nist nineteen
0:09:27when extending this is not use the better performance and on the gmm based
0:09:34so probably you when he was addition we can see that for those or three
0:09:39is the main sixty and may nineteen
0:09:42i think that legalising d i i think of this better where is the thing
0:09:49"'cause" it is important to the data
0:09:52but details are really and you the better performance
0:09:55where is it is still
0:09:57it could mean and importantly where i think so localising the
0:10:02i don't with the presidential elections seems to have that the and you the better
0:10:10no i guess of is defined in a day where is the performance work because
0:10:15the i for initial immunity the i-vectors are not explicitly localised spectrum so for example
0:10:22in the need a sequence is he or no ellipses in front end
0:10:31no that don't temporal resolution and maybe of wasn't feasible fishing
0:10:38so in this light we will explain why i think is the same front-end format
0:10:44so i x
0:10:49in this data is shown on the classical be split off and highly speech frame
0:10:54which represent the new nine
0:10:56and use it in this city
0:10:58taking this is obviously that was a good lately
0:11:01please remember that the other thing in a possible solution is represented by the area
0:11:08defined by
0:11:09what we are looking like
0:11:12probably one finger against and now we can see that
0:11:15if they are compressed using d i i was in part of the spectral then
0:11:22how this particular
0:11:24it means that only invading is also this area is contaminated
0:11:30two additional cepstral coefficient
0:11:33that is only bring reading the
0:11:36and then it is okay in these diversity in the women
0:11:41s a single in the windows
0:11:44we aim to the
0:11:45investigating more contribution to the computational distribution which means
0:11:50they're eating i is to deal with one second only
0:11:54and you have one single
0:11:58this control which is if it is forcing frame when using the uniform recently all
0:12:03be uniform resembling ones not seem to be
0:12:05it is normally using the
0:12:07sequences of feature extraction
0:12:09so no hannah
0:12:12we don't know why the within the how
0:12:17and unionise
0:12:19needed something in the frequency domain
0:12:21so in this in this problem can see that
0:12:27it was it is before
0:12:28no there is no what you contribution we got stuck a traditional cepstral coefficient daisy
0:12:36usually motivation the cepstral a
0:12:39a computational cepstral coefficient which means
0:12:42i don't information in government and giving more if the size of one second
0:12:48is known to be consistently for women is different for the first low frequency scale
0:12:54is with me
0:12:55and for the signal treatment is gonna union
0:12:59and lastly we show that
0:13:01us spend a of them you need i spent on the
0:13:06this shows because motion is sixteen cepstral coefficient is uniform
0:13:11i don't is better which means
0:13:13when i first was in any way to spend on
0:13:16then it would be better to use i
0:13:19localised there are a total successfully
0:13:25that you can use the cost in order to a constant is a solution was
0:13:30different spectral
0:13:31no only one thing that is k
0:13:34when i based on the polite the women
0:13:37then he was also can be
0:13:39using the challenge is good
0:13:41given the one of the size of the lower bound and able to capture the
0:13:45a difference when the other realising over right
0:13:47where s
0:13:48when i based on lies in the way the use of sequences using that is
0:13:53the engine
0:13:54the okay to those are persuaded and you the better performance and when i was
0:13:59i wasn't anything the spectrum that and it is different then
0:14:04it will look at those i think
0:14:06then and you get better performance
0:14:08then the secrecy using a recently
0:14:19it's just an no mandatory have in the i-vector nonetheless global warning behind
0:14:26use of sequences the using the dramatically scale and he news good if the performance
0:14:32based on maybe i'll fix the log spectrum
0:14:35so in this thing here in this role
0:14:39all singing voices to represent the
0:14:41there will be made within twenty minutes ambition using the or something custody
0:14:46that means because he's using that exactly
0:14:49where is it wouldn't the closing we didn't show that it was in there
0:14:55they do not even representation using the gmm based system
0:15:01that's true within their
0:15:03the only male presentation using the efficiency with the german task which is a anything
0:15:11so we you remain
0:15:13s goes we are using the original signal being systems are statistically or we will
0:15:19go in it was it would be nice to
0:15:23again in there
0:15:24if we can say they are specific what they contain in and fourteen
0:15:28where we can see that you are used a specially localising be
0:15:33and now you know there
0:15:36in our previous presenting overdemand use of sequences the user dramatically scale
0:15:42hindi better performance and table two shows are explained that she
0:15:47think this is in here
0:15:49this is because as in business
0:15:51it's a question you know right
0:15:57no only in thinking one representation
0:16:00this can afford it is from the decision a fourteen thirteen and fourteen and be
0:16:05used to think is residual material based front end
0:16:09a the idea that multiplying being by giving substantially nobody they were using it
0:16:18it is really is
0:16:22so no i
0:16:24i that imposing a when the i-vectors analysing the woodbury then you also decreases the
0:16:29original article is good this one day they're having the size of those of us
0:16:36and those are frequently but the woman
0:16:42the conditional condition
0:16:46if you already
0:16:47seen a linguistically and presentation you might hear i you might the idea would be
0:16:54originally proposed in this
0:16:57for something analysis to identify localiser representing this problem
0:17:03we define the also find that the different exactly the i think within the different
0:17:09something and
0:17:11but it was activated in front end which imprecise information relayed consuming
0:17:19it was also they're using the front end and vocal qualities of the database
0:17:25so this finding explain why
0:17:29that is simply a back to estimate the solution is
0:17:33so what i in this thing
0:17:40and if you have any portion a have little as follows