and and there a one um of

rough urgent phones at university from calm

five for a or two presents a joint one with my C for vices it can

uh the topic is a analysis synthesis

based speech enhancement

we is improved

spectral envelope estimation by tracking speech time then

first less

have a look at our line

for for my presentation

first

at the very beginning i where we introduce some uh

but runs

as a spectral you all some

a a effect by noise corruption

conventional filtering ring

and now introduce a model based based speech enhancement

which is a previous

proposed by us

and uh i work then introduce a speech tracking

speech dynamics tracking scheme that is used

in conjunction with the model based

speech enhancement

and uh uh performance evaluation

a cushion you

so uh

let's to have a a first have a look

the effect of noise corruption from as a true

perspective

yeah

use a white noise for example

we can observe that the

harmonic structure of speech as the C V a lead image

and the

the the special name a lot is now

which are out in a lot of a spectral distortion

and the

we are

the is some uh mention no statistical model based

as speech and and has meant to

and

the the the

the upper figure shows the classical oh lot special an impact you

though

and the

from the job times spent on we can see that

the lower portion of the

special and have been

restored

but the overall noise level

can not be

where was suppressed

so as a result there will be

many is music tones and the wrist reese residual noise is in the

clean

a a process the speech

and the the

lower or figure shows that uh optimum own modify the

log spectrum them to do you

and the

these not the generally

have a very good

at it

a cat ability of office

noise suppression

and but however

the form men and of the harmonics but

structures

for that just talk

so um that a of of us also um often you

are better pass goal but the

and the wild the

uh lower you go

gives a

better segment low snr school so there is always a tradeoff

the two in the noise suppression

and the the harmonic

distortion

we can say the naturalness of speech

so uh

can be also observed

from the spend joe

special special model that

no voice will first

a C V are they just thought of this

special name model

and the can measure no statistical method

what the

partial you're restore the

the

spectrum am all but a partial of for that just the spectrum

so this what potentially a con for some

comment and features

as such as music tones and the low intelligibility problems

in um

speech enhancement

so you in our our previous work we have proposed um

analysis synthesis this approach

based on the how most model

so that

basic idea is to

it's track a close to

from noisy spatial

and the down we reconstruct the noise uh the type a speech

by re is this

using these speech only information

so you can see had from yeah

you have a speech information so you can have the track the location of the harmonics

you have a actual again

so you can have that all are average spectral

and at level

and you have the special envelope

you can have the

track

many to respect

so why use this

what we choose this approach to

to do speech enhancement

first

this model was cape

escape

bow to generate

clean harmonics

and that only speech related information is size

so i background noise is out to me be removed

and the this

this model also

and retrieved

some then each harmonic structure

and that as moves

spectrum would hope so so no isolates spectrum peaks

and the hands no meats

we from one problem

and also this mortal allows

at independent adjustment of

different more apparent

so it in a thing and both ask to

for was or N has the spent M role

and

using this framework

so by you think this now thought

we can

you uh we can suffer from the noise suppression

and the the harmonic distortion trade

so from some

uh of of our previous work

after we uh

applying some

for clean procedures

using conventional method that

we can apply the pitch

uh frequency domain pitch searching

and that that

a a spectral again estimation

some um really

preliminary result

shows that that P H and the spectral gain estimation

already already give very

good performance

by a a a a a pine on the perfect in the spectrum

however

the spectrum envelope estimation is

someone and

we can see yeah for some are really made a result

shows that the the past goal for

uh uh there a do you want noise

would give a already one point five

and the some um

but can measure an approach what a run

one point nine

and the our previous

approach

take D vol

you with this

pretty clean and can give a uh

also a a a point to

uh improvement

however

it's we replace

the M brought with a to clean rule

this

that

it can achieve

three point one seven

so it is

huge huge got here

so we we would expect some

improvement in past call if we can

further proof

spectrum them

so that problem can be state

as a

so for each frame use

frames of noisy observation

uh we want to find a mapping

between the noise and train spectral envelopes

and of full can set sec two frames

we want to find that

temporary tried to juries of clean special neville

uh i in other words we want to estimate clean speech and by

looking for long term

speech you pollution

so by as you me uh over us

certain pure at time

uh a the S U yeah relationship between the consecutive clean spectrum blobs

and uh a

the in relationship between the noise and clean

special on them

we can use that lenient an

just the model to more though

this

state chucking

so the

the feature

we used here is uh

a a line spectrum frequency of lpc coefficients

and uh

and the

for

each uh pure

see each cu result

all

observations

we have well

as C a series of lpc coefficients

so a given a comments system few uh

part meters

we can run it

um um i and the

yeah

oh ten clean L quite vision

so the next proper or what you how to to ten

a a common system permit us

for

each

the year is all

which

so the idea is that for each block of noisy observations

we find a a a a we

we use the for each and the culpable

that

but also the

class did

parallel i lpc coefficients

and the

to through some uh optimize region

quite your we can all to and the corresponding i meant them permit

so in that all fine chaining just we have all

noisy and noisy and clean

uh L C coefficients

and the

we use those

spread B Q

to um

sure a to and uh

global and trace

in the sense that blocks with similar be sure

a a group into the same class us

by saying a similar we need to do define a distortion measure here

it could be is uh something vol

measure as a a you could in or you can

use the

some contract manager as as a

as a uh

modified i S measure

and the

you also a to define i'm

feature for each

prop of all persuasions

you can use that average just special

or you can use all of theories of

vectors

so it it what it actually be a a matrix quantisation quantization you this case

and the

a for each cluster

we have both noisy and clean up so

observation a noisy and clean

features

so we can minimize the total neck

a like cool function

for each cluster

and we will

oh to and the design the

comment system them permit in this case

so you know i like adaptation up they just we

we also have a a noisy observations

for a block

and the we use the

say

at this

this that's measure to find the cop and trees

and that has the

corresponding comments just the parameters

and the were run their common are we

is uh as that's of permit us and we will get the design better on them

so you can

so from the

spectral round yeah that the tracking

actually gives very good

performance

also have from uh

three D view that

the

a noisy

the noise the envelope trying to juries a

quite

mad and the flat rate

and the some get

this conventional mention P of a read what the re risk oh

some harmonics

but a resulting some use one problem when

but

that the this most tracking

subject

use here

which give various moves

and uh

uh a and accurate to to re

so it is it can also be

observe from this figure that the

for

is that a

spend it

the phone then with

expend as compared to the conventional map

so the tracking gives very close

to the original spectral envelope

try to the right

so uh there's still time men do spectrum

and that harmonic structures

uh also

and the from the fine find or size

speech we can see that uh

smell or

um it and no use homes

and the

harmonic structures i

retrieved

and the

actually we can achieve a run

to phone

for one

pass school for

speaker dependent trendy

and the the

the noise we use it is a from there are to ten db

uh uh using a white noise car noise and uh uh

a a be noise

so a speaker dependent and this be in the pen and testing is used

and it finally uh

i can group the presentation yeah

in this paper

presentation

we uh we've block at the effect of noise corporation an cry option

and the conventional speech enhancement

it's got

as been just got

and not and not it seems this approach is present

and and speech dynamic tracking important that incorporate

you change in the common ring as proposed

and they prove

a special name estimation is illustrated

objective to in terms

but

spectral distortion and passed call i show

at so for my

edition

i think you

yeah that so much

yeah this the first question

you have be audio samples are and then have you can up to uh bring with me so you're

yeah yeah yeah i was then some

good

yeah i you can sort or could you come on C P U cost to issues

oh um

actually i you

use that a a a for training

it will be time consuming a

will you out

but you can can show that all the in your protection

in of the uh

a a a i thought of size

so that that's always a tradeoff

okay fine tune is full

set able

let

that's quite lot

yeah a next question please

from your presentation uh i realise that

the on is by send is according to clean signal be or upper bound right

yeah

the on nice is sentences results according to the given to clean and but of the be you upper bound

of the optimal case would be the time to lead to show that clean yeah why is so my question

is is

you had in is said the on effect of D

they a noisy phase information that you using your sentences

so um what will be

the exact a of the uh noisy envelope

and the noisy

and fate information that you are using a would is in this case

in this work we just use the

many do spectral

we have not uh look at the face ms

actually uh

in college it has enhancement

uh a face not selling four

improved by

research

could

a fact the intelligibility

uh uh maybe in free for sure works we will come

but for your information the were some papers also

talking about the importance of phase information in in

the T

a made as which are work you know

yeah you this is a

have a some i mean that a gap between the upper bound and the proposed method of your can also

be because a

that's a

noise it face

so this check this scheme is uh

this lee

what well for voiced speech so

for um voiced speech we

we can just use some pretty clean the data

so this would be something weird asian

for

for form

a gap between the optimal

proposed but

i would be interested to know what you need to

a voice activity detector

all

actually we have tried to use the void

you D to trend voiced and i'm voice

for different that

but there is out there that

we can sure that

trend that

one class to pull or data

it's

you better for

for the whole tracking

yeah you synthesis model is very

yeah adequate for

it's a sinusoidal model for approach using yeah voiced sounds how do you put use the unvoiced sounds

so the unvoiced voiced sound it is basically uh

no P and uh we just

used that

um uh

a a a a boy

time the women port two

seems size

to have a gain information and P information

and just

commit time domain and

yeah i you are they have for of the questions

that is not the case

thank you once more