my name is uh as you get can and uh i will present you with the the work we uh we do we can do it uh in L D A yeah which is entitled intraspeaker variability effect speaker verification over the last decade uh the the one of the systems uh the performance of this is that uh is uh very very at uh the performance the have a rich uh a good uh little and uh is it so this permit uh to have allow set of practical application uh like in industry or in forensic application and uh all this uh performance performance are always driven by average error rate and uh uh we don't have a lot uh a lattice to D's on uh the i explanation of the performance viable hmmm and uh on the arrow uh we have a one important which is that doing so in that context mean actually who uh explain the performance viability according to the speaker for five it is a well known that the according to the lens of the training and testing that's out uh the the back of the T the performance liability is very important and uh it was proposed two uh to use the diff there in front of me contain in the two to use the interview showings that uh there is interference performance uh according to this one i mean to do a um our question uh we only work on the training data four one speaker uh the question is uh is that we have several except for the same speaker so uh what is the viability due to the signal sample used to that the speaker point and uh do you also questioned it is what kind of information may explain this difference of performance and uh we propose to use to stew D the number of selected frames the phone and make distribution and it in for uh for naming candlestick different okay uh we use the we use the the ideas is because that's a system which is an ubm gmm approach approach uh with uh that in fact one of these these and uh we use the the the C the this used and uh used for the news that several complaints and uh but we don't do a score normalisation the global I D is uh to uh to do a lot of um of uh trails for the different training samples we have four a speaker and uh we select the best training except and the worst training except for each speaker um the the best training except is uh use might have one is um calculated by the um by many minimise the the percentage of four exception and uh uh forms recreation and it's the same thing we maximise the stage of phone sex option accepts and false action Z we have a it to to set if uh we see selection one uh named mean and that mother name for i mean max and random yeah we do different uh experiment uh we there is exactly the same speakers exactly the same testing except but we change the training except four each set uh we do this uh experiments on two corpora the first is the based on then used uh two thousand eight with the telephonic conversational speech and uh which uh a lance of uh two two minutes uh for for each uh uh samples and that will maximise the number of training except for each speaker we do uh leave one out uh and uh with this per process uh we uh we have a be doing this uh one hundred that seventy one speaker for we have three to uh twenty models but and uh do you also corpus we used is the right for one hundred twenty which is an stooge or recording uh corpora that database we visited exactly the same microphone and uh it is the read speech by uh newspapers and is it oh what a speaker on the T french and uh we have uh more uh females on me and uh for each uh speaker we have a training and testing except and uh it is uh the the the the we we concatenate so the some sentences to have more than uh twenty seconds all the selected frames by itself yeah i'm the heave uh we take a do we we we analyse the the viability due to the training except we see that the uh the equal error rate uh range uh is that four point one person's too twenty one only nine percent for mean that are and for breath uh iran two uh one person to the thirty three person we uh have done a random uh set and the the mean here it's uh with with the um the breath is the is the mean of up different uh run them it is the very and very important gap between according to the so training and the now the important is this to explain the viability and the the question is what kind of information so for for the number of selected frame it's possible to do that we have uh nice and right but for when i make distribution and that for phonemic acoustic difference uh we use only right because it is so mm easier two what is this type of information and uh for i used to uh we have a significant effect well the number of frames but it is something that is controlled in uh breath one hundred twenty so it is an relevant fact so for eight an explanation of uh the difference of uh performance but the other four factors that was important because we have uh more important yeah in this in brief uh one hundred twenty and uh it's not can be explained but the number of uh for free though so for the phonetic uh content uh we for me we do a forced alignment also i mean and i five where the spirits about and uh we correct thus this argument manually and the to analyse the phonemic content uh we just uh for the first time uh counts the number of selected frame for each phoneme we don't man over with a between subjects factor which are the the set and the dependent variables are the number of selected and we see that there is quietly no different on phone it media content between here as for female speakers uh between the mean max and uh the random and the only oh one for names which is uh which is the relevant and the formalities was it the same thing so it's not uh a sufficient to explain the gap of performance oh for the infra phonemic information uh we uh we use the acoustic feature uh for each for names and uh it's uh exactly the same for sitting with a a man of a bit we have we have uh between subject factor of the set and the dependence of i'll are the L S D C the delta that so that's all yeah uh we have a uh important significant difference for L F C and for all the phonemes and the four del sol is an important uh yeah difference four um around majority of uh for names and the mainly stops and several voice but we don't find difference for that utterance and uh this is uh this type of uh analysis um it is challenge and proves that uh the infra permit unique acoustic difference our uh i to be accounted for from and uh so when's the training except she ends uh the uh we have a large performance differences you might not be explained by the number of selected frames or it is a possible factor but not a sufficient proctor and the the form a mixture distribution to account uh explain exactly this is uh got is there a investigation on it to that reminds influence of uh in prof anaemic acoustic and uh that's the the question is to do the drilling between six acoustic uh in phonemic acoustic difference and uh uh higher yeah but four uh from the media information and uh work there is uh in your results since uh the the the summation of the paper and uh we see that uh the intensity is either you mean than that but it is the it's the significance but if you take the mean of the intensity it is uh a very short different there is no difference for uh fundamental top of the peach and the you you can see it's form and here we don't have different and uh we we you say the dissipation of the volumes three and and no difference for uh this type of information and uh it is the same thing for the spectrum um so uh right of the fig so for the future work uh it's the the question it is that the viability may not be only the result all the signal samples and uh maybe the system itself a a a problem and uh now we are working on the linkage between the llr by the frame and the phoneme it distributed description to understand what are the exactly the good for that frame and if it is there is not a link uh with uh funding information thank you question uh i entered and there's two you said that there was no significant difference between the snr yeah oh do by training try out some good three trials yeah that is another difference for there is a difference on uh the acoustic for the L F C C for a for it we have the significant difference for all the finance but uh she if uh we we want to find uh the link with uh i'm here uh features and we don't fine something so the question is uh oh that we don't have found uh with the description the the the description the the the feature we use only used uh in phonetic science to describe the speech actually we don't have find the link between the L X T C and uh and the the the recognition and uh the phonetic uh information in the we don't we don't know uh uh well why yeah we have this type of guy and uh and uh we don't have an explanation actually uh by by the acoustic and the phonetic uh analysis so if you just take your means trials we don't we we selection train turned out and the mean high snr don't know with an hour so don't see a difference in performance sorry you take on your knees trials no no no we we still i mean but eventually you could do yeah yeah yeah we we did something like that in there is to be difference in performance i mean is what you would expect but yes in our training data should be yeah worse performance buttons you not not a break you rattle basically for exactly the the same but maybe there is not so much but you be the the nice 'cause maybe the breath they that there is not so much but maybe it's an hour no very that um the viability about the the uh a four position for example there is no viability right okay that no it is exactly the same microphone exactly the only people are are recorded uh oh no as the same day and uh it's there is no viability of the station the unique the only uh this the unique viability is uh is on the speaker so and uh when we have only the information about the speaker we can have uh evaluation like this between one two thirty three percent i think what everybody so it's very and then the the question you how to explain that because that if we can if we can have a an explanation we can the and uh a coffee then score or something like this that can't say that uh okay uh i i know the the the training and i know the the testing detecting the testing sample and uh i can say i can say oh okay for this i i can't i have a a good score and i don't have a a confidence with this doctor but we have an older data i can have uh a good the a score uh would computed and it is it is the objective of this kind of us to do it's a good but what sure hmmm what uh_huh oh some from hmmm hmmm uh_huh um yeah and it's uh yeah the actually boring problem anyway any information we just use the L S C that that that the delta delta and that it was to to check that the there is the a difference because uh at the beginning we don't understand the question now it is the link between uh or the fornication mister and this uh L S C uh which are used because we know that in L A C C and delta we have information but we don't yeah found a link between the test see and the dental and this the the i'll evil uh i phonemic information actually i am working on them the coarticulation information and uh the uh i i the first uh experiments i do we use it was the only with the a trifle and analysing the distribution of the triphones and uh i don't fine difference so uh actually i am a misery go all the locus to see if our with a lexus whether we have here in high school that with raucous we have yeah you use the you know uh not use is um uh you take uh the value of the formants of the second that's a formant at uh then purred and or the beginning of the boy and uh on a fifty percent of the volumes and you you you analysed evaluation between uh as it to to the two values and uh normally if uh there is a a lot of articulation and so the the people uh we you and you have a you are a regression all the value according to the for all the value but if there is no coarticulation uh you have something that is very and uh two yeah uh_huh first fig oh well you yeah oh good or or uh oh for those yeah our uh okay the more you or or the yes yeah it's a it's a good question um yeah you have uh the score the last call four um i is the speaker that on the twenty eight the it is there is a different uh according to the normalisation but it is not compatible with the difference we have in a house normalisation between the the when we select you said to yeah yeah that no we we are trying we are training the the normalisation is the it is something that so we have to do but the problem is uh we have uh a database like yeah right uh it's very difficult because we don't have and now that a lot of uh a lot of data and uh to be able to to have a a good uh a good word and that's who have uh uh would uh all different sub training and testing uh we don't have a lot of uh on that that so it's very difficult to to do the normalisation we if we want to to have a lot of different uh training excel oh or or what two maybe more to each source model one quarter sometimes you can point to oh um we have for the the concatenation it is uh a randomised concatenation we are sure that there is never the same uh samples for testing and training but uh uh it so we we don't combine that actually um for example if if your question is that uh have betrayed try to train right to um to use the the the best uh and uh concatenate the bad to to to have a best model we don't have uh i tried it's uh type of combination a small country you have some recordings of each speaker point time between three and twenty recording yeah and each recording some some some some point in time and according to teach yeah okay strong combining multiple recordings to a more no yeah we we have done um with um to to to have a um samples with for two minutes i mean it's and how uh um phrase selected frame the a and the we we we do the same thing that uh select the what best and the worst with um a longer uh signal and the the the results are this one is that uh the there is let's uh that's also that's why the the curve is that not so so good but uh we have the the set not uh the same yeah that's a gap which is important and uh here it is that the the equal error rate is last one one person and uh here it is um five percent and do we have a lot of frame select yeah which shows more combination of so yeah things from yeah point sometimes or between no no no no no it's uh now because ah it is uh it is yes there there is a it is exactly the same testing for for this curve and this curve so it is uh compare it is possible to compare the that's why the posted to i don't know from sessions which you just no i have no information about it because because the what the sample or uh recording in the same it with the same microphone and exactly the same day so if there is no the there is no uh interior stationed viability there is only uh intraspeaker valuable it is controlled that the speaker hon that's a the the one i want to find an optional for example for half an hour or two open or something yes yeah oh oh right hmmm right hmmm