| 0:00:13 | "'kay" | 
|---|
| 0:00:13 | Q so um my name is julie it can and uh i will present you | 
|---|
| 0:00:18 | uh what we have done for hasr submission | 
|---|
| 0:00:22 | and uh | 
|---|
| 0:00:23 | and all or are not this we have them after the mission and what we have done | 
|---|
| 0:00:29 | because of this proposition off uh | 
|---|
| 0:00:31 | questions | 
|---|
| 0:00:32 | so this work is than we have | 
|---|
| 0:00:34 | and nicholas as of where | 
|---|
| 0:00:36 | uh us not of that all is a seven arsed | 
|---|
| 0:00:38 | and uh we come from friend | 
|---|
| 0:00:42 | okay | 
|---|
| 0:00:43 | so | 
|---|
| 0:00:43 | uh the goal of as or uh | 
|---|
| 0:00:46 | as two | 
|---|
| 0:00:47 | i'm lies how can you man expert i think that but to lot | 
|---|
| 0:00:51 | uh use of was automatic speaker recognition technology is | 
|---|
| 0:00:54 | and uh how you we can have the makes on the the boss communities so | 
|---|
| 0:01:00 | uh it's was a very great experience the first time and so | 
|---|
| 0:01:04 | uh for for us it's was | 
|---|
| 0:01:07 | just uh the first experience and try to do something | 
|---|
| 0:01:10 | for this submission | 
|---|
| 0:01:12 | and uh | 
|---|
| 0:01:14 | the task was | 
|---|
| 0:01:15 | uh a very few good a classical a verification task we have a two point five minutes | 
|---|
| 0:01:21 | uh of uh samples for a it's speakers they come from a every ten and it was very difficult trials | 
|---|
| 0:01:28 | and this trials all's were | 
|---|
| 0:01:30 | uh choose and by uh need | 
|---|
| 0:01:33 | and give it in the the choose the them the choose of the trials | 
|---|
| 0:01:37 | was done for reports from a particular system | 
|---|
| 0:01:40 | and uh with this two sets | 
|---|
| 0:01:43 | has a one | 
|---|
| 0:01:44 | and has or two and so we poured to pay to as so with because the the the samples | 
|---|
| 0:01:51 | the the trials | 
|---|
| 0:01:53 | or uh colour include the trials of hasr or to include the has a one so | 
|---|
| 0:01:58 | with the all this the task | 
|---|
| 0:02:00 | so what | 
|---|
| 0:02:02 | we propose it was very simple because uh E a is the more and computer science uh | 
|---|
| 0:02:09 | uh rubber or or E and uh the league to so | 
|---|
| 0:02:13 | uh we just take as three net C french listeners | 
|---|
| 0:02:17 | and | 
|---|
| 0:02:18 | uh to for all and the males and one male | 
|---|
| 0:02:22 | and we all of them to exam a a examined spectrogram and uh to chance to band past | 
|---|
| 0:02:28 | uh filters signals so | 
|---|
| 0:02:31 | they can do was they want | 
|---|
| 0:02:33 | and uh they have to decide if it was | 
|---|
| 0:02:37 | the same speaker was speaking or | 
|---|
| 0:02:39 | it was | 
|---|
| 0:02:40 | two different speaker | 
|---|
| 0:02:42 | and give a uh a confidence score and uh if they gave as they were was that's means that they | 
|---|
| 0:02:48 | are not confidence in that decision and if they gave | 
|---|
| 0:02:51 | five | 
|---|
| 0:02:52 | that's means that the are very confident and Z C | 
|---|
| 0:02:55 | that for the submission | 
|---|
| 0:02:57 | uh to um | 
|---|
| 0:02:58 | i missed | 
|---|
| 0:02:59 | uh we choose to use the majority voting that's mean because | 
|---|
| 0:03:03 | three people so it's easy to have a majority so i | 
|---|
| 0:03:07 | uh we we we choose to use this indication to to to to the the choose of the decision | 
|---|
| 0:03:15 | and uh for the score no uh we said may it's uh we try to do it do we choose | 
|---|
| 0:03:20 | to do a mapping | 
|---|
| 0:03:22 | between the | 
|---|
| 0:03:23 | you men decision and uh the score we have we are as the and gmm system | 
|---|
| 0:03:29 | uh so has to compare | 
|---|
| 0:03:30 | uh the in try to find version | 
|---|
| 0:03:34 | score uh after | 
|---|
| 0:03:35 | and to compare | 
|---|
| 0:03:38 | uh if you have question on the mapping i can and sir but i think that's it's | 
|---|
| 0:03:42 | more interesting to to take uh all their results | 
|---|
| 0:03:46 | and see all things | 
|---|
| 0:03:47 | so | 
|---|
| 0:03:48 | um a | 
|---|
| 0:03:49 | the fact is that's a nice provide us a very long um samples so | 
|---|
| 0:03:56 | because we decide to do receptive test ten | 
|---|
| 0:03:59 | to uh listen to lots of | 
|---|
| 0:04:02 | the the samples | 
|---|
| 0:04:04 | uh we decide to | 
|---|
| 0:04:06 | chance | 
|---|
| 0:04:06 | L a little bit the things and not to give to the listeners | 
|---|
| 0:04:10 | all the two minutes | 
|---|
| 0:04:12 | of uh | 
|---|
| 0:04:13 | speech for each speaker | 
|---|
| 0:04:15 | so um it's | 
|---|
| 0:04:17 | and that we we decide so to to cats the signal and to sell at | 
|---|
| 0:04:22 | the the part which i with them more energy and uh we where we we are sure is that there | 
|---|
| 0:04:27 | is a lot of speech | 
|---|
| 0:04:28 | and the | 
|---|
| 0:04:29 | maybe a lot of information | 
|---|
| 0:04:32 | so uh and | 
|---|
| 0:04:33 | so we select a um the short things around six six second for it's samples | 
|---|
| 0:04:40 | and because of the in you have a lot of uh in perceptive test and for | 
|---|
| 0:04:45 | um | 
|---|
| 0:04:47 | which are only in psychology they use this kind of duration so that's why we we choose | 
|---|
| 0:04:53 | this kind | 
|---|
| 0:04:55 | and the uh so high have some example four | 
|---|
| 0:04:58 | that's you you use can see | 
|---|
| 0:05:01 | and her what's | 
|---|
| 0:05:02 | i am talking about | 
|---|
| 0:05:04 | because the idea was to have a beep between each sample to knows that's we are changing | 
|---|
| 0:05:11 | uh of sample yeah that are is a chance of them | 
|---|
| 0:05:15 | so that's my first example | 
|---|
| 0:05:17 | but | 
|---|
| 0:05:19 | a | 
|---|
| 0:05:21 | i | 
|---|
| 0:05:22 | a | 
|---|
| 0:05:23 | i | 
|---|
| 0:05:24 | i | 
|---|
| 0:05:25 | i | 
|---|
| 0:05:26 | i | 
|---|
| 0:05:28 | i | 
|---|
| 0:05:28 | i | 
|---|
| 0:05:29 | i | 
|---|
| 0:05:31 | i | 
|---|
| 0:05:34 | i | 
|---|
| 0:05:34 | a | 
|---|
| 0:05:35 | i | 
|---|
| 0:05:37 | oh | 
|---|
| 0:05:38 | i | 
|---|
| 0:05:39 | i | 
|---|
| 0:05:39 | i | 
|---|
| 0:05:42 | i | 
|---|
| 0:05:42 | so | 
|---|
| 0:05:43 | same-speaker same different speaker | 
|---|
| 0:05:46 | what these thing | 
|---|
| 0:05:50 | okay | 
|---|
| 0:05:51 | and it's not the same i | 
|---|
| 0:05:54 | it's always is the sensing i mean we we choose the | 
|---|
| 0:05:57 | or difficulty so | 
|---|
| 0:05:59 | yeah it's not the same but um yeah you have different sample in you can | 
|---|
| 0:06:04 | uh have | 
|---|
| 0:06:05 | you you | 
|---|
| 0:06:06 | not memorise because you don't have in house to memorise but with two minutes at exact same thing we can | 
|---|
| 0:06:11 | have a i mean your voice leaf two minutes so | 
|---|
| 0:06:14 | it's something you can compare | 
|---|
| 0:06:16 | very quickly and try to do to take a decision and | 
|---|
| 0:06:20 | at the consequence of this kind of | 
|---|
| 0:06:22 | the | 
|---|
| 0:06:23 | the steam is is that's uh are are are are listener take a decision very quickly | 
|---|
| 0:06:28 | um | 
|---|
| 0:06:29 | in | 
|---|
| 0:06:30 | around thirty sec and they they take these decision so | 
|---|
| 0:06:34 | that's why we we | 
|---|
| 0:06:36 | we choose this | 
|---|
| 0:06:37 | okay so i come back here | 
|---|
| 0:06:45 | so yeah and so they | 
|---|
| 0:06:47 | they can uh use uh L | 
|---|
| 0:06:49 | and they can yeah or or try to listen but | 
|---|
| 0:06:53 | you all the usual is they they just take the decision very quickly | 
|---|
| 0:06:58 | this as their results yeah you will have the other side and | 
|---|
| 0:07:01 | i i think that | 
|---|
| 0:07:02 | what's | 
|---|
| 0:07:03 | is very very thing is that | 
|---|
| 0:07:05 | uh are | 
|---|
| 0:07:06 | a some i think system is | 
|---|
| 0:07:08 | batter that the the decision we takes | 
|---|
| 0:07:11 | by you man | 
|---|
| 0:07:14 | and | 
|---|
| 0:07:15 | um are first question why as | 
|---|
| 0:07:17 | two because | 
|---|
| 0:07:19 | the the question is the human performance at that is is the very important things so | 
|---|
| 0:07:23 | uh a a first things that's we can have four and good information to know if the U we can | 
|---|
| 0:07:29 | cook have a confidence of the decision of human | 
|---|
| 0:07:32 | is to to see if they are agree and uh uh what's up and when they are we did they | 
|---|
| 0:07:38 | take | 
|---|
| 0:07:39 | the | 
|---|
| 0:07:40 | good decision or uh these they are wrong | 
|---|
| 0:07:43 | so | 
|---|
| 0:07:44 | you can see that | 
|---|
| 0:07:46 | here | 
|---|
| 0:07:48 | yeah | 
|---|
| 0:07:48 | we count | 
|---|
| 0:07:49 | no if they are if they agree it's not | 
|---|
| 0:07:53 | uh a good uh indication of the fact that's | 
|---|
| 0:07:57 | uh we can have confidence of that our on their decision because | 
|---|
| 0:08:01 | they | 
|---|
| 0:08:02 | do um | 
|---|
| 0:08:03 | yeah if you hear you have | 
|---|
| 0:08:05 | this this is the the good | 
|---|
| 0:08:07 | um that would then sir okay | 
|---|
| 0:08:09 | that the correct answer and the here are the the trials and so you can see that | 
|---|
| 0:08:15 | here | 
|---|
| 0:08:16 | that seems that they take that good did the correct decision but here | 
|---|
| 0:08:20 | on | 
|---|
| 0:08:20 | uh when they are working on a target as target | 
|---|
| 0:08:24 | uh | 
|---|
| 0:08:26 | we can know if it's good or not because here you have exactly the same proportion | 
|---|
| 0:08:31 | and and | 
|---|
| 0:08:33 | the confidence score as gave | 
|---|
| 0:08:35 | uh is not a a good indication to so you can't trust the people when they are as they say | 
|---|
| 0:08:41 | say okay i'm sure is that is the same that's not a good thing irrigation to say okay we can | 
|---|
| 0:08:46 | trust them | 
|---|
| 0:08:47 | so | 
|---|
| 0:08:48 | that's a problem | 
|---|
| 0:08:52 | and | 
|---|
| 0:08:53 | for the um | 
|---|
| 0:08:55 | the | 
|---|
| 0:08:56 | the um yeah we we are we have some discussion on the of the protocol of has or | 
|---|
| 0:09:02 | because the first one is first thing is that a listener | 
|---|
| 0:09:05 | uh have the the feelings that's it's was more and evaluation | 
|---|
| 0:09:09 | to no he they can come pence the cup to channels | 
|---|
| 0:09:13 | the and to evaluate the proximity of the voice for to speakers so but it's because it's just | 
|---|
| 0:09:19 | not that i as they'll days they ha | 
|---|
| 0:09:22 | uh uh usually do so | 
|---|
| 0:09:24 | yeah it's was | 
|---|
| 0:09:25 | difficult | 
|---|
| 0:09:26 | um | 
|---|
| 0:09:27 | yeah we it's thin only it's not it's | 
|---|
| 0:09:31 | for our summation mission is more a perceptive says as that an an acoustic and then is is because they | 
|---|
| 0:09:37 | don't use they just filter when they have very different channels and something like this but they don't use | 
|---|
| 0:09:44 | some | 
|---|
| 0:09:45 | part of the signal to know uh where is the end to take they decision it is just press the | 
|---|
| 0:09:51 | tips things | 
|---|
| 0:09:52 | and for the limitation of the protocol the question is | 
|---|
| 0:09:55 | is we have in a house to to dues that the T sense you know exactly what's happened | 
|---|
| 0:10:01 | and | 
|---|
| 0:10:01 | what is very important is that we can't | 
|---|
| 0:10:04 | randomized | 
|---|
| 0:10:05 | the the trials is that's means that yeah all the speaker | 
|---|
| 0:10:09 | um her in the same time | 
|---|
| 0:10:12 | the set the trails and it's | 
|---|
| 0:10:14 | clears is that | 
|---|
| 0:10:15 | you don't have the same attention when used charts | 
|---|
| 0:10:18 | and when it is the | 
|---|
| 0:10:20 | a hundred of trials you are listening so as that's important and and college is a we do always that's | 
|---|
| 0:10:27 | that's to two | 
|---|
| 0:10:28 | to have the | 
|---|
| 0:10:30 | to randomise the the the C | 
|---|
| 0:10:31 | so | 
|---|
| 0:10:33 | that's | 
|---|
| 0:10:33 | after is that we have a lot of question of of does this submission our first question was okay | 
|---|
| 0:10:39 | um what is the influence of the number of speakers because we have only needs tree speakers so what's up | 
|---|
| 0:10:45 | an if we increase the the number of speaker | 
|---|
| 0:10:47 | um and the uh what is the difference between experience and and not experience and listener these we have express | 
|---|
| 0:10:55 | sort of expert | 
|---|
| 0:10:57 | and what is the compliment charity T between the you men and the system decision | 
|---|
| 0:11:01 | because | 
|---|
| 0:11:02 | uh we just said made the decision of you men | 
|---|
| 0:11:05 | so which ends a little bit the the protocol from has or | 
|---|
| 0:11:08 | uh we have more listener | 
|---|
| 0:11:11 | search non experiments and ten experience listener | 
|---|
| 0:11:14 | we randomized because we have of all the trial so we randomised of trials | 
|---|
| 0:11:19 | and we balanced to the number of non-target and target | 
|---|
| 0:11:22 | uh because uh the first time the idea of this these an are are there okay i have to | 
|---|
| 0:11:28 | to it's | 
|---|
| 0:11:29 | so yeah it's a balanced so i will take | 
|---|
| 0:11:31 | them | 
|---|
| 0:11:32 | the | 
|---|
| 0:11:33 | there were point five of | 
|---|
| 0:11:35 | um my natural priori is result | 
|---|
| 0:11:37 | and so and | 
|---|
| 0:11:38 | for uh we only uh allows them to to listen | 
|---|
| 0:11:43 | one | 
|---|
| 0:11:44 | the trials and not to repeat the trials again and | 
|---|
| 0:11:47 | and so | 
|---|
| 0:11:48 | what are the result of it | 
|---|
| 0:11:51 | we have a only for non experience and listener that's for | 
|---|
| 0:11:56 | uh above chance level so if you take | 
|---|
| 0:11:59 | uh | 
|---|
| 0:11:59 | a occur on and here take it exactly the same thing for the majority of the listeners | 
|---|
| 0:12:04 | but | 
|---|
| 0:12:05 | what's is | 
|---|
| 0:12:06 | very interesting for us is that you have a very large gap of performance according to the the trials | 
|---|
| 0:12:13 | you have some trials where | 
|---|
| 0:12:15 | ninety percent of the listener are are | 
|---|
| 0:12:18 | core are right our core but give the good the correct answer | 
|---|
| 0:12:21 | and for all other trials you are only strip or or and of the listeners that gave uh the good | 
|---|
| 0:12:27 | answer so | 
|---|
| 0:12:29 | we don't find uh difference between the male and the female trials it's | 
|---|
| 0:12:33 | exactly same thing | 
|---|
| 0:12:35 | and | 
|---|
| 0:12:35 | we have sir different be if your of are we have | 
|---|
| 0:12:39 | some is no was say oh always yes yes yes it the same is the same and or there's that's | 
|---|
| 0:12:44 | are always thing no it's not the same as not the say | 
|---|
| 0:12:47 | so we all as that for for the from the the listeners | 
|---|
| 0:12:50 | and we find a correlation between the performance | 
|---|
| 0:12:54 | and the in level of the the listeners because here it's for | 
|---|
| 0:12:58 | and not not of uh in people so | 
|---|
| 0:13:02 | yeah we find that | 
|---|
| 0:13:03 | so | 
|---|
| 0:13:04 | the last question was the complementarity between the you men and the system and | 
|---|
| 0:13:09 | that's | 
|---|
| 0:13:10 | what we find is that's um for non-target trials | 
|---|
| 0:13:14 | the as be ham of | 
|---|
| 0:13:16 | uh | 
|---|
| 0:13:17 | a lot off correct answer and it's the only correct then some for the N the M and not for | 
|---|
| 0:13:22 | the you min | 
|---|
| 0:13:23 | but | 
|---|
| 0:13:23 | it's the contrary for you and we have a lots of uh a a big for per oh | 
|---|
| 0:13:28 | sorry | 
|---|
| 0:13:30 | a we have a be a big for version here | 
|---|
| 0:13:32 | uh of | 
|---|
| 0:13:34 | correct answer only for the you and so maybe we can find a compliment terry T | 
|---|
| 0:13:40 | and uh | 
|---|
| 0:13:43 | yeah um and the not yeah | 
|---|
| 0:13:45 | that's so | 
|---|
| 0:13:46 | and | 
|---|
| 0:13:47 | the after we have are known the experiments so then | 
|---|
| 0:13:50 | experience and listener | 
|---|
| 0:13:52 | and we don't to find difference on the performance for the non expert that and the experience a listener yeah | 
|---|
| 0:14:00 | it's exactly same thing you'll have | 
|---|
| 0:14:02 | the th | 
|---|
| 0:14:04 | so for the suggest and the the first or work is | 
|---|
| 0:14:08 | more question and all those things | 
|---|
| 0:14:11 | yes | 
|---|
| 0:14:12 | because the first | 
|---|
| 0:14:13 | question is how house the you men can help the system so | 
|---|
| 0:14:16 | um maybe you uh we have to eggs i mean the trials | 
|---|
| 0:14:20 | with the scores that are near the threshold of the system because we observed in the compliments are T that | 
|---|
| 0:14:26 | that is that | 
|---|
| 0:14:27 | it's is the that it is the them | 
|---|
| 0:14:30 | the trials | 
|---|
| 0:14:31 | where uh you man a right and uh system is wrong so | 
|---|
| 0:14:36 | maybe it's | 
|---|
| 0:14:37 | something thing we can do you | 
|---|
| 0:14:38 | and | 
|---|
| 0:14:39 | yeah the second question is okay | 
|---|
| 0:14:42 | i have some trials which are very easy and all their very difficult for you man what's are the different | 
|---|
| 0:14:48 | between | 
|---|
| 0:14:49 | so trials | 
|---|
| 0:14:50 | and | 
|---|
| 0:14:51 | it's its clear that it's important to rip gates this kind of experiments | 
|---|
| 0:14:55 | we have not to have listen at that i'm sure that joe or | 
|---|
| 0:14:58 | the next paper will answer to this question | 
|---|
| 0:15:01 | thank you | 
|---|
| 0:15:27 | you describe performance or with experience to of experienced was listeners | 
|---|
| 0:15:31 | uh how how someone of experience to image bruce was but the question | 
|---|
| 0:15:35 | and and experience that listener is the fun addition but | 
|---|
| 0:15:40 | who doesn't | 
|---|
| 0:15:41 | the fire and they | 
|---|
| 0:15:43 | don't to work on for and C E core they don't are uh interested in the the the speaker they | 
|---|
| 0:15:50 | they work on the language and uh so they | 
|---|
| 0:15:53 | they are not eight uh everyday day novels | 
|---|
| 0:15:56 | they are very yeah they they | 
|---|
| 0:16:00 | but it's not so experts a for and six something because | 
|---|
| 0:16:03 | in france we don't have | 
|---|
| 0:16:04 | this kind of people | 
|---|
| 0:16:05 | i | 
|---|
| 0:16:06 | yeah | 
|---|
| 0:16:16 | so this a lot of people saying just always a all makes me fine | 
|---|
| 0:16:21 | where it would be possible to how real human judge | 
|---|
| 0:16:24 | just give the mumbled string as might come from from list model | 
|---|
| 0:16:28 | and do post facto dollar bleep each room thing to of the pressure your bit of the best to perform | 
|---|
| 0:16:35 | to do do that but it we make them but that the problem is that for the first two D | 
|---|
| 0:16:39 | we have done we the three people the three people | 
|---|
| 0:16:42 | um we don't have a correlation be good uh correct answer and | 
|---|
| 0:16:47 | a confidence score | 
|---|
| 0:16:48 | which is good so we comes use | 
|---|
| 0:16:50 | the car the | 
|---|
| 0:16:52 | you see you we can't | 
|---|
| 0:16:53 | trust the the the the the the listener that are not it's not because they say i i'm sure is | 
|---|
| 0:16:58 | that it's the of my decision that's made is that's sim is that's | 
|---|
| 0:17:03 | sin if you K eight that | 
|---|
| 0:17:04 | they are right | 
|---|
| 0:17:06 | so | 
|---|
| 0:17:07 | it's difficult to have a liberation | 
|---|
| 0:17:09 | and to use this confidence score | 
|---|