| 0:00:06 | well |
|---|
| 0:00:07 | after a great discussion uh about uh the |
|---|
| 0:00:10 | last |
|---|
| 0:00:11 | so take i will i will continue with another topic of related to speaker diarisation |
|---|
| 0:00:16 | uh |
|---|
| 0:00:17 | my name is bob automatic and |
|---|
| 0:00:19 | uh i was working uh previous semester or uh |
|---|
| 0:00:23 | as an erasmus student in that |
|---|
| 0:00:26 | uh you at the university of i mean you wanna |
|---|
| 0:00:29 | at all about |
|---|
| 0:00:30 | about the last |
|---|
| 0:00:31 | in formatting that venue |
|---|
| 0:00:32 | uh were my supervisors where |
|---|
| 0:00:35 | coding the video and there is not true |
|---|
| 0:00:38 | uh |
|---|
| 0:00:39 | it was about uh preliminary study |
|---|
| 0:00:42 | oh factor analysis based approach is applied to the speaker diarization task |
|---|
| 0:00:48 | of meetings |
|---|
| 0:00:50 | well |
|---|
| 0:00:51 | what it would be about |
|---|
| 0:00:53 | uh i will briefly describe the speaker diarisation |
|---|
| 0:00:58 | also factor analysis |
|---|
| 0:01:00 | i will tell you something about the objectives of this study uh some experiments |
|---|
| 0:01:05 | and the |
|---|
| 0:01:06 | perspective |
|---|
| 0:01:10 | uh shortly about diarisation i suppose uh almost all of you know what speaker diarization means |
|---|
| 0:01:20 | what |
|---|
| 0:01:20 | is its purpose |
|---|
| 0:01:22 | uh speaker diarization tries to find the answer a question |
|---|
| 0:01:27 | who spoke one |
|---|
| 0:01:29 | uh we don't have |
|---|
| 0:01:31 | uh any a priori knowledge |
|---|
| 0:01:32 | about speakers they and number |
|---|
| 0:01:35 | and their identity |
|---|
| 0:01:38 | uh as you can see here is a small |
|---|
| 0:01:41 | small |
|---|
| 0:01:42 | you have uh |
|---|
| 0:01:44 | if uh |
|---|
| 0:01:45 | and how |
|---|
| 0:01:46 | would of uh such a such a system |
|---|
| 0:01:48 | uh where we can see the |
|---|
| 0:01:50 | speech segments are labelled by the by the speakers |
|---|
| 0:01:55 | uh the diarisation system you uh tries to find the same segments of |
|---|
| 0:01:59 | goers |
|---|
| 0:02:00 | and label them |
|---|
| 0:02:01 | uh for for my experiments i used uh diarisation system uh developed in |
|---|
| 0:02:08 | in the yeah |
|---|
| 0:02:09 | uh the the system uh participate it uh in a nice the rich transcription |
|---|
| 0:02:15 | uh combines since two thousand three |
|---|
| 0:02:18 | uh the system uses topdown strategy |
|---|
| 0:02:21 | uh what is the top down strategy i will |
|---|
| 0:02:23 | i will uh |
|---|
| 0:02:24 | sounds |
|---|
| 0:02:25 | now |
|---|
| 0:02:26 | uh the top down strategy consists of uh |
|---|
| 0:02:29 | four main steps |
|---|
| 0:02:31 | the first |
|---|
| 0:02:32 | the uh is in uh speech activity detection |
|---|
| 0:02:36 | uh |
|---|
| 0:02:37 | where to retrain the gmm models |
|---|
| 0:02:41 | uh |
|---|
| 0:02:42 | are are are used uh |
|---|
| 0:02:44 | as a as a models of speech and nonspeech |
|---|
| 0:02:47 | uh |
|---|
| 0:02:49 | then uh it's |
|---|
| 0:02:50 | used uh viterbi decoding and the map adaptation |
|---|
| 0:02:54 | another step is uh segmentation |
|---|
| 0:02:56 | uh where is |
|---|
| 0:02:57 | use the evaluative |
|---|
| 0:03:00 | uh hidden markov model |
|---|
| 0:03:02 | uh |
|---|
| 0:03:03 | also viterbi the counting the coding and uh |
|---|
| 0:03:07 | uh the third and for the fourth |
|---|
| 0:03:09 | steps |
|---|
| 0:03:10 | are almost the same uh it's for segmentation about |
|---|
| 0:03:13 | using different |
|---|
| 0:03:15 | parameterisation |
|---|
| 0:03:19 | uh factor and all is is uh |
|---|
| 0:03:22 | is |
|---|
| 0:03:22 | is so well known in in fields like uh speaker verification |
|---|
| 0:03:27 | language identification uh and video gender classification |
|---|
| 0:03:32 | uh |
|---|
| 0:03:34 | and |
|---|
| 0:03:35 | the the uh |
|---|
| 0:03:38 | the big difference uh |
|---|
| 0:03:41 | you can say it's |
|---|
| 0:03:42 | uh |
|---|
| 0:03:43 | but then that legally |
|---|
| 0:03:44 | describe |
|---|
| 0:03:45 | uh in these two equations |
|---|
| 0:03:47 | where the the first decorations |
|---|
| 0:03:50 | is standard gmm ubm modelling |
|---|
| 0:03:52 | and |
|---|
| 0:03:53 | the second equation |
|---|
| 0:03:55 | uh |
|---|
| 0:03:57 | contains |
|---|
| 0:03:58 | uh |
|---|
| 0:03:59 | um |
|---|
| 0:04:00 | contains you we |
|---|
| 0:04:02 | which uh |
|---|
| 0:04:04 | so modelling the session variability |
|---|
| 0:04:10 | so what about |
|---|
| 0:04:11 | trying factor analysis uh |
|---|
| 0:04:13 | the link uh uh the |
|---|
| 0:04:15 | the single audio files |
|---|
| 0:04:17 | uh |
|---|
| 0:04:19 | uh we have situation for example |
|---|
| 0:04:22 | speaker is |
|---|
| 0:04:23 | peaky and |
|---|
| 0:04:25 | environment of the recording is changing like |
|---|
| 0:04:28 | the speaker is going |
|---|
| 0:04:30 | and around the microphone and the distance |
|---|
| 0:04:33 | speaker and uh |
|---|
| 0:04:35 | and the microphone is changing |
|---|
| 0:04:37 | uh the the factor analysis can be held |
|---|
| 0:04:39 | helpful in this case |
|---|
| 0:04:41 | um |
|---|
| 0:04:44 | uh we we tried to uh to |
|---|
| 0:04:47 | two approaches in this work and |
|---|
| 0:04:50 | the first is uh by localising subspace you containing the entire segment viability |
|---|
| 0:04:56 | and the second uh is |
|---|
| 0:04:59 | uh in a localising the interspeaker variability |
|---|
| 0:05:05 | about the experimental protocol the details uh are the following as a development set i used twenty three audio files |
|---|
| 0:05:13 | from the nist uh rich transcriptions |
|---|
| 0:05:16 | since two thousand four |
|---|
| 0:05:18 | two two thousand six |
|---|
| 0:05:20 | uh |
|---|
| 0:05:22 | it took place in seven different meeting rooms and |
|---|
| 0:05:25 | uh from |
|---|
| 0:05:26 | some statistical data |
|---|
| 0:05:28 | uh the recordings |
|---|
| 0:05:29 | uh have from ten to eighteen minutes |
|---|
| 0:05:32 | containing from four to nine participants |
|---|
| 0:05:36 | and |
|---|
| 0:05:36 | as evaluation set i use the |
|---|
| 0:05:40 | seven audio files from nist uh from the previous year |
|---|
| 0:05:45 | they have from seventeen to twenty seven minutes |
|---|
| 0:05:48 | and from four to seven speakers |
|---|
| 0:05:52 | uh |
|---|
| 0:05:54 | the multiple distant microphones were used here and as a performance uh |
|---|
| 0:05:59 | measure |
|---|
| 0:06:00 | uh i used uh diarisation error rate |
|---|
| 0:06:05 | the factor analysis model link was applied |
|---|
| 0:06:08 | only |
|---|
| 0:06:09 | in the third step of the speaker diarisation system |
|---|
| 0:06:14 | now the first approach |
|---|
| 0:06:16 | the modelling go |
|---|
| 0:06:18 | interspeaker variability |
|---|
| 0:06:23 | uh the U matrix uh here |
|---|
| 0:06:27 | in |
|---|
| 0:06:27 | in this equation |
|---|
| 0:06:29 | is common to all speakers |
|---|
| 0:06:31 | and the assumptions are uh |
|---|
| 0:06:34 | main relevant speaker information located in the low |
|---|
| 0:06:37 | dimension subspace and the rest |
|---|
| 0:06:40 | uh |
|---|
| 0:06:41 | all the speaker information in the full space |
|---|
| 0:06:45 | and the results are on the next |
|---|
| 0:06:47 | page |
|---|
| 0:06:48 | uh there is uh |
|---|
| 0:06:51 | nothing interesting |
|---|
| 0:06:52 | except |
|---|
| 0:06:53 | one think |
|---|
| 0:06:54 | it's the difference |
|---|
| 0:06:56 | between |
|---|
| 0:06:57 | these two columns |
|---|
| 0:06:59 | uh |
|---|
| 0:07:00 | what does it mean and the first column |
|---|
| 0:07:03 | uh contains the baseline diarization error rate |
|---|
| 0:07:07 | of |
|---|
| 0:07:07 | this file |
|---|
| 0:07:08 | without application of factor analysis |
|---|
| 0:07:11 | uh the next |
|---|
| 0:07:12 | column contains uh |
|---|
| 0:07:14 | results |
|---|
| 0:07:15 | after |
|---|
| 0:07:16 | application uh factor analysis for segmentation |
|---|
| 0:07:20 | containing |
|---|
| 0:07:21 | the U V |
|---|
| 0:07:23 | and the last without |
|---|
| 0:07:24 | you think |
|---|
| 0:07:26 | and the difference is |
|---|
| 0:07:28 | big |
|---|
| 0:07:28 | uh in average about ten percent |
|---|
| 0:07:30 | what does it mean it means that the U I |
|---|
| 0:07:34 | can |
|---|
| 0:07:35 | contains some information |
|---|
| 0:07:37 | useful |
|---|
| 0:07:38 | four |
|---|
| 0:07:39 | what they're doing |
|---|
| 0:07:40 | speaker |
|---|
| 0:07:41 | uh |
|---|
| 0:07:42 | in this case uh the only only thing |
|---|
| 0:07:45 | uh which is important |
|---|
| 0:07:46 | all the all the results |
|---|
| 0:07:48 | are uh |
|---|
| 0:07:49 | in average whereas |
|---|
| 0:07:52 | the second approach is uh in the in in their segment of our identity |
|---|
| 0:07:58 | um |
|---|
| 0:08:00 | it's almost the same except uh |
|---|
| 0:08:04 | the the base |
|---|
| 0:08:04 | think that the right but the is |
|---|
| 0:08:07 | uh |
|---|
| 0:08:08 | modelling |
|---|
| 0:08:08 | inter segment |
|---|
| 0:08:10 | so the results uh are |
|---|
| 0:08:13 | this page |
|---|
| 0:08:17 | yeah the baseline |
|---|
| 0:08:19 | diarisation error rate |
|---|
| 0:08:22 | there is uh |
|---|
| 0:08:24 | after |
|---|
| 0:08:24 | application of factor analysis |
|---|
| 0:08:28 | with ordering with you you |
|---|
| 0:08:30 | and here without |
|---|
| 0:08:31 | you you |
|---|
| 0:08:34 | uh what is |
|---|
| 0:08:35 | what is uh interesting here |
|---|
| 0:08:38 | only the fact that uh |
|---|
| 0:08:41 | so speaker information uh |
|---|
| 0:08:44 | present |
|---|
| 0:08:45 | is present in the inter segment component but |
|---|
| 0:08:47 | not significant |
|---|
| 0:08:50 | uh i tried another experiment |
|---|
| 0:08:53 | and it was based uh on filtering |
|---|
| 0:08:57 | um uh of a speech segment |
|---|
| 0:09:00 | in |
|---|
| 0:09:00 | mm kay |
|---|
| 0:09:01 | development set |
|---|
| 0:09:03 | in the first column you can uh see |
|---|
| 0:09:06 | there are |
|---|
| 0:09:07 | results of system uh |
|---|
| 0:09:09 | which uses |
|---|
| 0:09:10 | you metrics |
|---|
| 0:09:12 | uh estimated on all speech segments of from the the from the development set |
|---|
| 0:09:18 | in the next next column you can see uh |
|---|
| 0:09:20 | results |
|---|
| 0:09:21 | system |
|---|
| 0:09:22 | using uh |
|---|
| 0:09:24 | you matrix estimated on uh segments |
|---|
| 0:09:27 | longer or equal to |
|---|
| 0:09:29 | one second |
|---|
| 0:09:30 | and so on |
|---|
| 0:09:32 | so seconds five second consequence |
|---|
| 0:09:33 | uh the most uh interesting i think uh |
|---|
| 0:09:36 | uh |
|---|
| 0:09:37 | this |
|---|
| 0:09:38 | this in this paper is |
|---|
| 0:09:40 | is the uh |
|---|
| 0:09:41 | the big difference |
|---|
| 0:09:43 | in these values |
|---|
| 0:09:44 | uh for this file |
|---|
| 0:09:46 | uh |
|---|
| 0:09:49 | it's uh |
|---|
| 0:09:49 | the original |
|---|
| 0:09:51 | uh diarization error rate |
|---|
| 0:09:53 | for this file was about twenty percent |
|---|
| 0:09:57 | after application uh |
|---|
| 0:09:58 | this modelling and this filtration of |
|---|
| 0:10:01 | uh segments shorter than one second |
|---|
| 0:10:03 | we improve the segmentation |
|---|
| 0:10:05 | uh about fifteen point five |
|---|
| 0:10:08 | point five |
|---|
| 0:10:09 | person |
|---|
| 0:10:10 | uh |
|---|
| 0:10:13 | well |
|---|
| 0:10:13 | it's interesting |
|---|
| 0:10:15 | and uh |
|---|
| 0:10:16 | we move |
|---|
| 0:10:17 | this segmentation |
|---|
| 0:10:20 | uh so much |
|---|
| 0:10:21 | we |
|---|
| 0:10:22 | we got from |
|---|
| 0:10:23 | twenty percent error rate to five percent error rate |
|---|
| 0:10:26 | uh |
|---|
| 0:10:27 | what about next |
|---|
| 0:10:28 | uh our segmentation step using ca |
|---|
| 0:10:32 | norm uh standard or a segmentation step |
|---|
| 0:10:34 | uh they but this is is that uh we can again |
|---|
| 0:10:38 | and other improvements |
|---|
| 0:10:40 | uh with viterbi and map adaptation |
|---|
| 0:10:43 | and |
|---|
| 0:10:44 | we can see here that |
|---|
| 0:10:46 | is it but this is calm |
|---|
| 0:10:47 | it's confirmed because from |
|---|
| 0:10:49 | uh from the well change |
|---|
| 0:10:51 | the segmentation |
|---|
| 0:10:53 | we improve it so but |
|---|
| 0:10:55 | by another one point four percent |
|---|
| 0:10:58 | but this is uh |
|---|
| 0:11:00 | this is important uh |
|---|
| 0:11:02 | and significant only for |
|---|
| 0:11:04 | for this file |
|---|
| 0:11:06 | uh |
|---|
| 0:11:07 | where the segmentation |
|---|
| 0:11:08 | changed a lot |
|---|
| 0:11:14 | oh |
|---|
| 0:11:15 | in general |
|---|
| 0:11:16 | the it's not significant |
|---|
| 0:11:19 | these changes |
|---|
| 0:11:21 | uh |
|---|
| 0:11:22 | and the signal segmentation uh |
|---|
| 0:11:26 | was uh just |
|---|
| 0:11:27 | about classical viterbi and |
|---|
| 0:11:29 | map adaptation |
|---|
| 0:11:36 | i would like to summarise |
|---|
| 0:11:37 | this work |
|---|
| 0:11:39 | uh i just it's a two strategies |
|---|
| 0:11:43 | the |
|---|
| 0:11:44 | interspeaker variability modelling and inter segment |
|---|
| 0:11:48 | but i but at the moment modelling |
|---|
| 0:11:50 | and |
|---|
| 0:11:50 | uh |
|---|
| 0:11:51 | only the second |
|---|
| 0:11:53 | has uh and improvements |
|---|
| 0:11:56 | uh of of the segmentation |
|---|
| 0:11:58 | but |
|---|
| 0:11:59 | very |
|---|
| 0:12:00 | or |
|---|
| 0:12:02 | uh it can be useful |
|---|
| 0:12:05 | to to feel |
|---|
| 0:12:06 | filters some |
|---|
| 0:12:08 | some short |
|---|
| 0:12:09 | uh |
|---|
| 0:12:10 | speech segment |
|---|
| 0:12:11 | in the |
|---|
| 0:12:12 | in the heart of estimation you moderate |
|---|
| 0:12:15 | and it's |
|---|
| 0:12:17 | also useful as you so uh another |
|---|
| 0:12:20 | presegmentation step |
|---|
| 0:12:27 | next work uh can be done with uh |
|---|
| 0:12:30 | more training data |
|---|
| 0:12:32 | uh and |
|---|
| 0:12:35 | uh |
|---|
| 0:12:36 | the large number of speakers when dealing with the |
|---|
| 0:12:39 | interspeaker variability |
|---|
| 0:12:41 | uh |
|---|
| 0:12:43 | regarding the inter segment viability |
|---|
| 0:12:46 | uh it can be interesting to to |
|---|
| 0:12:49 | ben dealing with the multiple distant microphones |
|---|
| 0:12:53 | uh and uh |
|---|
| 0:12:55 | also another |
|---|
| 0:12:57 | test |
|---|
| 0:12:57 | can be done uh |
|---|
| 0:12:59 | one uh |
|---|
| 0:13:01 | when the application factor analysis based uh speaker modelling in the first step |
|---|
| 0:13:06 | of the |
|---|
| 0:13:07 | the speaker diarization system |
|---|
| 0:13:14 | well thank you very much for attention |
|---|
| 0:13:16 | and |
|---|
| 0:13:16 | if you have any questions |
|---|
| 0:13:26 | question |
|---|
| 0:13:32 | only reported an improvement when actually you selected only the |
|---|
| 0:13:36 | speech segments longer than one second |
|---|
| 0:13:39 | right |
|---|
| 0:13:40 | it means that actually in your segmentation of most of most lots of research |
|---|
| 0:13:44 | and this is your variable files |
|---|
| 0:13:46 | so good that was how we were i was configure are there any |
|---|
| 0:13:50 | it limits for the minimum duration of a segment |
|---|
| 0:13:53 | uh sorry i cannot tell uh and i think about the vad because i just the work |
|---|
| 0:13:58 | uh with the diarization system as it was |
|---|
| 0:14:01 | uh maybe uh korean if uh not serious |
|---|
| 0:14:22 | uh but uh maybe uh i i didn't understand well uh this uh this uh filtration is made on the |
|---|
| 0:14:28 | development |
|---|
| 0:14:29 | so |
|---|
| 0:14:37 | uh_huh |
|---|
| 0:14:51 | yeah in fact that the united |
|---|
| 0:14:53 | yeah train on the |
|---|
| 0:14:55 | and development it so we have to wait for instance the development set |
|---|
| 0:14:59 | so we can choose |
|---|
| 0:15:00 | and the length of the segment |
|---|
| 0:15:02 | and you try to train |
|---|
| 0:15:06 | yeah but the united estimation yeah |
|---|
| 0:15:11 | yeah |
|---|
| 0:15:11 | oh i have a question |
|---|
| 0:15:14 | so |
|---|
| 0:15:14 | i see that is it to speaker variability in this segment |
|---|
| 0:15:18 | ability |
|---|
| 0:15:19 | and uh |
|---|
| 0:15:21 | do you |
|---|
| 0:15:22 | so |
|---|
| 0:15:23 | i guess each segment their ability uh reflects the changes |
|---|
| 0:15:27 | speaker |
|---|
| 0:15:28 | is it useful information for |
|---|
| 0:15:30 | or |
|---|
| 0:15:31 | detecting the speaker |
|---|
| 0:15:32 | change |
|---|
| 0:15:35 | so |
|---|
| 0:15:36 | and we expect |
|---|
| 0:15:38 | a two |
|---|
| 0:15:39 | speaker |
|---|
| 0:15:39 | i think they should okay |
|---|
| 0:15:41 | can you do some information but |
|---|
| 0:15:43 | we should keep |
|---|
| 0:15:44 | okay segment and applications compensated and |
|---|
| 0:15:48 | nation |
|---|
| 0:15:49 | well you can line |
|---|
| 0:15:50 | why not |
|---|
| 0:15:51 | in uh in the estimation of you metrics |
|---|
| 0:15:54 | the uh the vocal development set |
|---|
| 0:15:56 | uh we had the reference |
|---|
| 0:15:58 | and uh you matrix was estimated um |
|---|
| 0:16:02 | in this case |
|---|
| 0:16:03 | uh |
|---|
| 0:16:06 | for for each speaker |
|---|
| 0:16:09 | uh |
|---|
| 0:16:09 | between uh the segments |
|---|
| 0:16:11 | of |
|---|
| 0:16:12 | of one speaker |
|---|
| 0:16:14 | so it was it was not uh |
|---|
| 0:16:16 | in there a segment of arrival they |
|---|
| 0:16:19 | in the way of |
|---|
| 0:16:20 | for uh |
|---|
| 0:16:21 | intel |
|---|
| 0:16:21 | all segments right but the only |
|---|
| 0:16:24 | uh it was in their segment the viability of |
|---|
| 0:16:26 | of a certain speaker |
|---|
| 0:16:31 | all speakers soprano testing |
|---|
| 0:16:36 | and then you do the presegmentation |
|---|
| 0:16:38 | using a generative model |
|---|
| 0:16:41 | you can see you mentioned B B segmentation |
|---|
| 0:16:44 | i always |
|---|
| 0:16:45 | process so you have one |
|---|
| 0:16:47 | one night lately |
|---|
| 0:16:50 | and |
|---|
| 0:16:52 | how many rounds |
|---|
| 0:16:53 | right |
|---|
| 0:16:55 | uh how many how many or segmentation |
|---|
| 0:16:58 | uh |
|---|
| 0:16:59 | uh well |
|---|
| 0:17:00 | there is normally there is uh |
|---|
| 0:17:02 | one one uh |
|---|
| 0:17:03 | segmentation and then uh take place and the story segmentation |
|---|
| 0:17:07 | this case it was a resegmentation uses uh factor analysis |
|---|
| 0:17:12 | wondering |
|---|
| 0:17:13 | uh |
|---|
| 0:17:16 | and uh there is segmentation uh uh was it the right thing until |
|---|
| 0:17:20 | uh the number of |
|---|
| 0:17:22 | five |
|---|
| 0:17:22 | changes of |
|---|
| 0:17:23 | in the in the segmentation |
|---|
| 0:17:26 | uh was uh |
|---|
| 0:17:28 | less than a certain |
|---|
| 0:17:30 | well you |
|---|
| 0:17:33 | one one |
|---|
| 0:17:38 | one per segmentation process |
|---|
| 0:17:40 | with many iterations |
|---|
| 0:17:41 | right |
|---|
| 0:17:42 | which |
|---|
| 0:17:46 | oh |
|---|
| 0:17:47 | slide |
|---|
| 0:17:49 | a class |
|---|
| 0:17:52 | uh |
|---|
| 0:17:54 | i don't know which light you mean |
|---|
| 0:17:56 | in this uh |
|---|
| 0:17:57 | there are parts of the uh |
|---|
| 0:18:01 | right |
|---|
| 0:18:01 | segmentation |
|---|
| 0:18:02 | yes |
|---|
| 0:18:03 | uh yeah |
|---|
| 0:18:04 | this is the original baseline system |
|---|
| 0:18:07 | and there are two resegmentation uh steps and uh the factor analysis |
|---|
| 0:18:12 | took place after this |
|---|
| 0:18:14 | presegmentation step |
|---|
| 0:18:16 | as the last |
|---|
| 0:18:17 | part of the of the diarization system |
|---|
| 0:18:20 | okay |
|---|
| 0:18:24 | you can can anything |
|---|
| 0:18:25 | in fact that the number education is not speak |
|---|
| 0:18:28 | it depends that understands it changes |
|---|
| 0:18:31 | and giving them a sense |
|---|
| 0:18:32 | so when we an estimated ten |
|---|
| 0:18:35 | no more changes |
|---|
| 0:18:37 | it went a segmentation that a given state we stop |
|---|
| 0:18:42 | thank you |
|---|
| 0:18:44 | no |
|---|
| 0:18:46 | i actually |
|---|
| 0:18:50 | yes |
|---|
| 0:18:56 | uh but you tested so |
|---|
| 0:18:58 | you you only scored the sections of the meetings that did not have overlapping speakers correct |
|---|
| 0:19:05 | uh |
|---|
| 0:19:06 | we just it only the the evaluation set uh from the nist |
|---|
| 0:19:10 | so i but there were different ways to score that there was a parameter which determines how much overlapping speech |
|---|
| 0:19:16 | was included |
|---|
| 0:19:18 | uh |
|---|
| 0:19:19 | and and your your uh error rate |
|---|
| 0:19:22 | are quite low so i assume you |
|---|
| 0:19:24 | but did not score the overlap |
|---|
| 0:19:26 | speakers |
|---|
| 0:19:27 | but that's just an assumption i want to |
|---|
| 0:19:29 | from you |
|---|
| 0:19:32 | well there are rights uh |
|---|
| 0:19:36 | maybe maybe you don't know because you just drama |
|---|
| 0:19:38 | for example the yeah right |
|---|
| 0:19:40 | here are |
|---|
| 0:19:41 | are the global arrays that although the total |
|---|
| 0:19:44 | all rights including curve force are um with |
|---|
| 0:19:46 | speech and the speaker |
|---|
| 0:19:48 | you could change |
|---|
| 0:19:50 | okay i i don't know uh |
|---|
| 0:19:52 | if i |
|---|
| 0:19:53 | and just |
|---|
| 0:19:54 | oh okay |
|---|
| 0:19:55 | and then about this one meeting |
|---|
| 0:19:57 | where you |
|---|
| 0:19:58 | had a significant improvement |
|---|
| 0:20:00 | um |
|---|
| 0:20:02 | i i |
|---|
| 0:20:03 | i remember that on one of the nist meetings |
|---|
| 0:20:06 | there was a |
|---|
| 0:20:07 | much larger number of speakers |
|---|
| 0:20:09 | then |
|---|
| 0:20:10 | and the other meetings |
|---|
| 0:20:12 | and i wonder if that was the one meeting where you saw again |
|---|
| 0:20:16 | um so there were many more |
|---|
| 0:20:19 | speaker changes because the number of speakers were actually |
|---|
| 0:20:22 | that's like double the other meetings |
|---|
| 0:20:25 | uh so i wondered if you had actually looked at some statistics of your meetings |
|---|
| 0:20:29 | to see uh if |
|---|
| 0:20:31 | there are some variable like the number |
|---|
| 0:20:33 | speakers that |
|---|
| 0:20:34 | uh could predict when you're method works |
|---|
| 0:20:37 | uh well and when that might make a difference |
|---|
| 0:20:40 | oh well uh i i don't have |
|---|
| 0:20:42 | anyhow |
|---|
| 0:20:44 | oh |
|---|
| 0:20:52 | no information |
|---|
| 0:20:55 | and we we did not |
|---|
| 0:20:57 | it |
|---|
| 0:20:57 | and |
|---|
| 0:20:58 | and is about to the |
|---|
| 0:21:00 | this was it |
|---|
| 0:21:01 | we |
|---|
| 0:21:02 | we know that |
|---|
| 0:21:03 | you say that again |
|---|
| 0:21:05 | yeah sometimes |
|---|
| 0:21:07 | and is not |
|---|
| 0:21:08 | necessary you to to the fact and he's in good |
|---|
| 0:21:11 | if we change |
|---|
| 0:21:13 | and finally implies sense |
|---|
| 0:21:14 | and |
|---|
| 0:21:16 | insinuation of that and uh |
|---|
| 0:21:18 | and this is and we know that we can |
|---|
| 0:21:20 | and this improvement |
|---|
| 0:21:22 | an infected E es work |
|---|
| 0:21:24 | the good ones too |
|---|
| 0:21:26 | and |
|---|
| 0:21:27 | exp tool |
|---|
| 0:21:28 | and don't we all |
|---|
| 0:21:29 | applying thank john and easy C speaker deviation |
|---|
| 0:21:33 | on meetings we had these aladdin |
|---|
| 0:21:35 | speakers most because then |
|---|
| 0:21:36 | implementation |
|---|
| 0:21:38 | and a different connotation |
|---|
| 0:21:41 | it is |
|---|
| 0:21:45 | and that that's the overlap |
|---|
| 0:21:47 | and we didn't |
|---|
| 0:21:49 | scroll and we thought about that |
|---|
| 0:21:51 | and |
|---|
| 0:21:52 | because we we |
|---|
| 0:21:54 | do something |
|---|
| 0:21:55 | oh |
|---|
| 0:21:56 | and to delete |
|---|
| 0:21:57 | overlap |
|---|
| 0:21:58 | and the ones to law school |
|---|