| 0:00:15 | first i will give a quick overview of i-vectors |
|---|
| 0:00:19 | after that i will |
|---|
| 0:00:21 | only some of the methods for hand recounts and start the |
|---|
| 0:00:26 | of the i-vector eyes them estimate scores by |
|---|
| 0:00:30 | limited the |
|---|
| 0:00:31 | duration of recordings |
|---|
| 0:00:34 | then i will |
|---|
| 0:00:36 | describe a simple preprocessing weighting scheme which uses duration information as a measure of |
|---|
| 0:00:46 | i wrecked or a oral ability |
|---|
| 0:00:49 | then i will describe some experiments and the results |
|---|
| 0:00:54 | followed by concluding remarks |
|---|
| 0:01:00 | in theory each decision should be made to dependent on the amount of data available |
|---|
| 0:01:07 | and the same should hold also in the case of speaker recognition since |
|---|
| 0:01:13 | we usually have recordings of different lengths |
|---|
| 0:01:19 | in practice this is usually not the case mainly due to practical reasons since panic |
|---|
| 0:01:27 | of uncertainty increases the article we agreed to make and computational complexity |
|---|
| 0:01:34 | and also |
|---|
| 0:01:36 | the gain in performance |
|---|
| 0:01:38 | cohen |
|---|
| 0:01:39 | can be not that could be not so significant especially if the recordings are sufficiently |
|---|
| 0:01:46 | long |
|---|
| 0:01:49 | in the case of |
|---|
| 0:01:51 | i-vector challenge |
|---|
| 0:01:53 | the |
|---|
| 0:01:55 | i-vectors were extracted from recordings of different lengths |
|---|
| 0:02:00 | and to the duration follows log normal distribution this suggests |
|---|
| 0:02:06 | that |
|---|
| 0:02:08 | we should see some improvement |
|---|
| 0:02:11 | if the duration information is taken into account |
|---|
| 0:02:18 | i-vector is defined as a map point estimate of keeping the variable of factor analysis |
|---|
| 0:02:23 | model |
|---|
| 0:02:24 | and it serves as a compact representation of speech utterance |
|---|
| 0:02:31 | the posterior covariance encodes the answer t |
|---|
| 0:02:36 | of the i-vector or |
|---|
| 0:02:39 | estimate |
|---|
| 0:02:42 | which is caused by a limited to duration of the recordings |
|---|
| 0:02:46 | usually |
|---|
| 0:02:47 | the i sort the |
|---|
| 0:02:50 | is discarded to and comparing i-vectors for example in the |
|---|
| 0:02:54 | be lda model |
|---|
| 0:02:59 | nevertheless there have been proposed some solutions how to the |
|---|
| 0:03:05 | take the uncertainty into account for example be a day with uncertainty propagation |
|---|
| 0:03:11 | where and then we should note term is added to the model |
|---|
| 0:03:16 | which models |
|---|
| 0:03:17 | which explicitly models the |
|---|
| 0:03:20 | duration variability |
|---|
| 0:03:22 | another one |
|---|
| 0:03:24 | is score calibration using different |
|---|
| 0:03:26 | duration is a quality measure |
|---|
| 0:03:28 | and yet another recycle i-vector scaling where the length normalisation is modified this to account |
|---|
| 0:03:35 | for the |
|---|
| 0:03:37 | uncertainty of i-vectors |
|---|
| 0:03:43 | and those solutions are not directly applicable or at least not |
|---|
| 0:03:48 | easily applicable in the context of i-vector challenge |
|---|
| 0:03:53 | scenes |
|---|
| 0:03:54 | the data for we can start |
|---|
| 0:03:57 | reconstructing the posterior covariance is not available |
|---|
| 0:04:01 | and also there is no development data that could be used for |
|---|
| 0:04:08 | optimising the calibration parameters |
|---|
| 0:04:12 | so is there another possibility how to use duration information |
|---|
| 0:04:18 | to as a measure of i-vector a |
|---|
| 0:04:22 | rely reliability |
|---|
| 0:04:27 | prior to |
|---|
| 0:04:32 | comparing the i-vectors are usually preprocessed |
|---|
| 0:04:37 | among more common preprocessing methods are pca lda and do within class covariance normalization |
|---|
| 0:04:46 | in which the basic step is to calculate mean and |
|---|
| 0:04:52 | covariance matrix s |
|---|
| 0:04:55 | we implicitly assume |
|---|
| 0:04:56 | in those calculations that |
|---|
| 0:04:59 | each the i-vector is equally all i-vectors are equally reliable |
|---|
| 0:05:07 | some to account for the difference in a reliability of i-vectors |
|---|
| 0:05:14 | re |
|---|
| 0:05:14 | proposed a simple weighting scheme in of each other |
|---|
| 0:05:21 | in which the to could contribution of each i-vector is multiplied by its corresponding duration |
|---|
| 0:05:29 | so |
|---|
| 0:05:30 | to verify the |
|---|
| 0:05:34 | soundness of the proposed idea |
|---|
| 0:05:36 | we implemented that the baseline system right in which we compare it |
|---|
| 0:05:42 | the standard pca with |
|---|
| 0:05:45 | the weighted version of the pca |
|---|
| 0:05:49 | the results showed that weighted version of peace |
|---|
| 0:05:52 | pca |
|---|
| 0:05:56 | produce slightly better results than a standard one |
|---|
| 0:06:01 | we also wanted to |
|---|
| 0:06:04 | try within class covariance normalisation |
|---|
| 0:06:07 | but |
|---|
| 0:06:08 | in order to |
|---|
| 0:06:10 | the apply within class covariance normalization |
|---|
| 0:06:14 | we need to have labeled to date time which was not the case in the |
|---|
| 0:06:19 | challenge |
|---|
| 0:06:21 | so we needed to perform unsupervised the clustering we |
|---|
| 0:06:28 | but |
|---|
| 0:06:29 | experiment that with the different clustering algorithms but that the end to be selected k-means |
|---|
| 0:06:35 | with cosine distance and four thousand clusters |
|---|
| 0:06:42 | unfortunately the results are worse for within class covariance normalization then for a pca but |
|---|
| 0:06:49 | at least the |
|---|
| 0:06:51 | the weighted version was |
|---|
| 0:06:54 | slightly a cat of the standard one |
|---|
| 0:07:00 | we tried also several different classifiers and the best results were at used |
|---|
| 0:07:07 | with a logistic regression but only after reading remove the |
|---|
| 0:07:12 | length normalisation of from the processing pipeline |
|---|
| 0:07:17 | in that case within class covariance normalisation |
|---|
| 0:07:21 | gave better results then pca and all spend the can |
|---|
| 0:07:27 | weighted towards and was |
|---|
| 0:07:29 | score the better than standard one |
|---|
| 0:07:35 | we try to further improve the results by additional fine-tuning |
|---|
| 0:07:42 | so we put the duration as it and additional feature of i-vectors we excluded clusters |
|---|
| 0:07:49 | with small official score |
|---|
| 0:07:52 | we were is the roles of |
|---|
| 0:07:56 | target and test i-vectors |
|---|
| 0:07:59 | and do you can't do you want to the hyper parameters of logistic regression |
|---|
| 0:08:04 | we did this fine tuning we were able to improve |
|---|
| 0:08:09 | the previous result |
|---|
| 0:08:11 | for a little bit more |
|---|
| 0:08:13 | so this was also our |
|---|
| 0:08:16 | best submitted result |
|---|
| 0:08:20 | to conclude we |
|---|
| 0:08:22 | present that |
|---|
| 0:08:23 | a simple preprocessing weighting scheme which uses do duration information is a measure of i-vector |
|---|
| 0:08:30 | a reliability |
|---|
| 0:08:33 | we at you would quite reason the bus six sets |
|---|
| 0:08:37 | with a clustering in the case of within class covariance normalization |
|---|
| 0:08:42 | but okay but cat |
|---|
| 0:08:45 | nearly no success with clustering in the case of the lda |
|---|
| 0:08:50 | which suggests that we had a is more susceptible for labeling errors |
|---|
| 0:08:56 | and the last remark we found out that length normalization does not help logistic regression |
|---|
| 0:09:03 | thank you |
|---|
| 0:09:21 | okay |
|---|
| 0:09:31 | just empirical results but maybe somebody s can comment that i don't know |
|---|
| 0:09:40 | nicole |
|---|
| 0:09:46 | with on the side |
|---|
| 0:09:47 | but we with at the same results as logistic regression icons otherwise |
|---|
| 0:10:06 | did you generate what we did a clustering or you just one clustering stage we |
|---|
| 0:10:10 | tried |
|---|
| 0:10:11 | different things also two |
|---|
| 0:10:14 | to iterate the clustering but didn't six it |
|---|
| 0:10:26 | this was also experiments clear sets of four thousand because we didn't the get then |
|---|
| 0:10:32 | you improvements by |
|---|
| 0:10:35 | changing |
|---|