| 0:00:13 | a moral is gonna be given but by opal be microsoft corporation | 
|---|
| 0:00:17 | hello | 
|---|
| 0:00:19 | so | 
|---|
| 0:00:20 | um | 
|---|
| 0:00:23 | i'm a some of the material from this | 
|---|
| 0:00:26 | top was similar a little bit redundant with the next speaker a | 
|---|
| 0:00:29 | because uh | 
|---|
| 0:00:31 | to me and rick grow should of called an ada that talks a little bit quick | 
|---|
| 0:00:34 | giving talks on similar topics | 
|---|
| 0:00:36 | but i i am gonna go through the | 
|---|
| 0:00:38 | introductory material anyway because uh | 
|---|
| 0:00:42 | it necessary to understand might talk | 
|---|
| 0:00:44 | yeah | 
|---|
| 0:00:45 | yeah i'm kind of assuming that people in this audience may or may not have heard it | 
|---|
| 0:00:50 | S G M | 
|---|
| 0:00:52 | and will probably benefit from me going through them again | 
|---|
| 0:00:56 | so | 
|---|
| 0:00:58 | i | 
|---|
| 0:00:58 | this is a technique that | 
|---|
| 0:00:59 | we introduce fairly recently | 
|---|
| 0:01:01 | it's the kind of | 
|---|
| 0:01:03 | a factored form of the gaussian mixture model basis | 
|---|
| 0:01:08 | now | 
|---|
| 0:01:10 | i'm gonna get to it in stages starting from something that everyone knows | 
|---|
| 0:01:14 | now first imagine you have | 
|---|
| 0:01:16 | the full covariance | 
|---|
| 0:01:18 | system | 
|---|
| 0:01:19 | uh | 
|---|
| 0:01:20 | and and i just written down the equations for that | 
|---|
| 0:01:24 | and | 
|---|
| 0:01:25 | this is just a full converts mixture of gaussians in each state | 
|---|
| 0:01:28 | at at the bottom at just enumerated what the parameters are | 
|---|
| 0:01:32 | as the weights than means the variances | 
|---|
| 0:01:35 | no | 
|---|
| 0:01:36 | next we just make a very trivial change we stipulate but | 
|---|
| 0:01:40 | a number of gaussian the need | 
|---|
| 0:01:41 | state is the same | 
|---|
| 0:01:44 | and it would be a large number let's say two thousand | 
|---|
| 0:01:47 | is obviously the in practical system at this point | 
|---|
| 0:01:50 | but we uh | 
|---|
| 0:01:51 | uh i i'm just making a small change as possible each time | 
|---|
| 0:01:54 | so the same number of got that each state and that | 
|---|
| 0:01:57 | one we that the parameters and really just listing that continuous ones | 
|---|
| 0:02:01 | so those are unchanged from the four | 
|---|
| 0:02:04 | the next thing we do | 
|---|
| 0:02:06 | as we say that the covariance as are shared across states | 
|---|
| 0:02:10 | but not shared across gaussians | 
|---|
| 0:02:12 | so the question than change much all that happen as we dropped one index from the sigma i'll just go | 
|---|
| 0:02:17 | back | 
|---|
| 0:02:18 | i mean see it's sigma J i now we just have segment i | 
|---|
| 0:02:22 | so i is the like the gaussian and X it goes from let's say one two | 
|---|
| 0:02:26 | two thousand or one two thousand or something | 
|---|
| 0:02:29 | now | 
|---|
| 0:02:31 | X thing we do | 
|---|
| 0:02:32 | but the next stage is that's slightly more complicated state and is the kind of key stage | 
|---|
| 0:02:37 | and at the means to a subspace | 
|---|
| 0:02:40 | so | 
|---|
| 0:02:40 | the mean the now no longer parameters | 
|---|
| 0:02:44 | yeah | 
|---|
| 0:02:45 | yeah me a J i you G I is a vector | 
|---|
| 0:02:47 | and the J is the state i is the gaussian and X | 
|---|
| 0:02:50 | some i was saying you J i is um i V J | 
|---|
| 0:02:55 | you can separate these quantities in various ways that i just in a | 
|---|
| 0:02:59 | M M is an matrix V the vector uh i don't really give a much interpretation | 
|---|
| 0:03:03 | but each state | 
|---|
| 0:03:05 | each state J now has a vector V | 
|---|
| 0:03:08 | to the clip the forty of fifty | 
|---|
| 0:03:10 | and each cast an index uh i has this make tree | 
|---|
| 0:03:14 | let's say it might be thirty nine by forty at thirty nine by fifty | 
|---|
| 0:03:18 | a a matrix that says | 
|---|
| 0:03:19 | how | 
|---|
| 0:03:21 | the mean of that state varies when the vector and the | 
|---|
| 0:03:25 | sorry | 
|---|
| 0:03:26 | how the mean of that gas in index but one the vector that state changes | 
|---|
| 0:03:30 | so | 
|---|
| 0:03:32 | but changed here is that we used to have a go about go back one | 
|---|
| 0:03:37 | a to have the new J I down there was a and and the parameter less now it's | 
|---|
| 0:03:41 | V J and M my | 
|---|
| 0:03:43 | and of course then me J I is | 
|---|
| 0:03:45 | the kind of product of the two | 
|---|
| 0:03:48 | now so so that the most important change uh | 
|---|
| 0:03:52 | from a regular system | 
|---|
| 0:03:53 | and and | 
|---|
| 0:03:54 | there's a few more changes | 
|---|
| 0:03:57 | X thing is that the way | 
|---|
| 0:03:58 | and no longer parameters | 
|---|
| 0:04:00 | but a lot of weight | 
|---|
| 0:04:02 | is | 
|---|
| 0:04:04 | suppose those a thousand or two thousand gaussians that's a | 
|---|
| 0:04:07 | a lot of parameters and we we don't one most of the problem just to be in the way because | 
|---|
| 0:04:12 | work got accustomed to the weights been your rows of we small subset of the parameters | 
|---|
| 0:04:16 | so we say now the weights | 
|---|
| 0:04:18 | the weights or depend on these vectors V | 
|---|
| 0:04:21 | and what we do is make the way | 
|---|
| 0:04:24 | i | 
|---|
| 0:04:25 | so we maybe i'm lies log weights a linear function of these V's | 
|---|
| 0:04:29 | so you see on the top the X of W I transpose B J W i transpose we J is | 
|---|
| 0:04:35 | is a scalar that we can at separate as an a normalized log weight | 
|---|
| 0:04:39 | more this equation is doing is just normalising at | 
|---|
| 0:04:42 | i people ask me so why why that a log wait one just the weights well | 
|---|
| 0:04:47 | a that you can make the weights depend linearly on the vector because then | 
|---|
| 0:04:52 | he he would be hard to forced to be number to be positive | 
|---|
| 0:04:56 | also uh | 
|---|
| 0:04:58 | i i think the whole optimisation problem becomes non-convex if you choose any other formula apart from this | 
|---|
| 0:05:04 | no no uh up to scaling and stuff | 
|---|
| 0:05:06 | so a okay so i just so you what changed here | 
|---|
| 0:05:09 | i go back | 
|---|
| 0:05:11 | the parameters would W J I V J et cetera | 
|---|
| 0:05:14 | no it's | 
|---|
| 0:05:15 | W uh i bowled three a so | 
|---|
| 0:05:18 | and that we do have the weight as problem is as we have these vectors | 
|---|
| 0:05:23 | no the vector W i want for each gaussian index of this two thousand of these vectors are one thousand | 
|---|
| 0:05:28 | of these vector | 
|---|
| 0:05:30 | yeah | 
|---|
| 0:05:31 | thing | 
|---|
| 0:05:31 | then next thing yeah but next thing speaker adaptation | 
|---|
| 0:05:37 | and an an a | 
|---|
| 0:05:37 | a not the next thing the next thing sub state | 
|---|
| 0:05:40 | what | 
|---|
| 0:05:42 | we we just add another layer of mixture | 
|---|
| 0:05:44 | now you know you can always that another layer of mixture right | 
|---|
| 0:05:47 | just happens to help in this particular | 
|---|
| 0:05:50 | circumstance and and my intuition is that | 
|---|
| 0:05:53 | but there might be a particular | 
|---|
| 0:05:56 | kind of phonetic state that can be realized two very distinct way | 
|---|
| 0:06:00 | i you might pronounce the that I you might not pronounce it | 
|---|
| 0:06:04 | and | 
|---|
| 0:06:05 | it just seems more natural to have like a mixture of two | 
|---|
| 0:06:09 | of these vectors V one to represent that to and want to just represents and | 
|---|
| 0:06:14 | otherwise if force the kind of subspace to learn things that really shouldn't have to learn | 
|---|
| 0:06:19 | so okay we just and we've introduced these the sub states and i just go back to a a a | 
|---|
| 0:06:24 | a and look at the parameters of the bottom | 
|---|
| 0:06:27 | this W I V take now we have | 
|---|
| 0:06:29 | C J M W doubly V J a | 
|---|
| 0:06:32 | so | 
|---|
| 0:06:33 | a parameters is here at the at the mixture weight | 
|---|
| 0:06:37 | and also we added then you subscript on the these not now it's of V J M | 
|---|
| 0:06:42 | okay | 
|---|
| 0:06:43 | the next | 
|---|
| 0:06:44 | the X | 
|---|
| 0:06:45 | stage is | 
|---|
| 0:06:46 | speaker adaptation | 
|---|
| 0:06:48 | yeah we can be norm of things like a from a lot retail and | 
|---|
| 0:06:52 | but there's a kind of special speaker adaptation a specific to this model | 
|---|
| 0:06:57 | you see there's this play S and i be a go back one using get see the change | 
|---|
| 0:07:02 | that was | 
|---|
| 0:07:03 | this is then new thing | 
|---|
| 0:07:05 | so | 
|---|
| 0:07:06 | it is is we introduce an a a speaker specific back to V super script S | 
|---|
| 0:07:11 | it do we just but the S some top because sometimes we have both of them on certain quantities and | 
|---|
| 0:07:15 | then it becomes a mess otherwise | 
|---|
| 0:07:17 | so | 
|---|
| 0:07:20 | so that V stupid script that's of the speaker-specific vector that says | 
|---|
| 0:07:24 | it just in a | 
|---|
| 0:07:25 | i get the information about that speaker | 
|---|
| 0:07:28 | so so what we didn't have a is is we train | 
|---|
| 0:07:30 | the kind of speaker subspace and these and i quantities tell you how each mean | 
|---|
| 0:07:36 | varies with the speaker | 
|---|
| 0:07:38 | typically the speaker sub-spaces of a dimension | 
|---|
| 0:07:41 | with a forty | 
|---|
| 0:07:42 | the same dimension as the uh phonetic one | 
|---|
| 0:07:45 | so you have you have a quite a few parameters to describe the speaker subspace | 
|---|
| 0:07:49 | and and and | 
|---|
| 0:07:50 | a two D decode you'd have to | 
|---|
| 0:07:52 | to a first pass decoding | 
|---|
| 0:07:54 | as to make this these super script S | 
|---|
| 0:07:57 | and uh | 
|---|
| 0:07:59 | yeah to code again | 
|---|
| 0:08:01 | so we add the parameters and that but i | 
|---|
| 0:08:04 | and as also these these people script ask but these are speaker-specific specific then not really part of the model | 
|---|
| 0:08:09 | there a little bit like | 
|---|
| 0:08:10 | and F from a transform or something like that | 
|---|
| 0:08:14 | so | 
|---|
| 0:08:16 | i i think we can to the end of describing the sgmm so that means we K | 
|---|
| 0:08:20 | but it is uh | 
|---|
| 0:08:22 | oh i described that to now it's is stuff that we've already published | 
|---|
| 0:08:25 | and i just maybe the punch line of what we already described in case you haven't seen that | 
|---|
| 0:08:30 | but it bad so than a regular gmm based system | 
|---|
| 0:08:34 | uh uh i four | 
|---|
| 0:08:37 | it can better at the M a mobile and that's a special better for small data to the core | 
|---|
| 0:08:42 | a twenty percent relative improvement | 
|---|
| 0:08:45 | if you have a few hours of data and maybe | 
|---|
| 0:08:47 | ten percent | 
|---|
| 0:08:48 | if you like when you have tons of data | 
|---|
| 0:08:51 | you have a thousand dollars a | 
|---|
| 0:08:53 | and uh | 
|---|
| 0:08:54 | the problems a somewhat less up to the scrimmage of training | 
|---|
| 0:08:57 | mainly due to bad interaction with the feature space discriminative training | 
|---|
| 0:09:03 | i just some in previous work here | 
|---|
| 0:09:06 | but so so have this talk is about | 
|---|
| 0:09:08 | a a is kind of fixing thing an asymmetry in the sgmm | 
|---|
| 0:09:12 | so | 
|---|
| 0:09:14 | as go back one slide | 
|---|
| 0:09:16 | or or but what the speaker adaptation stuff you have this | 
|---|
| 0:09:20 | and my V J M plus and i V S not i think kind of symmetrical equation because | 
|---|
| 0:09:25 | you have these but to is describing the phonetic space | 
|---|
| 0:09:29 | and and another vectors describing gonna speaker space um we add them together | 
|---|
| 0:09:35 | that's nice and some you go but that like down to the the | 
|---|
| 0:09:38 | the equation for the weights W J M i equals probable | 
|---|
| 0:09:41 | we don't the in thing with the speaker stuff and their | 
|---|
| 0:09:45 | doesn't doesn't P S as an asymmetry in the model because was saying the weights depend on the | 
|---|
| 0:09:49 | phonetic state the not the | 
|---|
| 0:09:52 | peak care and you know why shouldn't they depend on speaker | 
|---|
| 0:09:55 | oh | 
|---|
| 0:09:56 | so so i this paper is about is it's fixed thing bout symmetry | 
|---|
| 0:10:00 | and uh i'll go i'll go for one slide you'll see how we fix set | 
|---|
| 0:10:06 | a look at that equation for the weights the uh | 
|---|
| 0:10:08 | the last but one equation | 
|---|
| 0:10:10 | we we've added that um is for for the uh | 
|---|
| 0:10:13 | speaker yeah | 
|---|
| 0:10:15 | that | 
|---|
| 0:10:16 | that for action just look at the top of a look at the new numerator | 
|---|
| 0:10:19 | that's the uh normalized what weight | 
|---|
| 0:10:22 | well the the inside the brackets of the uh normalized log way | 
|---|
| 0:10:25 | so but this is saying is it's a a function of the | 
|---|
| 0:10:29 | phonetic | 
|---|
| 0:10:30 | uh | 
|---|
| 0:10:31 | state and is a linear function of the speaker state so it's almost the simplest thing you could do | 
|---|
| 0:10:37 | we just fix the asymmetry had the parameters we have is this | 
|---|
| 0:10:41 | you use subscript i | 
|---|
| 0:10:43 | which is a kind of | 
|---|
| 0:10:44 | peak uh | 
|---|
| 0:10:45 | the of the | 
|---|
| 0:10:48 | the thing that tells you how the weights very with the speaker | 
|---|
| 0:10:51 | just the speaker space on a log of W subscript script i | 
|---|
| 0:10:56 | so now | 
|---|
| 0:10:57 | it was a hard to write down this equation | 
|---|
| 0:11:00 | so you know what didn't we do it the four | 
|---|
| 0:11:03 | well what what the | 
|---|
| 0:11:05 | uh | 
|---|
| 0:11:07 | you can just wide down equation for something else that to | 
|---|
| 0:11:10 | able to efficiently uh a that and uh | 
|---|
| 0:11:13 | code with it | 
|---|
| 0:11:15 | no | 
|---|
| 0:11:16 | if you were to just six | 
|---|
| 0:11:18 | expand these as gmms and to big gaussian mixtures that be completely impractical | 
|---|
| 0:11:23 | because | 
|---|
| 0:11:24 | i think about each state now has two thousand gaussians while some | 
|---|
| 0:11:29 | so | 
|---|
| 0:11:30 | and the full covariance | 
|---|
| 0:11:31 | i i i don't have i mentioned that but the and therefore co variance | 
|---|
| 0:11:35 | so you can you can fit that and memory and and | 
|---|
| 0:11:37 | a and in all the machine | 
|---|
| 0:11:39 | so uh | 
|---|
| 0:11:42 | but | 
|---|
| 0:11:43 | we previously described the ways that you can uh | 
|---|
| 0:11:46 | efficiently evaluate the likely but it wasn't it just wasn't one hundred percent obvious how to extend those method | 
|---|
| 0:11:52 | so the case where the weights depend on the speaker | 
|---|
| 0:11:55 | so why this paper is about | 
|---|
| 0:11:57 | as a separate tech report the describe the details | 
|---|
| 0:12:00 | as it's about ha how do you | 
|---|
| 0:12:03 | how do add in this uh | 
|---|
| 0:12:05 | it's about how to efficiently evaluate the likelihoods | 
|---|
| 0:12:08 | when use some at tries that | 
|---|
| 0:12:10 | and uh | 
|---|
| 0:12:12 | i i'm going to the details of that | 
|---|
| 0:12:15 | it it it was reasonable to for you have a bit more memory | 
|---|
| 0:12:18 | just just because this is necessary for understanding the results i just mentioning that | 
|---|
| 0:12:23 | but we describe to a date it's for the U's | 
|---|
| 0:12:27 | sorry for the use of script | 
|---|
| 0:12:29 | a subscript I quantities | 
|---|
| 0:12:31 | as an ending exact one and a an exact one | 
|---|
| 0:12:34 | but difference really isn't that important i'm just gonna skip over that | 
|---|
| 0:12:39 | uh | 
|---|
| 0:12:40 | so that was that the results on call home and uh | 
|---|
| 0:12:43 | how long do have by the way | 
|---|
| 0:12:46 | we hope | 
|---|
| 0:12:47 | okay | 
|---|
| 0:12:47 | i'm call home and switchboard | 
|---|
| 0:12:51 | yeah | 
|---|
| 0:12:51 | the call home results and | 
|---|
| 0:12:54 | so the second line of or | 
|---|
| 0:12:56 | but top line the result is on adapted | 
|---|
| 0:12:59 | a second line | 
|---|
| 0:13:01 | and the were there | 
|---|
| 0:13:03 | is a really difficult task | 
|---|
| 0:13:05 | callhome home english doesn't how much training data it's messy C | 
|---|
| 0:13:08 | the second one is | 
|---|
| 0:13:10 | is with the speaker vectors that's just the kind of standard sgmm gmm with without adaptation | 
|---|
| 0:13:15 | the bottom two lines of the new stuff | 
|---|
| 0:13:18 | a difference between the bottom two lines | 
|---|
| 0:13:20 | and the difference is not important so | 
|---|
| 0:13:22 | so let's focus on the difference between the second and third line | 
|---|
| 0:13:25 | as about | 
|---|
| 0:13:26 | one and a half percent absolute improvement | 
|---|
| 0:13:29 | going from forty five point nine to forty four point four | 
|---|
| 0:13:32 | so that seems like a very worthwhile improvement from | 
|---|
| 0:13:35 | this uh some a station | 
|---|
| 0:13:38 | uh | 
|---|
| 0:13:39 | so we put is about that | 
|---|
| 0:13:41 | uh | 
|---|
| 0:13:41 | oh yeah here is the uh | 
|---|
| 0:13:44 | the same with constrained mllr a | 
|---|
| 0:13:46 | just like you can get the best result this way you can combined the | 
|---|
| 0:13:50 | the uh special form of adaptation with the standard method | 
|---|
| 0:13:53 | so again we get improvement | 
|---|
| 0:13:55 | how much is it now | 
|---|
| 0:13:57 | most improvement we get is about | 
|---|
| 0:14:00 | a two percent absolute | 
|---|
| 0:14:01 | pretty clear | 
|---|
| 0:14:03 | i'm for the students seem to work on switchboard | 
|---|
| 0:14:06 | so the the this | 
|---|
| 0:14:08 | this | 
|---|
| 0:14:09 | table is a bit busy but the key line to the button two | 
|---|
| 0:14:12 | the | 
|---|
| 0:14:13 | the second to last line is the standard | 
|---|
| 0:14:16 | the standard that | 
|---|
| 0:14:17 | the bottom someone is the summit station | 
|---|
| 0:14:19 | i miss seeing | 
|---|
| 0:14:21 | between zero and zero point two percent | 
|---|
| 0:14:24 | improvement absolute | 
|---|
| 0:14:26 | which was a bit disappointing | 
|---|
| 0:14:28 | thought maybe it was some interaction with vtln and so | 
|---|
| 0:14:32 | we did the experiment without vtln | 
|---|
| 0:14:35 | and again we seeing | 
|---|
| 0:14:37 | oh we see point one point five and point to different uh | 
|---|
| 0:14:42 | different configurations and | 
|---|
| 0:14:44 | and it's a rather disappointing improvement | 
|---|
| 0:14:47 | uh | 
|---|
| 0:14:49 | so we try to figure out why wasn't working we looked to the likelihoods of various | 
|---|
| 0:14:53 | stages of decoding is stuff and nothing was a P S | 
|---|
| 0:14:56 | nothing was different from the other set up so | 
|---|
| 0:14:59 | i i at this point we just really don't know why it worked on one set up and not the | 
|---|
| 0:15:02 | other | 
|---|
| 0:15:03 | and and we suspect that is probably somewhere in between | 
|---|
| 0:15:06 | so we can do further experiments | 
|---|
| 0:15:10 | uh | 
|---|
| 0:15:11 | something we should do and future is is to see what weather | 
|---|
| 0:15:15 | there i didn't mention but this this is on the called a universal background model involved it's only use for | 
|---|
| 0:15:20 | three pruning | 
|---|
| 0:15:21 | but one possibility is that you should train that in the matched to way | 
|---|
| 0:15:25 | and that would help uh | 
|---|
| 0:15:27 | get the stuff to where you could be that the pretty pruning is stopping this from being effective | 
|---|
| 0:15:31 | has just one idea | 
|---|
| 0:15:33 | and way | 
|---|
| 0:15:34 | so next thing is just the | 
|---|
| 0:15:35 | applied for something | 
|---|
| 0:15:37 | we number | 
|---|
| 0:15:38 | that implements these S gmms | 
|---|
| 0:15:41 | it's is actually complete speech toolkit | 
|---|
| 0:15:44 | uh | 
|---|
| 0:15:45 | and it's useful independently of the sgmm aspect but | 
|---|
| 0:15:49 | i it can run the system we have we have scripts that uh | 
|---|
| 0:15:53 | for that we have a presentation on friday | 
|---|
| 0:15:56 | about that | 
|---|
| 0:15:57 | not part of the official program for it to the room here | 
|---|
| 0:16:00 | so if anyone's interested they can come along | 
|---|
| 0:16:04 | so i believe | 
|---|
| 0:16:05 | or are the time like you very much | 
|---|
| 0:16:12 | we have time for | 
|---|
| 0:16:14 | three or four questions | 
|---|
| 0:16:15 | uh | 
|---|
| 0:16:16 | we | 
|---|
| 0:16:16 | yeah | 
|---|
| 0:16:19 | uh are also uh a piece of the question | 
|---|
| 0:16:22 | you change a gmm a tool as uh a gmm M | 
|---|
| 0:16:25 | right yeah well as we know gmm is that generally we now tool all model and T | 
|---|
| 0:16:30 | a i | 
|---|
| 0:16:31 | hmmm is used to stick it is you wish | 
|---|
| 0:16:34 | uh the uh well you change twice | 
|---|
| 0:16:35 | gmm um | 
|---|
| 0:16:37 | hmmm have you told that that you can do those uh you could you are model is and we now | 
|---|
| 0:16:41 | oh a map model i do uh user oceans | 
|---|
| 0:16:46 | i mean just you you could increase the number of | 
|---|
| 0:16:48 | gaussian than the ubm | 
|---|
| 0:16:50 | and it would be general but it's really about compressed them a number of parameters you have to learn | 
|---|
| 0:16:56 | i mean i mean it's not a is not gonna but with infinite training data that it wouldn't be any | 
|---|
| 0:17:01 | better than a gmm | 
|---|
| 0:17:03 | but would finite training data seems to be but | 
|---|
| 0:17:07 | oh yeah yeah yeah | 
|---|
| 0:17:12 | three | 
|---|
| 0:17:14 | yeah | 
|---|
| 0:17:14 | so a little used about because we | 
|---|
| 0:17:17 | so the basic the | 
|---|
| 0:17:19 | uh of the variances | 
|---|
| 0:17:22 | a in some funny way and hmmm so | 
|---|
| 0:17:24 | a lot of mind how many more parameters or less parameters that well a U eight and have it is | 
|---|
| 0:17:30 | you mean input to do that a little bit less | 
|---|
| 0:17:32 | but | 
|---|
| 0:17:33 | that call me and that because i have a if i haven't checked in our distributed to by feel i | 
|---|
| 0:17:37 | have a feeling it might be a little bit more but but when you have a lot of data it's | 
|---|
| 0:17:41 | usually less to you to unit | 
|---|
| 0:17:45 | uh_huh | 
|---|
| 0:17:46 | right | 
|---|
| 0:17:54 | the difference between the call home and the switchboard | 
|---|
| 0:17:57 | uh | 
|---|
| 0:17:58 | the if for the the the speaker modeling like have to do with the amount of data per speaker and | 
|---|
| 0:18:02 | two | 
|---|
| 0:18:04 | um | 
|---|
| 0:18:05 | no i'm not i'm not one of these data base gurus i really don't she know | 
|---|
| 0:18:10 | how | 
|---|
| 0:18:11 | whether that differ | 
|---|
| 0:18:13 | so | 
|---|
| 0:18:14 | yeah i have to look into the how you in in most something but also the the the the likelihood | 
|---|
| 0:18:19 | be computation for the uh | 
|---|
| 0:18:21 | a what when you you calling segment arise when you some suck in the uh | 
|---|
| 0:18:25 | is the E the speaker | 
|---|
| 0:18:28 | hmmm subspace and the weights | 
|---|
| 0:18:30 | is is is that change a lot it more complicated | 
|---|
| 0:18:33 | well it very slightly more complicated | 
|---|
| 0:18:35 | uh but | 
|---|
| 0:18:36 | it's not significantly hard to so | 
|---|
| 0:18:38 | you you you is like more more an extra quantity that you have to pretty compute and then hmmm and | 
|---|
| 0:18:43 | then at the time when you | 
|---|
| 0:18:45 | and a complete the speaker vector there's a bunch of inner products the you have to compute one for each | 
|---|
| 0:18:50 | state or something | 
|---|
| 0:18:51 | i don't for each sub state but then not | 
|---|
| 0:18:53 | but that add significantly to the can compute to as just a book keeping in yeah and that i see | 
|---|
| 0:18:58 | in it that increase the memory nearly double the memory required | 
|---|
| 0:19:01 | storing a model | 
|---|
| 0:19:03 | you mean in do some likely computation or in training as well | 
|---|
| 0:19:08 | oh but was a in in storing in the model for the model that any more weights | 
|---|
| 0:19:12 | oh that it's not like there's more weights but that | 
|---|
| 0:19:15 | some way like this some can to do that the same size as the expanded weights that you have to | 
|---|
| 0:19:19 | store well | 
|---|
| 0:19:21 | yeah | 
|---|
| 0:19:24 | as like this week again | 
|---|