0:00:15so i mean i don't have this is all one solely basically let's go back
0:00:19to the they to the first day all the way and what is that you
0:00:24didn't ask what is it whatever it is that you didn't say
0:00:28you sort of each that maybe you said
0:00:31that now is a good opportunity because it happens very often and their questions
0:00:36but of comes out of don't know how fast but then you come home and
0:00:41say all i wish that be so this so alex definitely yes it is one
0:00:46thing to say
0:00:49sorry for saying something but
0:00:55well i'm close the circle of the
0:00:58basically at the beginning of the meeting we
0:01:01learned a lot
0:01:03well it's i want a lot about what happens and people's brains and so forth
0:01:11i think that something systems prove that we can understand language
0:01:19well i thought that's my
0:01:23but it here the and you're basically tell us all that stuff but have read
0:01:29out from between right after that talk i'm right now is basically you know probably
0:01:34the wrong idea
0:01:36are you buying well whatever you why are we do we need to learn something
0:01:41from those are
0:01:43and how can we do that
0:01:45that there are so real sounds so you have two choices a one choice you
0:01:52have it is about the models used all or something probably existence proof
0:01:58the other is the think about computation
0:02:01and build a model that way
0:02:02models don't have to be the say so there's certainly used to pass
0:02:07and i think learning for hence is a very bright
0:02:12i pay some attention to that
0:02:15recipes to which a lot a lot from the altar system
0:02:18to the extent to understand it and we understand part of
0:02:23so i think there's two avenues of information to mine once
0:02:30physiology the others computation
0:02:33but still there's also the model suggests also so what kind of the evidence we
0:02:39can take advantage of
0:02:41so i have a couple sense for this question too so you know regarding i
0:02:47don't and tuesday the low resource data and we had some zero-resourced arts and that
0:02:51sort of right most work and it turns out when you actually start removing supervision
0:02:57from the system
0:02:58the things that actually allow you to discover units as a speech automatically are not
0:03:04the same features that we use for supervised process is not the same models that
0:03:08we use this process
0:03:10and so somehow
0:03:12it is the case that i think well a lot of people might not be
0:03:15interested in sort of that extreme of research because it might not always be practical
0:03:19from one you don't insist that you can sell and so forth i think that
0:03:23style of work where you're forced to sort of connect yourself to something like for
0:03:28example of the and many was talking about with human language acquisition and make something
0:03:32between those things can send you to new classes and models and you representations that
0:03:37you're forced into that i think could eventually be that can be fed back into
0:03:42the supervised case for forgetting
0:03:49i'm glad that you go also back to the early days like monday and tuesday
0:03:53be more skillful of optimism and not for the two for thursday where we all
0:03:59i like of the coach our
0:04:01i like to remind you that indeed i think that a community is diving into
0:04:07new types of models
0:04:09well below for worse because of course always then you start some new paradigm everybody
0:04:14chime suddenly turned quickly you may get also discouraged but additionally these nonlinear systems each
0:04:21of these scores neural networks or something they are very good in being able to
0:04:26construct all kinds of
0:04:29architectures highly parallel architectures
0:04:32we have to think about the new a select up think the models the maximum
0:04:36likelihood is gone right and this ordinance along i think there is a plenty of
0:04:41work to do that if i may speak for myself i i'm big deal believer
0:04:47in highly parallel assistance a layer
0:04:50there is a many the use of speech being provided
0:04:54and then the big issue is how do you pick up the most appropriate you
0:05:00which is which might be appropriate for the situation so adaptation not abide by adapting
0:05:06the parameters of the model by adapting like picking up the right processing stream very
0:05:12much along the lines i was quite impressed
0:05:14what chris while it was telling us that when he added a lot of noise
0:05:18of course many few euros where good but the ones who were what were still
0:05:24a very good so essentially my purse i'm speaking for myself no my view is
0:05:29like that system should be highly parallel
0:05:33the trained on the whatever data are available but not like one global model on
0:05:39many parallel models
0:05:41and it is possible different and independent models and then the big issue is that
0:05:45you pick up a good one so this is that one direction i'm thinking about
0:05:49i don't know what other people think about it
0:05:52but i think that there is a whole i think that whole new a whole
0:05:56new area of research is and whole possibility for new paradigms is coming
0:06:02i mean that's what we see all the past few years with the re you
0:06:06re invention or rediscovery of alternatives to gmm models
0:06:14i didn't mean to speak i mean i just one mean and give you some
0:06:18space for thinking what you want to sail you want to ask
0:06:33so i would just like to
0:06:35pos ask a question about the possible eventual test of the field in a feature
0:06:42so it happens i mean i'm not old enough to see this but for example
0:06:46for coding it happened it after their strong technology transfer understand much more established
0:06:53the research fields
0:06:54i it didn't die
0:06:56trips freak terribly right
0:06:58and this will happen one day with automatic speech recognition
0:07:00we have some stop these methods and then
0:07:03they won't be that much things to research i this is going to happens some
0:07:07are applied
0:07:08and i was wondering how much time do we have
0:07:12we are already seen a very strong twenty times try and there's a lot of
0:07:16investment by
0:07:17all the major
0:07:19technology by using the market
0:07:21so are we close to really sorting is not i don't mean sorting semantic context
0:07:26that's not condition
0:07:28but are we close to
0:07:31study some standards
0:07:32and then is done
0:07:33because what i we got the research on
0:07:35how close are we
0:07:37and years twenty years because the but my carrier right maybe it's side effect
0:07:48i've life yes for the
0:07:54i think i people that
0:07:57that's good
0:08:00this is average spectral for your funding sources
0:08:06it's a can be all close hope that there is going to do and we
0:08:10will that
0:08:11stick i think i tell my students i come i still is that they are
0:08:15they getting the speech recognition they are safe for life that this was my experience
0:08:24somehow i think
0:08:26comparing speech coding to
0:08:29speech recognition just doesn't fly at all
0:08:32i mean speech coding
0:08:35unless you're going to try for their
0:08:39utopia of three hundred bits per second which does then requires synthesis coding
0:08:45there's just no comparison
0:08:48very straightforward and eventually yes
0:08:52standards with set
0:08:54the field i same
0:08:58could be
0:08:58to about a coding of pictures
0:09:01very trivial to cover pictures
0:09:04we have an impact three impact for
0:09:06it's all done
0:09:10picture understanding which is very much like
0:09:14is the thing
0:09:16sort of book
0:09:19i do think
0:09:20that to
0:09:21the feel this very far from that
0:09:26but i think the field
0:09:28will kill it
0:09:30if it assumes that it as the solutions
0:09:33and then continue
0:09:35to plough through just working the solutions that we have right now
0:09:41all done so one other thing that i would probably a
0:09:47like to see happen is are
0:09:51rather than sitting around and talking about what's wrong with the field
0:09:56is possibly construct certain experiments
0:10:01that could point
0:10:03to what's going on
0:10:09for example when steve was storing before
0:10:13i was thinking
0:10:15so you have a mismatch in acoustics and you have a mismatch and language
0:10:19try to fix one without the other
0:10:22and C
0:10:23what is the result where it falls
0:10:29so i think it's a wonderful want to remind people jump ears was advising us
0:10:35to design a clear experiments with the answers
0:10:40so that science can of speech can grow steadily step by step
0:10:46rather than the rapture for computers and unproven theories
0:10:52i have are
0:10:54maybe a couple happens observations
0:10:56we talk about neural nets
0:10:58right now as an improvement and i'm sure it's obviously an improvement
0:11:03it actually goes in the opposite direction
0:11:08what we're all advising ourselves to do that is it does nothing about any independence
0:11:13assumption it's just building a better gmm which is the place where you said that
0:11:17wasn't a problem
0:11:18it's not modeling dependence
0:11:19except to the extent that we model longer feature sequences which we tried to do
0:11:25with the gmms also
0:11:28in terms of
0:11:30where we will you know when will we sell but obviously not five years but
0:11:35that doesn't mean ever
0:11:38so it would be nice if we could come up with the right model obviously
0:11:40that would be the best answer
0:11:42i'm not sure that
0:11:45speech coding and image coding i don't believe they were saw by coming up with
0:11:51the right answer i think they were saul by coming up with good enough
0:11:56answers that
0:11:59wouldn't have been practical
0:12:02twenty five years ago because the computing was not enough to
0:12:06implement those solutions but they are now
0:12:09and so those
0:12:11fairly simple fairly brute force
0:12:15expensive methods now we're practical and work just well enough
0:12:19so i think speech recognition could go the same way it doesn't you know it
0:12:23could i if we if someone is very smart pick the right answer that's great
0:12:27but if you
0:12:30look at how much we've improved over say the last twenty five to fifty years
0:12:36there's been a big improvement
0:12:40say and twenty five years
0:12:43and if you imagine the improvement from twenty five years to now ago to now
0:12:49maybe two more times
0:12:51and the so this is next you know grows exponentially so fifty years from now
0:12:55i think we could say with almost absolute certainty
0:13:00speech recognition will be completely cell to all intents and purposes that is it'll work
0:13:06for all the things you want to do little work very well it'll be fast
0:13:09it'll be cheap there will be no more research in it
0:13:13because you will have
0:13:16computers with
0:13:18i don't know what the right term is but change of the ninth
0:13:21memory and computation where you know ten to the fifteenth computation and you'll have modeled
0:13:29all those differences
0:13:31by brute force it won't it still would never work to train on one thing
0:13:38and then tested another but you want have to you will have trained on everything
0:13:43you know you will of trained on samples of everything so that it just works
0:13:49the doom and gloom doesn't have to work that way it would just be nicer
0:13:52to find a more elegant solution sooner
0:13:55bcmvn this is also positive value there is a just for fast
0:14:00i don't know nine is probably this probably few more data people in this room
0:14:04this is a actually would point there's a ten to nine some neurons in auditory
0:14:08cortex so that must be turned to the nines
0:14:12tend to the nines away so first solving the problem and maybe it is the
0:14:16right way to go
0:14:19i think there is another aspect that's missing which is a
0:14:23looking at is speech recognition this is a little
0:14:28no acoustic signal and you're model
0:14:31model for
0:14:32i think we need to bring in the context and
0:14:35we are moving towards that
0:14:39feature where the palestinians about the context about your personality
0:14:44but the personalisation all these things should be
0:14:49incorporated into whatever model
0:14:51and that will be used some of these ambiguities that if you just looking at
0:14:55the acoustics
0:14:56that's another you know feature you know it
0:15:02actually i would also like to continue on what she was telling us that there
0:15:08is another one solution to speech recognition there is many right i mean there are
0:15:12some just like there is many cars and many bicycles and many what side i
0:15:16mean is something solutions we need solution to a problem
0:15:21and of course what we keep thinking about all the time is that we will
0:15:24so you can find peace i think it's okay to find many other so many
0:15:29smaller solutions it is not questioning my mind that recognition made enormous progresses i mean
0:15:36actually even i use it here and there i mean of informal will go voice
0:15:40and this is this is already quite something say so google voice is a good
0:15:44example since we have a over here i mean i where the solution came to
0:15:50the point where it's becoming use for just like a car used for do we
0:15:55all agree that this is not ideal way of
0:15:58moving people from one place to another it works to some extent so i maybe
0:16:03we should also think not only about this solution but about many
0:16:08solutions to
0:16:10i wasn't those say that
0:16:15and this relates to
0:16:18about data
0:16:19one thing we see anything this is that
0:16:23given our models language acoustic models
0:16:27young a particular size
0:16:29with a C V
0:16:32and in that sense what you say about what was also somewhat
0:16:39you were kind of suggesting and symbols of classifiers and rocky suggesting a personalisation their
0:16:44estimate well because
0:16:47we also and all that if i build the model just for you
0:16:50and acoustic model just for you are language models just for you it really works
0:16:56maybe is not the most a layer and solution but
0:17:00given enough data and enough context
0:17:02and in of computational resources that works really well
0:17:06and i think don't want to see a lot of work in that direction the
0:17:10prize will have to pay is that
0:17:12you have to let a whoever's building the recognizer for you what there is no
0:17:16one's or microsoft whatever
0:17:19you have to let them access your data
0:17:22and without that you will have to label within a speaker in the and then
0:17:26a context system which might be good but not as well as it can be
0:17:30or you may also provide the means for the user to a modified to technology
0:17:35in such a way that it works best for that even user and a given
0:17:38task right you don't have to the i'd necessarily of on the big brother whatever
0:17:43for me thanks but if you provided technology
0:17:46which is that have a just like actually most of the technology which we are
0:17:50using thing about the car i mean you know you can drive it fast you
0:17:53can drive it slow you can drive you crazy you can drive it safely and
0:17:57it's a little bit up to you technology basically was provided in such a way
0:18:01that user can adopt
0:18:03it in due to its knees i'm use i think that it so this is
0:18:08one way you the other ways you need we are trying to build is big
0:18:12huge model which will and the income parse everything i'm more like
0:18:18believer in many parallel models very much along the lines that human perception in general
0:18:23because you need wherever you're looking the sensory perception typically always find many channels each
0:18:31of them looking at the problem before and way
0:18:34and of course what we have available to us is to pick up the best
0:18:38way and any given time and this is something which we have two and perhaps
0:18:42you know but i don't want to push physical direction which i'm thinking about i'd
0:18:45like to
0:18:48my belief is that it just building one solution for everything is maybe not also
0:18:53the best the best way of
0:18:59so i just wanted to say that
0:19:01that the world is a dramatically different place
0:19:05now that it was in nineteen so
0:19:10and that
0:19:11that the constraints
0:19:14that row
0:19:16of the current sort of formalism they don't exist anymore and i think chip you're
0:19:21in shell but says that and i agree that you know if somebody didn't know
0:19:26anything about what the way we do this and they started
0:19:30a fresh
0:19:31and thought about it in the current context it would be remarkable
0:19:37that person came up with the formalism that we do have now
0:19:42i think that
0:19:44we should spend more time i don't know we should do i certainly will thinking
0:19:50you know about how to do this in a different way given what we have
0:19:54and what we know about the brain i mean it's remarkable how much
0:19:59more we know about humans
0:20:15just comment concerning the speaker-dependent stuff that you put gets it seems year
0:20:22but it's not really solving the problem i mean you can make really very good
0:20:26speaker dependent model but then the person i don't know switch the microphone and you
0:20:30are again most or he's called alright of no use some obscure digital coding which
0:20:34is completely cleared for the human beings but because of some strange digital artifacts your
0:20:40whole algorithms break again
0:20:41so this is i think this is somehow for the people each i'm i mean
0:20:46to help get business in the i completely speaker-dependent environment
0:20:49and i assume that for the people reach are in the i don't know in
0:20:53the environment which is completely speaker independent it must be kind of the power of
0:20:57these you know because you have a huge amount of the data which a speaker
0:20:59dependent so
0:21:01but it's not really sort of the problem is making the problem we came out
0:21:05of our error rate and everything obviously because you can train to the speaker but
0:21:08it's not really dissolution
0:21:10that you're looking for
0:21:12this just commands and then also somehow my
0:21:15intuition or feeling is that the
0:21:18i just i just know that if i understand what the people are talking about
0:21:22it easier to me all the to perform a speech recognition
0:21:26so it has to do something with semantic and it has to case to do
0:21:30something that semantic and with the with the intelligence and the and
0:21:35i don't know on so we use but this is the C just the kind
0:21:39of intuition
0:21:43i have a common about the semantics
0:21:46my perception is that
0:21:49in any many groups
0:21:51i mean many companies not so low resource
0:21:55they tend to treat the recognition as a black box
0:21:58and semantic models are built on top of it
0:22:01maybe they do a little bit of accounting like or maybe let's go phonetic matches
0:22:07just in case the recognizer makes a mistake
0:22:09and i
0:22:11and it that's okay to get something up and running but i think that's a
0:22:15stupid mistake
0:22:17that the semantics and the recognition so be closer together
0:22:24i have to say it's difficult to convince some of the people doing
0:22:29semantics that don't have any speech background
0:22:33that since would be done differently but i believe
0:22:36this would be influenced
0:22:37back and forth
0:22:51was mentioned that is
0:22:54someone starting fresh
0:22:57start with the approach we do
0:22:59and it probably really true
0:23:01one of you hear it
0:23:04the someone E mailed out so gone into that once is
0:23:08now we apply all the in that station the speaker adaptation or all the compensation
0:23:14development features now neural networks someone have that right
0:23:19it's just not gonna work right out by
0:23:22and you can i
0:23:24compensate for thousands of hours that on in its current a broken
0:23:37the renaissance neural networks so morgan
0:23:47using neural networks in the in their fibre formalism because nobody
0:23:55you know
0:23:56was that interested because of all the other things that we're working so well and
0:24:01why would why would anyone in their right minds what it right
0:24:04but then all of a certain work back to you know we're back in this
0:24:08zone where people are doing it so i'll all i'm saying is that the less
0:24:11and i take from that is
0:24:13you know if you can if you can work in if you can get something
0:24:16that is that is that makes sense and is and that is demonstrated really good
0:24:23on a small problem
0:24:25well then maybe that would be pretty compelling
0:24:28i mean i agree with you though it's a it's the success is pretty are
0:24:33you know if i have it is something that i am i gonna say what
0:24:36we think about this for forty years know exactly
0:24:42we all know thirty six
0:24:44and maybe they are like to do something that we should do dishes designing experiments
0:24:48where we say
0:24:50i will show you on the state-of-the-art systems that my method works a little bit
0:24:55because that's it itched it is not really such a very scientific is it i
0:25:00mean assigned to the experiment is that you isolate one problem and you sort of
0:25:03try to change the conditions and see the things go up postings go down into
0:25:09the goodwill design experiment if you get worse and you predicted be worse
0:25:14given your hypotheses i think you are meaning right we are almost never
0:25:20report results i that because our belief is that the only way to convince our
0:25:25peers that what you are doing is used to use was used for is that
0:25:30you get a low word error rate is possible on the state-of-the-art systems with the
0:25:35optimal accepted task whatever it is at the moment
0:25:38so i designing good experiments again going back it seems seriously to jump beers be
0:25:44designed a clear definite experiments so that science can grow step by step by step
0:25:50i seen that we have to learn how to do that and since you mentioned
0:25:54in new networks i want to share with you might personal experience
0:25:58it's different houses here is going to be and he may not even remember
0:26:02but a long time ago once the post postdoc at icsi here on the experiment
0:26:07very he had a context independent a hmm-model a context independent phoneme and the you
0:26:14wanna model and you wanted model was doing twice as good as the hmm and
0:26:20that can means to be i mean you know that we stick to neural nets
0:26:23throughout the dark ages on you of neural nets N I partially because we invent
0:26:28have a so but in hmms an lvcsr as but as a partially because i
0:26:32truly believe that because that was an experiment which was very convincing to me if
0:26:36i have a simple a gmm model
0:26:39without any context-dependency to try easy to of course building to do system and context
0:26:44the i mean context independent hmm model which was the only way which we between
0:26:49you have to be noted at a time
0:26:51and you and that is doing twice as good as the hmm why wouldn't i
0:26:56stick to this at you are like model i'm glad that we did
0:27:00i don't know steep if you remember this experiment i say good but i think
0:27:03it actually got a piece even in transactions eventually right
0:27:10you know what one other where you can get use of out of a local
0:27:13optimum is change the evaluation criteria right and i think and i think that's i
0:27:19mean and part what mary's than what the babel program you know have keyword searches
0:27:22the task in atwv well extracted word error rate it's not always perfect and i
0:27:26think another thing that
0:27:28people we seems to me really are to be reporting when you put a word
0:27:32report a word error rate is not just the mean word error rate but the
0:27:35variance across the utterances because you can have a five percent word error rate but
0:27:39if a quarter of your utterances are essentially you know eighty percent word error rate
0:27:43which can happen then you know that's a good way to start figuring out how
0:27:47to get your
0:27:48technology a little more reliable
0:27:51i was hoping you would have a comment
0:27:54i feel
0:27:56i feel obligated to
0:28:00talk about ancient history since i'm getting a little older now
0:28:05i remember when hmms started and we were certainly not the first to use them
0:28:11we were sort of in the middle of that
0:28:13of that previous
0:28:15a revolution
0:28:17the big criticism there were two big criticisms of hmms
0:28:22relative to the previous method the previous method was just write the rules because we
0:28:26all know about speech and say how it works and those systems which i wrote
0:28:31systems like that back and the early seventies because i was a late adopter of
0:28:37those systems were very simple easy to understand extremely fast
0:28:44needed no training data
0:28:46that sounds nice right
0:28:49and they could do very well on set on simple problems without training data and
0:28:54the hmm is the government argued in other people argued and sometimes we argued hmms
0:28:59were too complicated require too much storage too much training too much memory and would
0:29:06never be practical
0:29:09well obviously things changed and it wasn't only computing power that was a big factor
0:29:15but it was also learning how to make it more efficient and we do a
0:29:22combination of all of those not being
0:29:25re so rigid just to say we have to do it with zero data and
0:29:29just what i learned in my acoustic phonetics class
0:29:33we could use data
0:29:34more data always helped
0:29:36learning to do speaker adaptation rather than speaker dependent models
0:29:42okay neural nets
0:29:44neural nets work done simple problems but not on more complicated problems
0:29:50and what was need i'd say the reason it works now is because we can
0:29:55now do you know it two three years ago the things that we're working we're
0:29:59requiring two months of computation which is just you know unacceptable completely unacceptable some bold
0:30:06people did that that's great and then they figured out how to get better computers
0:30:12that all of this argues that each revolution which happens that at twenty five years
0:30:20is the realisation that all of the intelligent things that we thought we knew
0:30:27can stevens would tell us what happens with formant frequencies and i learned all those
0:30:31things all of those were not the way to go the real understanding was not
0:30:36the way to go with bothered us because we'd like to think about
0:30:42we like to think about you know the them phonemes and things like that
0:30:48but we know that phonemes are abstractions
0:30:51we know that formants are an oversimplification
0:30:54everything that we learn is an oversimplification and computers are just simply more powerful than
0:31:00we are
0:31:02then we can anything we can write the not more powerful than the brain but
0:31:06the right more powerful than anything that we can write in a in a program
0:31:10so i think
0:31:12that would argue against
0:31:16the i i'm not i'm not saying that you shouldn't keep trying to find the
0:31:21right answer but i think history has told us that the right answer is think
0:31:26about more efficient ways
0:31:29both you know computing will increase its increased by factor of a thousand and the
0:31:33last twenty five years both segments memory and storage and it will increase by a
0:31:37factor of a thousand every twenty five years forever
0:31:41and that's a big number in fifty years
0:31:46but at the same time we can think about algorithms that are a thousand times
0:31:51more efficient
0:31:52that had that has happened and it will happen
0:31:57it a little you know collects that's of data other people can collect parts of
0:32:01data i think it will happen that we will have corpora that include the speech
0:32:06of millions of people from
0:32:09hundreds of languages in hundreds of environments
0:32:15and if you just imagine that it was let's just pause it that it was
0:32:21simple and easy to collect millions of hours from all these environments and memorise all
0:32:26of it and learn what to do with it and compute it store it all
0:32:29in something that fits in your you know in the chip that's embedded in your
0:32:34in your hand or something or in your you in your head
0:32:39well in it just works you don't know why or how it works but it
0:32:44so i
0:32:46while i have the same desire to understand
0:32:53intellectually what's going on i would that almost anything that will be of the solution
0:32:58that eventually works
0:33:04so i'd like to make the other side
0:33:07and the other side is if you look at the history of science
0:33:10what's happened is
0:33:11are truly
0:33:13stupendous advances have come from understanding where we are
0:33:18recurrent models don't work
0:33:20it's not
0:33:21that we shouldn't try to push models
0:33:23but the think that you're describing
0:33:27i'm pam of engineering what truly understanding comes from looking at the places where our
0:33:33current models fail
0:33:35and all of the things that we've been doing for the past twenty years are
0:33:40for the next
0:33:42and we should be paying attention to where we fail
0:33:45and that's where we're gonna find the success
0:33:49so a
0:33:51one the to it at a little bit
0:33:54it seems like this i think that i like which we always think
0:34:01the old story is if you take
0:34:04an infinite number of monkeys and give them
0:34:07infinite number of typewriters eventually will i shakes
0:34:11and i think that's what you're suggesting
0:34:14a you have a few problems number one
0:34:18more is lower it did
0:34:20fairly much comes took came to an end
0:34:23and that industry is facing the same problem unless there is a dramatic
0:34:28technological shipped
0:34:30a you're not going to get
0:34:33the kind of doubling that we've seen every eighteen months
0:34:37in the future
0:34:38basically quantum mechanics eventually getting you way
0:34:43the alignments are so narrow now that there are not too many atoms or
0:34:48to allow for them to continue to be
0:34:53somebody else said something about
0:34:56well what happen if people started a
0:35:00doing this research all over again would be find the same solution
0:35:05a i'm waiting now a marvellous what paul designed the nature tries to explain evolution
0:35:12not just of humans but rivers and everything else in terms of
0:35:18physical laws
0:35:19i highly suggest reading it it's very entertaining a but basically
0:35:24and then going back to the coding i think when the coding what was done
0:35:28it really was fundamental in the sense that we understood
0:35:33a page and spectrum where the essence so for example the coding that works on
0:35:40yourself on which is really meant to code speech if this is like in the
0:35:45background it totally the stories because it really as adopted to the speech signal
0:35:51so wasn't just a random brute force process it really depended on first lpc then
0:35:59are is a coding the residual and all of that and that's why we have
0:36:03such good coders and i think
0:36:07the theory behind that was of course much more trivial then it is and in
0:36:13so i do think that
0:36:15we need to continue the work that we're doing but on the other hand do
0:36:21a lot for some paradigm shifts a that would be more than just are increasing
0:36:27a that's stochastic ability by introducing neural nets and
0:36:34from where i said i thousand miles at a neural nets essentially are a generalization
0:36:40of hmm their boats stochastic models it's just that in hmm you have essentially a
0:36:47single it later
0:37:00so i think the point about how much data and we need to solve the
0:37:03problem by brute force comes down also to the question of
0:37:07artificial intelligence right
0:37:08so contain with these two stage scenarios one even scarier is that one day we're
0:37:14going to get a activity in to use right
0:37:17and so this process this when this happened or in the way so that moment
0:37:21we're going to lose control of abstraction right machines are going to be better than
0:37:25this ad created their own map structures so all this prior knowledge we want to
0:37:30put into our models
0:37:31is going to be are way you've seen things but machines are going to have
0:37:35their way of seeing things
0:37:36and when is it is discussions about saying
0:37:39when we have to look at the problem and things like humans and
0:37:42i think well
0:37:43i is already happening that machine to create in they don't obstructions and they are
0:37:47not into due to less but since they are two going to do better than
0:37:50as in the long term we're done we might be better of just think you
0:37:54know how the so much in sync up on the not how like to think
0:37:57on this
0:37:58how i can express the problem okay you at generative model that see it is
0:38:02to me
0:38:03maybe it should be intuitive to the machine
0:38:05or to the harder right and deep neural networks
0:38:08to some extent
0:38:11doing this i would very far away from that similarity right but when we will
0:38:16reach that so maybe we'll webbetter of thinking
0:38:20and i
0:38:33that they are really always looking in the light and basically after fifty years over
0:38:38artificial intelligence essentially of developed
0:38:42tremendous methods for optimization and classification there is very little more can inference and logic
0:38:50so i'm very good the to field is alive and well the si can see
0:38:55from this discussion it really reminds me of which it reminded us that for one
0:39:02of the first the asr you the workshops and i will also remember that even
0:39:07in my introduction
0:39:08where people were discussing fighting and it always the desire to move the field further
0:39:15and i'm very happy that i think that we use exceeded too large extent in
0:39:19this asr you to so let's just keep it's going i think otherwise i will
0:39:24i will pass of the microphone to one zap who has a
0:39:28a sound
0:39:29since to say about is it is it the time for post the room or
0:39:33basically i estimate i one commander is discussion i think
0:39:38what we were discussing with the data that models the adequacy of models monitored by
0:39:43i think well it turned little bit speech centric
0:39:47so a little bit too selfish i fine so i think we forgot about the
0:39:52users have a four technologies because i have the impression
0:39:55that the well rarely people would just ultimately use the output the of asr and
0:40:00say this is the output them your it finishes is most of the time is
0:40:04just some meat product that would be further used by someone so actually
0:40:08i like the way that the better what so speaking about that the well for
0:40:13you would be the wer is not the automated metric but is the click through
0:40:16rate wer of foreign call center traffic it might be the customers of destruction so
0:40:21they have measures forty
0:40:23for a government agency it might be the number of court
0:40:27but the guys
0:40:28and so on and so on so i think actually there is still quite some
0:40:31work to do in propagating these target metrics
0:40:34back to our field that i'd i don't know if there was like sufficient work
0:40:38on this maybe they are not that only interested
0:40:41in at W or wer and stuff like this just the just need to get
0:40:46there were done
0:40:51okay so we cook is sorry i didn't i didn't mean that the
0:40:55find technical common and in the i did so no
0:41:01no comments on this