0:00:15that's right tree full column and weakness migrated ones are introduced
0:00:22we use word from a distance from time spectrum modeling one recognition
0:00:31and she's also of interest can
0:00:33that's what they should trust
0:00:36a huge
0:00:40you see over you know trying to you is rover bachelor's and master's
0:00:47operations research and industrial engineering
0:00:51no you can do not one which passes spoken by what sort of quite a
0:00:58long time and the
0:01:00i'm happy to be able to introduce are also your colleague of solution is to
0:01:07open laboratories with really and to mention risk
0:01:11so much closer to speak about interpreting spoken referring expressions empirical studies
0:01:18right
0:01:19and have thank you
0:01:22good morning
0:01:25and things for having here
0:01:29i will be don't know how down there
0:01:32challenges that ice for interpreting spoken referring expressions in physical setting
0:01:39i will be grabbing the presentation in my own icsi the system but they don't
0:01:47and yesterday to where some challenges mentioned already so why are we all of the
0:01:54end of some of my
0:01:59so
0:02:00this is the three
0:02:03well above the dream in nineteen sixty two
0:02:07and the for those of you more for the jetsons
0:02:12and the dream was okay may example there
0:02:18there we have to be these days
0:02:20he's actually better than the green
0:02:25actually because the woman in presence of
0:02:30and i don't know if adding more actually achieve the conversational capabilities that we want
0:02:36to but i
0:02:38if move
0:02:39like every are it will be achieved
0:02:44so
0:02:45one of the channel is
0:02:47so and that's a little
0:02:50and i do anything but for their share that computers the robot or something think
0:02:56be reasoned say that on the code rate of the but they have like resampling
0:03:04and the message result may still day
0:03:08it because if you are in there is a reasonable in there is anything k
0:03:13engine their appropriate for us
0:03:16and what exactly trust probably just
0:03:19you know when to what we need and you know not
0:03:26in each okay
0:03:28so you have different one interaction is in it that
0:03:34so how this is a fixed
0:03:36that i got challenges of first of all evaluation
0:03:42we might be able to provide policies and sorted they actually
0:03:46we thank you challenge
0:03:49i read a novel
0:03:51we don't trust
0:03:55in addition from a game theoretic point of view
0:04:00these are i
0:04:02five favourite challenge is
0:04:05in addition of questions yesterday
0:04:08so all we need to be able to deal perceptual complexity
0:04:13and i will illustrate shortly the to these challenges
0:04:19we need to be able to be with linguistic phenomena such as signal addressee and
0:04:24you would be
0:04:26but it's not gonna see it is not just asr error
0:04:30but also position error or
0:04:35several papers yesterday discuss the thai patient
0:04:40and finally we need to integrate directly probably the i-th knowing about something may help
0:04:47you figure
0:04:48i
0:04:50so noticeable for perceptual complexity
0:04:54so well i
0:04:58so
0:04:59i see
0:05:02by the way this is that one and one female prime minister
0:05:06we have ueller
0:05:08from the by
0:05:10handy
0:05:11that is the difference in the training right flowers and the right
0:05:17but the lexical when you talk about three vowels is actually more security
0:05:23so that has to be that we that's where
0:05:28what are talking about i
0:05:32we can talk about a large a the small a
0:05:37there are a factor in smaller than this more bass
0:05:41so sizes because either in context
0:05:48no
0:05:49in addition we gave topological relation which are spatial relations
0:05:56well carolina
0:05:59so in this example the oranges
0:06:03and the ball
0:06:05and or infeasible
0:06:08no even day
0:06:11okay on the left the position
0:06:14the one
0:06:16or just one
0:06:20no okay
0:06:21in the
0:06:22in the okay
0:06:24the orange the scale in the bowl
0:06:28but in the okay
0:06:31on this i
0:06:32the orders is null
0:06:35thank
0:06:38a
0:06:40you want to say the origins in the old even though it's not that well
0:06:45and the explanation the psychological explanation is that is related to one for
0:06:52if you move the ball we wouldn't the order
0:06:56but you know it humidity calculation the audience is not in the water
0:07:05on
0:07:06so in this wow
0:07:09i is very clear global or and here the plan on the wall but
0:07:15horizontally on the war ok
0:07:17a picture
0:07:22now we have also a project each relation
0:07:25which a particular direction from a landmark
0:07:30so we have a dc you're still far from being too
0:07:34and the last but back to the right of the day
0:07:39we try to see you also directly
0:07:44so it's another
0:07:46i tend to congregate
0:07:49okay that can referring expressions
0:07:53no from point of view of linguistic phenomena
0:07:56we have enough data c
0:07:59i mean i
0:08:01well they want a thread and the reward is more
0:08:05it was to do it sort of teen
0:08:08we have on you know anybody will be in a
0:08:13additional with
0:08:15in the to the problem that prepositional phrases
0:08:21so we have
0:08:24the
0:08:24a few e
0:08:27because we don't know if the back to the lack of the side of you
0:08:32know the plan the lamb
0:08:33but not as shown in our case we have
0:08:38which more
0:08:41i do you get it will
0:08:43even if you identify all the possible and you need at the end of the
0:08:47day it doesn't matter because there is only one flower however
0:08:52this is not the case in this example
0:08:56well
0:08:57in the case
0:09:02it's the table that's near the lack what is your the flat or near that
0:09:06this is to be
0:09:09and yes people do that
0:09:14asr error or out-of-vocabulary words
0:09:18so all of these are
0:09:20someone manufactured example
0:09:24it is not entirely vol all the flower on the table
0:09:30it is that
0:09:30that would be maxent
0:09:34you can
0:09:35something that we on the table and this happens when people who are usually and
0:09:41the main
0:09:42one worked out of it can even make one or are often and all before
0:09:48the user can be added there is a get because a status but no will
0:09:53not come up before right
0:09:57but this is just to illustrate the sort of affection from
0:10:01at this time ever saw can result in our vocabulary word
0:10:07and of course again if fusion errors
0:10:10the
0:10:11make the situation even
0:10:14so what we want to do
0:10:18we have no framework for spoken language understanding in this phenomena
0:10:26hey
0:10:27this is the store in we aim to handle the picture will or
0:10:33g is the average since upon this is due to the left of the table
0:10:38then we have that are also we have side scott are an example of what
0:10:43little
0:10:44and then it very precise description prepositional phrase
0:10:52so what we want to talk about
0:10:55and a few slides and one of about this interpretation process each of you know
0:11:02and then i believe that our approach
0:11:07then we describe
0:11:09the results were right now response generation can have a chart
0:11:17so this is the set of problems small
0:11:20you to anybody of the speech recognizer
0:11:24then some syntactic analyses in
0:11:27then you may going to show my or my
0:11:32so
0:11:34the speech way speech recognizers such as we will now
0:11:38in my o of such errors
0:11:42these ones
0:11:44you can always speech recognizers are really bad mode
0:11:49it
0:11:51after the syntactic and i is the
0:11:54but also lengthening and live apart
0:11:59to produce
0:12:00but
0:12:02and then you one semantics and but i
0:12:06so if you do we in two stages of semantic interpretation for the robot
0:12:16what i e
0:12:19again every that about on the table again
0:12:23doors the mappings are here the relation my and or
0:12:29and that's prepended is wider rc
0:12:32and we have label a cop not they're not in the table shows for this
0:12:38particular scene
0:12:40there are not be
0:12:43i didn't you all table one
0:12:47so this is an interpretation that is grounded e how we have
0:12:53so what if we
0:12:58so
0:12:59the first we consider this model that i just described
0:13:08well
0:13:11okay
0:13:12like the standard role in
0:13:15we found was insufficient
0:13:18so we will consider alternate interpretation
0:13:23why everyone provide a system for five in a just one used to be the
0:13:28base
0:13:30so the little amount stage process where stages of my has not the patient
0:13:37the addressee we don't want to start local maxima might not be what appears to
0:13:43be a based interface
0:13:45so we have a stochastic optimization process where we provide security different stages
0:13:53okay we want to right
0:13:56the different interpretations so we need somebody ways to make their problem
0:14:01at me about being used only the recognition is speakers the
0:14:08so this is illustrated our approach
0:14:13the first thing we do you and you like this waterfall roles what we call
0:14:18we so we have some of the presentation
0:14:23and then we
0:14:26products i we i
0:14:29we don't they should also try
0:14:32we different stages probabilistically in we can continue and you see
0:14:42it's not null and of my there
0:14:45that is one and one
0:14:49so i don't completion officer and i
0:14:53we assert that looks like
0:14:59now we one o is estimated probably these
0:15:04all their relations
0:15:10and
0:15:12may just apply bayes rule
0:15:15sure if you basically with a given set my impression that this implies that all
0:15:23day
0:15:24no context can be anything i story
0:15:28and i don't history i mean at the moment is the rule more data
0:15:35and
0:15:36we need like to ask for my i don't know so i want to make
0:15:40more complicated but
0:15:42imagine that are
0:15:44think that problem is formulated from i know
0:15:48so all then it is worth this problem
0:15:54the first one directly from the speech recognizer scores we use probabilities lose your number
0:16:01between zero and one
0:16:05parser generates parsers are real users probably e
0:16:10here
0:16:11we favour or simple interpretation sell the urinal the better
0:16:17and
0:16:19this is the more there are what we get the problem
0:16:26so let's illustrate this so what we have this argument of j o
0:16:33this is a crime and what we want you know that
0:16:37is how well each of the prime i really am i and my
0:16:41the corresponding to my
0:16:44so in the first one
0:16:46we have a problem
0:16:49that it
0:16:50you will designate got three by that are not by the colour blue
0:16:56then it is
0:16:58well that's that relation location or could be designated by
0:17:02the provisional
0:17:04and whether or not goal table one
0:17:06that
0:17:09in addition
0:17:12one who assigned a probably be i mean on the well
0:17:18wow so we can see the models can you on the world and everybody these
0:17:24buttons them on kind of work
0:17:28over the table to be than the problem is
0:17:34shell
0:17:35i just a continuation of the problem but you make some simplifying assumption
0:17:41so
0:17:42the remote will eat corpus to the user and able to refer to
0:17:47it does and of are more or fess okay why this or something that all
0:17:52and it really ambitious
0:17:55and he thought would have a robot and the mobile
0:17:58be able to walk around the room and both
0:18:02and we won one whole the role of all you see a actions that the
0:18:07we you of the time i want to get a better
0:18:12so that's why we make this assumption
0:18:16in addition each object is
0:18:19in a more label
0:18:22and then his sound
0:18:24the next life and deletions will assume that each object region
0:18:30so it may be circumscribed by a block each object is a single and
0:18:36but we have another and that's no way to from the speakers in y because
0:18:43if an object is able to it
0:18:47the problem the speaker is referred to we explore the and or not
0:18:54so we calculate is probably e
0:18:58so this is all technology channel we got a doing the learning
0:19:05will improve
0:19:07so you
0:19:10the lexicon new data was calculated using wordnet similarity function
0:19:18that are similar to what is calculated using a particular function you one i
0:19:26and
0:19:27exactly about ten percent you system or changing current system origins in
0:19:34similar
0:19:37so long as you probably you know what it was reported e
0:19:43how similar to you
0:19:45the
0:19:47but we are
0:19:50in this i
0:19:52we probably you got me
0:19:57dean you know
0:20:00and this was only by comparing the exercise for the bottom row
0:20:06we
0:20:07this is all
0:20:09a be consider the
0:20:13and if you're curious we used at a constant
0:20:18so
0:20:22we have a topological relations
0:20:25so the most interest while he's
0:20:30where we have a function that what the is nice
0:20:35represent we should for large
0:20:40i hope to continue for another way that's order to be in near each other
0:20:49so we have right
0:20:53i'm not sure
0:20:56that is done anything that they lack the thing like that and between the flower
0:21:01the baseline
0:21:03but what i say that these were in here
0:21:05these two are not
0:21:09so our function reflects this intuition
0:21:13and finally relations between your sentence frame of reference
0:21:19which means that
0:21:21you know there may be also
0:21:24we adopted it will be adopted the point of view that we are able he
0:21:29where interview speaker
0:21:32so this is the plan that means the right okay or speak
0:21:40so
0:21:41these where
0:21:43this is a short overview of what i
0:21:46so what can i don't think so far what we know
0:21:50so this is the case where we have audience participation
0:21:54so i'll
0:21:56therefore it play a little the microwave
0:21:58which one
0:22:06the
0:22:07you can sample
0:22:10the time course
0:22:12need only my yes can second guess we here
0:22:18but none of the missile
0:22:23okay about the case
0:22:33but
0:22:35the one okay again
0:22:38i mean in do you have three factors system
0:22:42that is
0:22:45now i really
0:22:47the label y is what we are some participants describe
0:22:51in this every the screen so what the intended it is actually one it is
0:22:57easy well i
0:23:02okay
0:23:03i want to find humour
0:23:06so well this is
0:23:09so the okay a
0:23:12this project is a few years all
0:23:14so i
0:23:15our speech recognizer was really giving us a lot of all
0:23:21we were using the microsoft the u i it before deep learning
0:23:27so what we decided we have some e
0:23:31about it and e so all error correction for the speech recognizer
0:23:38so what we need
0:23:40each we had some steps
0:23:43it is more like of course incorporated into are lower
0:23:49so we had to record speech recognition errors one but i think error correction
0:23:56it was a preprocessing step and robot error correction the possible across the things
0:24:01and yes
0:24:03now that you have been speech recognizer the impact of this it is floor
0:24:09but especially what
0:24:11marian discussed yesterday maybe kind of thing hand
0:24:16so that the semantic error correction
0:24:20in this was like every year
0:24:23we propose gently words ripley's or words that have expect i'm expect the boxes
0:24:32so you are described in all you get the bar in
0:24:37that can expect
0:24:38so use a generic were replayed
0:24:41however more than we replace the
0:24:45all of the problem you the new word i in a remote location so probably
0:24:53be a really planet
0:24:57the probability of those on a five of the problem you do not ever so
0:25:02we don't around just replacing work we don't lie you have to read to make
0:25:07a replace
0:25:09so this is the right for example here
0:25:12this is really a
0:25:14we will light on the back wall
0:25:18then we guess what the person actually
0:25:25but
0:25:28that's what they meant
0:25:30but that's what we're to build played the bus stop right interpretation
0:25:36so well
0:25:37if we
0:25:39me
0:25:41i five times in the end of that side of their own set
0:25:45so all
0:25:46we replace you that i don't think that this is really okay
0:25:52but you only have a few scenes on the cable
0:25:55it's better
0:25:57then
0:26:00okay so no
0:26:03this is what we start right now we have all these i
0:26:08in america okay i and say
0:26:12from one can i
0:26:14which one that models like late
0:26:18but only from this guy gonna different places
0:26:22so no okay
0:26:24it's play invented for their instead of everything that
0:26:30so
0:26:31i
0:26:35so that's what we've done
0:26:39and because one of my favourite sergeant's and she's performance me
0:26:44so first describe the corpus
0:26:50twenty six point six r d c back
0:26:53a native english speakers counter and it is but i will resonate adopted for images
0:27:01in we had a hundred and forty one descriptions
0:27:06no this is the asr performance
0:27:09and you would be split into a similar experiment we will a
0:27:14so you see they difference in what we head
0:27:20there but it hears signal
0:27:21and we will now
0:27:23so we're the word error rate all thirty percent okay
0:27:29in mind that this is an older version of the microsoft speech api
0:27:34and the only fourteen percent for the asr interpretations of the top around one for
0:27:40all right
0:27:41what is what will now where the rate of the top ranked interpretation thirteen and
0:27:46a
0:27:48but
0:27:49still a real
0:27:54so the resulting images that we shall i participants
0:28:00and some location for designed for example in this one
0:28:05each requires that all here it should i don't know that have anything it is
0:28:09there
0:28:10so we believe it uses and parts of speech
0:28:15in this work but we have seen as
0:28:19so okay
0:28:21we got the image and call it and we want and
0:28:26car
0:28:27as well as positions
0:28:31this one particular
0:28:33because they can use color size is it or
0:28:37basically you before loading a project you've relations
0:28:44and then just like real is i
0:28:47what
0:28:50where they had to describe the
0:28:55so no
0:28:57just some characterization of what people the
0:29:00in terms of known it
0:29:02there you know that were somewhere out of vocabulary
0:29:07so not just speech recognition error but words like that you words like model with
0:29:12the
0:29:14and they're gonna do not and then you will see
0:29:18we may
0:29:20is there
0:29:21we distinguish two types of one
0:29:25why are descriptions
0:29:27max at least one interpretation in every respect
0:29:32any perfect descriptions means max k
0:29:36so for a in prior description they come from multiple interpret it
0:29:46so these tasks i for our core well about three or four or eight
0:29:52and then apply that wordperfect in there was only one possible right side and that
0:29:57makes sense
0:29:58then sixty percent without which means that we're several reference mask perfectly
0:30:06and then we had to kind of thing accuracy
0:30:09and where only one object matches ending perfect remote one will do not depend
0:30:20no performance matrix
0:30:24again i'm going back to the ideal result how we wanna make explore the interpretation
0:30:32he's reasonable so yes but gold standard annotation
0:30:38by we my
0:30:39a perfect match
0:30:41like
0:30:43contrary to what
0:30:46this is a popular nowadays the screen
0:30:51not address yesterday you say okay i is all words in the list x and
0:30:57y
0:30:58sorry the object but the wall
0:31:01i don't care much percent of the request just retrieve the roles e
0:31:07so a perfect match not present such as
0:31:10it's a severe heart because at the end of the day
0:31:14you want all you know
0:31:17if you wanted and role
0:31:20so little or no but anyways for everything you want to understand perfect what
0:31:26well
0:31:29in addition
0:31:31we want to know if we probably their projects like you will see what problem
0:31:39if you use a live recording okay
0:31:43right of the roll can be a really no particular range
0:31:51she'll
0:31:52the roundness constantly as one unit profile of our systems that well
0:31:58so what you
0:32:01we have the right
0:32:02a two
0:32:03and
0:32:05and we have the probably the deceased in this kind of the this at the
0:32:09top right of the replace your
0:32:12matches
0:32:13the user's intention so this would be
0:32:16all day however the bottom right meaning it's wrong
0:32:21so it in this killer graph
0:32:24they refer the reader is referred by the system
0:32:28if at all
0:32:30and then we have a second one is the green one and then you have
0:32:33more probable one
0:32:35which one
0:32:36is small
0:32:38so for this for probable one
0:32:42you mean one and everybody three quarters of the brown
0:32:48not give a great
0:32:53so all our main breaks
0:32:57are three core which is actually recall
0:33:00where we is not always fractional round balls location
0:33:05to do it would probably interpretations
0:33:08and in c g which was defined by automating can get i don't
0:33:13a in the
0:33:16why does what side of the fraction that are reward
0:33:21you'd also or a discount lower right
0:33:27it right stand recognition does not have lower right but dct a
0:33:33the normalization component that i
0:33:38you divide whatever this is thinking about we'd like here
0:33:42by this score of an option
0:33:45where you're based on the beam was not the goal i think the situation where
0:33:50you are more advanced up right one
0:33:54so
0:33:55you by like the score of the option and then you
0:34:02so how do
0:34:05and we did okay that's the short version but i
0:34:12syllable is not actually
0:34:16it's not like that or
0:34:18that in our money left labelled c is
0:34:23that's
0:34:23better than that will allow okay
0:34:28if we use their predictions that's not very interesting about that for all i k
0:34:34so we might better now there is a reasonable that we have more than three
0:34:41and e c g is not into one
0:34:45by one or two but with a prayer
0:34:49but this surprising is a use rc replacement but in a war
0:34:57that
0:34:58but it would be why the problem replacement pretty or does not
0:35:05that's certainly not second guessing
0:35:09so that a surprise
0:35:16okay
0:35:18let's go on to response generation
0:35:22this is more control
0:35:27a popular problem yees select part in particular that features such as a as a
0:35:35side so that okay
0:35:38for the current approach is used on the fact
0:35:42there is only one acceptable
0:35:46but the main more than one
0:35:47maybe we will and stuff
0:35:51so the goal of this last part of the result was first of all learn
0:35:56what context of a response to
0:35:59the weather instead we rely different schools this
0:36:05and whether we
0:36:07distinguish between what did you in but like our two
0:36:13i think we all on the reason like a microwave
0:36:17but you want your what you agree in that my there but we
0:36:23not sure maybe you want you're able to be more sources than you
0:36:29so the design of y
0:36:32we compare the refer to convert a relations in two ways
0:36:37so
0:36:38you just added over from
0:36:42we assume the ones that are based on the i it
0:36:48we have all been we want you did they are able but that's the robot
0:36:53can find at the end of that
0:36:57we consider for response i
0:37:00which means just what to do so on
0:37:05a tool which means a
0:37:08it is eager wire between v two or three k entries phrase by phrase level
0:37:16of a whole
0:37:18don't be a different way
0:37:21and we can see what we have conducted one experiment anywhere in the process of
0:37:26combat and the second experiment
0:37:30so far in the first experiment we got artist incorrect responses
0:37:36a silence of what's
0:37:40so well i guess i want to solve a
0:37:44because there are the asr
0:37:48we one relay
0:37:51well of the asr be
0:37:54people can guess really where would you are
0:37:59and i known
0:38:03we train the classifier to produce acceptable responses
0:38:10and okay you use a score
0:38:13you're the first experiment using a
0:38:17so all we
0:38:18thirty five participants some of which were still from the one experiment
0:38:23describe the same okay
0:38:26we got
0:38:28that and seventy five descriptions in to draw a little right
0:38:34so you see when it is likely by not nsu and well
0:38:43asr performance is all the previous slide
0:38:48word error rate was only thirteen percent and by
0:38:52jointly of the requested object at least are also asr errors in indulging section driver
0:39:00the landmark search
0:39:04so you have something that will enhance the back the
0:39:10the correct ones and also interesting
0:39:15and you can guess can you guess what people say
0:39:28yes
0:39:31and
0:39:36like
0:39:38larger
0:39:46okay then we got it
0:39:49a simple or false
0:39:51where p c where a
0:39:54how this all for a so i
0:39:59for someone else's lazily or max a
0:40:05based solely on l two
0:40:08the dialogue policy and the results
0:40:14and for this experiment with four participants again
0:40:18both with
0:40:20so this is still in the participants were show
0:40:25and or something but not all the objects on
0:40:30and that was all again mentioned about five
0:40:35for us
0:40:37and
0:40:41you can see that they're talking about
0:40:45yes
0:40:47yes
0:40:49it
0:40:51in this and then that would be used in
0:40:54four options to a value that is that for the purposes of this presentation participants
0:41:01were not so that it is that there were a total of four intraframe
0:41:07but what is it but it's a huge rooms one score
0:41:11and then
0:41:12for the first response is a number
0:41:17from
0:41:19so if you are going to fix
0:41:21the request at all
0:41:23in which all
0:41:30so
0:41:31we don't sell all and
0:41:34we train some classifiers
0:41:36we the trained and you are able just database and two side guy
0:41:43it side
0:41:45indeed it is not bad because there wasn't enough
0:41:51so
0:41:54influential features where they can see that the third problem efficiently
0:42:02if you know that the performance you have one about nine percent of your updated
0:42:09is okay
0:42:12so the eventual users use percent of
0:42:16wrong words in the asr how do we know words are we have a classifier
0:42:22that
0:42:23that's which works well
0:42:28and you will be sold disease
0:42:30not all right their predictions that this you are scored
0:42:35so it would someday
0:42:37i se
0:42:39score all
0:42:41locations already i meaning the task force between requires in all day
0:42:48and in the u number of out-of-vocabulary words
0:42:54so this is also
0:42:57what we consider all the board but
0:43:01to is dangerous here january english native new this
0:43:07where
0:43:08useful and recall and f-score of seventy four
0:43:12so we were coming from
0:43:14what the participants were common
0:43:16this is
0:43:17then
0:43:19see here i all the data
0:43:23and
0:43:24we got
0:43:26and the score of nine two
0:43:28so that can be something here with the system so this is not fair
0:43:35but i is the
0:43:38or what you from his this is user rate and preferences are the big
0:43:46so what is the main inside yes people based on the differently in fact
0:43:53this is an extreme example because if you know more participants in this experiment in
0:43:59the previous call also we had used very able to work on the exact same
0:44:05so
0:44:07again
0:44:12any other
0:44:17yes i placed on the right of the right
0:44:21and
0:44:25this is
0:44:26what are participants
0:44:29we saw what parts and say okay the ones that come from
0:44:35one possible scores phrase
0:44:37no you have the sack part
0:44:40this is what the user was described
0:44:46so
0:44:47being courses is not about
0:44:51okay so i o k v c r challenge is
0:44:57is the bottom right
0:45:01so we need to do
0:45:03first of all we need to deal with real c
0:45:06our case where there were constructed using all three tool
0:45:14it sounds great but their sin
0:45:17and eighty somewhere so at least i hereby are
0:45:23i can be this work but we re scenes but that were causing some problem
0:45:28got its own problems
0:45:30because it can be very frustrating that kind of all
0:45:36car is being
0:45:40so that are and so that an
0:45:42have a paper addresses some of the other problem
0:45:47then that i
0:45:49and i like one of the texture
0:45:54she
0:45:59that's
0:45:59about okay so
0:46:02frames of reference
0:46:05there are lots of frames of reference speaker oriented here or the absolute
0:46:11in c
0:46:12but in the basic frame of reference in the fate
0:46:16the front of your lips easily the front of your data doesn't matter course there
0:46:23so
0:46:24also you can be all frames of reference s b one seen that and incorporated
0:46:30into interpretation
0:46:34and context positional relation
0:46:37the left of the front of the table doesn't something that somebody is
0:46:44linguistic phenomena hold it is the white or by nicole all the weak lexical stimuli
0:46:52yes what's a presentation about out of vocabulary words
0:46:59and more work has to be done about inaccuracy in u e
0:47:05perceptual i a busy
0:47:09yes asr grammar scale a problem in something better problems in
0:47:14v error or
0:47:18i don't know in this is but you know that
0:47:22is still not there's no
0:47:27user adaptation
0:47:30which all the different people to use right reference s
0:47:34and this adaptation is to be
0:47:38but what are trying to understand what people say
0:47:42in this case and the way people the or there are so a sign a
0:47:46nation
0:47:47it also response generation
0:47:50before this is why in different ways
0:47:54some people prefer the system should be able just seeing something record
0:48:04we need to integrate all i and
0:48:08i the overall view all the interpretation rules to not
0:48:14if you while seeing
0:48:18we know how are preferred interpretation right context of other of c e
0:48:26evaluation we need a system is reasonable
0:48:32and
0:48:33what i
0:48:36because lack of trust
0:48:38these you
0:48:41we perform human evaluations yes we don't like a mass
0:48:48and
0:48:50we must do not based once the result here
0:48:55so we need to be quite different interpretations are closing the need to you swatting
0:49:01italians in different interpretations on can ask
0:49:05appropriate questions
0:49:07and is used in this i will tell when it does not know wow
0:49:13in a just
0:49:14you see response
0:49:17so
0:49:18that's about i'm thinking all the people
0:49:23ever worked on this problem
0:49:28and then you
0:50:13with
0:50:18i'm going to disappoint either
0:50:21just looking around
0:50:25there was no okay so you just look around
0:50:29what meanwhile
0:50:30but it is very minimal
0:50:33we want it then all singing all bands and rowboat that
0:50:37so we also there are
0:50:40and we had these make them where we would match access to say for example
0:50:46where
0:50:47you can extra exam seldom in a
0:50:52right of the ball or i might i heard correctly
0:50:57so what but that one by the board because
0:51:01reality check and we start the referring expressions are mainly
0:51:05looking for things around a
0:51:09i
0:51:10okay a
0:51:12the standard names of the
0:51:14what you are and you would the
0:51:17goal for
0:51:19rock and category is if you one but we were very low just to name
0:51:25and then one side of the wordnet for see now i might
0:51:30but that was the idea of done
0:51:35there's a turn
0:52:08right one
0:52:11i'm like i'm not in the kitchen
0:52:15why don't like okay
0:52:18and if i didn't or anything like a and by or not
0:52:23like
0:52:24there i think my house and i one and then use them
0:52:29so yes i mean it's
0:52:31you would contextual i
0:52:34but
0:52:36what if i want the flow and identically
0:52:44exactly but i would be one i mean one of the sound while we are
0:52:48appropriate
0:52:50what we're not appropriate
0:52:53so
0:52:54where context and i mean exactly what
0:52:57we will now that
0:52:59however in this case it you are
0:53:02model
0:53:03on work like
0:53:05i was actually haven't all possible problem i was saying is that star flower like
0:53:10flower
0:53:12or
0:53:14our car phone or things like i a lot of normally i want to anything
0:53:21other than flower
0:53:23so there is
0:53:25in my contextual i think that kind of like second guessing the person towards the
0:53:29call
0:53:32the commentary
0:53:36i mentioned it is something that training with how much context relative scale
0:53:55for i mean we can prove our
0:54:00a slider direction problem h
0:54:03hasn't been the used by lee
0:54:06at the moment thing to get this unit but that are instantly
0:55:04well at some point of that is
0:55:07long or you have phone
0:55:12and
0:55:14i mean that we know why people thinking
0:55:18only when they were not restrict that would just a
0:55:22whatever why the point that about twenty percent of the time or
0:55:27there are
0:55:29so that we are going to be point
0:55:33they tend to become more me now we can get that
0:55:41but definitely i mean whatever right okay
0:55:48that's
0:55:49why didn't yourself
0:55:51and that goes to the definition part in fact the there was a paper yesterday
0:55:57but
0:55:58an hour
0:56:02the ones
0:56:04kind of limited in the interpretation for by already spoken the colour
0:56:10and then we using your also
0:56:15is that
0:56:18five around one
0:56:48the that if you need for every
0:57:02about that a but there are a
0:57:05so i have to do this in the problem doesn't performance for
0:57:11so that doesn't surprise me some point i mean
0:57:15but maybe we should
0:57:17the fees we are now whether we have several problems a minus right and i'll
0:57:22and their be assigned to me
0:57:26so how much for down within therefore it is
0:57:30exactly
0:57:34it's all could have an and
0:57:38that there probably but when we saw those with a in can see that the
0:57:44the main aim at ever or is in your in great deal in you don't
0:57:50get much mileage out three
0:57:55they
0:57:56i think
0:57:57you are looking at the fourth basically
0:58:02it was somebody
0:59:24okay like the first five better
0:59:26because we try the that the dean at the beginning very ambitious constraints on the
0:59:35object of the accent so we had the and
0:59:40well we had a
0:59:43actually or the actions for a particular case the what i think each other
0:59:50and all that weighted by the board when we had every and six of the
0:59:54i-th class
0:59:55in some but once
0:59:58yes definitely the four
1:00:00one of ten
1:00:02vol
1:00:04and
1:00:04and likewise if you have particular we are not sure whether they're the syllable or
1:00:11goal
1:00:13then
1:00:14you will go back and constraint of our off
1:00:19but as i said we had to know where r
1:00:24and okay what the user is embedded in the very large one the
1:00:46i don't know what to say to make a
1:01:06what
1:01:10well but the way we can design cation that we listened my only there
1:01:16so estimate only relative the thing mean segmentation for
1:01:20and it was incorrect hundred percent of the anybody problems with the problem and better
1:01:26than that of
1:01:27right
1:01:30so the only thing a lot of there is if you live semantic role labeling
1:01:35and you and that the thing that only or did
1:01:37you really don't they can be more
1:01:41this is what the you know
1:01:44if there is still are a bit not like war
1:01:50band
1:01:52you know that c
1:01:53if you
1:01:54at some point get to know that you don't know
1:01:59the things that the
1:02:58well as the semantic in our case the semantic role labeling there was trained on
1:03:03a referring expression with the various don't expect even when it's all of our paper
1:03:10segment mostly in the right place but you have a
1:03:15very briefly that saying it's and the expectations would be much better
1:03:21i cannot
1:03:22i denote better success there but for referring expressions was quite well
1:03:45you mean just for the five or
1:03:53well for the parse tree we got indicted from what they were trying
1:03:57three
1:04:04it wasn't from portals like to thank you but if one of them somebody sitting
1:04:09or whatever
1:04:11it is reached their maxima this work the lexical my
1:04:16at all of the sixteen year and by
1:04:19no like can go like
1:04:22i plan
1:04:24and that it is are then you get the pay to get like the score
1:04:29of a second we don't like little recall
1:04:33it's time for mapping but you get the very low score for that matter
1:04:43that that's why we don't think that environment and that's why at home or two
1:04:49two we review fire and
1:05:00the slogan of efficiency
1:05:02so
1:05:04you know that a framework
1:05:07okay let's call it could have a coffee breaks into