0:00:15and
0:00:16it's reference resolution in situated dialogue with learned semantics
0:00:21so
0:00:23a last look at the
0:00:26iterative dialogue for
0:00:29sitting at the dialog situated in and
0:00:32environment
0:00:33like a
0:00:35for example
0:00:36for human robot interaction
0:00:38in this image this human was trying to teach this robot to learn the map
0:00:45of the physical environment in this room
0:00:48and the next example it's the intelligent during system
0:00:53and this other tier was trying to teach college steering is to use computer to
0:01:00solve complex problems
0:01:02so as we can see the natural language dialogue between in those environments are highly
0:01:10related to the environment it they frequently referred to the objects or events
0:01:15that happen in the environment
0:01:18but here is an example from the
0:01:22tutorial dialogue which is about java programming
0:01:28in each a tutorial session there is a human tutor and the human student this
0:01:34tutor was trying to teach this student on and java programming as we can see
0:01:38everything here the dialogue is a it's related to the content of the java code
0:01:44say the they talk about the objects
0:01:48in the java code
0:01:50so to build a intelligent tutoring system to understand the dialogue of the user we
0:01:56have to understand
0:01:58the dialogue with in this environment including
0:02:01interpret the referring expressions
0:02:04so
0:02:05the problem is defined as
0:02:07given a referring expression which it is a sequence of a little worse or tokens
0:02:13and the an environment
0:02:15in this case with simplified the environment as a set of a objects so the
0:02:20goal here is to find the most compatible
0:02:25object for this referring expression
0:02:29though for the rest part of this card i'll introduce the corpus we used and
0:02:35the challenges and related words solution experiments than final a future work
0:02:42though the to the corpus we used is from
0:02:47is tutorial that all it's a set of all started dialogues from a java programming
0:02:53though those dialogs are between human you're in humans do you need
0:02:58here's the interface how we collect the data which is eclipse plug-in so this plugin
0:03:05will lead you want your in the in the students to work remotely like in
0:03:10different rooms
0:03:12just like using google dot so whenever the steering and
0:03:15and it the code that you're will see it
0:03:17and they can also see and text message to each other
0:03:20i within this
0:03:22this tool
0:03:23so we class of the dialogue between them and
0:03:26all of the editing behaviors
0:03:28so
0:03:31at it
0:03:31the tutorial dialogue is mostly on introductory
0:03:36programming in java programming
0:03:38which involves creating traversing
0:03:41and modifying parallel
0:03:42a race this out was collected in two thousand seven which includes forty five two
0:03:48recessions
0:03:49almost five thousand our results in total for each session a like last about one
0:03:55hour which has a on average one hundred and eight
0:04:00our races
0:04:02though they are some challenges to do there were a reference resolution in such a
0:04:08a setting
0:04:10the easy cases like when the user refer to something in the java code only
0:04:15use the name the proper name
0:04:17if you
0:04:18intuitively if you're we can just to compare
0:04:22the screen
0:04:23from the object and from the referring expression to see whether they match one out
0:04:28but this only account about a third of or
0:04:30all the cases
0:04:32it could be even it could be harder which means the user refers to something
0:04:37in the java code only use the attributes
0:04:40not the name
0:04:41like for the two dimensional array the array
0:04:45and it could be even harder they refer to something that
0:04:49are not properly defined
0:04:51channel a concept or
0:04:55which could be just a piece of code
0:04:57i for example here
0:05:01you could apply and use that cone if you want it
0:05:05so back alone is just the random
0:05:08line of code difficult here so
0:05:12those are the three challenges or
0:05:14actually two
0:05:15and the
0:05:16the last one is the number of objects in the java code could be very
0:05:21large which include the map the parables objects or any piece of a co and
0:05:27is dynamic because as the programming goes
0:05:32there could be objects removed from the vocal or introduced
0:05:37so
0:05:38that's it
0:05:39and
0:05:40then i'll talk about some closely related to prepare for
0:05:45how people
0:05:46like to do with this talented before like the first one if that either something
0:05:51and
0:05:52paper they work on reference resolution
0:05:56for a dialogues from the collaborative game which is called the at n-gram in this
0:06:01game there are seven objects and they are two players to play this game one
0:06:08is the instructor the other one
0:06:10well apply the instruction from the instructor to the to manipulate those objects
0:06:15so the used dialogue his rate and have his rate which are for dialogue his
0:06:20rate is
0:06:21any object that were mentioned
0:06:23recently or from the beginning of the dialogue
0:06:26and that have his rate was
0:06:28any objects
0:06:29that were manipulated
0:06:31from the beginning of the task
0:06:34that's how they do it
0:06:35and
0:06:36the next one
0:06:38is
0:06:39we can and some fifteen paper
0:06:42i the used a word as classifier to learn the
0:06:46a relationship between
0:06:49referring expression tokens
0:06:51to
0:06:54physical attributes
0:06:56in this setting
0:06:58the a set of a objects so they use the kind like a co location
0:07:04information like for a token
0:07:07they find all of the
0:07:11the co location co-located attributes are with the they manually comic a match the referring
0:07:17expression and the referent so the find the co locating does the co location information
0:07:22between
0:07:24tokens and attributes
0:07:27so the use the learner
0:07:28i like intention
0:07:30to predict the referent for a new giving referring expression
0:07:35so in this paper we follow the either a suntan we use
0:07:41similar dialogue history and the task is very features
0:07:46so here's an example of a from the corpus
0:07:52look here the student just a typed a line of code
0:07:56a rate goes to new
0:07:58int but well
0:08:00then
0:08:01another line of code there are only if that
0:08:05the minor a look like it is set up correctly now so we can see
0:08:11here is a relationship
0:08:13kinda like a
0:08:15between the behavior
0:08:17and the
0:08:17the referring behavior so after that the t is that in the forum what should
0:08:25you be storing in that ray so is also coming very close so they will
0:08:31refer to the same thing i go kind like repeatedly locally
0:08:35so that's why we think this a dialogue history and
0:08:39task is very are very important
0:08:42so we use them
0:08:43the third kind of information when you is semantic information
0:08:47given the referring expression which is a noun phrase this noun phrase has different segments
0:08:52used argument could indicate some kind of a attribute i'll the referent for this referring
0:08:58expression so a we used a
0:09:05conditional random field to segment and label this referring expression
0:09:09is to find out
0:09:11the attribute information it gives so
0:09:18though
0:09:19after this is a segmentation and labeling we confine the attribute segments
0:09:26like in this a referring expression data rate if it's a category
0:09:31and the two dimension
0:09:33in the case
0:09:34the dimension of this ray
0:09:36so after that we extract the attribute value from each segment
0:09:42here
0:09:44and we use at this added to make the attribute vector so this attribute baxter
0:09:49is that if the south of a attributes that this referring that this referent of
0:09:55this or referring expression should have
0:09:59if we do it correctly right
0:10:01and so after
0:10:05before starting their reference resolution a task we want to come like a make a
0:10:12candidate list because the number of objects in the
0:10:18in the java code could be very large
0:10:21i because
0:10:22i
0:10:23contain like everything you know
0:10:25so is a very intuitive approach with your first late we use all of the
0:10:32mission objects so far
0:10:35from the beginning of the session
0:10:36and
0:10:38we include all of the manipulate objects
0:10:41from the beginning of the session into the candidate list
0:10:44and the final a we include all of the object that match any attribute of
0:10:50this
0:10:52in this mission in this referring expression in this
0:10:56so the reason
0:10:57here
0:10:58to match only one attribute is we don't want to miss any
0:11:02real referent just a
0:11:05from mistake in padding the
0:11:08but semantics so that's how we do the
0:11:11create the candidate list
0:11:14and the
0:11:16here
0:11:17the reference resolution task is defined as to find the most compatible referent most compatible
0:11:26object from the candidate list
0:11:29for this referring expression so
0:11:32this probability is defined as the output
0:11:35of a classification function
0:11:37so for the classification function here
0:11:40we use the four different kinds of the classifiers to see
0:11:43how do they work in the setting
0:11:45we used a logistic regression decision tree
0:11:48nine but yes and then you're networks
0:11:51so here
0:11:54when
0:11:55we can see the probability
0:11:58of
0:12:00referring this given referring expression and
0:12:03candidate in the candidate list
0:12:05so we can rank this
0:12:08probability for all of the candidates and pick the candidate with the highest probability as
0:12:14the referent
0:12:14so that's how we
0:12:16did it
0:12:18so
0:12:20we used here
0:12:24are the features we use the first group is the dialogue history features
0:12:29which are
0:12:31when this object
0:12:34like we're mission
0:12:36how long ago
0:12:38was it mention
0:12:40the second
0:12:41a group of a features are
0:12:44the task is very features
0:12:46like a how long ago was this object
0:12:50manipulated like a tight
0:12:52or selected or
0:12:55kind of this
0:12:56the third group of a features are
0:13:00the semantic features like to measure how
0:13:03the semantics of a
0:13:05the referring expression match
0:13:07a given candidate
0:13:10though
0:13:13for the experiments we use the
0:13:17six sessions
0:13:18the tutorial data tutorial dialogue
0:13:21and
0:13:22the contain three hundred sixty four are referring expressions
0:13:27and that we manually
0:13:29label their referendum from the java code
0:13:32and
0:13:34we had two annotators
0:13:35and
0:13:37the
0:13:38we got a cap of a your voice six five
0:13:42and we used six fold cross validation which is basically take one session out in
0:13:49do the training was
0:13:50there are the other five sessions in the test on the
0:13:53the
0:13:55the last one
0:13:59two
0:14:01evaluate our approach we compare with two baseline models the first one is you know
0:14:07baseline
0:14:08model they use the dialogue his rate and task is rate in their in their
0:14:14task
0:14:15in their approach
0:14:16so to make it fair
0:14:19we and the handcrafted lexicon
0:14:23to provide some
0:14:25semantic
0:14:26information for this model
0:14:29the second baseline models the content and baseline model
0:14:33because it was
0:14:36weakly supervised approach
0:14:38and in
0:14:39dead-end perform the river
0:14:41reference resolution
0:14:43in a dialog setting so
0:14:45to make it fair we add
0:14:47the dialogue history and task he three features to this approach
0:14:53after that you're are the
0:14:56the results
0:14:58we got
0:14:58as we can the our approach got
0:15:02a higher
0:15:04our accuracy on the reference resolution have the reason why is higher
0:15:10is
0:15:11is the
0:15:12the semantics wheeler using the conditional random fields which has a higher accuracy on the
0:15:18semantics
0:15:20though
0:15:22actually there are two groups for the referring fashion
0:15:27for the reference resolution task because
0:15:29some of the are referring expressions
0:15:31contain some semantic information
0:15:34the estimate are indicates it's
0:15:36and some of them are just the
0:15:38products
0:15:39which does not have semantic
0:15:43information in it
0:15:46our so our work here
0:15:49the contribution here is a basic is mostly on the hour of reference resolution for
0:15:55those referring expression that contain semantic information
0:16:02and
0:16:03so to see a
0:16:07approach could work given the better
0:16:09semantic information so we test and using gold standard semantic labels which are made manually
0:16:17so
0:16:18here begins the
0:16:20using the goal in our semantics to run the same approach again we got
0:16:26a higher
0:16:28accuracy
0:16:30though
0:16:31this means the semantic information here is very important in doing this reference resolution task
0:16:39and but is do you like a
0:16:44there are still room to improve because
0:16:47the human agreement
0:16:49are
0:16:51like is eighty five percent which is a lot higher than the approach that the
0:16:58remote from the approach
0:17:00as
0:17:02but we did for the future work
0:17:05i think it will be promising to consider the structure of a
0:17:10of the dialogue
0:17:12and
0:17:14also an unsupervised or weakly supervised approach will be battery also very interesting it doesn't
0:17:22require much annotation
0:17:25we're
0:17:27that's it
0:17:29and want to
0:17:31thank our colleagues or their input
0:17:34and
0:17:35thank our sponsors
0:17:39in q
0:18:10i'm your repeat request so you were saying i we have different problem approaches for
0:18:14a referring expression
0:18:17like
0:18:18as the pronoun and the non-frontal
0:18:20right
0:18:21yes
0:18:23the difference here is only on the semantic information because we
0:18:28this
0:18:28the conch the main contribution of this work
0:18:30is employing the
0:18:33semantic information from a referring expression but problems they are pretty simple we don't have
0:18:39much information from it so we can like run this
0:18:44this model this approach
0:18:46by splitting this
0:18:48the set of referring expressions they are kind looks similar but
0:18:53yes i think we will consider this when we really being the entire interview system
0:18:59thank you
0:19:28yes
0:19:29the eye gaze would definitely give us more information
0:19:33like when we do the reference resolution
0:19:35it's
0:19:36come back into two like a sum and assumption here
0:19:40they won't look at the object when they refer to it
0:19:43so that
0:19:45could be
0:19:46another feature directly added to this approach
0:19:49or maybe there will be some more
0:19:53sophisticated way to use this kind of information
0:19:58thank you
0:20:07the mouse cursor
0:20:10actually we use the
0:20:13the selection
0:20:15which is
0:20:16the student my flat
0:20:18this part of a coding task how do they look like or
0:20:23as a question about it
0:20:25kind like a hard
0:20:27one case of a using the mouse or
0:20:29the cursor
0:20:32yes but that's definitely and also very interesting
0:20:37information to consider this case
0:20:52a well
0:20:55actually i haven't had a very deep consideration on this
0:21:00i just
0:21:01if you like a
0:21:03in different
0:21:05like
0:21:06additions
0:21:08of a
0:21:09the discourse structure this could give some
0:21:12interesting information on determining
0:21:15like duh the referent
0:21:22the details