0:00:16so i'm presenting the syllable they have to a whole team a people from my
0:00:21agency shown on and you are here j is not
0:00:28and this is gonna be a little bit different you "'cause" we're gonna have no
0:00:32neural networks knock or run with of the pause and no f scores
0:00:38no numbers
0:00:39so is gonna be a little difference
0:00:41so here's a the problem that at that
0:00:45we are i
0:00:46the so we start state-of-the-art in dialogue systems actually a couple of you please and
0:00:52the you know and others have had a similar slide
0:00:57what we're doing mostly is very simple parsing based on keywords phrases and so on
0:01:04a regular expressions as one
0:01:08very simple dialogue models based on either finite state somehow or frame systems with slot
0:01:18filling its own
0:01:20engineer for a specific application
0:01:24and there's
0:01:24sounds of applications for these
0:01:28but
0:01:29every single dialog system is developed for that specific application in which you some cases
0:01:33here this in get out
0:01:37modified domain but essentially there's sort of separate dialogue systems they're kind of work together
0:01:41with a single the interface
0:01:44but importantly there is no transfer between these domains there is no generic
0:01:52capability in these systems the transfer from
0:01:56one domain to another
0:01:59and as far as the kind of interactions that these other systems allow
0:02:04there's
0:02:05no effective the verification or corrections the kind of dialogue that allow is actually very
0:02:12limited
0:02:15so here's our position
0:02:17dialogue is an activity that we can be and should be modeled independently of the
0:02:23application domain
0:02:26we i understanding of language to effectively and robustly handle the a broad range of
0:02:32user utterances that the same
0:02:36intention can be expressed in so many different ways
0:02:41added
0:02:42most of these
0:02:45finite state based and with simple parsing hubris of data that are sitting in a
0:02:50day just common
0:02:53all the somebody's willing to spend years just
0:02:57encoding what's a regular expressions i suppose
0:03:00and we also think that the community needs to the frameworks to facilitate the development
0:03:05of these a complex mixed-initiative systems with very sophisticated back-end recently and i think there's
0:03:12a fierce of such tools
0:03:15we see for example in parsing with a stand for the tools or nltk or
0:03:21other various tools
0:03:23people adopted them and they started using them and they got better outcomes of that
0:03:29but in the dialogue maybe we don't have sophisticated enough tools
0:03:34a tool allows for the for people to a develop such systems
0:03:39so
0:03:41as use only the title our model is
0:03:45based on the collaborative problem solving so what is collaborative problem solving
0:03:50well when they collaborate what they do they rehabilitate you they developed jointly solutions the
0:03:57identify and resolve errors problems of the here a kind of the progress as the
0:04:03task is going on
0:04:07they jointly perform actions the of course they can negotiate roles
0:04:13and they learned from one another
0:04:14at all these things are done through communication right it's not necessarily by language communication
0:04:21could be gestures it could be other kinds of communication but it is by communication
0:04:26so we need to
0:04:29so
0:04:30our central thesis is that essentially all or at least most of the human machine
0:04:36language based communication can you model effectively
0:04:40as collaborative problem solving
0:04:44so
0:04:45what does the collected for solving a model in table
0:04:50so what we need by this is the is that we need to model the
0:04:54shared initial space between the two agents or some people actually have a
0:05:01i and the something about
0:05:04modified agents a sort of
0:05:07once i
0:05:09agent dialogue here we just limit ourselves to two but
0:05:13even with multiple the same response applied
0:05:16so what is this and intersentential spaceport kind of objects that we are dealing with
0:05:22these are particles solutions
0:05:24and understanding common ground session that strange
0:05:29and all this shared understanding
0:05:32arises from communication we need to communicate and agree on things and so on
0:05:39one page counts
0:05:40create a collaborative goal or as solution there has you to a pursue something together
0:05:48obviously a selection japan like to go without
0:05:51the other person
0:05:53so this is i pictures taken from a paper i data alone and a couple
0:05:59other of my calling problem
0:06:01two thousand two
0:06:03the models place the sort of the this case of tasks in this model in
0:06:10four different areas communicative interaction a collaborative problem solving a problem solving a individual problem
0:06:19solving actually
0:06:20i don't i did of course might interest in this talk is just about that
0:06:24solve the problem solving here
0:06:26which really can look at the object in there really reflect the problem solving actually
0:06:32the same kind of thing
0:06:33except that their properties
0:06:35so
0:06:37the central thesis that we have that in the two thousand every two thousand and
0:06:43it wasn't just ask other people have the same idea is that at that level
0:06:48when you can a reason in a domain independent represent things in the domain independent
0:06:54way
0:06:55but this has never been rated are properly and we also didn't we problems today
0:07:02we have a larger prototype we never really did it so here today i'm announcing
0:07:08that we know that
0:07:10and
0:07:12this
0:07:13this architecture would be familiar to all of you it doesn't look very different from
0:07:17other things that we which is so far
0:07:19so we have natural understanding there's lexicon ontology
0:07:23the dialogue management which is really the class problem solving agent at that we have
0:07:28it is in the centre
0:07:29there's a the backend problem solving or okay here
0:07:33a behavioral agent there's generation so this doesn't look very a different from other systems
0:07:41the parts that are in colour or the components of cogent
0:07:46of is domainindependent shall right so by itself people look at that you're not gonna
0:07:51have a dialogue system just by that having that but you can have the this
0:07:56dialogue system i dialogue system by adding to that
0:08:01the behavior spectrum domain specific and not to mention that language generation and of course
0:08:07generation you could press all have some higher level but mainly depend generation components but
0:08:14we don't have it
0:08:16so a lot of people can do sort of in domain you an iteration
0:08:22and
0:08:23so
0:08:25we also to do i'm just gonna talk a little bit about that components there
0:08:29so the natural language understanding the workforce of everything that we didn't for the last
0:08:35twenty some years as in the tricks parser
0:08:39it's a d
0:08:45the that is too sparse to use a very representation of the meaning of every
0:08:50a sentence it has a very sure principle ontology it has a very large lexicon
0:08:57some of it or ten thousand maybe more
0:09:00are handled lexical entries we it stand by learning from a word that but a
0:09:07session we derive automatically so freebase for example for we driver automatically the roles that
0:09:13have the they are from definitions
0:09:16it's and so on
0:09:19and
0:09:22i'm not gonna talk about to make too many details but it is available online
0:09:26and you can actually check it there's a there's a web service for the basic
0:09:30parser and or number of variations of the parser as well
0:09:34the output
0:09:37positions
0:09:51i don't see that
0:09:53data
0:09:54so i don't think this is actually visible but
0:09:58so this is the
0:10:00web interface i just put of sensors earlier something that it came up earlier i
0:10:06need a hotel in the centre of calibration
0:10:09and that's
0:10:10what a parse multiply and you can see that
0:10:15everything so there's a speech act at all
0:10:18every single more represented here has a type in the ontology
0:10:24so for hotel accommodation for needed one is one
0:10:28can the residual graphic region
0:10:31i even with the british spelling and their got that right
0:10:37and if you look for example at the next one i prefer very nice hotels
0:10:42when you can see that before is also one just like need which is something
0:10:47that you probably want to you
0:10:51and you can see how adjectives have
0:10:55very interesting types here the space here is basically a value on a scale of
0:11:00expressiveness as it for and so on
0:11:03so you get very rich representation
0:11:14well
0:11:15there's an additional thing is here the dealing with reference resolution ellipsis processing ontology mapping
0:11:21i'm not gonna talk too much about this
0:11:25i one is the here is that the there's conventional speech act identification still sometimes
0:11:29you can ask a question by making socially an assertion or you can you can
0:11:36make an assertion by asking a question for making a request asking it a question
0:11:41so there's conventional mapping between the surface speech act and the user speech act but
0:11:49you just really
0:11:51so not to do this yes agent
0:11:54so a
0:11:56essentially the output of all these national chance any sizes a feed into the a
0:12:01collaborative problem solving agent and what it does is it provides a domain and model
0:12:06communication adaptable to new domains
0:12:10on
0:12:11what side it just
0:12:13what really could be called just intention recognition
0:12:16so there's communicated at coming in from user utterance you want i understand would be
0:12:22fashion of the user is i and we call that can also be guy
0:12:28and obviously on the other side adjusting for someone to the specs much time on
0:12:32that
0:12:34if the system itself once to communicate to the user it will do that is
0:12:39actually creating a collaborative problem solving task which can get sense to the generation component
0:12:45and eventually we'll get into like that
0:12:50so this section does that and essentially maintain the quality of a state
0:12:57which
0:12:58all these acts together essentially drive the a conversational structure so that's why it is
0:13:03a dialogue model
0:13:06and again going to repeat myself here but this is primes good idea that there
0:13:09is in the in domain and the semantics of language that supports
0:13:14reasoning about intentions
0:13:18so there but
0:13:20there is attention here between the desire for domain independent processing and the need for
0:13:26very affordable a specific processing so
0:13:29understanding detection of user is almost always it possible to do in just the domain
0:13:35independent way so the way we deal with this problem is that essentially the collaborative
0:13:41problem solving agent should be understanding of the user intention is a hypothesis
0:13:48and then this is over to the behavioral agent which concludes sort of grounding of
0:13:52all objects and is actually trying to figure out does this make any sense in
0:13:57this particular state of the task does this makes test and if so then that
0:14:04i guess
0:14:05committed as a show if it's a goal then the system can mislead as a
0:14:12as a shared real but if not there can be clarification so on going on
0:14:19so is actually the way this is done based on the previous evaluate commit a
0:14:24little
0:14:25so the collaborative problem solving agent will figure out a probably problem solving a which
0:14:32explains the user utterance
0:14:35would send an evaluation and evaluate at the behavioral agent
0:14:40and the behavioral agent agree use it will send back an acceptable and only and
0:14:45we have a commit to the goal of the shared
0:14:51and this is the same way that we're dealing with a request proposals of those
0:14:57are questions as well
0:15:00if the va
0:15:03doesn't
0:15:04a light
0:15:05at the evaluation there's many different that there are several different ways it can handle
0:15:10with this one is just say a rejection actually i think this should be unacceptable
0:15:15but anyway
0:15:16but
0:15:17we use the like to do this and it can actually give a release
0:15:23it's a horizontal we don't have enough box for corporate law
0:15:28it is also possible to propose alternative way and together that for a to the
0:15:34resulting
0:15:36i'm gonna skip on aspect is just models
0:15:39so in the paper is a very detailed description of the various a quite a
0:15:44problem solving a
0:15:46so i'm not gonna going to the detail so there's a number of them have
0:15:50to deal with gold so we cannot do not select d for a goal if
0:15:55you don't wanna deal with the right now you can completely abandon the goal or
0:15:59we can really easy to release it means that it's completed
0:16:02satisfactorily more or not
0:16:06and there's a there's a bunch back support knowledge in make an assertion that is
0:16:10actually once is committed to that means of the agent a now believe whatever you
0:16:18don't the whatever that whenever the human user in intense corpus and the belief
0:16:25this question is a ask even task w a just to what
0:16:30questions
0:16:31you can see in a number of examples that
0:16:33quite complicated example these are actual examples from system you
0:16:38including something like doesn't amount of sorely
0:16:41at the conditional you
0:16:43at a one that if we increase the amount of whatever the some other proteins
0:16:47all
0:16:48or i wh with choices of the gt wagner propose which are regulated by a
0:16:53reinstall
0:16:55so this is all the little and there's a number of access related to the
0:16:58a problem solving status so again acceptable not an unacceptable are essentially interpretation yes where
0:17:06the da says i like that i don't like it that goes can be we
0:17:11use will reject it
0:17:13they can be failures of execution i answers to questions and execution status which can
0:17:20be either
0:17:21done at the very end but it can still it can be also used to
0:17:25just more progress i'm still working on this
0:17:30okay well as you one is the u
0:17:35so
0:17:36what is mean to add a behavioral agent to actually haven't i was system based
0:17:42on cogent
0:17:43so
0:17:45you can think of the cts access establishes a sort of a protocol was implemented
0:17:49protocol and any sure that the obligations that these things create
0:17:55are satisfied
0:17:57then after that there's nothing else to do essentially there's no requirement for how the
0:18:02behavioral agent represents intuitively
0:18:05i think what it's a line system or a very simple database lookup
0:18:09what kind appended complexity has
0:18:12how many some agents are out there are a as long as there's a single
0:18:17interface a single overarching yea everything should be fine
0:18:21with it has a models alone
0:18:24there are some related ways of affecting how the natural language understanding works
0:18:30but is somewhat so you really want to use this and actually
0:18:36change how the natural language understanding work because it's not good enough you ask the
0:18:42did you never i'm not reliable
0:18:45so we have a number of very implement coded based systems in very different domains
0:18:51very different interactions is
0:18:54so by duration
0:18:57that station in an assistant a biologist assist and a bunch of systems that have
0:19:03to do with the blocks world
0:19:05more or less
0:19:07and some others the that are sort of music composition visual storytelling that's creating such
0:19:13scenarios for making movies essentially with animated characters so with very different domains very different
0:19:20vocabulary very different interaction style
0:19:25so i'm not gonna go too much into a into a we have used systems
0:19:29but one of the reviewers we want to see the by iteration a system
0:19:34and i could put too much into the paper because it wasn't published and it
0:19:38still isn't really
0:19:40but i'm gonna give you a little video of the system and
0:19:45so these are all systems except for the one that you are represented the other
0:19:50day all these systems are not develop is people power cogent and they developed on
0:19:56the role
0:19:57so let's look at it of a dialogue
0:20:09providing you understand looks like logical systems like
0:20:15was there
0:20:19one is going to be sensor
0:20:21but the trees are a little bit
0:20:23the rule machine i don't want the one here
0:20:27sorry but
0:20:31alright so here we would have sort of a the dialogue history then is a
0:20:36idea a system by averages
0:20:39what you from an implementation and what you what the goal here i want to
0:20:45find out how you be shown in the
0:20:50b equal to these two genes
0:20:54and there's just outline i think it's probably best work
0:20:58so i'm so what is the goal here i want to find an explanation so
0:21:03it's a very interesting type of goal of how this happens
0:21:09and the way the system knows how to provide an answer that time is to
0:21:13build and what a model of the molecular interactions
0:21:18and can try to find out
0:21:20one that you are maybe we which is kind of the source
0:21:25useless is g the joan i in this particular cells
0:21:30so
0:21:31i'm gonna you go your
0:21:39so the user then asks how does your maybe if we regulate pi okay now
0:21:43why did they know you can see here about the p eight we hate you
0:21:47"'cause" they're biologists obviously this is not a system for novices
0:21:51and what the system does it actually looks also there's a huge array of a
0:21:57by will just pacific agents
0:21:59including ones that go look up a ways in a perfect database is
0:22:05there's one but actually read papers and can we can extract information from the air
0:22:11so it defines a watermark task between these two
0:22:16g and it creates a network that the user can use it as a source
0:22:22of information
0:22:24so i'm gonna speed up because i know my ties are already right it is
0:22:29okay
0:22:31so a and creates a so
0:22:35i'm just gonna lexical and only because it is below
0:22:39so not the user creates with the system at i a very specific don't model
0:22:44of this
0:22:46the system actually based on what it sees it can suggest additional information based on
0:22:52what it knows
0:22:54and the user can look at it and say well okay that
0:22:56good enough with an actual i know something even more specific than that
0:23:00and the system comes back you can see here
0:23:03but
0:23:05to actually explain
0:23:07the original question that the user a
0:23:11and there's more it can actually take this and create a dynamic model about it
0:23:17can ask questions for example is the monitor for whatever protein high and you can
0:23:23see all kinds of useful information about so i'll stop here
0:23:44for
0:23:46four point recognition we actually don't to a
0:23:49in the in the air agent in the cccs agent
0:23:53we don't actually use right no plan recognition i know that i
0:23:58more me
0:23:59running when you
0:24:01understanding dialog
0:24:03we
0:24:04for now we don't at high
0:24:11so
0:24:12the i you can see some essentially the one where of i've answering this question
0:24:17is why was why where we successful with this where we're reward before and done
0:24:23more work before because of this the way we split
0:24:27what can be done in the domain independent way from what can be done in
0:24:30a domain independent way
0:24:32so a lot of the time i is a set in this evaluates commit little
0:24:36we basically just wrote things over the fast and say well you figure it out
0:24:39so most of the situational context and in there is not a model of user
0:24:43modelling in this thing but the were all of this would actually reside right now
0:24:48in to be a obviously you want at the at this is a level to
0:24:52have some of it
0:24:53to be able to do some walk some more reasoning but right now we don't
0:25:10we don't offer a deterioration that all the teams that have worked on this have
0:25:15essentially created template case the generation on the role and so we did we don't
0:25:25provide
0:25:34no
0:25:35shortcuts
0:25:37would be very difficult
0:25:51well we started with similar goals right it with the collagen there are
0:26:00actually some of these older papers dealing more with that question about the differences
0:26:07i
0:26:08there are some limitations in the collagen model there are some really good features the
0:26:14colour to model
0:26:15so i think we can at the same in the same direction but kind of
0:26:20tackle things a little bit differently but actually i just wanna learn recently that the
0:26:29the chart for each and others that have put together idea i toolkit
0:26:35moving in the same direction
0:26:37although as far as i understand i haven't seen it in practice that their there's
0:26:42is more task oriented kind of like reading floor
0:26:47so you know what you know way they can move their expectations as the kind
0:26:52of reduce their expectations
0:26:57so i don't know discourse on the slice sliding it was at a
0:27:03link you can actually download it recommended to use
0:27:07at least the parser you can actually do much better than what we people do
0:27:10and if you want to use the whole system will be