0:00:18so we should move on i think
0:00:25that's the neck speaker
0:00:31the next paper
0:00:33is towards and round learning for efficient dialogue agent by modeling looking ahead ability
0:00:41so unfortunately the order of the paper have these subproblems
0:00:45so where we have a stand in
0:00:48percent through here
0:00:49a main issue
0:00:52so i think for that reason
0:00:57not possible to announce ask questions unfortunately
0:01:02but please
0:01:03go ahead
0:01:35you'd have to the one just work at this paper entitled with two words and
0:01:39you and learning
0:01:41for efficient than all agent by modeling look you have the ability
0:01:46it is also the by george intel xeon email
0:01:50gmm so actually
0:01:52from ivn research china and that the institute of technology
0:02:09first let me introduce the background
0:02:11the dialogue systems attract a lot of attention recently
0:02:15due to get huge value to reduce your mobile work in many commercial domains
0:02:22restaurant reservation and travel planning
0:02:25unlike those chitchat counterparts
0:02:28the majority of dialog agents
0:02:30with goals
0:02:31are expected to be efficient
0:02:34to complete tasks with as few as possible dialog turns
0:02:40the right first
0:02:42three the right first example shows that a chit chat board
0:02:46right bars
0:02:48the dialogue to be as small as possible
0:02:51how never
0:02:52the below example shows that
0:02:54in goal oriented dialogues
0:02:57the purse that should to be efficient with as few as possible dialog turns
0:03:06here yet another decoding down whole for expressing the cmi years
0:03:12that we want to book a table or twelve o'clock
0:03:15the only efficient examples stands for turns
0:03:18why are efficient one only needs to turns
0:03:22not looking at the t efficient examples
0:03:25the human
0:03:26we don't have found you cables at the eleven o'clock tomorrow
0:03:30all i restored
0:03:31the agent are applied
0:03:33what have time is available
0:03:36the humans there twelve o'clock it's okay
0:03:39the agent about y alright we want that
0:03:42though it took for turns
0:03:44but efficient examples
0:03:46the human
0:03:47there we don't have empty tables at eleven o'clock tomorrow or a third the agent
0:03:52can reply
0:03:54hot swap o'clock
0:03:56we also okay
0:03:58what is one it only take tutors
0:04:06as showing the right figure
0:04:08the dialogue manager why domain considered to be responsible for efficiency
0:04:14cell or probably it's
0:04:16how to learn and efficient dialogue model
0:04:19a dialogue manager from the data
0:04:23we have two fold existing works
0:04:27either to too many manual efforts such as for reinforcement learning
0:04:32we have to designed to strategy the reward function and the expert what training and
0:04:38test three
0:04:40for a sequence of sequence methods
0:04:42they tend to generate
0:04:44generate response
0:04:45for example like
0:04:47i don't know
0:04:48yes okay
0:04:50and that they can distinguish different context
0:04:59in this paper we address in this paper
0:05:04we address the problem
0:05:06from the perspective off and you and dialogue modeling in order to reduce the human
0:05:11intervention in system designs
0:05:14and propose a new sequence to see what's model by modeling the looking had ability
0:05:22our intuition bus
0:05:23by predicting the several future turns
0:05:27the agent can make a better position of what to say
0:05:32for current turn
0:05:33for achieving dialog goals as soon as
0:05:39well mastered has several advantages
0:05:42it yet they had human and does not spend too much manual work
0:05:46and all experiments chose it is more efficient than nine you've segment just sequence methods
0:05:59this architecture
0:06:01overall model
0:06:03from the but
0:06:05to a
0:06:06there are many three components
0:06:09the far to get card encoding module
0:06:12and the intermediate the lucky had module and the top one get carded decoding modules
0:06:19in the including more dues
0:06:21we call the street kind of information but at directional gru
0:06:26it won't be the case particle utterances
0:06:29and the current utterance
0:06:31and that the goals
0:06:35the goals are represented by one hot vectors
0:06:38similar to the bag of words
0:06:41after getting did find kind of representations
0:06:44it were be the representation for the current utterance
0:06:48and the bidirectional representation from the current utterance and the bidirectional a representation from the
0:06:58and this fight kind of revenge can connect together
0:07:01and it would be
0:07:05and their artwork speed of the look at the modules the input so it won't
0:07:10be the actual one the actual and the input
0:07:17for the log you had a mode you idiot a little different from the bidirectional
0:07:21gru all terrorists
0:07:24we have only one direction
0:07:27the predict the future turns shot we now by the first hidden state
0:07:32because it work we used to predict the aperture system utterance for the current turn
0:07:38the information will be translated for war and the backward
0:07:42then combining the two e direction information
0:07:47the future each term in the predicted by the queen models
0:07:51that we what it would be that probably h one average to the registry and
0:07:55the b h k
0:07:58and this modeling looks like a shoelace
0:08:01so we design and you are with them to learn to model
0:08:06each time of training
0:08:08part of the parameters are fixed and others a big and the turn around
0:08:14so it seems like
0:08:17expectation maximization algorithm
0:08:19but it is not real my with
0:08:23in the decoding mode you
0:08:25with something that every predict the future dialogues
0:08:28and that generate a real system utterances
0:08:31by an attention model
0:08:34the loss function contents three terms
0:08:37the first yet
0:08:38for modeling for modelling language model the second it for modeling the looking i had
0:08:44ability and the last it's for predicting the final state of conversation
0:08:50this the we
0:08:56you know experiments
0:08:57that it is that should have goals
0:09:00we use to kind of data sets
0:09:02what you want to go station from a project division task
0:09:06the that it is that
0:09:08it is generated by the mastered with the goals
0:09:12you cater for the details from paper
0:09:15for preparing the training and testing samples
0:09:18let's see the examples
0:09:20at every turn
0:09:21do it a sample with the heat
0:09:24historical utterances and the current utterance in total we have about
0:09:29certainly case what is updated at one and the ten k forty the set to
0:09:40we use the goal achievement the ratio and average dialog turns is true matrix
0:09:46see our method and you haven't
0:09:48we used a user simulator it has the sequence of signals model with goals
0:09:54and also to human evaluators i invited to talk with the agent
0:09:59we program the network using the prior approach and there are some parameter settings details
0:10:05can be referred in paper
0:10:12this it experimental results
0:10:15in the table they are four models
0:10:18this equal to see
0:10:21with the goal means
0:10:23encoding used canonical utterances
0:10:25and the goals together
0:10:27then outputting the system utterance
0:10:30sequences you got class they can meetings
0:10:33it's pretty the final competition state agree or disagree
0:10:39six to see go look
0:10:41means it can look ahead
0:10:44but not predict the final state
0:10:47the last baseline means it can do everything
0:10:51we can find
0:10:52where the looking we can find what the market had ability shows
0:10:57the optimal efficiency
0:11:00both my simulator and the human evaluators
0:11:04below and the parameter tuning
0:11:07then last
0:11:07four fingers
0:11:09show the performance with different look you have had steps
0:11:14we find
0:11:15setting the step
0:11:17should be best
0:11:19we think this parameter it and have recall and depends on the datasets
0:11:25the rightful figures
0:11:26show the performance with different hidden state dimension
0:11:30from one hundred twenty eight
0:11:32two one thousand twenty four
0:11:34we find a stack as the two hundred fifty think fifty six can be better
0:11:46here it example dialogues with the agent
0:11:49the last
0:11:50the laughter shows sometimes if the agent tends to agree
0:11:55it was and what dialogue turns
0:11:59the right shows
0:12:01although all the dialogues
0:12:02and agreements
0:12:04our model
0:12:06spending this dialogue turns
0:12:08of course
0:12:09here we remove the unknown words because the language generation is not so perfect
0:12:19e sum rate this paper proposed an end-to-end a model towards the problem of
0:12:24how to learn like efficient dialogue manager without taking your much manual work
0:12:29experiment experiments on to detect that illustrate our model you the more efficient
0:12:36the contributions include the new problem from the perspective of deep learning
0:12:41a novel method to model the monkeys had ability
0:12:45and the effective experiments
0:12:48in the future we will investigate other matters to the problem
0:12:53the language generation quality should be paid more attention
0:13:00that of what is well for this paper