so we should move on i think

so

that's the neck speaker

so

the next paper

is towards and round learning for efficient dialogue agent by modeling looking ahead ability

so unfortunately the order of the paper have these subproblems

so where we have a stand in

percent through here

a main issue

so i think for that reason

not possible to announce ask questions unfortunately

but please

go ahead

you'd have to the one just work at this paper entitled with two words and

you and learning

for efficient than all agent by modeling look you have the ability

it is also the by george intel xeon email

meanwhile

gmm so actually

from ivn research china and that the institute of technology

first let me introduce the background

the dialogue systems attract a lot of attention recently

due to get huge value to reduce your mobile work in many commercial domains

right

restaurant reservation and travel planning

unlike those chitchat counterparts

the majority of dialog agents

with goals

are expected to be efficient

to complete tasks with as few as possible dialog turns

the right first

three the right first example shows that a chit chat board

right bars

the dialogue to be as small as possible

how never

the below example shows that

in goal oriented dialogues

the purse that should to be efficient with as few as possible dialog turns

here yet another decoding down whole for expressing the cmi years

that we want to book a table or twelve o'clock

the only efficient examples stands for turns

why are efficient one only needs to turns

not looking at the t efficient examples

the human

we don't have found you cables at the eleven o'clock tomorrow

all i restored

the agent are applied

what have time is available

the humans there twelve o'clock it's okay

the agent about y alright we want that

though it took for turns

but efficient examples

the human

there we don't have empty tables at eleven o'clock tomorrow or a third the agent

can reply

hot swap o'clock

we also okay

what is one it only take tutors

as showing the right figure

the dialogue manager why domain considered to be responsible for efficiency

cell or probably it's

how to learn and efficient dialogue model

or

a dialogue manager from the data

we have two fold existing works

either to too many manual efforts such as for reinforcement learning

we have to designed to strategy the reward function and the expert what training and

test three

or

for a sequence of sequence methods

they tend to generate

generate response

for example like

i don't know

yes okay

and that they can distinguish different context

in this paper we address in this paper

we address the problem

from the perspective off and you and dialogue modeling in order to reduce the human

intervention in system designs

and propose a new sequence to see what's model by modeling the looking had ability

our intuition bus

by predicting the several future turns

the agent can make a better position of what to say

for current turn

for achieving dialog goals as soon as

possible

well mastered has several advantages

it yet they had human and does not spend too much manual work

and all experiments chose it is more efficient than nine you've segment just sequence methods

this architecture

overall model

from the but

to a

there are many three components

the far to get card encoding module

and the intermediate the lucky had module and the top one get carded decoding modules

in the including more dues

we call the street kind of information but at directional gru

it won't be the case particle utterances

and the current utterance

and that the goals

the goals are represented by one hot vectors

similar to the bag of words

after getting did find kind of representations

it were be the representation for the current utterance

and the bidirectional representation from the current utterance and the bidirectional a representation from the

goals

and this fight kind of revenge can connect together

and it would be

and their artwork speed of the look at the modules the input so it won't

be the actual one the actual and the input

for the log you had a mode you idiot a little different from the bidirectional

gru all terrorists

we have only one direction

but

the predict the future turns shot we now by the first hidden state

because it work we used to predict the aperture system utterance for the current turn

so

the information will be translated for war and the backward

then combining the two e direction information

the future each term in the predicted by the queen models

that we what it would be that probably h one average to the registry and

the b h k

and this modeling looks like a shoelace

so we design and you are with them to learn to model

each time of training

part of the parameters are fixed and others a big and the turn around

so it seems like

expectation maximization algorithm

but it is not real my with

in the decoding mode you

with something that every predict the future dialogues

and that generate a real system utterances

by an attention model

the loss function contents three terms

the first yet

for modeling for modelling language model the second it for modeling the looking i had

ability and the last it's for predicting the final state of conversation

we

this the we

you know experiments

that it is that should have goals

we use to kind of data sets

what you want to go station from a project division task

the that it is that

it is generated by the mastered with the goals

you cater for the details from paper

for preparing the training and testing samples

let's see the examples

at every turn

do it a sample with the heat

historical utterances and the current utterance in total we have about

certainly case what is updated at one and the ten k forty the set to

we use the goal achievement the ratio and average dialog turns is true matrix

see our method and you haven't

we used a user simulator it has the sequence of signals model with goals

and also to human evaluators i invited to talk with the agent

we program the network using the prior approach and there are some parameter settings details

can be referred in paper

this it experimental results

in the table they are four models

this equal to see

with the goal means

encoding used canonical utterances

and the goals together

then outputting the system utterance

sequences you got class they can meetings

it's pretty the final competition state agree or disagree

six to see go look

means it can look ahead

but not predict the final state

the last baseline means it can do everything

we can find

where the looking we can find what the market had ability shows

the optimal efficiency

performance

both my simulator and the human evaluators

below and the parameter tuning

then last

four fingers

show the performance with different look you have had steps

we find

setting the step

for

should be best

we think this parameter it and have recall and depends on the datasets

the rightful figures

show the performance with different hidden state dimension

from one hundred twenty eight

two one thousand twenty four

we find a stack as the two hundred fifty think fifty six can be better

here it example dialogues with the agent

the last

the laughter shows sometimes if the agent tends to agree

it was and what dialogue turns

the right shows

although all the dialogues

and agreements

our model

spending this dialogue turns

of course

here we remove the unknown words because the language generation is not so perfect

e sum rate this paper proposed an end-to-end a model towards the problem of

how to learn like efficient dialogue manager without taking your much manual work

experiment experiments on to detect that illustrate our model you the more efficient

the contributions include the new problem from the perspective of deep learning

a novel method to model the monkeys had ability

and the effective experiments

in the future we will investigate other matters to the problem

also

the language generation quality should be paid more attention

that of what is well for this paper