0:00:15so my name is a given to be some degree c and i'm currently a
0:00:19postdoctoral researcher
0:00:22and i'm going to present this work with a great level and phonetic nonbackchannel
0:00:30first let me see if you were the buttons and the context of this work
0:00:34so this work is part of the european project
0:00:39i have a spell
0:00:40which aims that the designing artificial which we get of information and assistance
0:00:46and this assistant a on the form of the actual agents
0:00:51but are it can that ever to engage in a pretty model interaction
0:00:57involving verbal and nonverbal behavior
0:01:01there's agents also aim at its adapting to the user
0:01:07and adapting to for instance in expected situations such as interaction
0:01:12as well as to this to show emotional state of the u
0:01:18in these projects and that's to that interested in a convergence and that better alignment
0:01:25as shown by the communication accommodation sorry
0:01:29can value convergence of behaviour is a very important features of you menu many interaction
0:01:36that occurs both at low level such as pos true accent speech right and that
0:01:43high-level such as of the mental emotional and cognitive label
0:01:48and in particular
0:01:53human the participant
0:01:56align the mb at all at many linguistic level such as the lexical syntactic and
0:02:01semantic ones
0:02:05and one consequence of successful alignments in dialogue i is a set and a repetitive
0:02:16as a consequence there are there is going to be a
0:02:20some of dialog regions that are going to imagine between the dialogue participant
0:02:27under the form of lexical items for instance
0:02:31so on the slide you can see two example of a dialog which represent the
0:02:35same face aging introduction every face of a negotiation
0:02:40and in this
0:02:43in this examples
0:02:45the dialogue buttons
0:02:47a core roles and their buttons are the main focus of this work
0:02:52so on the left you can see that they are very few buttons
0:02:56in this case we says that the available alignment is very low on the contrary
0:03:01on the right example you can see that
0:03:03that a participant's aligned us to may need that of routines
0:03:09such as nice to meet you how are you good
0:03:13in this case we are going to say that the better a alignment is higher
0:03:18so the main focus on this work is to propose measures of the of alignment
0:03:22based on this data which
0:03:27so what you think about alignment for human machine interaction so first
0:03:32we can see from human interaction and that's this is a subconscious phenomenon that naturally
0:03:38appears and it has been shown by previous work
0:03:42that speakers we use lexical as well as syntactic structures from previous utterances
0:03:50on top of that
0:03:53double and temporal alignment may facilitate successful taskoriented the conversations
0:03:59however in human machine interaction
0:04:02it has been shown that linguistic alignment cultures
0:04:06and in particular are users at the lexical items and syntactic structures from the system
0:04:12but this is only one way
0:04:15in most of the system the user aligned with the system is not able to
0:04:23so in this work all goal is to provide a virtual agent with the ability
0:04:27to detect the alignment behavior of its human participant of each from an interlocutor
0:04:32and to align or not depending on the strategy with the user
0:04:37so them in which iteration
0:04:39of using the about alignment for an agent
0:04:45is set provide a natural source of evaluation in dialogue and in particular for the
0:04:51natural language generation that
0:04:53it also makes it possible to take into account the social emotional behavior of the
0:04:58behaviour and works
0:05:00as a social blue
0:05:03it's also way of adapting results the need of an extensive user profile
0:05:10and what we expect from
0:05:13providing an agent with the ability of the body a line is to and this
0:05:19agents ability likability and friendliness to improve
0:05:24interaction naturalness as wavelet to maintain and for still user engagement
0:05:30finally we aim at improving collaboration in taskoriented that
0:05:37in this work or approach is to provide the majors a characterizing babble alignment
0:05:45that are going to be based on the transcript on dialogue and on the shared
0:05:49expression at the lexical
0:05:52and a proposition stands on
0:05:55i was stream in past
0:05:57the first one is to extract
0:06:00the dialogue routines other justices the shared expression from the dialogue transcripts
0:06:05the second part is to be an expression lexicon from this shared expression a as
0:06:11that's keep track of the expression and some features of these expressions
0:06:17and then they're deriving measures of that better alignment from the data transcript and the
0:06:23expression icsi
0:06:25let me so if you word about the automatic building at the expression a lexicon
0:06:29so in this work we provide a model where we define
0:06:33a surface text but then at the utterance a shine expression as a surface text
0:06:37but then at the utterance level that has been produced by both speakers in dialogue
0:06:42so for instance you can see
0:06:45i and example of dialogue
0:06:47on the left of the slide that in the middle
0:06:50where there is are shown expressions that's not gonna work for me
0:06:55which is used to reject a proposition in a negotiation dialogue is that is used
0:06:59by the interlocutor at
0:07:02in it in that first term and by interlocutor b in the first
0:07:09so is a shared expression is part of the expression lexicon
0:07:14and has been initiated by eight
0:07:18and so in this paper we present a framework of expressions that maybe and but
0:07:24the or not
0:07:26and we also provide
0:07:28way of automatically extracting is it their shared expression to be done expression next we
0:07:34can automatically
0:07:36so this is an instance of sequential best down mining in
0:07:42and it involves the use of by you informatics algorithms that are usually used to
0:07:49my in dna sequences
0:07:52so in short
0:07:54it is involve zeros are the reserving of the multiple common subsequence problems for the
0:08:00generalize to fix tree data structure
0:08:03and through this
0:08:05base of sequential pattern mining we can be from the transcript of dialogue d v
0:08:10a dialog lexical
0:08:13then from the data transcript and the expression lexicon we derive some aspects for one
0:08:21to characterize verbal alignment
0:08:23so the first measures a global on the single dialog
0:08:29and now the expression lexicon size that is this is a number of a unique
0:08:33shown expression other to establish between dialogue participant
0:08:37and the expression by a variety which is the expression lexicon size a normalized by
0:08:43the length of the not a given but as a number of to the total
0:08:47number of token in the day
0:08:50we also derive
0:08:53measure that a specific to the speakers
0:08:57first the expressed in the expression repetition measure
0:09:04measure which gives the amount of token that is dedicated
0:09:09to the repetition of an expression by the user
0:09:13over the total amount of token
0:09:15and the initiated the expression racial which determines for a given speaker the number
0:09:23of expression that has been a initiated by him
0:09:31so to study the proposed from a we present in this paper copies based contrastive
0:09:40that stands on a real interaction copper well involving you menu man and you man
0:09:46agent but
0:09:47as well as artificial cover all which
0:09:52and used as a baseline
0:09:54and in this work we provide several a study comparing
0:09:59the real interaction corpora right to our baseline
0:10:02comparing a double alignment in you menu men covers and human-agent copies and also studying
0:10:07some condition on the am an agent copy such as a negotiation
0:10:13so let me so if you will about
0:10:15the negotiation corpora that we are using this work
0:10:21so this negotiation corpora
0:10:26involve two participants is that are required to find an agreement
0:10:32over the of the amount of
0:10:36okay they are they have to share
0:10:39and this negotiation task can be is a integrative that is to say that can
0:10:45jana to be a wean for bus participant
0:10:48all completed you
0:10:54this couple right available in that you monuments aiding continue in the human agents sitting
0:11:01you consume the slide an image from the human agent corpora
0:11:08in the human-agent sitting
0:11:11the agent is controlled by you are without of course system
0:11:15that has been designed to be as natural as possible
0:11:19and this was system involves more than a eleven thousand possible you challenge is so
0:11:26the agent as a wider variety of you terence to express it's a
0:11:35the human colour i never eighty four that a white the human-agent corpora
0:11:41involve one hundred then fifty four down
0:11:46from these a couple are we constructed all based about a baseline the showing it
0:11:53which have been designed to break the dynamic of us interactive alignment protocol
0:12:00and to do that we decided to break the cooking between you differences
0:12:04so starting from a real interaction dialogue
0:12:08what we have done is that we have k
0:12:12all the utterances from a speaker
0:12:15where substituting all the user utterances from the speaker from the others a speaker
0:12:20by you two entities should which was an from one concludes
0:12:26from sorry from there are several pull
0:12:30but utterances are chosen randomly
0:12:32and the prove a specific
0:12:37for the human participant
0:12:39the human participant facing an agent and for the agent
0:12:45so on the slide you can see an example of real dialogue on the colour
0:12:50and of the left
0:12:52and one randomized version where all the utterances from the human participant had been that
0:13:00subject you to buy a randomly choose an and jones
0:13:03so the main idea of these corpora used to break the dynamic of interactive alignment
0:13:14so the first one of the first hypothesis is that we are investigating in this
0:13:21is that it's the dialogue participants should constitute a richer expression lexicon
0:13:27in the real interaction call logs and what would happen incidentally industrial get corporal
0:13:35in the artificial or
0:13:37and so to investigate this it was hypothesis we looked at the expression very variety
0:13:44measure from all model
0:13:48what we found
0:13:50is that there is a significant shift different difference between the you menu man
0:13:56and so the it's at if you can talk about as well as for human
0:14:00agent as in as and it's
0:14:03artificial can talk about
0:14:06in the sense that is expression body right variety is higher in the real interaction
0:14:11copper wire than in the signal string will get one
0:14:15so what we have observed is that's or it was is we have a provided
0:14:21some arguments to can for this is this hypothesis is that in the sense that
0:14:27we have observed a richer expression lexicon in the real interaction couple and then the
0:14:32in the artificial ones
0:14:34which have been designed to avoid
0:14:39the interaction process the interactive alignment process and thus the constitution of expression mexico
0:14:47then we have been interest the in the comparison of that better alignments shows a
0:14:54measure that we propose a
0:14:57between the human corpora corpus and the agent corpus
0:15:03so here what we expected that we expected that moldable alignment from the human
0:15:10in the human-agent interaction
0:15:15then the agent the main reason is that
0:15:18the agent even if it even if it's a was it has not been designed
0:15:23to be able to align
0:15:24and the second reason is that
0:15:27the human participant may be influenced by the belief about the limitation of the communicative
0:15:32get abilities of the agents
0:15:35so to us to this i prissy six we looked at the initiated expression right
0:15:42sure that we propose in a model as well as the expression repetition ratio
0:15:49and in the human interaction
0:15:53in terms i would that there are no differences between the two speakers in that
0:15:57it's there is a symmetrical that by alignments
0:16:01regarding of these two measures
0:16:04bus dialogue participants initiate
0:16:06approximately the same amount of expression
0:16:10and they repeat also the same amount of
0:16:14of expression
0:16:17is this is not the case in the human agents and sitting
0:16:21and we observe here
0:16:25and estimate
0:16:29this estimator e a
0:16:35this end symmetry happened and
0:16:38can be is summarized by the fact that
0:16:42the human participants adopt more was initiated expression
0:16:48which is not surprising because the which cannot
0:16:51a adopt easy to use a human participant expression the human participants also they did
0:16:57get small talk into the repetition of expression
0:17:01so a here
0:17:03this give some
0:17:07arguments to say that the human participant
0:17:10is influenced by its belief about the limitations
0:17:14of the communicative capabilities of the agents
0:17:17and it should be stressed that lets us test image three a does not appear
0:17:23when considering the number of the can produce by each speaker or when considering the
0:17:27change proportion
0:17:29is the proportion of vocabulary
0:17:35finally we looked at some conditioned on the human agent corpus and
0:17:42we have mainly focus on the negotiation type
0:17:47in we wanted to see if there was an impact
0:17:50on the verbal alignment indicators
0:17:53given the type of negotiation so integrative negotiation which i don't know to be a
0:17:59wean a distributive
0:18:04which is a competitive one
0:18:06and what we found is that
0:18:08both negotiation type have as a similar amounts the c is a similar value for
0:18:16the expression for it
0:18:18that is to says that down
0:18:20the same amount of expression
0:18:22that are created in both dialogues but there is a clear difference in the text
0:18:28prediction repetition ratio
0:18:30which shows that's
0:18:32in the competitive in the negotiation
0:18:36dialogue participants
0:18:40all and their body allowing more
0:18:42then in wean negotiation
0:18:51the fact what we provide here is arguments to us about the fact that it's
0:18:59competitive negotiation
0:19:01due to more rubber alignment and one it was this is that
0:19:07the participants a need to be already allowing more on control proposition
0:19:14so to conclude on in this work and we have proposed automatic and generic measures
0:19:20of the other alignment based on sequential pattern mining at the level of stuff first
0:19:25of texture differences
0:19:26that makes it possible to characterize
0:19:30interesting aspect of that by law alignment such as the reading position process
0:19:35the degree of repetition between that a participant and the orientation of the about that
0:19:41we have contrast construe a contrastive then you menu man and you men agent that
0:19:47better alignment showing us that there is a symmetry in babble alignment
0:19:53when a given now indicators on
0:19:57in human interaction why there is an asymmetry in human-agent interaction
0:20:02and this touch we wanted to evenly comfy m some hypothesis is from they need
0:20:08to ensure
0:20:10and the perspective that we want to explore used to used as a measure that
0:20:16we propose in a dialogue system and should be stressed that the major based on
0:20:21very efficient algorithm is to say
0:20:26linear complexity algorithms
0:20:31we would like also to investigate this
0:20:36more the query and to do a qualitative analysis of that but alignments between a
0:20:40human interaction in human-agent interaction
0:20:43such as a function and analysis of the repetition
0:20:47and finally we would like to investigate
0:20:51that was are comparable here menu man and human-agent gabor
0:20:55to confirm or reasons
0:20:57thank you for your attention and i'm now ready to answer your question ratio image
0:21:14thanks for the top i was i was wondering several things about adopt actually one
0:21:20of them is i i'm not quite sure you said something that on
0:21:26way of the machine adapting to
0:21:30to the user there's nothing out there
0:21:34you have any idea why is nothing out there because when i looked into
0:21:40it was like slot filling kind of dialogue and that was difficult because you don't
0:21:44have a lot of data about user
0:21:46to make the system about two but in this kind of data it might be
0:21:50different and also the second question is whether
0:21:52the measures that you come up with would work got for turn level
0:21:57so if you have the decision to change from mexico expression
0:22:00with those words to make changes that the turn level rather than
0:22:04several turn
0:22:06but like rather than taking into consideration example for
0:22:11so for the first question about that there are systems that are able to align
0:22:17as in some interesting work and they are pointed out in the in our paper
0:22:22the main disadvantages that most of the system i'll based out rule based
0:22:27and specific to some domain
0:22:30all of some tasks
0:22:32and the idea and providing measures and used to go towards more data driven way
0:22:38an automatic way of aligning
0:22:42but there are some system that i module
0:22:45and the second question
0:22:49so if i understand you where is that if we change the granularity of where
0:22:55we've well where we look for expression
0:22:59so we can not be over your problem i
0:23:03don't see the em program in using all means of to be just changing that
0:23:07when you're writing and ueller richie
0:23:10of the units
0:23:13which we
0:23:14do you think you would get this you would keep the same accuracy
0:23:20i don't know we have one check because here we go to variable for a
0:23:26couple always very when the limited you challenge is
0:23:33if we look at
0:23:36i'm not sure to understand their we will your point in fact
0:23:39we can talk a yes
0:23:49hello i am here but talk about how you are looking on the degree of
0:23:57repetition and what i didn't you are looking as repeated
0:24:01i think perhaps not counting
0:24:04probably so you get things like
0:24:08i'm interested in shares or whatever was and in the next one you're getting a
0:24:14in content items as being the repetition
0:24:18in terms of being you know sort of
0:24:22alignment which i think in this case where the participants don't really have so much
0:24:27like what they say that phone first-person pronoun there is only one
0:24:35you have similar ones for me i think that if you were doing alignments
0:24:40on the on that might also be the same sort of a problem
0:24:45what the in i think that it's just one of the difficulty to work when
0:24:51we walk misalignment is that it can be very
0:24:56you can very specific
0:24:59words such as the difference between what time i used adding all at what time
0:25:05and it is going to be very important in that case
0:25:10and in this work we have chosen to
0:25:15select all the expression
0:25:17and to can everything even though we are probably counting some
0:25:25expression that in around and that are still going to happen even without that but
0:25:32but what we show in the by comparing to the strongest cultural
0:25:37i think is that
0:25:41when people line
0:25:44they will create mall expression
0:25:52i just if you were telling
0:25:57information for
0:26:01i think you would want to understand some of these things are alignment in some
0:26:07so that you would be producing delays
0:26:11and regarding that
0:26:14the expression mexican keeps track of expression instead our future such as the frequency
0:26:20such as a recent c of an expression we can use it is it's is
0:26:24it is features to feature out
0:26:28an interesting expression
0:26:30but can you because i could be extremely free
0:26:34and it could be very recent as well
0:26:43i can just two
0:26:45to copy this behaviour
0:26:50we can choose to stop my sentences by the same expression that we use for
0:26:54instance i want to align
0:26:59thank you very much nothing to speaker again