0:00:09a good morning everyone that they have to change at least i hang up seeing
0:00:13a speaker is to study and practically i is a senior researcher french national centre
0:00:20for scientific research just a little area computer science a sixteen c fan
0:00:26i think i c six test databases and have it seems like and statistical models
0:00:31natural language she's are not an acquisition that can arise and lexical resources for nlp
0:00:37syntactic and semantic parsing and i need technology for language learning
0:00:41she is very natural language generation from syntactic let and has a large delay not
0:00:48time which is a shared task and generating text from section let
0:00:53said this morning style will be at and t planning-based critically models thinking i natural
0:00:59language generation of are a variety of different types of a unique tasks so that's
0:01:04baseline okay
0:01:13can you email
0:01:15can you hear the back
0:01:17okay so good morning
0:01:19thank you for being hereafter last night
0:01:24so when i was invited so it is workshop i was i was for real
0:01:27of course you know coming to the n i giving a talk
0:01:31but then i was at the river a the also about the
0:01:33thai tone of this workshop synthesis no
0:01:37i think that the introduction say they show that i don't do synthesis i don't
0:01:41do speech in fact i just for a context
0:01:45but of course you know there is a link between
0:01:48text-to-speech synthesis and generation which is what i've been working on for the recent years
0:01:54which is that natural language generation is the task of producing text so you can
0:01:59see natural language generation as a
0:02:02three step two text-to-speech synthesis
0:02:05and it is what i'm going to talk about today are going to talk about
0:02:09different types of natural language generation tasks
0:02:14and is and we start with
0:02:17we know what's
0:02:20okay
0:02:21so i would start with a an introduction to you know how hard generation was
0:02:26the on before deep learning or is and then i will show how
0:02:31you know the deep learning
0:02:32and paradigm completely change the approach to natural language generation
0:02:37and i will talk about some issues about current your approach is to
0:02:42text generation
0:02:45so the vocal presence here is a joint work with the phd students and colleagues
0:02:49which i want to name here so i met corner is a p g students
0:02:53at the north sea where am based
0:02:55you do that again from bob dylan university and it is well okay life on
0:02:59is a piece is you don't to jointly student between fair in paris and all
0:03:03c
0:03:04are in our group goal of each that the dams that university okay push your
0:03:08cough and the another liberal who were also puget students they're the joint supervision with
0:03:14within a
0:03:15and the finally and that so the su money not be the phd students with
0:03:19mean of
0:03:21okay so first what is natural language generation well it's the task of producing text
0:03:26but it's very different from natural language understanding because of the input so in natural
0:03:32language understanding the input is a text is well defined everybody agrees on that and
0:03:37this you know large quantities of text around available
0:03:41in natural language generation is very different in that the input can be candy many
0:03:46different things
0:03:48and this is actually one of the reason why natural language generation board was very
0:03:52small see for a very long time so compared to nlu you know the number
0:03:56of papers on energy was very small
0:04:00and so what out the types of inputs there are basic
0:04:03types of input data are meaning representation
0:04:07well text
0:04:08so data i would be that it data from data bases a knowledge bases
0:04:14structured data then you have meaning representation that are devised by language that can be
0:04:18produced by computational linguistic tools that basically a device to represent the meaning of a
0:04:24sentence
0:04:25sometimes of a text more generally of a sentence or dialogue turn
0:04:30so sometimes you want to generate some this meaning representation for example
0:04:34in the context of a dialogue system
0:04:36you might want to the you know the system will the dialogue manager would produce
0:04:40a meaning representation music which is called a dialogue turn
0:04:43and then the generation task is to generate the texas
0:04:46a turn so the system turn in response to
0:04:50to this meeting representation
0:04:52and finally you can generate from texas and that would be in our applications such
0:04:56as text summarization text simplification sentence compression
0:05:03so those are the main types of input another complicating factor is that the
0:05:07what we call the communicative goal can be very different sometimes you want to verbalise
0:05:12so for instance if you have a knowledge base you might want the system to
0:05:16just verbalise the content of the knowledge base
0:05:18so is
0:05:19you know readable by human uses but off you know all the goals would be
0:05:23like to respond to a dialogue turn
0:05:25to summarize a text or to simplify text or even to summarize the content of
0:05:30a knowledge base
0:05:31so those two factors with the
0:05:33means that
0:05:34up renewal
0:05:36natural language generation was divided into many different sub fields which didn't help
0:05:41given that the community was already pretty small
0:05:44and there wasn't much
0:05:45you know communication between those subfields
0:05:48so why did we have this difference amphibians because essentially the problem is very different
0:05:56so in the when you are generating from data there is a big gap between
0:06:01the input and the output six of the input is a data structure they died
0:06:05does not like the text at all it can be even you know result from
0:06:09signal processing can be numbers from with r
0:06:14at call numbers whatever
0:06:16so
0:06:16the input data is very different from texts and so the to bridge this gap
0:06:20you have to do many things and essentially what you have to do is to
0:06:25decide what to say
0:06:26and how to say eight
0:06:28what to say is more in the eye problem so deciding you know what part
0:06:32of the data are you want to select two
0:06:34actually verbalise time because if you were you know if you verbalise all the numbers
0:06:37in the you know given by a sensor it would just make no sense at
0:06:41all of the
0:06:42so without in text with make no sense at all
0:06:44and so you have a content selection problem then usually have to structure the set
0:06:49the content that you've selected into a at structures of the resemble the text
0:06:55then so this would be more like ai planning problem so in this is actually
0:06:59that was handles often with planning techniques
0:07:02and this more linguistics once you have to text structure how do you convert it
0:07:08into well-formed text
0:07:09and there you have to make many choices so generation is really a choice problem
0:07:15because there are many different ways of realizing of things
0:07:18so you know have problems such as you know
0:07:21well to choose to lexicalising that every symbol which referring expression to describe an entity
0:07:27known hear its rays are you going to
0:07:29use a pronoun
0:07:31a proper name
0:07:32aggregation how to about repetitions this is basically about choosing deciding when to using such
0:07:38as ellipsis so-called in addition to avoid redundancy in the output x
0:07:42even for there is redundancy in your or not
0:07:45see but repetition signal knowledge base
0:07:49things actually so
0:07:50basically generating from data at the consensus was there was this a big and then
0:07:54g pipeline where you had to mobile all of these
0:07:57subproblems
0:08:00if you generate a meaning representation the task was seen that's completely different
0:08:04partly i mean a mainly because the gap between in your presentation and the sentences
0:08:10is much smaller as in fact and in fact scissors meaning representation not be a
0:08:15by linguists
0:08:16so the
0:08:17consensus here was that
0:08:20if you can have a grammar that describes you can you know
0:08:25no and the rubber grammars that describe basically the mapping between text and meaning
0:08:31and c and because it's a grammar it also include this notion of syntax so
0:08:34it ensures that that's the text will be well-formed
0:08:38so the idea was you have a grammar that define this mapping between this association
0:08:41between text and meaning
0:08:44you could use it in both direction either you have it x and you use
0:08:47the grammar to derive its meaning
0:08:49but you can use it for generation you start some of the meaning
0:08:52and then you user grammar was to decide you know what is the corresponding sentence
0:08:57given by the grammar
0:08:59and of course is grammar that soon as you have lots coverage they become very
0:09:02ambiguous so there's a huge ambiguity problem
0:09:06it's not tractable basically you get you know tiles on some intermediate results thousands of
0:09:12outputs
0:09:12and the initial search space is used so you combine usually you combines this grammar
0:09:18with some statistical modules that will basically designed to reduce the search space
0:09:23and to limit the output to one always you outputs
0:09:30and finally generating from texas here again very different approach the main point them and
0:09:36consensus again was that you generate from text they are basically foreman operations you want
0:09:41to model i mean all of some of them depending on the application which is
0:09:46pleats rewrite real during delete
0:09:49thing is about learning went to split a long sentence into as the several sentences
0:09:54for example the simplification where you want to simplify text
0:09:59we ordering is just moving constituent around the words around
0:10:04again because maybe you want to simplify the paraphrase is another text to text the
0:10:09generation application
0:10:12you want to rewrite again maybe to simplify of the paraphrase so we write a
0:10:16word or rewrite the phrase
0:10:18and you want to decide what is what you can see deeds in particular if
0:10:21you're doing simplification
0:10:24so in general free many free very different approaches to those free task depending on
0:10:30but the input the is
0:10:32and this completely change with annual approach so what the new one approach that it
0:10:37is it's really completely change the field so now is the before generation was a
0:10:42very small fields and now at a c l so the main completion linguistics conference
0:10:48generation is one of the top
0:10:51you know he gets the top number of submissions i think the second ranking for
0:10:54number of submission in that field
0:10:57so it's really
0:10:58change completely
0:10:59and why changes because is encoderdecoder framework willie allows you to model or free task
0:11:07in the same way
0:11:09so all the techniques the
0:11:10methods that you can never up to improve the encoderdecoder framework
0:11:14there will be novel but you know
0:11:17it is common framework which makes it much easier to
0:11:20take ideas from one field from one stuff it or not
0:11:23so the encoderdecoder framework is it's very simple you have you input and it can
0:11:28be data
0:11:29text or meaning representation
0:11:31you are encoded into a vector representation and then you use the power of the
0:11:37in your language model to the code so that the decoder is going to produce
0:11:41the text
0:11:41one word at a time using recurrent network
0:11:44and we know you know that
0:11:46no language model much more powerful than previous
0:11:49mobile language model because the context of
0:11:52in limited amount of context into account
0:11:56okay so we have this unifying framework but what i want to doing this talk
0:12:01at a
0:12:03of course the that you know the problem still remain the task are different and
0:12:07you still have to handle them somehow
0:12:09so whether we'll doing this talk is so you based on some work we've been
0:12:13doing focus on two main points how to improve encoding or how to adapt encoding
0:12:19to various energy task
0:12:21and if i have time to talk a little bit about training data again off
0:12:25and you know with they stiff this problems that findings disparity that so it's all
0:12:29supervised approach usually and unsupervised mean you have this training data in this case the
0:12:34training data has to be
0:12:36the texts in the input
0:12:38but this inputs candy already how to get so this meeting a presentation you know
0:12:42where do you get them from or even the you know getting an alignment a
0:12:47parallel corpus between
0:12:49database fragment and the corresponding text is also very difficult to get right
0:12:52so often you have you don't have much data
0:12:55and of course this neural networks they want a lot of data so of and
0:12:58you have to be clever about what you do with the training data
0:13:09okay something coding and we talk about three different points modeling graph structured input
0:13:15so we see that
0:13:18the encoderdecoder framework initially at least in the first steps
0:13:22the encoder was usually always a recurrent network so
0:13:27no matter whether the input was that x or meaning representation of a graph
0:13:32you know a knowledge base
0:13:33it was people where using this recurrent network i think order because you know the
0:13:38encoderdecoder framework
0:13:40was very successful machine translation and people with all the building on this for doing
0:13:44this
0:13:45but of course you know after a while people for the about some of the
0:13:49input is graph so maybe it's not such a good idea to model it as
0:13:52a sequence
0:13:53so let's do something of so we talk about
0:13:55how to model
0:13:57graph structure input
0:13:59then i would talk about generating from texas where here i will focus on the
0:14:05an application where the input is a very large quantities of text and the problem
0:14:10is that if you are you know neural networks are only so good that encoding
0:14:14large quantities of text so it's snowing big in fact for machine translation that
0:14:18you know the longer the input these
0:14:20so both the performances
0:14:21and here we're not talking about long sentence is well within talking about single wrong
0:14:25text to have that as an tokens or something so what do you do in
0:14:29that case if you still want to do text to text the generation
0:14:32and i will talk a little bit about jen normalization so some device that can
0:14:36be used in some application
0:14:39again because the data is not so big how can you
0:14:43improve its so you can generalise better
0:14:46okay so first encoding graphs
0:14:49so i say the input so graphs
0:14:53they all curve for example if you have not answering the input or meaning representation
0:14:58so here you have an example from the mr to two thousand and seventeen challenge
0:15:04where is the task was
0:15:06given
0:15:07meaning representations of this amr is means abstract meaning representation you part be considered as
0:15:13a matter to match basically it's the it's a meaning representation you know the rights
0:15:17which is can be written like it's written on the right
0:15:20but basically you can see that the graph with the note that the concepts and
0:15:23the edges of the relation between the concept
0:15:26right so here the
0:15:28this meaning representation idea would correspond to the sentence here
0:15:31us officials had an expert meeting group meeting in january two thousand two in new
0:15:36york and then you see that the you know that the at the top of
0:15:39the tree you have z holds concept and then the arg zero with the person
0:15:46and then account even read it's but contrary to the basically so united state but
0:15:50then there are some concepts
0:15:53so the task was to generate from these from this mr and the mr can
0:15:58this you know the graph
0:15:59was another challenging two thousand and seventeen which is how to generate from set saw
0:16:04a the f triple
0:16:06and so here the
0:16:07we what we do this we extracted the sets about the f triple from to
0:16:11be paid which
0:16:12we had a method to ensure that this that's about the f triple where
0:16:18could be match into a meaningful social texts
0:16:22and then we had crowdsourcing people associating the sets of triples with the corresponding text
0:16:27so that i said in this case where the pilot data set with the input
0:16:30with a set of triples and the output with this text that was available i
0:16:33think the content of this triples
0:16:35so you probably can sit here
0:16:37but for example the exact
0:16:39example i showed here is like you have free triple is that repere is the
0:16:42subject pretty property object
0:16:45the first simple sets junk the harp state and then the date john gonna have
0:16:49birthplace and then the place and then shown to how occupation fighter pilots so you
0:16:53for example you have this very triples
0:16:55and then that is would be to generate something like john blah ha born in
0:16:59some of you on nineteen forty tool with twenty six worked as a fighter pilots
0:17:04so this was or the task
0:17:06and the point again here is that
0:17:09so when you are generating from there are like doing here then this data can
0:17:13be seen as a graph where the
0:17:15well that's a graph of the subject in the pair and the object of the
0:17:19entities in your triples and the edges of the relation between a triples
0:17:27okay so i they send initially people where and you know apply for these two
0:17:31task initially people with that
0:17:33it simply using recurrent network so that we have linear rising is a graph to
0:17:37just to a service offers a graph using
0:17:40you know some kind of traversal methods
0:17:42and then the so they then they have the sequence of tokens
0:17:46and then they just encoding using a recurrent network
0:17:50and so here use in example where you know the tokens
0:17:54input to the rnn always tn are basically the concepts and d and the relation
0:18:00that are present in the meaning representation
0:18:03and then you the code from that
0:18:05okay
0:18:06so
0:18:07of course that problems
0:18:08intuitively it's not very nice you know pure modeling a graph as a sequence well
0:18:14and then also there is technically that some problems that occur in the in that
0:18:20not a
0:18:21local dependency that at low content the graphical we could become long range
0:18:27so
0:18:29it's okay so these two edges here they are the same distance from eight writing
0:18:35the initial graph but now when it's you know right you see that the crew
0:18:39members of the first stage
0:18:40is much closer to the to the a node which is that in the in
0:18:45than this one right so you really
0:18:47the linearization is creating those long range dependencies and then again we know that lstms
0:18:52are not very good at dealing with long-range dependencies
0:18:55so also you know technically you think well maybe it's not such a great idea
0:19:00okay so people have been looking at this and they propose the various a graph
0:19:04encoders so the idea is no instead of using an lstm to encode your linearise
0:19:09graph you propose to you just use a graph encoder which is going to lead
0:19:14is going to models of relation between the nodes inside a graph
0:19:19and you and then you the code from the output of the graph encoder so
0:19:24there were several proposal
0:19:26which i won't go into detail via basically the amount in cohen propose a graph
0:19:30published in the network
0:19:31and this to approach the uses some graph a recurrent network
0:19:37okay so we build on this idea here
0:19:40we
0:19:41at this you know i started this is introduction of putting your own energy because
0:19:46i think it's when important to know all about the history of energy
0:19:51to have ideas about how to improve the new one approach and here is this
0:19:55proposal was really based on the previous approach the previous work on a grammar based
0:20:01grammar based generation so this idea that you have a grammar is that and that
0:20:06you can use to produce a text
0:20:08so in this pre in your work
0:20:12what people show this okay you have to a grammar and you have meaning representation
0:20:16then you use the grammar to decide to tell you
0:20:20which sentences that describe our associate
0:20:24with this meaning representation
0:20:26so you
0:20:27see it's like it's you know it's not good parsing problem
0:20:30if i say you know you have a sentence you have a grammar
0:20:32and then you want to decide what are the meaning representation of the syntactic tree
0:20:36associated basis grammar with the sentence
0:20:39it's a parting problem
0:20:40so all i'm saying what i'm doing it's other reversing the problem
0:20:44instead of starting from the text that stuff of the meaning representation that say what
0:20:47you know what that's a grammar tells me audio the okay sentence si that's to
0:20:51say to do this sentence
0:20:52so
0:20:54it was a parting problem and then people started working on this reverse a parting
0:20:59problem to generate sentences
0:21:01and they found it was very hard problem because of all this ambiguity
0:21:04and they had like two types of algorithm bottom and top down
0:21:07you know eyes are you start from the from the meaning representation and then you
0:21:11tried to be of the
0:21:12it's really a syntactic tree that is allowed by a grammar and you get out
0:21:16of that you get the sentence or you got top-down so you just user grammar
0:21:20and try to be of the relation that are going to map you in jail
0:21:23meaning representation
0:21:25so there were these two approaches and they both had problems and what people in
0:21:28the end it
0:21:29if they combine both approaches so they use both top-down and bottom-up they have some
0:21:34he breeds algorithm which was used which we are using both top-down and bottom-up a
0:21:37information
0:21:39so here this is what we did more that's we
0:21:42the idea was okay and those graph encoders the and they have a unique
0:21:48representation graph encoding of the input graph of the input meaning representation
0:21:52what we want to do is to
0:21:53well they're this idea that both bottom-up and top-down information are important
0:21:58so we are going to encode each node in the graph using two encoders
0:22:03one that's is that goes
0:22:04basically top-down from the graph and the others that goes bottom-up for the graph so
0:22:10is what it's gives us is that each node in the graph is going to
0:22:13have
0:22:14two encodings of buttons that reflect the top down view of the graph and the
0:22:18other
0:22:19the bottom-up view of the graph
0:22:21what and that so in terms of number
0:22:24we could show of course you know the weather with independence that
0:22:26you know the we could
0:22:30outperform the state-of-the-art so those are with the state-of-the-art so this is a more recent
0:22:33one which are of course we are no longer state-of-the-art
0:22:38brenda and the time we we're right so without improving a little bit over as
0:22:44its previous approaches
0:22:46more importantly it's all those numbers i don't what was it that was sitting here
0:22:51blah so of course it's there's always runways evaluation is always very difficult to evaluate
0:22:56those
0:22:57generated text
0:22:58because you don't want to look at them one by one what you can
0:23:02you have to the human evaluation in fact side but if you have large quantities
0:23:06and if you want to compare many systems you have to have an automatic metrics
0:23:09so what people use these learn from machine translation and they are well known problems
0:23:13which is you know you can generate a perfectly correct sentence that match the input
0:23:17practically
0:23:18but if it does not look like the reference sentence which is what you compute
0:23:22your blue against then it will get very low score
0:23:25so you have to have some other evaluation or should try hadley
0:23:30so what we did this one on problem with neural network is semantic adequacy of
0:23:35and they
0:23:36the generate very nice looking texts right because this
0:23:38you know language models are very powerful but often
0:23:42the normal to match the input so it's a bit problematic you know because if
0:23:47you when you want to have a generation application
0:23:49it will it has to match input otherwise in right
0:23:51it's very dangerous the asian in a way
0:23:54so
0:23:55what we try to do here is we wanted to measure the semantic adequate because
0:23:59the semantic adequacy of a generator
0:24:01meaning
0:24:02how much that it to match you know how much the generated text
0:24:06match the input
0:24:08and then we what we did this we use the
0:24:11the textual entailment system that basically give and give a sentence tells you whether the
0:24:16first one entails the other
0:24:18so is the first
0:24:19you know with the second sentence implied so entailed by the first sentence
0:24:23and then if you do it both ways
0:24:25on the
0:24:26owing to sentence si so that's being t s q and that's to intel speed
0:24:30then you know the t and q are semantically equivalent right
0:24:33logically that would be the fink
0:24:36so we did something similar we wanted to check semantic equivalence on text
0:24:41we use these tools that have been developed in competition linguistic to determine whether
0:24:45two sentences on a relation entailment
0:24:47and we looked at of direction between so we're comparing the reference and the generated
0:24:53sentence and what you see here is that the always the graph approach is much
0:24:59better
0:25:00at that i mean at producing sentence see that are entailed by the reference
0:25:04and also much better
0:25:06it's producing
0:25:08sentence is that entails a reference
0:25:14we also the human evaluation
0:25:17where basically the way to questions to the human evaluators
0:25:21is it semantically it quite that the output x match the input
0:25:25and it it's readable and then again you see so this is the in orange
0:25:30this is our system and the result the sequence systems and you see that is
0:25:35a large improvements
0:25:36so this you know this all points to direction where you know using the graph
0:25:41encoder ways you have a graph at least
0:25:43i meaning representation of the graph is a good idea
0:25:47okay another thing we found this
0:25:49it's also a valuable in its often to combine local and global information local information
0:25:57meaning local to the node in the graph
0:26:00and global sort of giving information about the structure
0:26:03all the surrounding graph
0:26:06so in this so this is that it is still the same the
0:26:11dual bottom-up so this is top-down bottom-up souls
0:26:14this is a picture of the system
0:26:15we have this graph encoder that eh could top down view of the of the
0:26:21of the graph the bottom of view and then you
0:26:24so you have
0:26:27these are the than the encoding of the nodes
0:26:32and then what you do is you
0:26:36okay so you end up with free and could free embedding so each node one
0:26:40embedding is basically the embedding of the label
0:26:43the correct things is no one so the concept so that
0:26:46so it's a word basically what i'm betting
0:26:48and the other two of the bottom-up and top-down embedding of the node
0:26:52and what we do is would ban same from an lstm
0:26:55so we have a notion of context for each and would
0:26:58which is given by you know the preceding nodes in the graph and we found
0:27:02that this also improve our results
0:27:05and we also apply this idea so this local plus global information
0:27:10idea to the to another task he has a task was it's a surface is
0:27:16another challenge on the generating from depending on who the dependency trees
0:27:20so the idea is the meaning the input meaning representation is this case is an
0:27:24older dependency tree
0:27:27where the where the nodes are they created with them and so the
0:27:31something like this
0:27:33and then what you have to do is to generate a sentence from it so
0:27:37basically this task has to send task one of them is how to real those
0:27:41of them as into correct sentence
0:27:44and then when you have the correct order
0:27:47how to inflict the words so you want
0:27:50for example you want apple to become apples
0:27:53this case
0:27:57so we worked on that so this was also some work we did with you
0:28:01have any push you coughing in time slot
0:28:05so what we did again it's a so then we transform basically what happened so
0:28:10where we handle this
0:28:16the way we handle this was
0:28:19as follows so he i'm just i'm just focusing he on the word or the
0:28:22problem how to
0:28:24maps this and all the tree
0:28:26to a sequence of or of elements
0:28:28so i'm not talking about the world interaction problem
0:28:31so what we do is we basically binarize achieve for us
0:28:36so every everything become binary
0:28:38and then we have c we use a multilayer perceptron to decide on the older
0:28:43of each child's with respect to the head so here we're going to this that
0:28:48we have a
0:28:49we have a
0:28:51we build a training corpus where we say okay i know you know from the
0:28:54corpus from the from the reference that i proceed likes
0:28:59et cetera and then you so this is a training corpus and the task is
0:29:02medically given to
0:29:03given the child and ahead
0:29:06and the parent
0:29:06how do your there's m is apparent first saw is apparent second
0:29:11so we where doing so it's and then we found again that combining local always
0:29:16global information helps
0:29:18and this is the this is a picture of the of the model
0:29:22you have that the embedding for your two nodes
0:29:25and you concatenate them so you build in your presentation
0:29:28and then you have
0:29:30the unveiling although the of the normal also know that are below other parent node
0:29:35so the subtree the is dominated by the by the parent node
0:29:38and again we found that you know if you combine these two information
0:29:42you get much better result in the world ordering task
0:29:45so that what this shows is that
0:29:47taking into account in this case the subtree
0:29:51you know that
0:29:51top down view of the node of that node
0:29:54hams and it's really helps
0:29:57so here you say you again have this bleu score this is the basic question
0:30:00is one where you do data expansion talk about it later and this is the
0:30:06one with the new encoder so when you when you do take into account this
0:30:09additional global information so you see this quite a big improvement
0:30:16okay so this was a bad decoding graph
0:30:19what i want to talk about now is a bad
0:30:21what you do with what you what can you do when you know the
0:30:25the input text that you have to have there is very large
0:30:32so in particular we look at two different tasks
0:30:42we looked at the different is one of them is question answering on free phone
0:30:45for more web text and the others multi document summarization
0:30:50the first task is you have you have a query
0:30:56and basically what you're going to do is you're going to retrieve a lot of
0:30:59information from the web
0:31:01meaning a lot meaning something like to know that doesn't tokens
0:31:05so basically that the first one hundred sun hits from the web
0:31:12and then you are going to use all six takes as input press the question
0:31:15as input to generation
0:31:18and the task with the generate summaries answer the question
0:31:22so it's quite difficult is this is a text to text generation task
0:31:26and the other one is multi document summarization
0:31:29we use that we get some dataset
0:31:32what
0:31:33is that we keep at article tight also you have a title former we could
0:31:37be the article
0:31:38then you retrieve
0:31:41information again some that a from the web about you know using this title other
0:31:46query
0:31:47and the goal is to generate a paragraph we keep at a paragraph that talk
0:31:53about this title
0:31:54so basically the first paragraph a week over you have to generate the first paragraph
0:32:00of that we keep at a page
0:32:03so here's an example you have the question so this is the a life i've
0:32:07i do it i five is ask is you i'd if you where five so
0:32:11with a pos it is simple language
0:32:14so the question would be why consumers that still terrified of genetically modify logan i'd
0:32:19organisms
0:32:21so there is little debating the scientific community or whether that's safe or not
0:32:25and then you retrieve some documents on the web search
0:32:28from the web and then this is this would be in this in this case
0:32:33the
0:32:33the target and so
0:32:35so not only in the input text very long but the output x is also
0:32:39not sure what is not a single sentence is really a paragraph
0:32:44so the question is how to encode two hundred thousand are then generate from it
0:32:49previous work you know it took a way out in a way they basically use
0:32:54tf-idf to select the most relevant
0:32:58wave hit so they're not taking
0:33:01or even sentences so that are not taking the whole results of the website so
0:33:06just take you know they limit the results to a few thousand votes using basic
0:33:10basically tf-idf score and
0:33:13okay so what we wanted to do with to see could we have with their
0:33:18way that we could encode all
0:33:20all these two hundred thousand words that where we're three from the web
0:33:25and then coded and use it for generation and the idea was
0:33:29to convert the text into a graph
0:33:31no this in this case not
0:33:34not a new mode not a graph encoding but sweeter graph it's embody graph like
0:33:38we used to do in
0:33:40and information extraction and see whether that help
0:33:45so when to see how do we do this
0:33:49and so no so here is an example
0:33:51the query with explain the fury of relativity
0:33:54and here's a toy example right we have to those two documents the idea is
0:33:59that buildings this graph allows us to reduce
0:34:04to reduce the size of the input drastically
0:34:07we see why later
0:34:09so the idea is we use
0:34:12to tools to existing tools from compression linguistics coreference resolution and information extraction tools
0:34:20court what coreference resolution status and gives task is that it tells us what are
0:34:24the mentioning that x that talk about the same entity
0:34:27and then once we know was this we group them into a single node in
0:34:30the knowledge graph
0:34:31and the triples the transform the text into sets of triples
0:34:36basically relation between binary relation between entities and so the band is a relations i
0:34:41used to be at the edges in the graph and the entities that in the
0:34:44nodes
0:34:45so here's an example those two documents and then you have like in blue you
0:34:51had those for mention of albert einstein they will all p combined into one note
0:34:56and then the information extraction
0:34:58we'll tells us that there is this triple that you can extract from
0:35:02from this sentence here i'd but i'd sign a german theoretical physicist publish the fury
0:35:07of relativity you can
0:35:09this is the open i need to will tell you
0:35:12that you can transform the sentence into these two triples here
0:35:17no german
0:35:22the german signals to the german no justice this one here into this triple
0:35:28and similarly take this one here they've lobbed if you're over at t and give
0:35:31you this triple
0:35:36and so is the and that's how you build the graph so basically by using
0:35:39coreference resolution and
0:35:41information extraction
0:35:44and another thing that is that was important was that buildings described
0:35:48it's sort of
0:35:50a giving us a notion of or
0:35:53the important information or you information that is repeated in the input
0:35:58because every time we are going to have
0:36:00you know a different mention of the same entity
0:36:02we will we will keep score of how many times it's entities mentioned
0:36:07and we will use this in the graph representation to give a score to be
0:36:10the weights
0:36:11to each node a to each edges in the graph
0:36:15so here if i goal either bit in more detail so you have
0:36:18if you
0:36:20constructs a graph incrementally by going for your sentences so you first at the sentence
0:36:23here
0:36:24i'd it as a graph or you are you at the corresponding triples to the
0:36:27graph
0:36:28and now you had this one here
0:36:30and you see if you're real for a t v t was already mention
0:36:33so now the corresponding node has a he's is the weight of this corresponding note
0:36:39is incremented by one and you go on like this
0:36:42we also have a filter operation that said you know if
0:36:45if a sentence had
0:36:46nothing to do with the query we don't include it's
0:36:49right so and we do this using tf-idf
0:36:52so to avoid including is a graph information that is totally relevant
0:36:57okay so we built this graph and then we are going to linearise a graph
0:37:02right
0:37:03so it different from previous approach where we are going from sequence to graph here
0:37:08we're going to grab from graft a sequence because the graph is too big so
0:37:11it you know what you could time as a graph encoder but we didn't do
0:37:14this
0:37:15i might be the next step but this that quite the graph so i'm not
0:37:18sure how whistles graphing coders would work
0:37:21so we dinner as a graph but then to keep some information about the graph
0:37:25structure
0:37:26we use this two additional invading so it's token so we have a the encoder
0:37:31is a transform and this case so we have since it's not a recurrent network
0:37:34we have the position i'm betting add it to the word i'm betting
0:37:38to keep track of where the in the sentence or word is
0:37:42and we had these two additional embeddings that gives that's information
0:37:46a bad you know the weight of each node and edges
0:37:49and the relevance to the core
0:37:54so the global view of the of the model was
0:37:58you have your linearise graph as i said with different embeddings for four different embeddings
0:38:03for each node or edges
0:38:05you press it was a transformer we use
0:38:09memory compress attention to this is for scaling better we used up cat tensions is
0:38:14you only look at the point in the encoder which have at the top attention
0:38:22so we encode the graph as a sequence
0:38:26when core the query we combined involve using those that tension
0:38:30and then we decode from that
0:38:36so these pictures he additional or the amount of reduction you get from the graph
0:38:41construction
0:38:42and it's
0:38:43and then
0:38:46the proportion of meeting and so tokens because you might think you know okay you
0:38:50may be choose because by compressing the text into this graph
0:38:54by reducing the redundancies
0:38:56maybe you lose some important information
0:38:58but is actually not the case
0:39:00so what the first graph shows is that
0:39:03if you do website so you have something like two hundred thousand tokens
0:39:09and if we you what we what we are graph construction process that it's reduces
0:39:14to
0:39:14or if e ten thousand tokens
0:39:16right
0:39:17and that we compare this with a just extracting triples on the text and not
0:39:21constructing the graph in you see that you still have a lot of that would
0:39:24not be enough to reduce the size
0:39:27and it seconds a second graph shows is that
0:39:31and it says the proportion of missing answer tokens
0:39:36wise lower bit are missing and the tokens
0:39:40you don't want to many
0:39:43so we are talking about comparing with the reference and sign you want to have
0:39:46as many tokens in your output
0:39:49that come from the reference as possible so you don't want to many missing tokens
0:39:53might so what this shows is the previous approach using tf-idf filtering where you don't
0:39:59consider the whole so hundred thousand tokens but simply
0:40:02i think it's hundred tokens
0:40:06this is what happens if you encode the graph from this eight hundred and fifty
0:40:10tokens that the tf-idf approaches using
0:40:14and the seas
0:40:16and so this is the number of meeting talk and so you know the higher
0:40:19other words
0:40:21but we what we see so we if we encode the everything so this is
0:40:25the one we and core the whole
0:40:28it's not very
0:40:29so we encode the whole
0:40:33one
0:40:34with this one on that but those the what if we encore the whole input
0:40:37text the one hundred with
0:40:40web page
0:40:41and so all the information we take on the web
0:40:43you see that the actually the performance is better
0:40:53and these are the general without again so in this case using rouge comparing against
0:40:59or reference an answer again they are
0:41:02issues with this so here we compare with the tf idf approach
0:41:07with
0:41:09extracting the tables but not constructing the graph in here was a graph and then
0:41:13you see you always "'cause" get to you know some improvements but the important point
0:41:18mainly is that we can so we get some improvement with respect to the tf
0:41:22idf approach
0:41:23you go from twenty eight something to twenty nine something
0:41:27but also what's important in which can we can really scale to the whole
0:41:31two hundred web pages
0:41:35and here's an example showing the output of the system
0:41:39which is a that's a very
0:41:41i've only the very impressive so there's a but also illustrates some problems with the
0:41:46evaluate the automatic evaluation metrics so
0:41:49the question is why is touching make it might micro fibre time of such an
0:41:52uncomfortable feeling
0:41:54then you have these and so this is the reference and this is the generated
0:41:57answer
0:41:58generated answer is you know make sense the micro fibre is made up of a
0:42:02bunch of tiny fibres
0:42:04that attached to them
0:42:06when you touch them
0:42:07the fibres that make up the micro fibre an attractive to each other
0:42:11when that actually attracted to the other end of the fibre which is what makes
0:42:15them uncomfortable so this part is a bit strange but overall it makes sense
0:42:19and it's relevant to the question and you know you have to think it's generated
0:42:22from this and it doesn't talk and so it's not so bad
0:42:26but what it also shows that you know they almost not overlapping words between the
0:42:31generated answer in the and the reference and so it's an example where
0:42:35you know that automatic metrics to give it a bad score actually
0:42:38whereas in fact this is a pretty okay sentence
0:42:48how much time your hand
0:42:54fifteen
0:42:57so with this
0:43:00okay so one nothing about encoding
0:43:03is that sometimes
0:43:05sometimes again you don't have so much data
0:43:09so you model your abstracting away a over the data might help in generalizing
0:43:15it so here i'm going back here to this task of generating from an older
0:43:19dependency trees
0:43:20so you had this no is it input this is what you have to produce
0:43:23as output
0:43:24and a idea here was that so this was another work that and it's with
0:43:29the
0:43:30this was in that study i don't see
0:43:34idea was that a that here we just use before we you know in this
0:43:39other approach we have sort of
0:43:40attaining the two into a binary tree and then having this a multilayer perceptron to
0:43:45although the trees so local ordering of the of the knowledge
0:43:49but we do is we just haven't encoder-decoder which basically learn to map
0:43:54are linearise version of the and all the dependency tree
0:43:57in two
0:43:59the correct order of the lemons
0:44:02so it's different approach and also what we'd it's is
0:44:08we for twelve this work downlink problem is not so much determined by word sits
0:44:12model dependent on syntax
0:44:14so maybe we can abstracts of other words we can just get rid of the
0:44:18words
0:44:18and you know if this was reduce data about sparsity it
0:44:22it wouldn't we be more general we don't want you know the specific words to
0:44:26have an impact basically
0:44:30so what we did is we actually got rid of the words
0:44:34so here you have your input word input
0:44:38input dependency trees that is not older oppose the john the need for example
0:44:42and what we do is we linearize this tree
0:44:46and we remove the words so we have factored representation of each node where we
0:44:50keep track of you know the pos tag the parent node
0:44:54and can't remember with this one days
0:44:58i guess the position
0:45:06well i don't know for the
0:45:09zero one two okay a member
0:45:11anyways important point is that are we got we would within right tree and remove
0:45:15the words
0:45:16so we only keep
0:45:17basically postech information
0:45:19structural information what is apparent
0:45:22and i apparent
0:45:24and what is a grammatical relations the dependency relation between the node and the parents
0:45:29so here you're saying eats for example is replaced by this id one you know
0:45:34it's over
0:45:35you know the parent is a route
0:45:38about sorry it's populated i didn't ct of its related by the what relation to
0:45:42this no to the remote so if i think another example that would be clear
0:45:47john for example
0:45:49where is the subject john here is replaced by id for it's a proper noun
0:45:54so this is a bust act and its the subject and the i think it's
0:45:57missing the parent node
0:46:00okay so we need we delexicalised we didn't linearise and delexicalise the tree
0:46:05and then we build this corpus where the target is that the lexical i sequence
0:46:10with the correct all other so here you see that you have the proper nouns
0:46:13first a verb the determinant and down
0:46:16and basically we train a sec two sec model to go from here to here
0:46:20and then we have a lexical i so we keep a mapping of you know
0:46:24what id one is and then you generate so you can just use of the
0:46:28mapping to we lexicalised a sentence
0:46:32and what you see that it really helps
0:46:35so this a surface realization task is
0:46:38it data for everything about ten languages so it is
0:46:42big czech and english spanish finish french
0:46:45italian dutch portuguese and russian
0:46:48and you see here the difference between
0:46:51doing the sec two sick with whereas a tree
0:46:54contain all the words so where we haven't change of thing and the seeds doing
0:46:58it's without the words of the delexicalised version and you see that for all languages
0:47:02you get quite a big improvement in terms of bleu score
0:47:10and we use a similar idea here on the so this was generating from another
0:47:15dependency trees but
0:47:16there was is all the task you know generating from abstract meaning representation
0:47:21in fact a here we built a new data set off range but the
0:47:25so here is the same idea we represent the notes by a concatenation of those
0:47:29the factored model where
0:47:30each node is the concatenation of different types of embedding so you know it's a
0:47:34can take on the post like the numbers and the and morphological syntactic features
0:47:39and again we delexicalise everything
0:47:42and again we found oops
0:47:45yes and again we found that you know that get this improvement so this is
0:47:49a baseline weights not the lexical i and this is that when we delexicalise so
0:47:54you get two points improvements
0:48:01okay
0:48:03so i think as i mentioned the beginning of and you know the that is
0:48:06that they don't have they are not very being so the in particular for example
0:48:09the surface realization challenge
0:48:12is
0:48:15it is you know there is a training sets a like a two thousand packets
0:48:22so you have to be a little bit
0:48:24sometimes you have to be clever or
0:48:27constructive in what to do with the training data
0:48:29one thing we found is
0:48:32it is often useful to
0:48:37to extend your training data outweighs information that is implicit it's in a to be
0:48:42found
0:48:44is that the implicit in the available training data
0:48:48so again going back to this example where the problem was to we members of
0:48:53the
0:48:53the problem is to
0:48:54we attack the problem by having this classifier that would that i mean
0:48:59so delatable they're of a parent and child
0:49:03and so you know you know the on your training data was like this you
0:49:07had you had the parent and you had the child and you had the position
0:49:10of the child with respect as a parent
0:49:16and this is that we had and we folds well you know it should learn
0:49:19that if this is true then also this is two days that the if the
0:49:24you know
0:49:26if the if the chinese to the left of the parents it should also learned
0:49:30somehow that the parent is to the right of the chart
0:49:36but in fact we found that it
0:49:37didn't learns that we have also what we did is we just and it was
0:49:41payers whenever we had these spans in the training data we add the despair to
0:49:46the training data so we double the size of the training data
0:49:50but we also give more explicit information about what possible constraints there are so usually
0:49:56you know the subject it before the verb at a us thing you know and
0:50:00the verb is after the subject
0:50:03and again you know you see that there is a large improvement
0:50:14and also went swimming poof expands the data is to use competition linguistic tools that
0:50:19are available and that was done already in two thousand and seventeen billion these constants
0:50:25where the idea is that so this was for this so generating for amr data
0:50:29so the training data was
0:50:32it or manually validated or constructed
0:50:35for the task for the for the shared task but in fact there are a
0:50:39semantic parser that if you give them a sentence they will give you the error
0:50:43more so i mean they are not one hundred percent reliable but they can produce
0:50:49a tamil
0:50:50so what you can do is you just generate a lot of you generate a
0:50:54lot of training data are by simply using a semantic parser on available data
0:50:58and this was i think what about constancy so you basically part two hundred thousand
0:51:03you get what sentences with this semantic parser
0:51:07and then so you do some pre-training on this data and then you do some
0:51:11fine tuning on the other shan't as the test set
0:51:18and so we
0:51:20we use this again you know for the first approach i show the good the
0:51:24graph encodes the dual top-down bottom-up
0:51:26i think writing approach and again we so what you know like the other approaches
0:51:29we should we see that knows this really improve performance
0:51:36case on getting to the ends
0:51:38so you know i mentioned some
0:51:41some things you can do a better encoding of your input and better training data
0:51:46there's of course many open issues
0:51:49some of them the affine particularly interesting is a multilingual generation so we saw in
0:51:54the surface realization shared task there are ten languages but is still a reasonably simple
0:51:59that's and you can have some data
0:52:02they take it on the universal dependency tree banks
0:52:06it so what would be interesting is you know how can you generates in multiple
0:52:10languages from data from knowledge bases
0:52:12or even from texas a few at simplifying can you simply find different languages
0:52:18they all cosine supposed to be seen interpretability issues
0:52:23i that's at the beginning you noses the standard approach is this encoder and decoder
0:52:27into and approach
0:52:29which that the wherewithal those modules that we had before
0:52:32but in fact now
0:52:33you know one way to make the mobile more interpretable is to have to reconstruct
0:52:38those modules right so instead of having to sing a mandarin system
0:52:41you had difference different networks for each of the of the task and people are
0:52:46starting to work on this in particular is a fine to cost
0:52:51coarse-to-fine approach
0:52:52where you first for example generate the structure of the text and then you feel
0:52:56in the details of example
0:52:59and generalized inverse is memorising
0:53:02they have been problems with you know that that's that are very repetitive and it's
0:53:07really important to have very good test sets to control for the test set and
0:53:10for example a lot of the data a lot of the shared task do not
0:53:14provide the sort of unseen test set in the sense that
0:53:19i don't know you add generating newspaper texts about you would like the test set
0:53:23also to contain a the test for you know what happen if you applied your
0:53:28mobile two different types of text one so i think this you know having sort
0:53:32of out of main test set is really important for general for the for testing
0:53:36so did normalization of the system is also linked to you know what can you
0:53:39do with also learning of them in addition to go from one a type of
0:53:43text another
0:53:44and that's it thank you
0:53:56is a personal questions
0:54:07i one you shown some results of acceptability of text generation that was something like
0:54:13seven five percent before was sixty percent somewhere in the middle
0:54:18thus annotation and i want to show that to ask you is this like a
0:54:22zero one people say are you accept or and not accept or dts you've a
0:54:28degree was like i don't know so i'd human evaluation you mean yes the evaluation
0:54:32human annotation is usually on a scale from one to five
0:54:36okay because you shown to be percentage i think
0:54:41at some points so i'm wondering if that is
0:54:50readability
0:54:52sorry to compare so they know what in this case in which they just compare
0:54:58the output of two systems
0:55:00so the compare the output of the second set to the output of the crisis
0:55:04ten
0:55:04and then they and they say which ones they prefer
0:55:07so the percent is like sixty four people prefer this one
0:55:17if you have them from one to five okay that because this is preference test
0:55:23right do we know in these similar or reliability
0:55:31good is the score between one to five
0:55:34no i think i think a four
0:55:41to get back to the paper i don't remember
0:55:44but i think it is not is not that one to five or that was
0:55:47wrong so it's been here comparison between two different if they have to do this
0:55:52often systems that don't know which one and they have to say which one day
0:55:54before
0:56:00hi
0:56:07okay
0:56:09the quite
0:56:15so i like to thank you for the great i since you cover their summer
0:56:22kinds of generation start many in the relationship a slu engine texts first summarization you
0:56:30can generate x is very a
0:56:33conversational system and was curious is that's
0:56:39a very different kind of problem the architecture and main a state-of-the-art approach how to
0:56:46converge at that's another architecture they are very different by different sets of that well
0:56:53so
0:56:54so the question is whether we have very different your approach for the given depending
0:56:59on the input of different on the task
0:57:01this version task
0:57:03so i that initially for all online photo for years everybody was using this encoder-decoder
0:57:10are often within a recurrence encoder
0:57:14and the difference is where what was the input space so in that i don't
0:57:19for example
0:57:21going to take as input
0:57:23the current dialog the user turn press maybe the context or receiving some information about
0:57:28the dialogue right
0:57:30if you adding question answering qa take a question
0:57:33and as a supporting evidence
0:57:35so it was really more about
0:57:38you know which kind of input you had and that was the only difference but
0:57:41now more and more and then people are paying attention to you know
0:57:45in fact there are differences between this task so what is the structure of an
0:57:48input what is the goal do you want to focus on identifying important information or
0:57:54you know
0:57:55the problems are very think it remains very different a so you have to tutor
0:57:59so very different problems in a way so this was but that was trying to
0:58:02show in fact
0:58:04but the fact that dialogue and generation there is a thing people at time different
0:58:10approaches to do the encoder and decoder and that problem is not very is
0:58:15and you can you can learn to see of these units that are a lot
0:58:18campaign see that encoding okay that the time things is the decoder generates things not
0:58:26able to generate things more san okay and a and various state transition now
0:58:33yes so there is a known problem your data set and that the potential generate
0:58:38very generic answers like i don't know or maybe you're not very informative we actually
0:58:44working on
0:58:48using external information
0:58:50to produce to have direct systems that
0:58:53actually are produced more informative and so the idea in this case is high or
0:58:57the problem is how you're achieve so you have your dialogue context you have euro
0:59:01user question no user turn
0:59:04and it's a bit similar to the text approach to produce you do you look
0:59:09on the web or in some sources all for some information that is relevant to
0:59:14a what is being discussed
0:59:16and you in the range and so now on you and we joint but with
0:59:19this additional information and the whole busy gives you more informative dialogue so instead of
0:59:25you know avoids is empty utterances
0:59:27but the system now hides all this knowledge it can use to generate more informative
0:59:31response
0:59:34so there are a number of you know calmly i two challenge for example is
0:59:38using providing this kind of data sets where you have it dialogue plus some additional
0:59:43information related to the topic of the dialogue or chat image where you have
0:59:48you have any image
0:59:50and so there's a dialogue is based on the image
0:59:52so again you can is it the dialogue system should actually use the content of
0:59:58the image to provide some informative
1:00:04and so a again this slide is something i think what you have really human
1:00:10in evaluation is i think is something that speaks to a lot of people in
1:00:12this room because it's at least for speech it's been shown that you really need
1:00:17humans to george whether or not something is adequate and natural and many of those
1:00:22things so
1:00:24i wonder that because this to my understanding was perhaps the only subjective human evaluation
1:00:30results double contain only slides so mostly people optimising towards objective metrics do you think
1:00:39there is a risk of overfitting to these metrics maybe in particular tasks or so
1:00:44on
1:00:45do you see where do you see the role of the humans judging generated text
1:00:51in your field now one in the future
1:00:54so the human evaluation of sas will important because is automatic metrics
1:00:59that you need them you need
1:01:00two dev larger system and you need them to compare you know exhaust yet is
1:01:04you have a the output of many system so you need some automatic metrics
1:01:08but they are imperfect right
1:01:11so you all you also need you meant evaluation
1:01:14often the shared task actually organiser human evaluation and they
1:01:18they do this i mean it's i think it's getting better and better because getting
1:01:21a people are getting more experience
1:01:23and they are better and better platform and you know i'm guidelines don't know how
1:01:27to do this
1:01:29we are not optimising with respect to those human
1:01:33objective because it's just impossible right so the over fitting would be a with respect
1:01:38to the training data where you do you know maximum likelihood
1:01:43trying to maximize the likelihood of the training data mostly using cross-entropy so they are
1:01:48say there is some oracles on using a reinforcement learning where you optimize with respect
1:01:53to actually your evaluation metrics for the rouge number
1:01:58that morning spent reading
1:02:02we could and you so to me it's is that the main problem that you
1:02:07cup is kind of during the
1:02:09student a task and right beyond is the about this that's correct
1:02:13looking internet finally the information that you want them right m is the about that
1:02:17my question is very often the type of owns for that you give depends on
1:02:21the type of person that is going to receive is what you will employ the
1:02:24same and sort of you are going to give to work during your soul child
1:02:28or to an expert in the figures that will go with screen your
1:02:31is there any research on how can you can be sooner or limit
1:02:35the answers so that it fits the user
1:02:40there really but i can think of ways right now i mean there is a
1:02:44people often find that you if you have some kind of parameter like this that
1:02:49you want to used to influence the outputs so you have wine input and then
1:02:55you want to different outputs depending on this parameter
1:02:58often just i think this to your training data actually helps a lot
1:03:04i think this was in
1:03:06so people do this with emotions
1:03:09for example should that x p us either should be happy so are there is
1:03:15a might use there is they want to use emotion detector and emotion you know
1:03:19some something that gives you in any motion tag for the for the sentence and
1:03:23then they would
1:03:25produce
1:03:26i mean you need to the training data right but if you
1:03:30if you can have this training data and you can label the training example with
1:03:35the
1:03:36personalities that you want to generate for then it works reasonably well in fact the
1:03:41chat image a data it's nice place
1:03:47it's
1:03:49the dialogue is the dialogue has you have four same image you might have different
1:03:54dialogue depending on the
1:03:56personality so there's i-th input as a personality and the personalities is like something like
1:04:01two hundred and fifty personalities
1:04:04can be you know
1:04:06jockeying serious or whatever and so they had the database the training data taking into
1:04:11account this personality
1:04:13so you can generate dialogues that have
1:04:15about the same image
1:04:16with different on depending on the personal but for example within the same or what
1:04:22would be possible to open a constraint on be the vocabulary but you can use
1:04:26on the output
1:04:27yes
1:04:29so in the encoder decoder
1:04:30yes you could do that
1:04:33this is not something normally people do the just use the whole vocabulary and then
1:04:37they hope that the model is going to learn to focus on that vocabularies that
1:04:42correspond to certain feature but maybe you could say that
1:04:57you already mentioned it somewhat but
1:05:00this you also raises or effective questions on the right the text more than maybe
1:05:06in synthesis
1:05:08that you really need to get it right or two
1:05:11you have some other problems consistency or units indication of something bend it is
1:05:18c
1:05:19this is the proposed a statistical approach or can you
1:05:24can you solve this
1:05:26well that's a i mean i think one
1:05:30when the problem i think that is the l c of the i c with
1:05:34the with the current approach to new approach to generation is that they're not necessarily
1:05:39semantically faithful what okay thing right so you know that they can print think that
1:05:45have nothing to do with the input we can see the problem
1:05:47i'm not sure a syntactical problem in the sense you know that generators are not
1:05:51really out in the one that are not super useful either but in an application
1:05:55so you know for people for industrial people want to develop application clearly it's a
1:06:00problem right because there you don't want to sell a generative that
1:06:04that is not faithful
1:06:08but i mean ethical problems we have plenty in general in nlp
1:06:17that's not time we had so that's a think is documents