0:00:15this is the work of my phd student law may write him or not who
0:00:20it's from here on
0:00:21incamera currently leave the united states
0:00:24so i
0:00:26presenting our work with your and she's finishing phd not very good situation
0:00:33alright
0:00:34so our
0:00:35and the motivation for this work is that
0:00:38a narrative structures occur all over different kinds of natural language genres you see "'em"
0:00:43in restaurant reviews you see it in
0:00:46in newspapers
0:00:47and this seems to be because humans
0:00:50the way that they work advice
0:00:52the world is in terms of narrative structure so this kind of fits in very
0:00:56well with the added tailors
0:00:57i talked this morning that people are always trying to look for coherence a lot
0:01:01of that coherence can kind of be framed as a narrative structure
0:01:06finally agree
0:01:07that narrative understanding requires modeling the goals of the protagonist in tracking the outcomes of
0:01:13these goals whether the goals are being fulfilled or not
0:01:16or thwarted
0:01:17and first person
0:01:20social media stories are actually full of these expressions of desires and outcome
0:01:25descriptions of for example here's something from a log site like journal where it's very
0:01:30similar to like most of where our data comes from
0:01:34so somebody's telling a story something that happens at a at a concert
0:01:38slight drop something it was dark about the cellphone hardly look for we spoke a
0:01:42little bit it was loud and so can really talk
0:01:45i had hoped to
0:01:47asking to jointly forgery first subpoena to the shower likert with alarming to do such
0:01:51thing
0:01:52but he left before the and alliance em after that maybe outright missed connections
0:01:56so this
0:01:58sentence here i had hoped to ask him to jointly for a drink or something
0:02:01i shows an expression
0:02:03of a first person desire and one of the reasons that were interested in first
0:02:07person stories is because
0:02:10we don't have to deal with co reference it's quite easy to try to the
0:02:14protagonist is in the first person
0:02:17narrative so we can kind of tracker
0:02:20the near the protagonist goals in this case
0:02:24which makes the problem of a little bit more tractable
0:02:28so what we do is we identify goal and desire expressions in
0:02:32first person narratives like for example that had hoped to in the previous we have
0:02:36a bunch more all
0:02:37tell you more about how we get an
0:02:39and then we well we aim to do is to infer from the surrounding texts
0:02:43whether or not the desires fulfilled or it's not fulfilled so we want to actually
0:02:48we the narrative and be able to predict whether the desire
0:02:53is fulfilled or not
0:02:56so in this particular case the one i showed you
0:02:59we have this a phrase but he left for the and i didn't seem after
0:03:02that which clearly indicate that the desire was unfulfilled
0:03:08and
0:03:09as i said in this kind corpus that we have so we have a corpus
0:03:12of about nine hundred thousand
0:03:15first person
0:03:17stories from a blogs domain
0:03:21these
0:03:22excuse the for you know if i try to do that the practise
0:03:27so
0:03:28i this was a slight a lot i didn't have a mare
0:03:32but there's the these first person narrative are just right
0:03:36with these desire expressions you can get as many as you want as you could
0:03:40possibly want out of out of this kind of
0:03:43data and they have lots and lots of different forms like i wanted to i
0:03:47wish to i decided to i couldn't wait to
0:03:50i aim to i arranged to and i and i needed to
0:03:53and this paper we it's true that states can also expressed desires like
0:03:59if you say something like i'm hungry
0:04:02that implies that you have a desire to get something to e
0:04:05so we initially we had a goal that you we might be able to do
0:04:08something mistakes but we decide in this paper to restrict ourselves to
0:04:12to particular verbs
0:04:15and have tense
0:04:16expressions
0:04:18so
0:04:20okay
0:04:21so the related work the previous work was
0:04:24in a around twenty ten was the first paper on this by ellen trial often
0:04:28hers to deny me well who were trying to implement a computational model of twenty
0:04:33lenders
0:04:35plot units for story understanding
0:04:37and one of the main things that you do in that model plot units is
0:04:42that you try to
0:04:43tracking identify the
0:04:45states of the of the characters
0:04:47the dataset they use with say stops fables
0:04:50and they manually annotated a stops tables themselves to examine the different types of aspect
0:04:55expressions and narratives and one of the things that they claim of this paper is
0:04:59really interesting paper you have read it one of things that they claim is that
0:05:03i states are not expressed
0:05:07explicitly like i was
0:05:08the character we saturday character was happy but that they're implicated or you data right
0:05:14the inferences
0:05:15by the tracking the character schools and they claimed in this
0:05:19and seminal paper that we need even though it's been a long time ai idea
0:05:24that what you wanted to extract people's intentions and whether they're being realised or not
0:05:29that in natural language processing we need to do much more work
0:05:33on tracking a goals and
0:05:35and their outcomes
0:05:38is there is also a recent paper by selecting chatter of at
0:05:42where she kind of picks up on this idea tracking expressions of desire and their
0:05:47outcomes
0:05:48and they did this in a two very different corpora from cars they to date
0:05:53and m c test which is the corpus from microsoft
0:05:56of crowd sourced stories that are suitable
0:05:59for machine meeting a task and the stories are supposed to be understandable by seven
0:06:04year olds so you get to start expressions like
0:06:07johnny wanted to be on the baseball scene
0:06:10he went to the parking practised everyday you know so
0:06:14that kind of story and then they also to passages from wikipedia
0:06:20and tracked desires there and then you and you get stuff like lenin wanted to
0:06:24be varied in moscow
0:06:26but blah so they they're very different than ours than when i first heard a
0:06:32presentation of this paper
0:06:34i thought that aren't at it so much more suitable to this task data we'd
0:06:39already been working on for several years
0:06:41a narrative understanding that our data is so much suitable and so much
0:06:45so prime with this particular task that we had to try it on the our
0:06:50datasets
0:06:53so
0:06:54so we made a new corpus which is publicly available you can download it from
0:06:58our corpus page
0:07:02we have three thousand five hundred it's really high quality corpus where i'm super excited
0:07:06about it being able to do more stuff with that
0:07:09with three thousand five hundred first person informal narratives with the annotations you can download
0:07:14it
0:07:15and we
0:07:16and in this paper might talk about how we model the goals and desires and
0:07:20they're gonna talk about some classification models that we've done i don't know why my
0:07:24slides going on the bottom
0:07:26thing there
0:07:28but
0:07:30what do we do a feature analysis of what features are actually good for predicting
0:07:34the fulfilment outcome
0:07:35we look at the effect of both the prior context in the pos context and
0:07:39i don't even know what that last thing is
0:07:41on there that we do
0:07:46this is gonna be a problem
0:07:50i hope it'll be
0:07:52can i x
0:07:56then by slides are going on the bottom of the
0:08:05ok starting a subset of the spinner corpus which is publicly available corpus of social
0:08:11media from
0:08:13collected at one of the i c w s and task
0:08:16and we restrict ourselves a subset of the spinner corpus that comes from these traditional
0:08:22kind of journalling side like journal by a
0:08:25so you can get quite clean data by restricting your style
0:08:30two
0:08:31things from the spinner corpus that just come from particular blogs websites
0:08:38should we use the power
0:08:42and you
0:08:44i it works fine on my
0:08:46you know
0:08:48for
0:08:52i guess
0:08:54it is not one can to right now
0:08:59we can continue
0:09:02and you think the pdf would be better
0:09:05no you can see okay alright
0:09:09okay
0:09:19okay
0:09:21alright
0:09:22so we have a subset of the spinner corpus
0:09:26we have this like what we claim is a very systematic method linguistically motivated method
0:09:31to identify just a wrinkle statements we collect the context before the goal statements in
0:09:36the context after five up to five utterances perform five utterances after
0:09:42and then we have we have mechanical turk task where we put it out of
0:09:46mechanical turk and we collected a gold standard labels for the fulfilment
0:09:51status to be i actually also ask the turkers to mac
0:09:54to mark what the spans of text
0:09:56we're for evidence for fulfilment are not for film that
0:09:59but it is
0:09:59paper we don't do anything with the evidence
0:10:04okay so i kind of refer to this before the many different linguistic waste expressed
0:10:09desires and so one of the things that my colleague phenomenon was struck by the
0:10:13prior work with that it was the limited in terms of the desire expressions that
0:10:17they looked at they just looked at hope to which two and wanted to
0:10:21i think that's motivated probably by the fact that the and c test corpus is
0:10:25very simple and written for
0:10:27seven year olds and it was crowd sourced of maybe they didn't have very many
0:10:31expressions of desire in there
0:10:33but our data is open-domain is very rich we have complex sentences complex
0:10:39temporal expressions we have all kinds of really great stuff for it on there are
0:10:43really encourage you to have a look at the data
0:10:47so what are not it was he went through framenet and picked every kind of
0:10:52frame really thought could possibly have a verb in it that would or state it
0:10:56would express the desire expression
0:10:58we went through we made a big list of all those then we looked at
0:11:01their frequency in the gigaword corpus to see which things a kind of most frequent
0:11:05english language not just dinars
0:11:08we pick a thirty seven verbs we constructed their past tense
0:11:12for patterns with regular expressions
0:11:15and then we
0:11:17and then we put those out against arg database of nine hundred thousand first person
0:11:21stories
0:11:22and we found six hundred thousand stories that contain verbal patterns of desire
0:11:28so this is kind of what it looks like we go five sentences before and
0:11:31five senses after the reason that we go five sentences before is that
0:11:36there is a oral narrative claim that the structure of narrative that you often for
0:11:41chateau
0:11:42something that's gonna happen so unlike the previous work we took the prior context
0:11:47so we have the prior contact the desired expression and the
0:11:51and the pos context in our goal is to use the context around the desired
0:11:54suppression to try to predict whether
0:11:58the
0:11:59to express desires that
0:12:02so we sampled from a corpus according to a skewed distribution that match the whole
0:12:08original corpus we put three thousand six hundred eighty samples out for annotation
0:12:12exhibiting sixteen verbal pattern
0:12:15and we show the mechanical turkers
0:12:19what the desire expression was that they were supposed to match "'cause" sometimes and story
0:12:23might have more than one
0:12:25in it
0:12:26so we show them the one that they were supposed to predict the fulfilment status
0:12:29for
0:12:31we had three qualified workers for utterance
0:12:34this is really annoying
0:12:38sorry
0:12:40we ask him to label whether the desired expression was fulfilled in to mark the
0:12:44textual evidence
0:12:48so
0:12:50we got agreement almost have a mechanical turk we did three qualify the workers the
0:12:55kind of make sure they could read english and that they would kind of paying
0:12:59attention to the task that's typically what we do when we have a task like
0:13:02this is that we
0:13:03we put it out a lot of people we see who does a good job
0:13:06and then we
0:13:07go back to the people that have done a good job and we say will
0:13:10give you this task exclusively we pay then
0:13:13well
0:13:14and then we'll and then go off and do it for whatever it takes like
0:13:18a week or two
0:13:20and
0:13:22we got on the stuff that we put out there we got that there were
0:13:25a seventy five percent of the
0:13:28but to start were fulfilled sixty seven percent ground of l and forty one percent
0:13:32of them
0:13:33where and don't from the context
0:13:36if i'm not in presentation the next
0:13:39i shows okay so
0:13:42the
0:13:43one thing to notice is that the verbal pattern itself harold's
0:13:47the outcome so what how you express the desire is often
0:13:52kind of conditioned on whether the desired actually fulfilled so if you look at
0:13:57does decided to
0:13:59decide it's you kind of
0:14:00implicates that the desire is gonna be fulfilled
0:14:03if you use the a word like hope to it implicates that the desire is
0:14:07not gonna be fulfilled
0:14:08but there is but something like wanted to
0:14:11it's more around fifty are needed to so there's a there's a prior distribution that
0:14:16is associated with the selection of the
0:14:18of verb form
0:14:21and so what we have like this database you know like a set you can
0:14:24download it
0:14:25we think it's really lovely
0:14:27testbed remotely desired can personal narrative in their fulfilment
0:14:32it's very open domain we have the prior in the pos context we have pretty
0:14:36reliable
0:14:37annotation so that's one of our contributions is just
0:14:40a corpus
0:14:42so accented talk about the experiments we did so we define feature sets motivated by
0:14:47narrative structure some of these features were motivated by the previous work by
0:14:52garland right off and by think it is
0:14:55chatter various experiments
0:14:57and then we ran another class different kinds of classification experiments
0:15:02to test whether we can actually predict desire that
0:15:06fulfilled desire and we also apply our models to chatter babies data which is also
0:15:11publicly available and so we compare directly
0:15:15how our models work on their data and are data and all those datasets are
0:15:20are publicly available
0:15:23so some of our features come directly from the desire expressions are in this example
0:15:28eventually i just decided to speak i can't even remember what i said people were
0:15:32very happy
0:15:33and proud of me for saying what i wanted to say
0:15:35so the first
0:15:37i think that's important is that is the desire for
0:15:41obviously like whether it's decided to are both sure wanted to
0:15:45and then what we call the focal word which is the embedded for underneath the
0:15:50desire expressions so we pick the verb with stem it
0:15:53so in this case it's p
0:15:55we then look for other words that are related to the vocal words in the
0:15:59context that we look for synonyms and antonyms of the vocal words and we count
0:16:03whether those things occur
0:16:05we look for the desire subject in its mentions all the different places with the
0:16:09desire subject which is in our case is always first person
0:16:13get mentioned
0:16:14and
0:16:16and we have those features we have discourse features having to do with whether there
0:16:21is discourse relations explicitly stated
0:16:24and classify these according to their occurrence in penn discourse treebank
0:16:30is that we just there is an inverse indexing penn discourse treebank annotation manual that
0:16:35gives you all the
0:16:37all the surface forms that were classified as a particular class of discourse relations that
0:16:42we just take those from there so we have two classes violated expectation or max
0:16:47median expectation
0:16:49and we keep track of those
0:16:52we have sentiment flow features
0:16:54i miss anything
0:16:55we have connotation lexicon
0:16:58sentiment flow features
0:17:00so we look and see whether the over the passages that are in the story
0:17:05whether the sentiment changers stuff it starts positive in goes to negative or starts negative
0:17:10it goes to positive that's not feature that we keep track a
0:17:15and we have these four types of features that are motivated by the narrative x
0:17:20it characteristics the paper goes into detail about the appellation experiments that we do
0:17:25to test which kinds of features
0:17:28we use a d
0:17:30a neural
0:17:31network architecture that sequential architecture to do this i'm running out of time
0:17:36and
0:17:38we also compared to just plain logistic regression
0:17:43on the data so we have two different approaches for generating the sentence embedding jesus
0:17:48get caught
0:17:50the pre-trained skip but models which like we can combine the we concatenate the features
0:17:55with the with the sentence embeddings and we can use that as the input representation
0:18:00and we also have a convolutional neural net
0:18:03recursive neural net
0:18:05so we have it is three-layer architecture and
0:18:08what we do this we sequentially go through the prior context in the pos context
0:18:12we also did experiments where we
0:18:15distinguish we tell the learner whether the prior context of the pos context surprisingly to
0:18:20me
0:18:21it doesn't matter
0:18:23if you if you
0:18:25so what we do is we do that have just to remember we have eleven
0:18:28sentences we have five for five after the desired expression
0:18:32at each stage we keep the desire expression
0:18:36in the inputs to each time we meet in a new next
0:18:40sentence context
0:18:42and we keep the desired expression and then we kind of recursively call routing get
0:18:46the next one
0:18:47so that how we're keeping track of the context
0:18:52and we did some experiments on a subset of
0:18:55desired to be which is meant to match more closely the sense that
0:19:01chattered at worked on
0:19:03that only have a expressions that she looked at only have two minutes
0:19:07okay so
0:19:11we look we wanted to do these ablation experiments in the with these different architectures
0:19:17and so
0:19:19there are first thing compared to bag of words with skip well it shows that
0:19:23have been these linguistic features actually matters
0:19:26for the performance on this task not just the embedding
0:19:30so we get an overall f one of point seven for predicting
0:19:36fulfilment versus not fulfilment
0:19:38we also have results that show that
0:19:41are
0:19:42theoretically motif plane motivated claim that the prior context should matter not just the subsequent
0:19:48context
0:19:49this shows this slide shows that indeed
0:19:51it does improve over just having the desired that's alone to have the prior context
0:19:56of course if you have the whole context
0:19:58you can do even better
0:20:01and then we compare
0:20:05certain individual features
0:20:07bag of words versus all features versus just the discourse features than our best result
0:20:12is that just the discourse features this is actually i in my view is kind
0:20:16of disappointing that just the discourse features by themselves
0:20:20to better than all the features
0:20:23so if you kind of tell it paid just pay attention to these discourse features
0:20:27and i would consider like that
0:20:30interesting next thing would
0:20:31be to do would be to explicitly sample the corpus
0:20:36so that you
0:20:37selected stuff that didn't have the discourse features so that you could see
0:20:41what other features come into play when you don't have explicit discourse features they're
0:20:49so interestingly it's a similar features and methods achieve better results than the fulfilled classes
0:20:54compared to the unfulfilled class
0:20:56and we think that's because it's just harder to predict
0:21:01unfulfilled case it's more ambiguous and the annotators the human annotators that think problem
0:21:06and supposed to stop now
0:21:08those
0:21:10we actually kind of really surprising to us we got much better results on chat
0:21:15of eighties dataset then
0:21:17they did them
0:21:18cells
0:21:19okay so i stopped and i will take questions
0:21:24i will be about that slide
0:21:33questions please
0:21:37nobody
0:21:48right so you see it is if you looking at just verbal patterns for these
0:21:53designer
0:21:55expression give you a do you come across
0:22:00nonverbal patterns that might expose designer or
0:22:05there are there are not verbal patterns like you can easily see like if somebody
0:22:10says i
0:22:10suspected to then you could have my expectation was that
0:22:14or you could have
0:22:16but what we
0:22:17we so we didn't
0:22:19we did a search against the corpus where we pulled a bunch of those things
0:22:23out my expectation my goal my plan
0:22:26and also some of these data is like
0:22:28hungry thirsty tired whatever that might indicate some kind of goal
0:22:34and we just decided that to leave those things aside
0:22:38for the present but there definitely in there
0:22:40so if you're actually interested in those you could easily probably find those semantic in
0:22:45things like you know just purpose clauses lot of these contexts are in this that
0:22:51they don't actually have the words of you know you don't want to go somewhere
0:22:55you can see in order to go somewhere
0:22:57are you don't actually get verbal patterns so those the there are
0:23:03i'm just wondering how many other kinds of
0:23:06hundreds the might be i just think there is lots of other there's lots of
0:23:10other patterns and it's actually
0:23:12what's really interesting is how frequent want is
0:23:16so in this
0:23:18so for our data
0:23:20once it is the most common
0:23:22of the verbal patterns wanted is the most common expression
0:23:26and you could do quite a lot if you just look for wanted to assess
0:23:29it we have all these we have all these different ones and we are also
0:23:34able to show that they have different biases as to whether the
0:23:38the goal be fulfilled or not i have not just comments at the end because
0:23:43usually the kinds of but when you're talking about non fulfilment that's indication of
0:23:49expectation
0:23:52and i wouldn't have thought that
0:23:54the work decided
0:23:56gave generated that expectation that can get counts other words the time at your this
0:24:05probably due but not decided
0:24:08so
0:24:09but you
0:24:12sighted
0:24:14was unfulfilled
0:24:15sixty some design a design bill eighty seven percent of the time
0:24:20what that shows right
0:24:22so we strongly a strong into intuition before we put the data for annotation
0:24:28it's just and it is fulfilled eighty seven percent of the time which is what
0:24:32i would expect it's just looking at the and it's unfulfilled nine percent of the
0:24:36times right
0:24:37okay
0:25:11it's interesting they
0:25:13you could you could see a difference there i had very strong intuition that a
0:25:19lot of these would be interesting
0:25:20that they because it would be implicated so i'm actually quite interested in the fact
0:25:24that i used in ten percent of the time we decided to it's actually not
0:25:28fulfill
0:25:29and
0:25:31and there's these different cases that not fulfilment which we're looking at an art in
0:25:35our subsequent work
0:25:37like
0:25:37sometimes the key goal is not fulfilled because something else comes along just side actually
0:25:42to do something now
0:25:43so it's not really unfulfilled is just you kind of changed your mind like
0:25:48we wanted to buy
0:25:50playstation
0:25:51and we went to best buy
0:25:53and the
0:25:55we were on sale so we came home with that we
0:25:58so it mac if the kind of higher level goal
0:26:01is actually fulfil that they wanted some kind of entertainment system but the expressed desire
0:26:06was in fact not fulfilled those
0:26:08those are maybe about i don't know
0:26:11eight percent of the cases of the ten percent something might not
0:26:16okay since we're running over time should mix people forced segmentation