Speech Transcript - Lexical Acquisition through Implicit Confirmations over Multiple Dialogues

and then it means

but not accommodating rumbled university japan

i think i try to talk about alex context kind of ipd strong impression upon

is shown over multiple thereof

and this is done during the lack of all university and the quantity fact leads

to the japan

okay so

she using a dialogue systems and should allow a dyad

and must scotty and that's system should apply a

under a

accumulated knowledge during their ideas design example

i used to a safe you know they are all that and that i-vector line

don't take shown you to pay once for try to write

and another a new that it i know that dialogue i like to fly to

right after the tightest that the other station

a three d the user utterances meetings that systems should apply and this kind of

information about the upright airline pilots

nice

and this can be used for the future recommendation

not correctly this kind of tum-initial is time we prepared by fifteen db don't a

but do you think it should to be applied here and you adding the dialogue

and the we also i'm really a closed domain that talked about

and it's you think know ledgebased if necessary records

assuming the dialogue corpus including all lexical item is unrealistic

so i'll talk about i we want to and make that's talked about

i two i buy a new concept dialogue and that will reduce cost to manually

and knowledge base and now we are building a chat about in the food and

the restaurant domain

a laconic target is to apply two and a

could than a hundred and they worry

and the subset tractable us should be able to continue type dialog even for unknown

talent

the idea example source that you that's it i try to cook now supporting today

and those is that it is not supporting it and on down for that system

and simplest cradle simplest case john is

so what even actually boring but is incredibly meetings such simple hundred maybe abrupt question

if the variable that's

so on a local it is in the lexicon are shown to implicit confirmation and

here that system tried to acquire that on court one protocol cut their worries over

unknown time to form a major

i don't see is also an example of a single example i will try to

class you're going to the and that's a great unknown time interval you

o the system predicted

okay we use one protocol category although this i don't amnesty one

this is done by a british previous method

our previous method

i thought to use these

channel that in the grammar and the four types of japanese a character types

and a

it can really

and this may be a intimacy

note that that's is then generate implicit confirmation request we lacked predictive category c o

for example a good initial restaurant how or not on the geography

and if the user on like i think so

they probably user response

and that extend it only if predicted how they what we see if correct why

not for all a user is this use a response

that system a quick story did i mean that

these categories seem to be to show are quite then that system and a quite

that's not supporting the only two in canadian category

and this and it makes k y ix speech components are all mutually it's not

a good for this task

this is mister example of explicit confirmation but

there we need was incorrect a format on be the only one could argue that

system if the model really an italian

so we think that this kind of a comparison we case degraded that you that

extra experience

it's like to hand if also express

confirmation and it is correct

but very well yes so i sometimes have a much more mushroom are not purely

and that system asks if the mushroom are rumoured audio the italian

relative to yes and i think the sets explicitly expression degrade the user experience so

we are now a lot wrote is using a implicit confirmation

okay but so and to determine with that they were either correct or not if

this card

i because you use that the response

you use that is boundaries b is expression

and that includes not only simple affine what people and negative responses

okay these lists either you very edus samples so you that it is picked up

and or yesterday and the that's if their masks i want to one if you

to japanese food

they that users did not exactly well what are you talking about

the by using these as a responsive that system can easily recognise that of the

predicted after what do you want incorrect

on the other hand in it you example or you that it is based upon

goal yesterday under that extensive i wanted to stop and you the food

and they do that

i likely to

so this is a difficult to determine

and italian so

this is not while it does not want a cue to did i mean that

but it is to predict it cut they were collected or not

so a lot of a obvious out a our problem without is to take various

features into consideration and before and after that increase the complementarity with

and i another i

about the other programming to be sort of it in this one that are you

that do not always respond to correctly

so that i don't that means that there are sometimes inconsistent

so this is also incorrect

confirmation via estimate of the onion started around and the japanese with the

and this user response i've just

if a guy's if there are you gotta that this

is this indicates that

activity correct

they are incorrect nor its will be added into the system not it's

so i left second problem is old is to exploit responses over multiple dialogues

okay so let me someone it our proposed method is a first one is to

design a deficit

a whole machine learning based classification

and that consider expression i dunno simple of comedy or negative responses

and that miss out that exploits user utterances around i don't know that the mean

before and after

implicit confirmation request

and all i think on the proposed method is to exploit the determination without although

multiple dialogue

and this become possible if other system is deployed on sr for this is a

conventional one-to-one dialogue

but now building but at about and it is the ensemble

so that system channel interact use a multiple you that's

if the same content

so we integrate that is out and

user that without

for determining

without that

quite a pretty story

predicted cutter will easily collect one

okay so this is overview of our method o cost

i've that i think i explained

are you that some unknown town

i do that system generate a implicit confirmation with a

predicted category c and the now use a i thought this so

utterance

and it's and that system calculate the probability p w three from a single user

response at this point i

and the next that system

and according to their responses problem and use that's

that is or like this

then after that we calculate a major role these a probability i'm so by integrating

believe probabilities are be forwarding to find out of confidence major to detach with it

to collect one

okay so this evaluation so i explain to the background on the proposed method and

the problem now i am you explain got a log files may result in more

detail

and the data for extra and experiment

and the next i explained that our

second propose a result

and without on the computer we computed my talk

for five middle part of our proposed with

well

so you calculate the probability that the response is that it is i believe and

without category c for unknown time w if the covariance relative to collect a note

but a

we i introduce our notation for you wanted to

so you don't is therefore user utterance

you are containing the unknown time w

and it's

increase the controversial

group-based

including the predicted included category and you do is that this response to

and the here we use

using logistic regression for

pretty for determining made it is predicted category the correct or not

and we incorporated in table p g s

so for the loop is expressed you do so

not only affirmative or negative expression but also some of the expression and we also

see this you expression and its relationship with what do you wanna under u two

and finally we also incorporated a relationship between you want you to

and to decide are listed and the teachers

so this part of the six

are constant six speech as

under these that

expression in u two

for the for the two is a complex wanted to the baseline and we also

you can incorporate it either voltages

and the second group represent a that express shown you to an adaptive a user

utterance before all have actually correct

and that

the last along its relationship between you and we used that means

are you a way that you want you to contain

the same one and whatnot

and also featured by

what is a before the result hundred

but data collection

so we collected a user utterance it's before and after implicit confirmation request

a fast by of clauses fourteen

the first we ask a walk while "'cause" to encode a think this is really

about a specified by i

for example i eight by how to fold up for that

and so then that system responded

i generate input is to call have initialized an implicit confirmation request

that it is that correct or incorrect

these are requests correspond to a this specify the time so we pretty be able

to increase to confirmation requests a for each specified that are

for example italian particle for data twenty well i mean the dishes

and then we ask the user we ask

the walker to respond to do this

a confirmation request

so we pretty the other a twenty channels under their corresponding correct and incorrect if

which to cover image only based

and the we asked

although one hundred workouts

and the quality a lot of two thousand and of their own

and we after that we excluded embodied utterances

what is so this is the result of user logistic regression only ten fold cross

validation

and the we gotta that can cut their policy is correct if the probability was

like larger and larger than zero point five

and of

this low so the baseline this really the proportion result

and this

table shows out a confusion matrix

and we can see that a classification accuracy improved

and

especially no precision of the detection of the product cut a woody

improved

and this the most significant feature was if able that the you might include the

cut they were eating use it is one

so that means and that same topic if the shared what the shared

the u one and this one

and that it is you insignificant a feature but not that it's a user included

start of it

then it in this result shows that proposed the p to improve the detection of

incorrect categories

what is needed to move along the next

that second problem in front

for this is a position she are so we take great it's the

probabilities and the

i integrate

that

probabilities

so this continuous major is to determine collect cut they what is wrong in the

user responses

so easy a also used a logistic regression

so we actually we test it as a regression function such as a random forest

additional buttons we showed that it up it out of the logistic regression

i don't we use this by the feature list at each year

and this undercutting what we see a very valid have correct one time with w

a range that computers the major xt does three shows so we change it is

a shorter the value and the if corpus it

exceeds a threshold

the system channel at the same

to that's if they would knowledge base

what is so this is a conditional so we use of that same data

as i explained before

and the we divided them into to rate the training and test with it

to make that experience perfectly open use a block on the request

and the we selected in this policy is happening with that probability a longevity from

a forty nine or forty eight one of the discourse in that we have all

a lot of data fit

and the daily all that in the feature value problem that in response it according

to the computer vision

and this

i really show the result

and this me six we present a fast and that the recognition performance improvement by

multiple user responses the second one is how many sports event need to acquire these

are correctly

they were

the third one is how to fit furniture for constant

well this is that it out for the last question

so we introduced

e but so that the meantime break even point until indicating

it indicates that parting mean precision rate is equal to the recall rate

so i received operational and recall car and the we can see that p b

e p value all and do not have to while larger than

the top and recall while so

that point and the diva any the larger than two

so this means that added to the user response is i had able to improve

the logistic regression functional

performance

and the two on the to determine if the predicted categories collect on

under the second question is how many user responses are needed

so i wrote it

the increase in to be p-value i do that

a horizontal axis is done about in

so this pdf on this graph shows that increases in viterbi peabody while that's right

in the in the one small

so this indicates that it is worthwhile to ask them why users

for

that's a that's what you that's

implicit confirmation request

and the we all we can also see that the deep they diminished in

become between class

so this means that the d needed to be done problem asking what use

the final question is how can we fit to the threshold

so we think that hype original data are required because

that systems should avoid applying incorrect information

so we set high spatial so that the pressure data becomes also almost a one

and the we predict recall rate in this day

so we can see that the recall rate for indy five

well as a zero point two one table one seven five

so it is another all but we think but the old because

we want to avoid a writing imported incorrect information

and we also see that article recall rate but it only increase if we've

so this means that substandard high threshold you know that system to a quite more

categories along with high pressure right

okay so let me somewhat i to this talk

so a lot of timit goal is enabling with a realise that system that allowed

to do that is dialogue

and the tackled in this paper is to determine you stuck at a forty if

correct or not a sort of that you wish to complement your process

and the we propose to the middle part by dividing a feature set and that's

the kind of it integrates the probability o into one complex vector

and the result so that performance was improved

so our future work is o

two for the party line you are used to compare the implicit and explicit confirmation

when you quit

so we assume that an implicit confirmation it's a bit

in that in the viewpoint of the user experience but we need to verify

under the second one is to incorporate the proposed effort into it prototype

the taurus it for your attention

okay so we have about that so many possible questions

and it's

okay

in future work

g u i

i think

right

you

three

so you

once

yes it's incentive for your comment and the i think we you

and you want to use it to undo a we need to

carefully designed that

com experiment and

i only that we needed to compare the just techniques speech type and but you

proceed

i and also

explicit and

not a kind of intuitive and based centre for document

a question

so one to the to do so it is to have the system so

is you

rubber goal is to tell you not just cool

so that gives the user chance to say no stop

or just

not do anything so it is not as intrusive so you don't do not problem

but sometimes very clear that this make this assumption

one point is don't being with this method in that sense about

so intent to talk about that

one cory to

enjoying the conversation

and you said that is we think that this kind of are repeated based on

you very annoying

four we want to

and on the cheap conversational a continue and it

as a to do that we are introducing a implicit confirmation

you just a i mean i do so it is not a question is just

the state so the user doesn't a response you don't to the dropped it is

you that's

most of your research on

there are lots of the target restaurant

that combining the implicit and explicit intuitively if you have a high confidence

understanding you could use the implicit and it very well conference

my is the explicit

you know that seems a little more natural maybe getting over some of the deception

issues

the first question

consider you could also they can

these things got nothing to annoy keeping a threshold how many questions are allowed to

ask but it those kind of techniques

so i already that under a at various you on the data that dialogue strategy

so we just think that repeating with this kind of expressions to computationally very annoying

so but we need to

but i thought you could speechto coverage on implicit confirmation

so i don't

for convenience if one to control that confirmation process

okay so it is just about times so that each sensor speaker

Lexical Acquisition through Implicit Confirmations over Multiple Dialogues

Second WOCHAT Special Session on Chatbots and Conversational Agents (WOCHAT-SS)

Kohei Ono, Ryu Takeda, Eric Nichols, Mikio Nakano and Kazunori Komatani