0:00:15Thank you, Brian.
0:00:16Good morning.
0:00:17It is my honour to be here
0:00:19and this is my first time in Interspeech.
0:00:23I thought that
0:00:24Interspeech have an interesting culture that before the talk there is dancing and singing,
0:00:30but it was not offered to me.
0:00:32So
0:00:33and I saw many old friends here, many familiar faces
0:00:37and this community is technically like a cousin to me.
0:00:43I have learnt a lot from this community.
0:00:46I'm working in signal processing in general
0:00:52with application to communication, multimedia and many other things, but not speech.
0:00:56I don't know why, but I found the answer yesterday.
0:01:05thing, okay.
0:01:07Yesterday,
0:01:09when I first came to America as a teaching assistant
0:01:13I was teaching assistant for signal systems
0:01:16and I had to teach students there's linear shift in invariance ??,
0:01:21there's a concept ??
0:01:23And every time they were laughing behind me,
0:01:25so I asked them once: Why are you laughing?
0:01:28And they told me.
0:01:30F. You need to pronounce your F.
0:01:35That's why I cannot do speech processing.
0:01:38So
0:01:40today
0:01:42just like Brian said,
0:01:43there's no speech, there's no keynote, if I submit a paper to Interspeech it will
0:01:47be 'yes' or it will be rejected,
0:01:49because it doesn't match heading, keynote yesterday or keyword at all. However
0:01:58I hope this can be useful to you. I learnt something like hidden Markov model
0:02:03and deep learning
0:02:04from this community and I hope that what you see today, the idea, may be
0:02:08useful to you.
0:02:10Okay,
0:02:10so the story begins.
0:02:12First
0:02:14social media.
0:02:15This is a new phenomenon. More and more decisions and
0:02:20activity in our daily life are being recorded, tracked and shared.
0:02:25It's scary, right?
0:02:26Yes, it's scary. This is kind of user-generated big data.
0:02:31For example like our motion, our movement everywhere, they are being tracked.
0:02:35Yes, it is scary. However it is also a 'plus'.
0:02:40Some potential and very interesting and important research direction for us
0:02:45and machine learning is a very powerful tool
0:02:49as being used in this community and also potentially
0:02:52in many other enterprises, such as social media.
0:02:56Machine learning aims to use reasoning to find new and relevant
0:03:00information given some background knowledge. There're many applications
0:03:04and
0:03:05you all know, I believe many of you using that machine learning consists of three
0:03:10elements:
0:03:10representation, evaluation, automatisation. I will not get into detail of these.
0:03:15Although it is a very powerful tool
0:03:17there are limitations and constraints.
0:03:20Why?
0:03:21Because the generalization assumption may not hold.
0:03:24Especially in user-generated data scenario.
0:03:27Why?
0:03:28Because the user behaves differently at different time under different settings.
0:03:34You behave very differently when you are hungry or
0:03:36you need to go to restroom then you cannot sit here, yeah? You, we
0:03:40at different we may behave in a different way.
0:03:42And the training set may not be statistically consistent
0:03:46with the testing set.
0:03:48That was the assumption in this machine learning algorithm
0:03:52and also single objective functions
0:03:54cannot cover user's
0:03:56difference interests.
0:03:58Users have different interests and therefore they have different objective functions.
0:04:02And users are also rational and therefore they are all selfish. They all want to
0:04:08optimize their own utility.
0:04:10And we need to consider this.
0:04:12So
0:04:14the knowledge contained in the data
0:04:17can be difficult to fully exploited from
0:04:20this machine learning macroscopic view. Although it's very powerful.
0:04:25Because the data are the outcome of users' interactions.
0:04:29We cannot ignore the users' interactions.
0:04:32We need to explicitly consider interaction of users
0:04:36and basically that mean their decision process
0:04:38should be taken into consideration.
0:04:41And in the sequels you will find out that Game theory is
0:04:45powerful tool
0:04:46that can study users' strategic decision making, because we can take into
0:04:51multiple objective function at a single time and we can
0:04:54also take into consideration of users' local and individual interests
0:04:58into consideration.
0:05:01Okay,
0:05:02let's face it.
0:05:03Learning is for what? Learning is for making optimal decisions. We don't learn just for
0:05:10fun.
0:05:10And learning and decision making are coupled together
0:05:13due to network externalities. What is that? What does I mean network externalities? I mean
0:05:18that my decision, my action
0:05:20will affect your decisions and your behavior.
0:05:24We all affect each other. You made decision to come
0:05:26here and I made decision to come here, that's why we are all here.
0:05:30We affect each other. However what is the problem?
0:05:33The problem is that there is a missing link.
0:05:36And the missing link is that
0:05:38in a very traditional sense machine/social learning and strategic decision making are still two
0:05:45unrelated disciplines.
0:05:47And so here
0:05:48we propose
0:05:49a notion of decision learning to preach in between
0:05:53that
0:05:54when we are doing learning we also need to consider the decision making process.
0:05:59So this is a
0:06:00title of the
0:06:01talk. 'Decision learning' by that we mean learning with
0:06:06strategic
0:06:07decision making.
0:06:09Okay,
0:06:10I'm going to use three example to illustrate
0:06:12how decision learning work.
0:06:14First decision learning with Evolutionary User Behavior. Second is with Sequential User Behavior and then
0:06:21we'll talk about how we design mechanism
0:06:23from system perspective.
0:06:26First let's talk about Evolutionary User Behavior. Here I want
0:06:30to use information diffusion over social network as an example.
0:06:34Okay,
0:06:35now you see
0:06:37this is the Twitter hashtag during 2008 US Presidential elections.
0:06:43It
0:06:44captures spreading of comments and phrasing by candidates.
0:06:48And look at
0:06:49pink one, you can see that pink one
0:06:51this one says it is lipstick on a pig.
0:06:54It is a phrase by famous Sarah Palin, if you know her.
0:06:58And you can say that when a phrase was comment
0:07:01there is a duration and it'll reach to the top
0:07:04and then eventually it decreases and then goes down.
0:07:08You see that everything have a durations.
0:07:10And
0:07:11this is for the political comment
0:07:14and this was done in 2008. Now, if you are politician you want to measure
0:07:22that your
0:07:23message can be delivered. When to reach to the peak?
0:07:26When will be the starting point and when will be the end point?
0:07:28It had to be before the election not after, right?
0:07:31Okay, now this is the other point.
0:07:34That is online advertising for new product.
0:07:36How to predict
0:07:38the popularity of this advertisement and eventually what is the market share?
0:07:45Can we predict that?
0:07:47Okay, all of this relate to one problem that we call Information Diffusion Problem.
0:07:51How information diffuses?
0:07:54Users exchange information over social
0:07:56network. Okay, we're talking about this on the our social network
0:07:59and the study of this information diffusion,
0:08:01this problem is important. Why? Because if we understand the dynamics
0:08:06of this information diffusion. We can predict and maybe control
0:08:10the start/peak timing and distribution.
0:08:13And we can estimate a final reach of the populations
0:08:16it also perhaps we can identify the influential users and links
0:08:21for our purpose.
0:08:23Okay,
0:08:24now
0:08:25Dynamics In Information Diffusion.
0:08:28It is a sequence of decision making processes.
0:08:31It is not what we said, what we though that information just
0:08:35spreads by itself.
0:08:37No.
0:08:38For information to diffuse
0:08:40it relies on other users to forward or post the information they receive.
0:08:45Like I have my social network and if you are in my network, I post
0:08:48something and you will see that
0:08:50and once you see that you have to make a decision to post it or
0:08:53not
0:08:53and then from there now everybody when you are in somebody's social network
0:08:58everybody have a decision making process: Should I post, is the information exciting,
0:09:03are our friends interested in it? Or I don't want to post because
0:09:07it could be embarrassing.
0:09:09There is a decision making process here
0:09:11and information diffusion is an evolving process
0:09:15like our evolution.
0:09:17So
0:09:18for information diffusion we ask
0:09:21whether to post
0:09:23or forward or not.
0:09:25And for evolution process
0:09:27when there's a gene that have a mutant
0:09:29we ask
0:09:30whether to take mutant or not.
0:09:33So it's similar, so we want to relay this together and model this problem.
0:09:37And social network is always illustrated by a graph structure and we understand that so
0:09:42we are using the
0:09:43Graphical evolutionary game.
0:09:45Now
0:09:46what is this Graphical Evolutionary Game? Let me make a very simple explanation.
0:09:51Each player have
0:09:52a notion of fitness.
0:09:55How fit you are? How fit I am?
0:09:57Okay and so this is an evolution that we all know. The stronger one, the
0:10:03fittest will survive.
0:10:05A users' fitness
0:10:07is determined by this as you can see I tried to use this equation and
0:10:11this is the user's fitness.
0:10:13And B is the baseline fitness that
0:10:15depend on myself, okay.
0:10:17And this U
0:10:18is interaction payoff
0:10:20determined by the environment.
0:10:22And this alpha is selection intensity.
0:10:26So if I'm going to interpret this I would say
0:10:28that
0:10:29someone, like for someone here Haizhou Li here in Singapore
0:10:34his fitness depends on his own fitness
0:10:38plus the selection intensity multiplied by overall environment fitness of Singapore together, that is
0:10:44his own fitness strenght.
0:10:47I think that makes sense.
0:10:48So
0:10:50an evolution process has two important elements. One is selection and one randomness.
0:10:56These two are very important in the evolutionary process.
0:10:59So let me use this example. We have a graph
0:11:02and then randomly we select
0:11:05a node
0:11:06that can be a user.
0:11:08And once we select this one we have already randomness in here, now
0:11:13we are going to make selections, okay.
0:11:15So
0:11:16we have to compute fitness.
0:11:19Its own fitness and also its neighbours' fitness.
0:11:23I want to compute all these fitness and we select
0:11:26the strongest fits
0:11:28to imitate.
0:11:29It doesn't have to be imitation, I just used imitation as an example.
0:11:33There's a ?? protest and ?? despair there many different kinds of processes.
0:11:36Okay,
0:11:37so
0:11:38we need to know which one is the best
0:11:41fits
0:11:42and then we are going to imitate
0:11:45and
0:11:46this is the best fit and we imitate,
0:11:49okay.
0:11:49So this is the
0:11:51a typical evaluation process that is determined
0:11:55by users fitness.
0:11:57Now
0:11:58Information Diffusion Over Online Social Networks
0:12:00I have just say that
0:12:03online social network can be presented as a graph.
0:12:06As we can see it's a graph.
0:12:07And this information diffusion depend on the user's actions to forward or not depends on
0:12:13utility.
0:12:14If I like or my friend may like it I will calculate the utility of
0:12:19that and if
0:12:20that utility is strong I'm going to forward or post.
0:12:23If that is embarrassing then I am not going to do it. It'd damage my
0:12:27reputation.
0:12:28So this is a very similar to the
0:12:31Graphical evolutionary game.
0:12:33Now on the left hand side
0:12:35is the Graphical evolutionary game. On the right hand side we have the Social network,
0:12:39okay.
0:12:39Now
0:12:40there we have notion of fitness as I
0:12:43explain. Right hand side we have the user's utility. Now we
0:12:46basically we try to model the entire problem using this utility.
0:12:51Is there interest or not that we can do so. So on left hand side
0:12:55is that how can we map
0:12:58and relay that utility or our interest to the notion of fitness
0:13:02and then we can use Graphical evolutionary game which is a
0:13:05very powerful tool that have been useful sometime, that we can use.
0:13:09Okay, so how can we calculate that? I'd use a very simple example here.
0:13:15Now let's look at this graph.
0:13:17Now we have a user.
0:13:20He chooses strategy to foward.
0:13:23Okay. What is his own fitness?
0:13:25Here I choose B=1.
0:13:27This is based on his own.
0:13:29And the neighbour is what?
0:13:30We have Kf neighbor here. We have Kf neighbour
0:13:34with utility that forwards also.
0:13:37The rest of them K-Kf
0:13:39choose not to forward.
0:13:41Okay, so this is the utility of all his neighbours and you have selection intensity
0:13:46if that
0:13:46overall together with his own
0:13:49fitness. This is the overall fitness of
0:13:53the user who have the strategies to forward.
0:13:56For someone not to forward,
0:13:57we'll be doing the same,
0:13:59okay.
0:14:00Now for this
0:14:01network, we have this configuration probability.
0:14:05We have the fitness of the 'center guy'.
0:14:07Now we need to calculate the fitness of all his neighbours.
0:14:11Let's say someone with strategy to forward also.
0:14:14What is the
0:14:16fitness
0:14:17and if someone with strategy not to forward
0:14:20what is the fitness? So we calculated all this.
0:14:23Now
0:14:24with this we want to calculate what is the
0:14:28probability that someone will change from forward strategy
0:14:32to not to forward strategy?
0:14:34And we can calculate all this, okay. The details I will just omit.
0:14:37By doing all this
0:14:39we can find the dynamics of information diffusion.
0:14:43This dynamics of percentage of users who will
0:14:46forward the information will be captured by this equation that we can calculate
0:14:50and
0:14:51the final evolutionary stable state
0:14:54can be also obtained by this. And let me explain this. What is this
0:14:58if the utility of forwards, all forwards is much larger than the rest who will
0:15:03not do anything about it then
0:15:06one hundred percent of the users will all forward and post.
0:15:08Because it's too advantageous to them.
0:15:12If on the other hand
0:15:14not to forward have the highest utility and to forward
0:15:17have the lowest utility then noone is going to do it.
0:15:20Follows in between, there will be a stable percentage of population that will do that,
0:15:27okay. This looks complicated.
0:15:29Now K is degree of freedom. Meaning for each node how many friends you have.
0:15:34This assumes that
0:15:35(scale) is a large enough, okay. Let's assume K is sufficiently large. Now you can
0:15:40see that basically it
0:15:41is independent of the degree of freedom.
0:15:44And meaning what? Meaning only the utility can determine how much percentage of the population
0:15:50will forward or post information or not.
0:15:53Okay, now let's see this
0:15:56experiment. These are real data we acquired from Facebook. In this particular graph we have
0:16:024039 users with 88234 edges.
0:16:05And here we have ten
0:16:07subgroups, ten major user subgroups. First this is a social network so you can see
0:16:12that in
0:16:13the low scale it is a straight line here. This is so called scale-free network
0:16:18meaning
0:16:18that this is low scale so low scale people have very high
0:16:23degree of connectivity, meaning it's very powerful. It is exponentially less, okay. So that is
0:16:29something that we all knew before.
0:16:31Okay, now let's look at Evolutionary Stable States.
0:16:35That is
0:16:36now let's look at left hand side first.
0:16:39We use four cases as examples. So now we know the utility function, okay. Utility
0:16:44function of
0:16:45four parameters now are our model parameters. We want do model of information diffusion process.
0:16:50So now look at this. The first case.
0:16:53If utility of forwards is very high or higher than others, then what?
0:16:57One hundred percent of user or everybody will post.
0:17:01If
0:17:02it reduces
0:17:03and is not as good
0:17:05however still good enough then
0:17:07then there are some percentages in this case some sixty and some percent that will
0:17:12post or forward.
0:17:13If it's getting less
0:17:15then about thirty and some percent
0:17:17will post and forward.
0:17:18If on the other hand
0:17:20it's not good at all. The utility is so bad, there's nobody left to post
0:17:24then
0:17:25nobody is going to forward it
0:17:27and do anything. And there is a rising time and we can calculate that time
0:17:32and eventual population,
0:17:33that percentage of people. We said there are ten subgroups and in each group they
0:17:37all behave the same. This
0:17:38is just to show that.
0:17:40Okay, now this is something that we say: Okay, if we had a model parameter
0:17:45what is the behavior?
0:17:46Now, how about we don't have the model parameter. We don't.
0:17:49We only had the data and this of the data from MemeTracker.
0:17:53MemeTracker is this
0:17:54it is
0:17:55this news, this cycle network. It builds map
0:18:01for the daily news cycle, okay.
0:18:03By trying to link
0:18:05that what others,
0:18:06these are news report,
0:18:08for these information.
0:18:10You see as example here is a comment: we're not commenting
0:18:15on that story I'm afraid. Okay. With this
0:18:18then someone reports: we're not commenting on that. And there are three different links. And
0:18:23then somebody
0:18:23changed a little bit: we're not commenting on that story.
0:18:27And then there is a link, okay. So
0:18:30this is a very good source of
0:18:32to find these information, how you diffuse, okay. So we had this data and this
0:18:36is a huge amount of
0:18:37data. As you can see that we have more than 172 million news, articles and
0:18:42sources.
0:18:43Okay, so we might from here
0:18:45now what we want to do is
0:18:47we want to learn the utility
0:18:48that is, I have this vast amount of data,
0:18:51can I find a utility? I can reduce to few parameters to describe how informations
0:18:56may diffuse.
0:18:58Okay, this is the result.
0:19:00This is dynamic of four pieces of information.
0:19:03This curve associates with word
0:19:06Google Wave.
0:19:08This associates with the word
0:19:10Tehran.
0:19:11The grey line is a real data.
0:19:14And the red one is our fitting, our result.
0:19:17You can see it very well and the blue line
0:19:20is using these only
0:19:22using only the machine learning approach.
0:19:24So you can see that
0:19:26what we can do can fit much better.
0:19:29And most importantly that not only that
0:19:32we describe this information diffusion phenomena into, in this case only in two parameter, because
0:19:38we normalize these two, okay there are more and more. We only used two parameters
0:19:43we can describe how the information
0:19:44diffuse, when it stops, when it reached to the peak and the
0:19:48way it (ends) and if we only have partial information we can predict.
0:19:51I didn't have something to show you we can predict
0:19:55and also.
0:19:57Okay.
0:19:58Now
0:19:59let's do this experiment. We have five
0:20:02group of sites from the database, okay. We know this is a
0:20:05large database. Each group contains five hundred sites.
0:20:10And we estimate the final stable population,
0:20:13the lowest population that posts or forwards these information.
0:20:16So these five groups we can find very interesting.
0:20:20At the end it had different percentage of
0:20:23users in this particular group.
0:20:25That will post or forward information.
0:20:28And
0:20:31this is to show that the black one is, these are from,
0:20:36these are our model.
0:20:38And the red one is from the real dataset and you can see that they
0:20:42are very consistent.
0:20:43Okay, now back to this.
0:20:44What does this means?
0:20:46This is very interesting. We can see that group five
0:20:49behave cohesively
0:20:52or share major common interests.
0:20:54That means that those users in this particular group
0:20:59more than eighty one percent will agree with the social (group),
0:21:04with their own networking friends. Whenever they post something
0:21:07eighty one percent that they will also post.
0:21:11So
0:21:12and however in the group one. There only about
0:21:16nineteen percent will do so.
0:21:18Therefore if you hire an advertiser you want to
0:21:21do advertisements. Which group are going to forward to?
0:21:25And then to expect
0:21:27a high
0:21:28percentage of
0:21:30results that many people with see.
0:21:32That would be this group.
0:21:34So we can
0:21:35expect our group five in this particular case behave
0:21:38cohesively and share major common interests. However group one
0:21:43share rather very little common interest.
0:21:45This is to mine
0:21:47the coherence of group and
0:21:50if we can do anything about it or not.
0:21:53Okay.
0:21:54Now I'm going to change gear to talk about Sequential user behavior.
0:21:59Here I'm going to use some
0:22:01well known website that we collected data from to
0:22:07illustrate the problem. You all know Groupon. I think some of you may also use
0:22:10it and advantage of it,
0:22:11right? On Groupon we sometimes find out there is a very good deal for a
0:22:15restaurant
0:22:16that you always go and we buy from there.
0:22:19And
0:22:20and Yelp.
0:22:22We're also using Yelp.
0:22:23When I am here I am using Singapore app to determine which restaurant I want
0:22:29to go.
0:22:31Once we pull them together it gets very interesting.
0:22:33We found that
0:22:35this is a Yelp rating
0:22:36for some very good restaurant.
0:22:39We found that
0:22:41Yelp's star rating decline, okay you see this is declining after a succesful Groupon deal.
0:22:49With rating that's very high and
0:22:52many people buy this Groupon
0:22:54and then in the next few months you can see that
0:22:57its rating is declining.
0:23:00Why?
0:23:03This is a negative network externality
0:23:06in place here. The one I'm mentioned, I'm going to explain that.
0:23:09So this degrading quality may be due to overwhelming number of customers, because it's too
0:23:15successful. Many people buy and then in the following day many people will show up
0:23:20and they only have three waitress. And then you have one hundred people who show
0:23:24up. Quality would not be good.
0:23:26Kitchen is limited and may not be good.
0:23:29That could be the reason.
0:23:31Okay, so
0:23:33what is a phenomena here?
0:23:35Learning
0:23:36to get best deal. Everybody want to get best deal.
0:23:39And then they make a decision to take advantage of that and then their decisions
0:23:44affect each other.
0:23:45Reduce all this ?? fashion from each other.
0:23:49That's what we call
0:23:50negative network externality.
0:23:53Your decision and my decision
0:23:55eventually affect each other,
0:23:58okay.
0:23:59So how can we have model this problem?
0:24:01I like to go to a problem in machine learning called Chinese restaurant problem or
0:24:07Chinese
0:24:07restaurant process. Some of you are using it also.
0:24:10I don't need to explain this is Singapore if you don't know
0:24:13what Chinese restaurant problem is, just choose chinese restaurant you go to, okay.
0:24:17So
0:24:17Chinese restaurant problem
0:24:19we have limited round tables with unknown size
0:24:22and this round table is of unknown size could be cloud service or it could
0:24:25be a deal,
0:24:26okay.
0:24:27And customer which choose table to sit
0:24:29or open a new table.
0:24:32For the Chinese restaurant process we have infinite number of a tables,
0:24:36with infinite number of seats.
0:24:38A customer enters and with predefined probably to
0:24:43either choose table to sit or open new table.
0:24:46This is non-strategic because it's parametric, okay.
0:24:50So now
0:24:51we introduce Chinese restaurant game.
0:24:53We want to introduce this strategic behaviour, this decision making here. Why? Because
0:24:58eventually we want to understand the negative
0:25:02externality effect, to model it and to understand that.
0:25:06Okay, so this Chinese restaurant game.
0:25:09We have table
0:25:11with unknown size
0:25:12Rx (θ):
0:25:13is system state (θ). System state is the condition of the restaurant. How much money
0:25:16they have, how much
0:25:17budget they have and the environment, okay.
0:25:19A customer has signal S about the system state from
0:25:22advertisement, from friends, from wherever they have been before.
0:25:26Or from advertisement.
0:25:28Which table to seat
0:25:30to have maximum utility?
0:25:32You know that if you go to chinese restaurant some day it'd very unfortunate many
0:25:36people seat
0:25:36with you and eat, is that right?
0:25:39Utilities is very bad. It's not confortable at all.
0:25:42And this is a negative network externality.
0:25:44More customers at one table less space for each user and therefore
0:25:49utility is less
0:25:50for past and future.
0:25:52What does that mean in future?
0:25:54That mean
0:25:55when you come in
0:25:57based on the condition you sit at a table that may not be the best
0:26:01decision.
0:26:03In ten minutes
0:26:04ten people may sit with you, okay. So you need to predict the future and
0:26:08that is the problem and the
0:26:09future affect you decision right now.
0:26:11And that make it in fact difficult.
0:26:15Okay, so this is a sequential decision making problem, customers make decisions sequentially
0:26:20and information of observation that we have is that for every customer comes,
0:26:24every player, they can only seat
0:26:27at that moment how many people seat at the table.
0:26:30And then also the signal, okay, from those who seat before him
0:26:34that's all he can get.
0:26:35So when first customer arrives
0:26:38it picks up a table
0:26:39and then the second one
0:26:40pick another table, and so on and everybody picks one. Sequential decision making process.
0:26:49Can
0:26:49number three user make the best decision at that moment?
0:26:53No.
0:26:54If he cannot predict the future and see the future his decision at that moment
0:26:58may not be the
0:26:59best, because people not yet coming are going to affect him,
0:27:03okay.
0:27:04So now let's assume there is a perfect signal, meaning that
0:27:07everybody know all the signals, all the decisions other people may want to do given
0:27:11that
0:27:11condition that they have.
0:27:15I would not get to the details so I'll explain
0:27:17then there is equilibrium grouping meaning the best utility is
0:27:21that when you choose that
0:27:24with this signal from everybody it will be the best.
0:27:29When anyone changes to other table
0:27:32the utility will only reduce.
0:27:34So that would be the best strategy.
0:27:36When a signal is perfect.
0:27:38Let me explain this. So what will be the strategy?
0:27:41When everything was,
0:27:43all the condition if I come in I will know
0:27:46ten minutes or thirty minutes later of what people come here
0:27:48I have the perfect information. You know what?
0:27:50If I come in
0:27:51I'm going to know which table have the best utility and I'm going to choose
0:27:55that.
0:27:55The second one will choose that also, until the table is filled. And then they
0:27:58will go to the
0:27:59second best utility and do so, because the signal is perfect, right.
0:28:03So
0:28:05customers who come early
0:28:07have
0:28:08advantage
0:28:10but those were protect signals.
0:28:11In real life
0:28:13it's not that easy.
0:28:14It's imperfect signal. We cannot have the
0:28:17perfect signal and therefore we have to learn
0:28:20from the noisy signal to form our belief and this is the learning process
0:28:25and we can use Bayesian learning or some other thing to construct the belief that
0:28:29we have,
0:28:31okay.
0:28:32So
0:28:35now this.
0:28:36What is the best response? The best response is:
0:28:39let's choose the best utility.
0:28:42But in order to choose the best utility we need to know the final distribution
0:28:46of the users and like
0:28:48I said we need to know who come in, where they will seat?
0:28:52So this utility function depend on subsequent users' decisions
0:28:56and this is a negative network externality.
0:29:00And we need to predict the future decisions to find a best response
0:29:03therefore we had to ?? backward induction to do so. So that everybody is going
0:29:08to have optimal
0:29:09decision we can predict and then backward induction from there.
0:29:13Okay. Now this is the summary of this Chinese restaurant game and this is some
0:29:18example.
0:29:19Remember this?
0:29:20I say that
0:29:21if there is a good Groupon deal
0:29:25the utility will go down like that, okay. The red one is the real data.
0:29:29And now let's use linear model in this model
0:29:32the utility, okay.
0:29:33So
0:29:34this is utility that we model. We learnt from the real data that this is
0:29:37the utility.
0:29:38And based on this utility
0:29:40this utility and then we can attain all these parameters that we have and then
0:29:44we can use
0:29:46what we had just derived.
0:29:47We consider all this negative network extarnalities. We can predict the future to find optimal
0:29:52result.
0:29:53As you can see the blue one is learning with negative network externality that we
0:29:58propose and then
0:29:59the red one is without. Meaning only at that moment.
0:30:02We can increase ?? DSSR rating.
0:30:04We can indeed increased the rating by doing so.
0:30:08So if we consider this
0:30:11negative network externality then the strategy will perform much better.
0:30:17In fact we can also use this to devise strategy.
0:30:20This is a New restaurant's strategy, okay. Now we have a new restaurant.
0:30:24This is a low quality restaurant and a high quality restaurant.
0:30:29Now
0:30:30what is the number of customers?
0:30:34Look at this. For a low quality restaurant
0:30:37in order to increase the number of customers who arrive
0:30:41the deal price had to be less.
0:30:45If your quality is not good, you make it cheaper.
0:30:49If
0:30:50high quality restaurant even if deal price is a little higher you can still make
0:30:55a good profit.
0:30:57This is a signal quality. Your advertisement and also all affects to be none.
0:31:03If you a high-quality restaurant you want signal quality
0:31:06to be very good, so people know this is a good restaurant.
0:31:10If you have a low quality restaurant. Don't let know too much about you, okay.
0:31:14That would be better.
0:31:16And next we get to revenue. This is now what is the revenue?
0:31:20High quality and low high quality restaurants they can also all achive peak revenue.
0:31:26So
0:31:27for high quality restaurant to achive peak revenue
0:31:30the signal equality had to be good
0:31:32then the revenue was higher. Let people know
0:31:34and then
0:31:36through the word of mouth people will come.
0:31:39However for low quality restaurant it's not good.
0:31:42Don't let people..
0:31:44The better the signal quality the lower revenue you are going to have and see.
0:31:49Okay, so I think this all is common sense, however it's highly nonlinear
0:31:53and we had to model that and this is indeed a very highly nonlinear problem
0:31:57and I don't have this
0:31:59slide to show you. And in our paper you can take a look. So high
0:32:03quality restaurant should try
0:32:04to increase the signal quality and low quality address should hide the
0:32:09quality information and use low deal price to attract the customers.
0:32:15Indeed we developped a family of this. We had a Chinese restaurant game and we
0:32:18also had Dynamic Chinese
0:32:19restaurant game for people who come and go.
0:32:21And that is for example just like this Wi-Fi, okay. Many people come here and
0:32:25many people may leave and
0:32:27they decide which network they want to choose. And also we had this Indian Buffet
0:32:30Game.
0:32:31And that is
0:32:32interesting. We see all this in US, okay.
0:32:36Chinese restaurant in this all round tables you can
0:32:39see in Indian restaurant also had these buffet thing.
0:32:42Okay so
0:32:43we have multiple
0:32:45choice of what we can do
0:32:47and this can be applied to many different scenarios.
0:32:50And we've applied to for example we have seen this in Yelp, okay. There is
0:32:54Yelp on Amazon.
0:32:55And this is so for this review, okay. It's a sequential decision and Question and
0:33:00answer sites. You can see
0:33:01all these Question and Answers sites, we also had a problem on this.
0:33:05Also many other sites as long as you have this sequential decision process that come
0:33:10in
0:33:10and why these .. we can apply this to that, because social computing system in
0:33:16this case, they are all structured in the same fashion that
0:33:20users arise sequentially
0:33:21and to make decisions on
0:33:23whether to do something or not, whether to forward something or not.
0:33:26And also there are externalities amount users.
0:33:29Your decision, your actions will affect each other and also there is some uncertainty. So
0:33:34in this
0:33:35sense Chinese restaurant game will be a natural choice in modeling and analyzing these users'
0:33:40behavior, so
0:33:41that we can come up with the best
0:33:43piece of the model we have seen and also to devise some incentive mechanism,
0:33:47okay. To attain some results that you want.
0:33:50And I believe that in speech and language community
0:33:53that you also have something that you may want to do.
0:33:56Speaking of this mechanism
0:33:58we want to see this I believe that you use a lot of this a
0:34:01labeling also in this community.
0:34:03Now you have seen all this
0:34:05that's all from users' point of view, so you may ask can
0:34:08system designer design something based on this?
0:34:10Yes.
0:34:11So we want to see if we can design some mechanism to
0:34:14attain something that we really want.
0:34:17In this case I want to show you
0:34:19is that how can we collect
0:34:21good
0:34:22data
0:34:23okay.
0:34:25Large scale labeled dataset
0:34:28is very important in many applications. I think you all know better than me.
0:34:31Because more data leads to better accuracy,
0:34:34however large scale in-house annotation is very expensive.
0:34:38You need to have a thick pocket to do so.
0:34:41And often it becomes a bottleneck for that.
0:34:43And recently you have seen that microtask crowdsourcing is very popular.
0:34:49It claims to be a very promising way, because it can handle large volume, short
0:34:53time and low cost.
0:34:54All the benefits
0:34:56and a question is: is that so?
0:34:58We have many website that can do so,
0:35:01okay.
0:35:03There's a problem
0:35:04you see all the
0:35:05positive side, but there's something that didn't tell you.
0:35:08That
0:35:09due to limited budget, okay you and I don't have deep pockets, okay
0:35:13so that the collected data often have low quality.
0:35:18Why
0:35:19low quality? I will explain that. T mutual incentive thing, okay.
0:35:22Now
0:35:24for machine learning all signal processing, okay.
0:35:27Sometime we say okay
0:35:29let's deal with it, let's cope with this
0:35:31so let's filter out low quality data or cope with
0:35:34noise with this modified algorithm to be more robust and probable.
0:35:38But there is after-effect,
0:35:40okay. This is a our typical behavior up to that.
0:35:44Now with these decision learing solution
0:35:47maybe we can ask
0:35:49can we do a better job before that?
0:35:51Can we incentivize
0:35:54with a mechanism to ensure high quality data from the first place?
0:36:00Okay
0:36:02you want to give me more money? No, I don't have money to get you.
0:36:05Let's see if we can detain ?? something
0:36:07that can make you happy.
0:36:09We allow
0:36:09increase in your budget, maybe reduce your budget, okay. What that is
0:36:13okay.
0:36:15This
0:36:16crowdsourcing
0:36:18has such characteristic
0:36:20repetitive and tedious small tasks. Yes or no, correct or not correct. And you do
0:36:25that
0:36:25and how much do they earn?
0:36:27USD 0.04 cents for one thing.
0:36:29Too many
0:36:31too many of them have house or wife and they have nothing to do,
0:36:34when baby is sleeping they can make a little bit more budget, so they can
0:36:38help pay for some of the bill.
0:36:41So worker receives a small amount of reward and there's no competion among workers.
0:36:45So what's the problem?
0:36:47The problem is that this may lack proper incentive.
0:36:51It is profitable for a worker to submit poor solutions.
0:36:55Yeah?
0:36:56Because of the
0:36:58the number of solution is just too high, nobody is going to check on it.
0:37:02It'd be too expensive for you to check on it, right?
0:37:04Think about it. That's why it's expensive.
0:37:07So
0:37:08and however you want to check on all this, just like I said.
0:37:11Provide incentive is challanging. Why? Requesters typically have low budgets for each task
0:37:18and also to offer better incentive you have to pay more, so people can do
0:37:22better job, you need more
0:37:23money and also in order to verify the solution is also expensive. All these are
0:37:28problems.
0:37:29So what incentive mechanism should
0:37:31requesters employ to collect high quality solutions in a cost-effective way?
0:37:38So here we propose a mechanism
0:37:40that sounds interesting and we also have the Oracle analysts that found it can work
0:37:44and we also did
0:37:45experiments, so what is that? I'll show you.
0:37:47Now up to now there are two very basic reward mechanisms
0:37:51that we can do so. That we can evaluate the solution.
0:37:54One is called consensus mechanism. Consensus mean what? Okay
0:37:58now I have all these solutions people submitted.
0:38:00I do a majority vote. I don't know the right answer but
0:38:04I do a majority check. After majority check I know it's correct.
0:38:10Correct fits the majority check.
0:38:12Or I can do reward accuracy mechanism. Meaning I can
0:38:17I can verify
0:38:19the solution is correct or not with probability.
0:38:21It's too expensive with probability. I want to verify all of them, but I don't
0:38:25have time.
0:38:25If I had time I'd it myself.
0:38:27Okay, now I have these people so I may have some probability points, probability principle,
0:38:32probability
0:38:32that I'm going to make a sample
0:38:34and then to check on the solution.
0:38:37Now you can see these two had cost C or ??
0:38:41And this relates to a parameter and this is not under control of requester.
0:38:46So there's a problem here.
0:38:48This cost, the one I mentioned, is a fundamental problem.
0:38:51For the traditional (methods) it's a problem.
0:38:54So
0:38:55now
0:38:56before we do so
0:38:58let's
0:38:59come up with this incentive mechanism first that's called Mt, okay.
0:39:03This is called Incentive Mechanism via Training. Idea is this
0:39:06to employ quality-aware training to reduce mechanism cost.
0:39:11What does that mean?
0:39:13We will assign
0:39:15enforced unpaid training to workers when
0:39:18performing poorly.
0:39:20You flip off one poorly
0:39:22then they had to go to some training time and in this training time they're
0:39:25not going to make any money
0:39:27okay and to guess some credential back so before they can do that again.
0:39:32So this is more like a punishment if they don't do well.
0:39:35So
0:39:36okay now we had two states. Everybody had two states.
0:39:39What are they?
0:39:40Okay this is normal.
0:39:42They produce results, give them money so that they are happy.
0:39:46However
0:39:47if they don't do well
0:39:49they are going to go to the training state.
0:39:51And in this training state workers have to do a set of training tasks.
0:39:56Unpaid.
0:39:58Okay
0:39:58to gain qualification
0:40:00for the working state.
0:40:03Okay this can be just a policy
0:40:05to deter it,
0:40:06not necessary may be implemented. You will see that.
0:40:09So everybody now
0:40:10had two states.
0:40:11Now you can see two states.
0:40:13One of these is worker's action at working state.
0:40:16Now everybody's action at working state.
0:40:19This will determine the probability of this worker may continue to be in working state
0:40:24or
0:40:24he may go to the training state.
0:40:26And for how long it may get from training state back to working state.
0:40:31Worker will prefer to be here to make money.
0:40:34However if the job is not good enough he is going to go here. Nobody
0:40:38want to come over here.
0:40:41Okay, so worker's action at this moment
0:40:44may not only affect his immediate utility, but also his future utility or both
0:40:51together.
0:40:52There's a notion of this long-term expected utility here
0:40:56and basically for you formulate this problem
0:40:58it is
0:41:00is a like the
0:41:01Markov Decision Process (MDP).
0:41:02But this Markov Decision Process is not easy, because that
0:41:07the
0:41:08this MDP faced by each workers also depends on others' actions.
0:41:14Our action, how's the quality of our work is depending on how others are doing
0:41:19their quality work, so it is indeed
0:41:21Game-Theoretic Markov Decision Process. That means what you are doing affects what I am doing.
0:41:27What I'm doing will affect what you're doing.
0:41:29Okay
0:41:30so from here
0:41:31I will just summarize the results. Basically we show that
0:41:34with this Game-Theoretic MDP
0:41:37there is a guarantee of Nash Equilibrium
0:41:39as long as the number of training set N is large enough.
0:41:44If the number or training set,
0:41:45I mean the training problem to be done N is too small,
0:41:48nobody will worry, okay. I go to the training state, very soon I can get
0:41:52out just like somebody commits a crime,
0:41:54you put them in the jail and one day
0:41:56they come out. You know what? Many people walk out and in everyday.
0:41:59If you put them in for ten years. Very few will .. they will think
0:42:03twice before do something again.
0:42:05Okay, now
0:42:06we also show that
0:42:08if we
0:42:08follow certain design procedure
0:42:11we can achieve arbitrary low mechanism cost. The cost can be even
0:42:15very small, almost to zero, if we design it correctly.
0:42:21Okay, now let's see the experiment setup.
0:42:25Task, a very simple task
0:42:27this is to calculate the sum of two randomly generated double digit numbers.
0:42:33Second year,
0:42:34second grade student can do so, okay. And (workers are given) ten point for each
0:42:38accepted solution.
0:42:39And the goal is for
0:42:40these participants to maximize or to accumulate these points. This is not real money. We
0:42:45work
0:42:45with points, we are doing an experiment.
0:42:48And the text assigned to these participants comes in three sets. And each one is
0:42:53limited to three minutes, okay.
0:42:54Only in this limit had to do so.
0:42:58And so we had these three sets.
0:43:00The first set is that reward accuracy mechanism with sampling probability 1, that mean
0:43:05in this
0:43:06in this set
0:43:08we tell all the participants
0:43:10that we are going to check your every solutions.
0:43:13And the set number two
0:43:15with sampling probability 3.
0:43:17That we will only have thirty percent or probability 0.3 to check your solutions.
0:43:23That is set two. Set three is
0:43:26sampling probability policy 0.3 only, however we also tell them we also have this incentive
0:43:31mechanism.
0:43:32That is if you don't do well enough
0:43:34you are going to do some training problem before you can come back to do
0:43:38that.
0:43:39Okay, so what are the results that we get?
0:43:42If we tell them that we are going to have this sampling probability 1, you
0:43:46can tell that
0:43:47the accuracy, this is the histogram, number of wokers and their accuracy, the accuracy is
0:43:52quite good.
0:43:55With 0.3 probability you can see that most of them don't care
0:43:59okay, because only
0:44:020.3 probability will be checked.
0:44:04And this is the sampling probability.
0:44:08You can say .. how about this guy? You know there's always
0:44:11nerd, okay. And they always do good job.
0:44:15And now with this
0:44:17sampling probability 0.3. We tell them: hey we have this, if you got checked, you
0:44:21are going to put
0:44:22into there. Now you can see the result is basicly as good as this result.
0:44:29And you say: Okay, who are these students?, can they do this easily?
0:44:33They are all
0:44:34engineering students, graduated student, okay. With forty one of them to participate.
0:44:40So these show that
0:44:41in fact with such machanism we indeed can collect good data from beginning. We don't
0:44:46have to
0:44:47take it saying that data is not good enough and just take it from there.
0:44:52Let's
0:44:53do something you know
0:44:56do to better if we can collect much better data.
0:45:00Okay, the conclusion and final thought here is that here we see three social media
0:45:06examples
0:45:08to illustrate the concept of decision learning.
0:45:12In this information diffusion problem
0:45:15we
0:45:16try to learn
0:45:18users' utility
0:45:20functions
0:45:21for understanding and modeling strategic decision making.
0:45:25And for the Groupon and Yelp example that we've seen
0:45:28we try to learn from each others' interactions.
0:45:32We have this negative network externality
0:45:35for better decision making.
0:45:37And for this crowdsouricng we design a mechanism to
0:45:41enforce users' strategic behaviour to attain better quality data for better learning.
0:45:47So
0:45:49by decision learning
0:45:51we mean that it is the learning with strategic decision making.
0:45:55And these two are something that has to be considered together.
0:45:59And we can analyze users' optimal behaviour from user's perspective
0:46:03and we can also design optimal mechanism from
0:46:05system designers' perspective.
0:46:08Now for the
0:46:10coming big data tsunami. This is something that we envision.
0:46:14We are going to have big data here
0:46:17and these big data and
0:46:18is not
0:46:20?? steady data overlay
0:46:22that
0:46:23Bob and Alice is going to
0:46:26learn
0:46:27from this data and be it whatever model here. I say Markov decision process, it
0:46:30can be anything to learn from here.
0:46:32Once you learn something from here, you are going to make decision.
0:46:35But (when) you make decision you are going to take action and most of these
0:46:38are sequential actions. We are at
0:46:40different places and different people.
0:46:42And then you have the outcome and this outcome will come back to affect the
0:46:46data.
0:46:47Like I said the data is not static data
0:46:49and therefore
0:46:50Bob and Alice
0:46:52one can be in Singapore and one can be in US, their actions and their
0:46:56decisions will eventually affect each other,
0:46:57because the data have been changed.
0:47:00And this is something that
0:47:02we believe that is something that
0:47:04both
0:47:05challenging and the potential future for research. And I believe that
0:47:12this is something that also can be
0:47:15very promising in speech and language community.
0:47:20Okay, that's it. End of story. Thank you for your attention.
0:47:32So I am glad, I think we all made a good decision to come to
0:47:36professor's Ray Liu talk.
0:47:38We have time for few questions.
0:47:47I think this was a wonderful talk.
0:47:49I've noticed that in your Chinese restaurant decision process that you had this backward reduction
0:47:54stamp and
0:47:55you showed your backward reduction for just one time step, from T+1 to T.
0:48:00The question is: is it computationally feasible to do backward reduction from the end of
0:48:05the end of time to different time?
0:48:07Oh yes. That's correct.
0:48:49I think it's surprising issue, right? I mean it's easier
0:48:52okay to make one cent for task, if they are more difficult they make 10
0:48:55cent for task and they had to think twice
0:48:57right? Which one can make better profit?
0:49:01So I think that different
0:49:03what I show is just one dimension, meaning I just used one example. In that
0:49:08particular we
0:49:08designed mechanism in that way.
0:49:11In a different scenario
0:49:12you can design different mechanism in a different way. I think you should be all
0:49:17fine. There's
0:49:17no single solution here. This is just a concept, meaning that we can do so.
0:49:22And I used an example to illustrate that.
0:49:27Do I have to push? Yes.
0:49:29I am a little intriqued by the experiments that you have done.
0:49:32I'm just wondering that if we look at social networking
0:49:36you know first time you are using it, maybe you're kind of getting
0:49:42bogged by that information diffusion that happens.
0:49:46But you know second time or third time user, you just ignore all the incentives.
0:49:50I'm just going to do what I normally do.
0:49:52Say every time you do that experiments I am going to get a new set
0:49:55of
0:49:56people who are
0:49:57using this and then making decisions based on that?
0:50:02Okay, because there are some microphone problems I didn't hear clearly. But let me
0:50:08guess that I think you are saying that
0:50:10if they do many times they have different experiences so may game the system, right?
0:50:16So I think it would not (happen), because we use game theory to formulate that,
0:50:20we found
0:50:20ultimate equilibrium in that solution, so
0:50:23basicly doesn't matter how they game,
0:50:25they will have the best strategy that can achieve their best equilibrium solution for that
0:50:30so doesn't matter what they game. They only have
0:50:33some of the best strategy they can do, they can go.
0:50:35Whatever other strategy would not get them better so it's okay.
0:50:44Anymore questions?
0:50:53So seems like they want to make a decision to go for coffee
0:50:57early, but before that
0:50:59I would like to show our gratitude to a professor Ray Liu and our vice-president
0:51:05Haizhou will present
0:51:06a souvenir to professor Liu.