so my name is the current crush them more the and i'm presenting the top michael orders uh

my graduate student was stuff a and also line white to in the audience you lying is from the you

is your badly

uh so just to kind of give you a pictorial

description of what the problem is it's kind of nice to start with something artistic so uh

this is a main idea

we are looking at

eight basic scenario where you're trying to track a moving target so

signal processing that B

the silly in papers where what you do is you do some sort of bayesian estimate of the state of

the top given noisy measurements

uh and of kind of written that abstractly

once you have those track estimates

typically you have some he wouldn't being looking at a

television screen and saying okay this it

is doing something interesting

well this start it is not doing something interest

what we wanna do is all to mate that high level

that's why we call this metalevel tracking so

in other words

no i i had to have a project a a a a a thing

the red button you

so the idea is given these dot

how do you joint the essential

can you tell from these dots what sort trajectory the talk it's gonna do

and

do this in a page and sense to some sort of non than you filter

estimate the trajectory

so this is to automate

several possible toss which are of interest

for example you might have a target

which you interested in which goes to a check point of fills up fuel goes to another point and then

a final this

um are only interested in targets which go to this trajectory go to those but specific point

how can i we construct

an estimate of such a project

you know patience and

so this kind of bills on a a legacy tracker

and a goal is to see how could be construct good models to this

and once you do construct a good model how can you actually can L

so uh just to give you a further example of this this is a typical example which kind of motivated

by we did some of this research

so this could be maybe somewhere in a city where a

typically traffic

road traffic goes like this

donna role

and then some day

do you to maybe the local people knowing that i have something was deployed in the role maybe and i

i D use something

you see a lot of traffic

by pass a particle part of the road

so probably a track see all the dots done with people by past the that

so previous so you're going the straight line

now you kind of died reading and then going back

how could you actually construct some like to classify for that

given given the do

so all an obvious way of more like this would be that

in some sense i'm deviating

some amount

X

from the road

and then a coming back but a same amount

now that a more X could be variable a baby got all the way here are then going back

so we need a model which is scale invariant

which allows you to go in a particular dimension probably a random number of step

and then you need to remember that that a number of steps to come back and we model construct a

model which does that

it's kind of obvious that a standard finite state markov chain cannot do that

is a markov chain cannot embed them E which is

variable in it

i mean of course you can construct a state space which is a maximal possible size

that's ridiculous because a of possible that's very here

so it sort of possible use

station

so we need something smarter than a markov chain

so uh the that's that's basically the idea of what we gonna talk about

so you can think from an abstract point of view this is a special case of a trajectory which we

call it a

where you deviation away from it is equal to the deviation to words that

and this D V A should could be around random

what to do

so could be anything

another example could be top which just circling the building so for example a to be something suspicious build themselves

up or whatever

so

can you tell from a bunch of don't

isolate only those trajectories

where talk it's something a building

for example going a rectangle

uh a or for example maybe be power just going straight fine to be known

so how do you isolate different trajectories you using some sort of nonlinear filtering operation um

okay so i just just a kind of again reinforce the idea that

in conventional talk tracking which i guess to lots of papers and signal processing it's so one

the main idea you construct a steak space model for the trajectory of the top

that's fantastic if you're doing very short ranges over time because you can approximate things by random walks

but over long periods of time the drivers not a random block

he has a well defined destination

and they typically going through for example

a street man

a digital that for example which you could we construct google

or whatever

so how do you construct

an efficient model which can do that and then the questions how can you estimate that

so the main idea the talk is we we dealing with a particular class of models which are called stochastic

context free grammars

and maybe is useful to start with this picture the border

so if you if you did a elementary course and radical computer science

you would see that a markov chain

or for like to go to state space models

are

regular grammars

they assume that what happens now depends probabilistic lee

on what happened at the previous time point

it's not true would most complicated trajectories because if you're gonna go in a square

you have a long range dependency because you're starting and ending point of the simple

that's not my

so

in in computer science there is this more general model call eight context free grammar

which is a level of abstraction we gonna talk about today

where

you can actually construct

several very interesting

trajectories

which are non that of you

such as rectangles all since a

a but size which but you can prove

a markov chain cannot

we can write and talk about the

now of course there are even more general language

such as context-sensitive grammars

but there are no use to was because they do not have all real time out

some things not all of the you know it's not

from an engineering point of view

useful and that there are restricted was just like one i'm talking about

it's kind of a while english so speak

so typically regular is

mark or V a know the sort of stuff everybody does

what we're trying to do here something a slight generalisation of the context free

me

so this is called a chomsky hierarchy a formal language that and our goal is to see can one constructs

the equivalent filters such as a kalman filter or the hidden markov model filter so one

in this domain where you have long range dependence

okay not a can be a little bit more insight

now you might say why is this of any use and i want give you some more insight as to

why this is of use

for starts this is a very convenient model for human operators

you can imagine that

in ideally only what happens is off to the tracker operates and as a should in the first slide we

trying to build something on top of that

which helps to human

automate to the process of detecting if something doing

something interest

now it's pretty hard to teach the average soldier probability theory and detection it's so on and make them understand

how to construct a bayesian estimate

if you can automate that process by allowing the average soldier just to estimate

to put in

uh_huh

metalevel is that if a target is doing something it's a species

if is not doing something not speech

so essentially actually you can quantify these rules

in two

fairly interesting types

of of production rules what you call be size

and you can check the consistency of those by a compiler which which are long describe here but it's pretty

easy to do

second point is once you do that

just like a regular grammars in the context free grammar domain main you have

polynomial out

a estimation

a lot of this work is really been motivated by recent research should by all informatics

where you also have long range dependencies

and i want to give you some insight as to what i mean

so this is a typical buyer informatics type example this is

a protein which coils

or a given thing D N A which is a you can declare nuclear tag sequence which also close like

this now the dependencies you're a spatial so this guy here depends a lot of disk Q even though it's

pretty far we you will always call

rather than have a linear depends

okay so mark of in type processes can not more the sort of dependency with this guy

is spatially very close to you

long

another example

think of a computer language

this is a a complete or experiment it's not it's not a actual useful thing but it's it's a nice

thought experiment

suppose to take

something like a computer language like C your back to that

have a big in if

and then i have a lot of stuff and then at the end

if a computer language was marked for the and then at have a a and if

with a high probability you be B up for beginners

although an actual computer language you have a lot of stuff

so now you can think of the core experiment

suppose i think this computer language

this code and a corrupted by noise

how that we can

now i can't use a markov chain because this is not

representation

so that's another example

i

so the ball is how can you construct a useful things here

and the point is

they should be scale invariant you C

all of these are the structure it scale in it's i have a big in if i can have a

lot of junk could between and nine is but me and

so the size of how much stuff is in between

should not matter for the context of the a

so that's that's something which we want as well

now it turns out that

for such that the models people are shown i'm not to give you any right studies here

that

in terms of entropy

these context free grammars

do fantastic stick that is

for a equal size parameterization of a model

if we look at the

predicted entropy

these are extremely efficient models when you have to sort of scale and

okay so let's not jump in

the actual types of trajectories we want talk about

so this is the first trajectory a random walk which is simply

a good role of chain which you do was state space

this is a widely used

it target tracking but as i said it's pretty useless because no drivers a drunk or simply flips a point

it is that's but gonna drive

they got go in a directed we from a to B

a much more sophisticated model which michael a line white is that's a pretty cool work alone

is very you know the beginning and you don't the and

and you want an so it's called the reciprocal process are a lot of you bridge which is a generalisation

of a brownian bridge so you clap the two and but you start this

definition as random pass

and then you construct

a dependency but each point now depends so the next point at the previous one

to this is a one dimensional model round fee

and that's called them up with you

now you can think of a much more generalisation of that we is a stochastic context free grammar

so rather than have

i start from a a

and i'm restrict to reach time be at a fixed future time

i now want to go through particular weight points these could be for example a ones

so i one like topic to go from a to C to be

and in between to you i'll i'll a we project

how can a now if uh

given this sort of model

construct from noisy estimates

the actual trajectory that would be a question once

so here are examples of cases where you can not fit simple markov change

so i the said one of them is an out

so if i goal in this direction a four and point

and then and and the tree number of points to this direction B

and then come back

see for and points

you can prove

by something colour pumping level like computer signs that

such a string

when you have and As

a bunch of bees and then and sees were and is a battery

can not exclusive we generated by a markov process it's a possible you can do that you can actually she

for like four

so a rectangle

so suppose i when equal number of points year then here than equal number of points and the backward

such a close trajectory can not be generate by

the intuition is that a markov chain cannot store memory in

it's got a markov chain is simply a regular automata

to store memory you need something called push down stack you have to store stuff and you put that out

of the stack

and that's a lot of

so anyway so these are kind of

but a things i don't just one a very quickly give you a a uh some inside

as to how this differs from a standard construction of a

hidden markov model which is

what you typically do

when you deal that are tracked

a little

so most if you would know what a hidden markov model or a probabilistic model of them are uh functional

markov chain

we start with the transition matrix so this is an underlying markov chain

which of evolves over time

and we observe the markov chain

it also to alphabet you a and B

well these are the probabilities of the observation a given state one

these are the probabilities of the observation a given state doing so so this is you parameterize it to do

um now

you can as right this in in the type of rules we wanted to do to express a context free

grammar and the next page

by simply saying that we have a bunch of terminals which uh what you'll to

and a bunch of non terminals which are state

okay

and essentially actually you can qualify all these in to rules that if i scott

from some starting state S one

then i generate

a a and remain S one it probably point five for and that's pretty clear is the i-th eight point

nine mike transition matrix of boy from S one to S one

and not blah but the probably a generate eight point nine ten point six point five and i can do

this for all possibilities and this is be the production rules

the drama which can generate any string which is mark of view

this is how a re

and then i simply start for example a S one

i use a real i S one piece me a S two with some probably point to just my simulation

i keep generating and what you see here is the markov chain grows

on a bayesian network from left to right

dependencies is just

one

this is trivial this is well known

now we let me give you some insight to the stochastic to three gram

so here are the same rules as what we and the previous page

the only difference

is this last rule

what you see here is that we have

a not want term at

a state if you like

which maps

not just to another stadium but

a bunch of state

and this is the key difference you see when you have a context free grammar the chomsky

sort of hierarchy of languages tells you that

any time you map from a nonterminal on one side

to a terminal and

multiple non terminals

such as a language the string can not be model by a markov G

now this is

oh wait so this ground or would generate and we compute a language you know such as C

for of whatever the all examples

of things we set for the strong

of course this is a probabilistic version because we log probabilities but things

the big a generalisation of course which we want consider here is context sensitive

in context sensitive you have a

eight

terminal followed by a non little which maps of the right and say that would be the context

this don't to

anyway so this is how you would generate

using these rules uh a stochastic context free grammars string

and what you know here is

the recursive embedding it doesn't just left to right but it grows inside out

so it other words

you see this a and this at

get further and further away and yet they depend on each other

C is long range dependence

which which can a picture

this is

in very simple terms

how you generate a stochastic context free grammar

now to some mathematics

so everything here so far been pictorial next one X three

so

how would you that he model this in terms of and

or stochastic process well these are actually special types of branching process

that multi type gal and what's and branching process

where the order of the caff the bottom process better

so you can view this is saying roughly speaking that each realise a should is a

three

or a patient at

so each realise asian is a patient

so every time we realise it

you have different patient

just like for

state space models you have a realisation T this case the call chomsky don't forms and so one

and to prove that i have a string and it does not belong to a particular language

there are well defined rooms which are called pumping that

which can like to show that

okay now i i guess wanna give you some more insight as a how you can process these sort of

strings

so i have a spring i have a bunch of observations which are noisy i want it's a is this

target gotta go has gone in a straight line has to go on and then all has gone in a

circle

these a context free grammar type things which cannot be recognized by a markov chain

what sort of out sums of it'll

well you know the are or in case

there are standard filters like a hidden markov model filter which is analogous to a kalman filter

if we want to do so a poster sequence estimate the the viterbi out

it's so

i you want to estimate the parameters that things like the so like to estimation by the expectation maximization

it turns out

all these things generalized to the stochastic context free grammar to me

so just like you have a forward-backward filter

in a stochastic context free grammar because you of on a parse tree

you have something called the inside outside help with

just like you have the viterbi algorithm in the stochastic context free grammar case you have you do the cook

younger "'cause" sound out the model only two parts

which which do

so

essentially what you have done is you of generalized the role or the evolution of a random process

on a straight line

to something which of balls on a tree now and that's that's basically that the i Q where eventually the

tree has long which dependencies because it start from a common

okay so that's uh

get get so this is a then your hmm the the

pitch a network

and

this is

for example stochastic comes to grab but this is called the parse tree

okay K and has that all these algorithms have polynomial only complexity what you lose is

typically for a

hmm or any type of mark of via process when you do

filtering

the complexity

of going through T points of data is then you're T

a common filter of your processing a million points requires a million

time to state space

in this case for a stochastic on is free grammars actually cubic peak the like the day but it still

or no

it's Q

okay

so uh here is a example of of some rules which are just distorted you bit more be killed the

paper give the out with them simulations and so on

so here is a sort of

flow of traffic in normal life

at this is actually a fight of the google that what of the things you can do very nice use

you can actually qualify

different maps digital maps in constraints in used to go

so you you have talk at just go in a straight line or a bunch of targets maybe be today

and then to you notice that

some that like this people are deviating from

how would you all to make that how you come up the bayesian inference that

something funny was going on you the people of changes

so you would qualify

the rule that people have on

equal amounts away and come back to the rule by equal

which is at a

and this sum the go by is a random amount

so you a fixed but you have a memory which is variable

and that cannot be and coded is

markov of G

and of course you measuring these a noise was of a noisy sensor

you can construct a grammatical and so this fairly easily terms of production rules

once you do that

you can construct an optimal filter

which gives it the optimal estimate of what's what's going on

so

i in the time available of course that can the find to how you derive still does but they given

in the paper and also the reference here

so i just one a kind of conclude with some some of the main points

what i want say here

so the first point is that these syntactic models which are constructed

context free grammar model

a a to really look at a complex

trajectories of targets

which can not be dealt with it the markovian work

there are several classes of trajectories which you can prove

can not be generated by a markov process

and

so i in some sense this allows us to go into a slightly more general class

these are meant to replace the human operator who sits and looks at track lit

coming out of a track or you wanna automate that for

the second point is that these algorithms of polynomial complexity so the practical

just like in

for a state space model you call that filtering she you would call that a parsing out

because you really constructing a parse S

uh in many cases you can view this is a human sensor interface because there's been a lot of work

done in the actual sense signal processing

how can i or to made the process of using that

estimate from the sensor level

to something which helps if you mean

determine a something's happen

that's really what we're talking about you

is the human sensor interface also known as middle where

there's a very interesting system you ready issues which i haven't even mentioned you

okay so the first thing is if we have a parse tree which keeps generating these non terminal to eventually

stops

of course you need the eventual length of you string to be fine i

so how do you and sure that's the case

those that stability issues he's a course some pretty cool than what's of processes

and essentially you need to spectral norm a particular matrix vol be the probabilities to be less and one

there are also constraint

if you know that a the target gonna and up at some particular location

how do you constrain the probabilities in your mobile so that

so that you model actually ends up there

piece so those are also pretty interesting problem

fine are several unresolved issues and this problem as i said much of the detail as been developed in the

by informatics that are only the last ten fifteen years

and

things like how sensitive is your beige an estimator to prior

does it for get you initialization geometrically fast which just to for example the kalman filter so

these are unresolved issues

so in time sort of that picture there's been a lot of work done research in this area are just

given you a couple of

references here

uh and we also use this quite a bit not only to model trajectories but also to model

for example radars

complex sensors are not markovian because they

they have feedback structure with the them and and really you can you

the

oh put of a complex sequence of sensors as a grammatical more

and so essentially the idea is

by doing that you can apply

patient signal processing to come up with

so that's really all the one C thank you

five questions to me

that

you're all we should really be asking questions

um can you go back to your original picture at the beginning where you had the

the uh

legacy tracker and

oh

yeah uh yeah right so

you have here

um the

try what's coming into the middle level of and you have a portable and saying feedback

uh

can you call in a or

so one additional point did mention is now suppose you can infer that talk it is actually gonna in a

rectangle

you can use that information to help you track at the lower that

which you low level tracker does on know that you low level tracker simply assuming a top it's sit down

target

flipping a point deciding where it's still

a if you had this high level infers that you know what's gonna make a right to

or if you look at the global maps and you figure out the constraint that the role only you can

go left or right any can't go right out of the role

you can of course utilise information to

to improve you estimates and and that's that's really good

that

i did mention that you're so of course as a feedback structure where you can use that to

proof proof you

yeah

uh could you please one tell the differences between a stochastic context free grammar and the mark of tree as

a small doing and in is can a good point

okay so the idea here is

we're looking at a actually a a a probably go to that little picture which are

okay so

this would be a standard finite state markov chain which is kind of a

this is our parse tree which is really and count and what's and branching process so i start with

a bunch of parents they have children they actually not children and so on and it keeps scrolling

but the point is the order of the chill the name of the children matters in a standard multi think

type gas the what's process you only K they have three kids of for kids

here the name of the matter because they can wait different trajectories

eventually

all these children stop and that's your all of that the the thing you read and you meet a from

that to right and that's for example that all or some sort of straw

so that's the multi type gas and what's some process we considering here

so it's really a a realisation of for tree whose size

is not a you know

you could only

more the expectation of that

you to come up with the as of that and and so

okay well are out of time thank very much we go to the next speaker or no