thanks thanks very much to very nice to be here and
thank you for organising
special
session um
between talking about particle filters
uh uh uh to sell not your filtering problem so uh
problem is to estimate X
which is D dimensional
and were given a dynamical system which can be a linear
in X
as well as measurement sequence of measurements
which can be nonlinear in in and
the usual approach and particle filtering is to compute
the entire are
conditional density of X conditioned on all measurement
which is a very very and vicious thing
uh to to uh and hence the
basic problem is computational complexity
so here's a plot that we made about eight or nine years ago for the um bootstrap particle filter
uh what i call the classical particle filter it um
illustrates the problem in high dimensions
uh
uh a different dimensional state vectors a one dimensional problems to easy and high dimensional problems are hard it's the
message of the chart the
vertical axes is the error estimation error roll to the optimal so this is optimal performance
and the horizontal axis is the number of particles
uh a million particles uh
ten dimensional state vector
that ten orders of magnitude worse than optimal
that's the problem now
uh it's also been the focus of the research and particle filters for the last fifteen
or uh
so here's the
the theory are we talking about today um
fix this problem at least for a certain class of examples so here is the same
problem uh for a thirty dimensional state vectors stable plant
uh you get essentially optimal performance
even for just the few that's was a hundred
particle so if you contrast
um
uh ten dimensional state vector with a million particles ten orders of magnitude worse than optimal
to the theory you'll be talking about today
or uh about a four order of magnitude and prove meant
at least for this example
uh that example was a stable plant here is an extremely unstable plants meeting all the eigenvalues are in the
left have
in the right have
plane and uh five dimensional state vector essentially optimal
for just a hundred particles
if the plant is uh a higher dimensional say thirty dimensional
you need more particles to approach optimality
uh
but one shouldn't get too excited about that that's sort of result these are just linear system
um it's an assurance that at least for linear system the new three L be talking about
is interesting here is a highly nonlinear example
hundred particles six dimensions
the i'm talking about here
each the extended kalman filter by roughly
uh and order of magnitude so there's a hint
that there something interesting useful new in that theory um talking about all show you more
not an your examples later
uh
second question is the computational complexity the theory i'm talking about here which be call exact flow
as compared to say the bootstrap particle filter and other particle filters um
for are for different dimensional state vectors going from five to thirty
uh for a hundred thousand particles
the new theory is roughly four orders of magnitude faster that's running in matlab on a single
P C uh and the question is i is that that
quite surprising
because we don't resample we have no proposal density and we never ever resample particles
and the
the bottleneck in a normal particle filter is the resampling step by eliminating
that we've
uh improve performance by
about
for orders of magnitude
speed up so we have
about three or four orders of magnitude fewer particles required in addition to
three or four orders of magnitude faster execution of the code
um
so these put together
is
six to eight orders of magnitude faster computation for the same
accuracy
in addition to that
um the theory all be telling you about this afternoon is uh
embarrassingly parallel liza meaning that
each individual particle has its own individual computation not interacting
in a strong way with other particles uh
this allows one to code these algorithms and say gpus thousand gpus you might give the speed up
factor
of a hundred this is unlike other particle filters where there's a very strong
bottleneck because of the resample
and again because we do not
resample we've eliminated that
well
so um how does this work
uh what's the theory
um well actually the first question is what's the problem the problem is sort of surprising
um any nonlinear filter would consist of these two parts a prediction of the probably density
and the application of bayes rule
so this is a pretty complicated operation it amounts to solving a partial differential equation
where is this operation is uh not as simple as you can imagine is just multiplying two functions of
point
and the very interesting surprising
but true
situation is that that
this is where the problem is
uh for the most part
well known
oh it's been well known for many years and has goes by the name of particle the generous C and
this is just a partition the double straight
what it means uh
if the prior density of the state is well represented at these points school particles
but you wish to multiply the red curve times the blue curve it turns out that uh
quite often it's unlikely to have particles in the right position
to represent that problem of the red and the blue curve
so somehow or other one would like to
position the product
we're get a new set of particles or moves the particles
so was to be able to get got a good representation of the product
and that's what we'll do
a do it starting with an idea yeah that is uh uh not our our idea uh not a new
idea uh it
first occurs
in a paper about ten years ago by simon got soul will be giving a talk later
here uh
it's the to find of flow of the probably city
from the prior G
to the post E the product of G times a
or this single value parameter lambda a lambda is like time
lan to go from zero to one
from the prior to the posteriori
and you can see when the is equal to one
uh we
P will be equal to the product of G times
what you really like to do however is somehow or other computer flow of particle
and that's what i'll tell you how to do flow the means
move the particles
using an ordinary differential equation
uh from lambda at
zero to lambda at one
so the name of the game is to compute the flow
a
we can do that in a large number of examples and uh the way to do it is to uh
expressed that uh flow as a function
and tight turns out the functions related to the solution of this partial differential equation
partial differential equation is uh
stream really simple if you had to have a partial differential equation it's about the nice one
to get it's linear
it's first water and it has constant color fish
um
on the other hand
we're particular about the solution that is we want the to be a stable dynamical system and secondly the left
hand side is a known function
but it's only known a random points in particular at the particle
uh but we can solve but nevertheless uh using many different methods
it's a um it's furthermore i should point out it's a
say
it's a
it's a linear partial differential equation that's highly under determined
at at it's a single scalar valued partial differential equation
but it's in the known
so uh
is generically only the term
uh there are many many ways to solve it and i'll show you a few
in this brief
period of time that i have
the talk about um
subject the first one is the most concrete the simplest to understand and the fast
for a special case
the special cases
if the prior and the likelihood are both gaussian
then we can solve the partial differential equation this function that's uh affine index
the matrix a a
and the vector V or explicitly computable
and the very good news is that that dynamical flow
is automatically stable
well the eigenvalues are in the left half
right
a second simple explicit
solution
that's also quite fast is given here
suppose you were given the divergence of F then you can pick the minimum norm solution F
as the generalized inverse as gradient of the log of the uh
the density times the uh log likelihood blessed virgin
and uh although that's quite
possibly
come suitable um
it turns out that the um
yeah generalized inverse is nothing of this gradient is nothing more than the transpose divided by the square of the
norm and so with
like only amounts to this
so the flow oh
particles goes in the direction of the gradient
of a lot of the density
and uh if that
gradient along the density is zero
the flow stuff
which is in agreement with ones
intuition
uh yeah the third solution
which is fairly easy to grasp
uh intuitively
and quickly is that uh we can solve that P D we can turn the pde into plus sounds equation
famous and Z
by asking that the vector feel be the gradient of the scalar potential
and then um to greens function for plus sounds equation in even in D dimensions as is well it's that
differentiated to get the gradient
we can pick either either of these two solution
and it a
turns into an algorithm
with the algorithm is the following
the direction of
the flow of the uh particles is given as the sum over particles
uh equal to the uh basically the electron density in this uh
charged a particle feel
times uh something that looks an awful lot like uh
a well of D was equal to three this to be an inverse square
law so it looks like cool ones a cool hoodlums law famous from physics
uh in particular lecture min attics and fluid flow and also
gravity
so here we have particles which are mathematical constructs to solve a nonlinear filtering problem
that should slow according to a famous physical law called cool homes
law uh in addition to being interesting from a physical viewpoint it's very fast
other ways to solve
at uh
concrete example
for a highly nonlinear problem
is
to solve the problem where you were given the
measurements of the cosine of an angle and then asked to estimate the and of course that's highly in figure
a single measurement
um
so one has to depend on the motion i'll of
the angle
over time
and the knowledge of that dynamics
and um
the dense the conditional density is highly multimodal uh bimodal at the beginning
and so a kalman filter or extended kalman filter would die verge the error would grow with time
um but if you wait about twenty seconds the particle filter that we design on version
and B the extended kalman filter
and uh the reason for this is of course extended kalman filters do not cannot
represent multi-modal dance
uh i here as a movie
of the particle flow uh
based on the very first measurement
a part there's about a thousand particles it's dot as the particle
it's in this uh as two dimensional state space
and it there's a uniform distribution of uncertainty in angle and angle rate
and as a result of making a single measurement
what the differential equations that we derive
automatically do
is create that
conditional them
so we go from a uniform density to a bimodal
then
wondering what uh
five but five minutes
um
most general solution for this uh um P D E can be written out this way
terms the generalized inverse C sharp sharper of this when you're difference operator
which is a matrix post the divergence operator
a plus a projection and to the D minus one dimensions orthogonal to that
uh
direction and Y is anything
so this is a very explicit
simple representation of the
um
uh ambiguity of the arbitrariness in the solution
a
the solution of this equation is highly arbitrary um
and
a question is can you pick a Y
that is a good
uh solution
take it good solution the answer is yes you can pick Y
stabilise this flow to make that
that's stable dynamical system
and you can do it in a very very simple way can just picked white why the minus
ten X
as an exam
uh quickly to nonlinear examples
twelve dimensional state vector
you measure three components the a quadratic measurements
uh the error of the particle filter decreases as the number of particles increase
or
and you finally be the extended kalman filter performance by about two
orders of magnitude similar for the cubic
linearity
one summarise by saying um
like just repeating what i said
uh are ready
emphasising if you fact
um
speed is of the essence here
um
and what i've channel you is an is a new algorithm based on a new theory that's orders of magnitude
faster than
certainly the bootstrap filter
and other standard particle filter
at least for certain non linear examples that i've shown you it's orders of magnitude more accurate than the extended
kalman filter
uh it focuses on solving
the key
a problem with standard particle filters that is
the problem particle to C
and it does it
uh by moving the particles to the right
part of state space
to compute
the product of two functions corresponding to bases
rule
we never resample particles because we don't have to and we'd rather not because that's a time consuming operation
since we never resample we never need a proposal density
uh we don't use any
a markov chain monte carlo that method
uh we never read represent the probably density but rather we represent the log
of the a normalized probably density
i've asserted that this is
highly parallel lies able
and uh
you might believe need but if you don't you can
try for yourself when your favourite seven
you use
and a final point is um
a a is this magic were fraud or is there some reason
that one can grasp intuitively Y
we should be getting better performance
uh and i think
i hint at
a reason is
where lot solving the most general filtering problem
we are not solving wall of the
i only full during problems that particle filters could solve
we solving the special
class of
problems where the probably densities are nowhere vanishing
and continuously french
the state
which is a more restricted class
so that we can write differential equation
and exploit the good idea is uh a mathematicians of
cooked top over the last two hundred years
to a van
a at least that's my curve
maybe be think something else after uh
a few years of reflection
questions uh please
simon
uh nowhere vanishing
yeah
a course in terms of computational complexity
the structure of the problem um um
can greatly influence
as you well know the um resulting computational complexity that from a theoretical
you point a it is nowhere vanishing continuously differentiable
then sit
peter
the particles are asked to follow the conditional density
uh throughout the predict time prediction step
as in any particle filter and the particles are asked to follow the conditional density
uh at the big
but four and after bayes rule
so if the algorithm is doing what i've asked to do the particles do in fact
satisfy the conditional then
yeah every single point
in the L
and at each
step
there's no proof that that happen
i have no idea what these sufficient conditions would be on such or
proof obviously you need enough particles
you need an of um
regularity of the densities and the dynamics
and the likelihood
uh we're no close to having a theory
that would be useful in guaranteeing such a result
it's proof by map
on a honest on the subset of
examples that we chose
for for all examples that we've were we we have worked out uh six different classes of
exam
lots of linear examples
uh
quadratic
Q
uh rate examples both with and without range measurements
and the um
measurements of phase
the cosine
so when all those examples that we
run our
we get
better
much better
accuracy than the extended
comes to
right
for the linear K yes for the linear your cases um
and for the quadratic and Q
for a given number
particles we be the
uh
well
a selected
set of standard particle
you probably know that there's lots and lots of particle filter
and will probably hear about new ones
sept F known that i
that one of that so
an exhaustive check be
the
uh
uh suppose you not believe me i'll give you the code and you can check it on your very for
uh set of
particle
a point of comparison
i think that's a way to make progress
or questions
yeah