Přepis řeči - PARTICLE FLOW FOR NONLINEAR FILTERS

0:00:13	thanks thanks very much to very nice to be here and
0:00:15	thank you for organising
0:00:17	special
0:00:18	session um
0:00:20	between talking about particle filters
0:00:23	uh uh uh to sell not your filtering problem so uh
0:00:29	problem is to estimate X
0:00:30	which is D dimensional
0:00:32	and were given a dynamical system which can be a linear
0:00:36	in X
0:00:37	as well as measurement sequence of measurements
0:00:39	which can be nonlinear in in and
0:00:42	the usual approach and particle filtering is to compute
0:00:46	the entire are
0:00:47	conditional density of X conditioned on all measurement
0:00:50	which is a very very and vicious thing
0:00:53	uh to to uh and hence the
0:00:57	basic problem is computational complexity
0:01:00	so here's a plot that we made about eight or nine years ago for the um bootstrap particle filter
0:01:07	uh what i call the classical particle filter it um
0:01:11	illustrates the problem in high dimensions
0:01:14	uh
0:01:16	uh a different dimensional state vectors a one dimensional problems to easy and high dimensional problems are hard it's the
0:01:22	message of the chart the
0:01:24	vertical axes is the error estimation error roll to the optimal so this is optimal performance
0:01:31	and the horizontal axis is the number of particles
0:01:34	uh a million particles uh
0:01:36	ten dimensional state vector
0:01:39	that ten orders of magnitude worse than optimal
0:01:42	that's the problem now
0:01:45	uh it's also been the focus of the research and particle filters for the last fifteen
0:01:51	or uh
0:01:53	so here's the
0:01:55	the theory are we talking about today um
0:01:58	fix this problem at least for a certain class of examples so here is the same
0:02:03	problem uh for a thirty dimensional state vectors stable plant
0:02:08	uh you get essentially optimal performance
0:02:11	even for just the few that's was a hundred
0:02:13	particle so if you contrast
0:02:16	um
0:02:19	uh ten dimensional state vector with a million particles ten orders of magnitude worse than optimal
0:02:25	to the theory you'll be talking about today
0:02:28	or uh about a four order of magnitude and prove meant
0:02:31	at least for this example
0:02:33	uh that example was a stable plant here is an extremely unstable plants meeting all the eigenvalues are in the
0:02:40	left have
0:02:41	in the right have
0:02:43	plane and uh five dimensional state vector essentially optimal
0:02:48	for just a hundred particles
0:02:49	if the plant is uh a higher dimensional say thirty dimensional
0:02:54	you need more particles to approach optimality
0:02:57	uh
0:02:59	but one shouldn't get too excited about that that's sort of result these are just linear system
0:03:05	um it's an assurance that at least for linear system the new three L be talking about
0:03:11	is interesting here is a highly nonlinear example
0:03:15	hundred particles six dimensions
0:03:17	the i'm talking about here
0:03:20	each the extended kalman filter by roughly
0:03:23	uh and order of magnitude so there's a hint
0:03:27	that there something interesting useful new in that theory um talking about all show you more
0:03:32	not an your examples later
0:03:35	uh
0:03:36	second question is the computational complexity the theory i'm talking about here which be call exact flow
0:03:44	as compared to say the bootstrap particle filter and other particle filters um
0:03:51	for are for different dimensional state vectors going from five to thirty
0:03:56	uh for a hundred thousand particles
0:03:59	the new theory is roughly four orders of magnitude faster that's running in matlab on a single
0:04:06	P C uh and the question is i is that that
0:04:09	quite surprising
0:04:11	because we don't resample we have no proposal density and we never ever resample particles
0:04:16	and the
0:04:17	the bottleneck in a normal particle filter is the resampling step by eliminating
0:04:23	that we've
0:04:24	uh improve performance by
0:04:27	about
0:04:27	for orders of magnitude
0:04:29	speed up so we have
0:04:31	about three or four orders of magnitude fewer particles required in addition to
0:04:37	three or four orders of magnitude faster execution of the code
0:04:42	um
0:04:43	so these put together
0:04:45	is
0:04:46	six to eight orders of magnitude faster computation for the same
0:04:51	accuracy
0:04:52	in addition to that
0:04:53	um the theory all be telling you about this afternoon is uh
0:04:59	embarrassingly parallel liza meaning that
0:05:03	each individual particle has its own individual computation not interacting
0:05:09	in a strong way with other particles uh
0:05:12	this allows one to code these algorithms and say gpus thousand gpus you might give the speed up
0:05:19	factor
0:05:20	of a hundred this is unlike other particle filters where there's a very strong
0:05:24	bottleneck because of the resample
0:05:27	and again because we do not
0:05:29	resample we've eliminated that
0:05:31	well
0:05:32	so um how does this work
0:05:35	uh what's the theory
0:05:37	um well actually the first question is what's the problem the problem is sort of surprising
0:05:43	um any nonlinear filter would consist of these two parts a prediction of the probably density
0:05:50	and the application of bayes rule
0:05:52	so this is a pretty complicated operation it amounts to solving a partial differential equation
0:05:58	where is this operation is uh not as simple as you can imagine is just multiplying two functions of
0:06:03	point
0:06:04	and the very interesting surprising
0:06:07	but true
0:06:08	situation is that that
0:06:10	this is where the problem is
0:06:12	uh for the most part
0:06:14	well known
0:06:16	oh it's been well known for many years and has goes by the name of particle the generous C and
0:06:21	this is just a partition the double straight
0:06:24	what it means uh
0:06:26	if the prior density of the state is well represented at these points school particles
0:06:32	but you wish to multiply the red curve times the blue curve it turns out that uh
0:06:37	quite often it's unlikely to have particles in the right position
0:06:41	to represent that problem of the red and the blue curve
0:06:44	so somehow or other one would like to
0:06:47	position the product
0:06:49	we're get a new set of particles or moves the particles
0:06:53	so was to be able to get got a good representation of the product
0:06:56	and that's what we'll do
0:06:59	a do it starting with an idea yeah that is uh uh not our our idea uh not a new
0:07:05	idea uh it
0:07:06	first occurs
0:07:08	in a paper about ten years ago by simon got soul will be giving a talk later
0:07:14	here uh
0:07:15	it's the to find of flow of the probably city
0:07:18	from the prior G
0:07:20	to the post E the product of G times a
0:07:23	or this single value parameter lambda a lambda is like time
0:07:27	lan to go from zero to one
0:07:30	from the prior to the posteriori
0:07:32	and you can see when the is equal to one
0:07:35	uh we
0:07:35	P will be equal to the product of G times
0:07:39	what you really like to do however is somehow or other computer flow of particle
0:07:44	and that's what i'll tell you how to do flow the means
0:07:47	move the particles
0:07:49	using an ordinary differential equation
0:07:52	uh from lambda at
0:07:53	zero to lambda at one
0:07:56	so the name of the game is to compute the flow
0:07:58	a
0:07:59	we can do that in a large number of examples and uh the way to do it is to uh
0:08:05	expressed that uh flow as a function
0:08:09	and tight turns out the functions related to the solution of this partial differential equation
0:08:15	partial differential equation is uh
0:08:18	stream really simple if you had to have a partial differential equation it's about the nice one
0:08:24	to get it's linear
0:08:26	it's first water and it has constant color fish
0:08:31	um
0:08:32	on the other hand
0:08:34	we're particular about the solution that is we want the to be a stable dynamical system and secondly the left
0:08:40	hand side is a known function
0:08:42	but it's only known a random points in particular at the particle
0:08:47	uh but we can solve but nevertheless uh using many different methods
0:08:51	it's a um it's furthermore i should point out it's a
0:08:55	say
0:08:57	it's a
0:08:58	it's a linear partial differential equation that's highly under determined
0:09:03	at at it's a single scalar valued partial differential equation
0:09:06	but it's in the known
0:09:08	so uh
0:09:09	is generically only the term
0:09:12	uh there are many many ways to solve it and i'll show you a few
0:09:16	in this brief
0:09:18	period of time that i have
0:09:19	the talk about um
0:09:22	subject the first one is the most concrete the simplest to understand and the fast
0:09:27	for a special case
0:09:28	the special cases
0:09:30	if the prior and the likelihood are both gaussian
0:09:33	then we can solve the partial differential equation this function that's uh affine index
0:09:39	the matrix a a
0:09:40	and the vector V or explicitly computable
0:09:43	and the very good news is that that dynamical flow
0:09:47	is automatically stable
0:09:49	well the eigenvalues are in the left half
0:09:52	right
0:09:53	a second simple explicit
0:09:55	solution
0:09:57	that's also quite fast is given here
0:10:00	suppose you were given the divergence of F then you can pick the minimum norm solution F
0:10:05	as the generalized inverse as gradient of the log of the uh
0:10:10	the density times the uh log likelihood blessed virgin
0:10:15	and uh although that's quite
0:10:17	possibly
0:10:18	come suitable um
0:10:20	it turns out that the um
0:10:25	yeah generalized inverse is nothing of this gradient is nothing more than the transpose divided by the square of the
0:10:31	norm and so with
0:10:32	like only amounts to this
0:10:34	so the flow oh
0:10:36	particles goes in the direction of the gradient
0:10:39	of a lot of the density
0:10:41	and uh if that
0:10:43	gradient along the density is zero
0:10:46	the flow stuff
0:10:47	which is in agreement with ones
0:10:50	intuition
0:10:51	uh yeah the third solution
0:10:54	which is fairly easy to grasp
0:10:56	uh intuitively
0:10:57	and quickly is that uh we can solve that P D we can turn the pde into plus sounds equation
0:11:03	famous and Z
0:11:05	by asking that the vector feel be the gradient of the scalar potential
0:11:10	and then um to greens function for plus sounds equation in even in D dimensions as is well it's that
0:11:17	differentiated to get the gradient
0:11:19	we can pick either either of these two solution
0:11:22	and it a
0:11:23	turns into an algorithm
0:11:25	with the algorithm is the following
0:11:27	the direction of
0:11:28	the flow of the uh particles is given as the sum over particles
0:11:33	uh equal to the uh basically the electron density in this uh
0:11:38	charged a particle feel
0:11:41	times uh something that looks an awful lot like uh
0:11:44	a well of D was equal to three this to be an inverse square
0:11:48	law so it looks like cool ones a cool hoodlums law famous from physics
0:11:53	uh in particular lecture min attics and fluid flow and also
0:11:57	gravity
0:11:59	so here we have particles which are mathematical constructs to solve a nonlinear filtering problem
0:12:05	that should slow according to a famous physical law called cool homes
0:12:11	law uh in addition to being interesting from a physical viewpoint it's very fast
0:12:17	other ways to solve
0:12:19	at uh
0:12:20	concrete example
0:12:23	for a highly nonlinear problem
0:12:25	is
0:12:26	to solve the problem where you were given the
0:12:29	measurements of the cosine of an angle and then asked to estimate the and of course that's highly in figure
0:12:35	a single measurement
0:12:36	um
0:12:37	so one has to depend on the motion i'll of
0:12:39	the angle
0:12:40	over time
0:12:42	and the knowledge of that dynamics
0:12:44	and um
0:12:45	the dense the conditional density is highly multimodal uh bimodal at the beginning
0:12:51	and so a kalman filter or extended kalman filter would die verge the error would grow with time
0:12:57	um but if you wait about twenty seconds the particle filter that we design on version
0:13:04	and B the extended kalman filter
0:13:06	and uh the reason for this is of course extended kalman filters do not cannot
0:13:12	represent multi-modal dance
0:13:14	uh i here as a movie
0:13:16	of the particle flow uh
0:13:19	based on the very first measurement
0:13:21	a part there's about a thousand particles it's dot as the particle
0:13:25	it's in this uh as two dimensional state space
0:13:28	and it there's a uniform distribution of uncertainty in angle and angle rate
0:13:34	and as a result of making a single measurement
0:13:37	what the differential equations that we derive
0:13:40	automatically do
0:13:42	is create that
0:13:43	conditional them
0:13:45	so we go from a uniform density to a bimodal
0:13:49	then
0:13:51	wondering what uh
0:13:53	five but five minutes
0:13:56	um
0:14:01	most general solution for this uh um P D E can be written out this way
0:14:06	terms the generalized inverse C sharp sharper of this when you're difference operator
0:14:11	which is a matrix post the divergence operator
0:14:14	a plus a projection and to the D minus one dimensions orthogonal to that
0:14:19	uh
0:14:20	direction and Y is anything
0:14:23	so this is a very explicit
0:14:25	simple representation of the
0:14:27	um
0:14:29	uh ambiguity of the arbitrariness in the solution
0:14:32	a
0:14:33	the solution of this equation is highly arbitrary um
0:14:37	and
0:14:38	a question is can you pick a Y
0:14:41	that is a good
0:14:42	uh solution
0:14:43	take it good solution the answer is yes you can pick Y
0:14:47	stabilise this flow to make that
0:14:49	that's stable dynamical system
0:14:51	and you can do it in a very very simple way can just picked white why the minus
0:14:56	ten X
0:14:57	as an exam
0:14:59	uh quickly to nonlinear examples
0:15:02	twelve dimensional state vector
0:15:04	you measure three components the a quadratic measurements
0:15:08	uh the error of the particle filter decreases as the number of particles increase
0:15:13	or
0:15:14	and you finally be the extended kalman filter performance by about two
0:15:19	orders of magnitude similar for the cubic
0:15:22	linearity
0:15:24	one summarise by saying um
0:15:28	like just repeating what i said
0:15:30	uh are ready
0:15:32	emphasising if you fact
0:15:34	um
0:15:35	speed is of the essence here
0:15:38	um
0:15:39	and what i've channel you is an is a new algorithm based on a new theory that's orders of magnitude
0:15:45	faster than
0:15:46	certainly the bootstrap filter
0:15:48	and other standard particle filter
0:15:52	at least for certain non linear examples that i've shown you it's orders of magnitude more accurate than the extended
0:15:59	kalman filter
0:16:01	uh it focuses on solving
0:16:03	the key
0:16:04	a problem with standard particle filters that is
0:16:08	the problem particle to C
0:16:10	and it does it
0:16:12	uh by moving the particles to the right
0:16:15	part of state space
0:16:16	to compute
0:16:17	the product of two functions corresponding to bases
0:16:21	rule
0:16:23	we never resample particles because we don't have to and we'd rather not because that's a time consuming operation
0:16:30	since we never resample we never need a proposal density
0:16:34	uh we don't use any
0:16:36	a markov chain monte carlo that method
0:16:39	uh we never read represent the probably density but rather we represent the log
0:16:44	of the a normalized probably density
0:16:47	i've asserted that this is
0:16:48	highly parallel lies able
0:16:50	and uh
0:16:51	you might believe need but if you don't you can
0:16:54	try for yourself when your favourite seven
0:16:57	you use
0:16:58	and a final point is um
0:17:00	a a is this magic were fraud or is there some reason
0:17:05	that one can grasp intuitively Y
0:17:09	we should be getting better performance
0:17:11	uh and i think
0:17:12	i hint at
0:17:14	a reason is
0:17:15	where lot solving the most general filtering problem
0:17:18	we are not solving wall of the
0:17:20	i only full during problems that particle filters could solve
0:17:24	we solving the special
0:17:26	class of
0:17:27	problems where the probably densities are nowhere vanishing
0:17:31	and continuously french
0:17:33	the state
0:17:34	which is a more restricted class
0:17:36	so that we can write differential equation
0:17:40	and exploit the good idea is uh a mathematicians of
0:17:45	cooked top over the last two hundred years
0:17:48	to a van
0:17:50	a at least that's my curve
0:17:52	maybe be think something else after uh
0:17:54	a few years of reflection
0:17:56	questions uh please
0:18:07	simon
0:18:16	uh nowhere vanishing
0:18:19	yeah
0:18:21	a course in terms of computational complexity
0:18:23	the structure of the problem um um
0:18:27	can greatly influence
0:18:28	as you well know the um resulting computational complexity that from a theoretical
0:18:33	you point a it is nowhere vanishing continuously differentiable
0:18:37	then sit
0:18:42	peter
0:19:02	the particles are asked to follow the conditional density
0:19:06	uh throughout the predict time prediction step
0:19:09	as in any particle filter and the particles are asked to follow the conditional density
0:19:15	uh at the big
0:19:17	but four and after bayes rule
0:19:19	so if the algorithm is doing what i've asked to do the particles do in fact
0:19:24	satisfy the conditional then
0:19:27	yeah every single point
0:19:28	in the L
0:19:29	and at each
0:19:30	step
0:19:32	there's no proof that that happen
0:19:34	i have no idea what these sufficient conditions would be on such or
0:19:38	proof obviously you need enough particles
0:19:41	you need an of um
0:19:44	regularity of the densities and the dynamics
0:19:47	and the likelihood
0:19:48	uh we're no close to having a theory
0:19:51	that would be useful in guaranteeing such a result
0:19:54	it's proof by map
0:19:57	on a honest on the subset of
0:19:59	examples that we chose
0:20:19	for for all examples that we've were we we have worked out uh six different classes of
0:20:24	exam
0:20:26	lots of linear examples
0:20:27	uh
0:20:28	quadratic
0:20:29	Q
0:20:30	uh rate examples both with and without range measurements
0:20:34	and the um
0:20:36	measurements of phase
0:20:38	the cosine
0:20:39	so when all those examples that we
0:20:41	run our
0:20:42	we get
0:20:43	better
0:20:44	much better
0:20:45	accuracy than the extended
0:20:47	comes to
0:20:52	right
0:20:55	for the linear K yes for the linear your cases um
0:20:58	and for the quadratic and Q
0:21:01	for a given number
0:21:03	particles we be the
0:21:05	uh
0:21:07	well
0:21:08	a selected
0:21:09	set of standard particle
0:21:12	you probably know that there's lots and lots of particle filter
0:21:14	and will probably hear about new ones
0:21:17	sept F known that i
0:21:18	that one of that so
0:21:20	an exhaustive check be
0:21:22	the
0:21:23	uh
0:21:24	uh suppose you not believe me i'll give you the code and you can check it on your very for
0:21:28	uh set of
0:21:29	particle
0:21:32	a point of comparison
0:21:33	i think that's a way to make progress
0:21:38	or questions
0:21:43	yeah

PARTICLE FLOW FOR NONLINEAR FILTERS

Particle Filtering for High Dimensional Problems

Přednášející: Fred Daum, Autoři: Fred Daum, Jim Huang, Raytheon, United States