Speech Transcript - MAP-BASED ESTIMATION OF THE PARAMETERS OF NON-STATIONARY GAUSSIAN PROCESSES FROM NOISY OBSERVATIONS

0:00:14	okay okay welcome
0:00:16	my name's eggs and a are and time
0:00:19	he of one from germany
0:00:21	and um
0:00:22	to they have give a talk and my based estimation of the wrong to us
0:00:26	a a stationary process
0:00:28	not
0:00:28	innovations
0:00:30	and uh and the talk
0:00:31	on
0:00:33	a
0:00:33	as follows five the action
0:00:36	or i give the motivation for
0:00:37	problem
0:00:39	um
0:00:39	sense of method is based on the conventional map method i will to review and
0:00:44	mention that that and and now explain the modifications which and necessary
0:00:49	to extend
0:00:50	the
0:00:50	method
0:00:51	for a noisy observations
0:00:53	talk will be computed
0:00:55	a summary and all
0:00:57	okay to start the integration
0:00:59	start to uh with the
0:01:01	process which is
0:01:03	described by a white caution stochastic process which is you know by your and
0:01:08	and uh and
0:01:09	a is can be interpreted as a time index and here we on the right hand
0:01:15	an example which years
0:01:17	a samples of this process a a given and dark red
0:01:21	and uh
0:01:22	what
0:01:22	you can see here is that the me
0:01:24	and the variance of this process
0:01:25	a a very with time
0:01:28	not the problem is you are not able to observe these samples of
0:01:31	a process but you all only you
0:01:33	able to to a noise samples which are denoted by the head
0:01:38	and uh we assume that gives a efficient error or
0:01:40	is a zero mean and uh
0:01:43	um um are only
0:01:45	a time varying variance of may
0:01:47	be strongly time-variant but you as but you know the variance
0:01:51	and the question is now
0:01:53	um um how can you um find a simple method for estimation of the time varying mean and the variance
0:01:58	of
0:01:59	process which you can
0:02:00	only observe zero
0:02:01	and noisy
0:02:03	which
0:02:03	can of the only know samples of
0:02:06	okay
0:02:07	and the yeah D is uh we assume that uh the mean and the variance are still time varying and
0:02:12	uh
0:02:12	we want to exploit the correlations between successive that's of value so use to
0:02:17	and uh since
0:02:18	this we do want to exploit a priori knowledge which we again from the previous of the divisions
0:02:23	and so um
0:02:25	for this reason we use a a maximum a posteriori approach
0:02:28	based approach
0:02:30	and uh
0:02:31	um
0:02:32	since uh
0:02:33	this will be the uh basic
0:02:36	for me
0:02:37	um and method which we propose i would first review on this
0:02:40	which everybody i think
0:02:41	we know here
0:02:43	for the first uh
0:02:45	first case we assume a stationary process that
0:02:48	parameters don't vary with time of that being set a fixed
0:02:51	fixed mean and variance and we assume but don't all
0:02:54	there's is no noise and the visions and the concept i think everybody knows
0:02:58	you have a some of the observations
0:03:00	do you want to we and and uh
0:03:03	you have
0:03:03	start with the private yeah which you gain from these of the patients and you try to prove
0:03:08	estimates based and new observation be of plus one
0:03:11	and uh
0:03:12	okay the concept does then you just uh compute some you estimates
0:03:16	and uh uh you actually structure the maximization
0:03:20	we have P yeah and uh i think everybody knows that the this is composed
0:03:24	uh
0:03:25	of a private yeah
0:03:26	uh which
0:03:28	um
0:03:28	actually uh gives information from the and you get some
0:03:32	and an observation actually
0:03:34	okay
0:03:35	and not what are components of
0:03:37	as where yeah if you have a cost an observation like of course
0:03:40	and then you have to assume a can get prior art
0:03:43	and this case
0:03:44	something like a product of an and inverse scale
0:03:47	he's square or distribution multiplied by uh
0:03:51	um
0:03:52	caution distribution and you have for like the parameters
0:03:55	two of them are location and scale me
0:03:57	actually be represented
0:03:59	three of uh
0:04:01	so do you have gained from the previous observations about the mean and
0:04:05	same you have for the
0:04:07	and for variance you have to decrease
0:04:08	freedom
0:04:09	scale which i
0:04:10	by
0:04:11	sci and
0:04:12	a on the square
0:04:13	and
0:04:14	now then you get some that roots for the drama us actually you increase
0:04:18	uh the scale
0:04:19	and the decrease of freedom but one means you get one observation more
0:04:23	and uh the
0:04:25	you estimate for the mean so wait a which
0:04:28	from the old value and a new observation
0:04:30	the weight
0:04:31	factor for you to uh for
0:04:33	observation is
0:04:34	inversely proportional to the number of observations one
0:04:38	and uh a similar expression for the said don't on to detect
0:04:42	actually
0:04:43	and now when you have a computer these parameters
0:04:45	you can uh
0:04:47	compute the you a maximum
0:04:48	the ticket and
0:04:50	and you get
0:04:50	the estimates for the mean of the variance
0:04:53	the standard approach okay
0:04:54	what happens now oh okay yeah
0:04:56	example
0:04:57	and here is um
0:04:58	example for so
0:04:59	process process the in variance a
0:05:02	chosen to one and you have an example of
0:05:04	uh a five hundred samples
0:05:06	and below know
0:05:07	there are estimates which are uh uh you um which are obtained from the
0:05:12	um
0:05:12	method
0:05:13	and and the right hand side you see a posterior pdf which
0:05:16	you're is shown after ten observations as you see a a a a lot ten observations
0:05:21	uh actually you
0:05:23	can can
0:05:24	so actually very flat
0:05:26	and uh the centre years
0:05:28	quite white
0:05:29	uh quite
0:05:30	uh
0:05:30	i i don't the since away from the the design
0:05:33	uh a which it should you one one
0:05:36	and now what happens if the observations
0:05:38	a increases
0:05:39	then uh them
0:05:41	distribution gets more P key and
0:05:42	gets closer to the
0:05:44	is i point
0:05:45	now
0:05:46	see that you get to much more more sure about yours
0:05:49	okay now what happens if you are not signal process
0:05:52	now uh you still have not streams nations
0:05:55	what uh the parent i size to be time varying and what you can do is to introduce a
0:05:59	for getting in
0:06:00	and keep the degrees of freedom
0:06:02	from be increased
0:06:04	means you can assign a constant value
0:06:06	to both of them
0:06:07	and uh a a it means that
0:06:09	you you it's you
0:06:10	actually use
0:06:11	information of and last observation from the past
0:06:15	and this value and of
0:06:16	of process
0:06:17	shows and
0:06:18	uh a to me to to between estimation accuracy and tracking in G
0:06:22	that means if you have
0:06:23	got a a a a high
0:06:24	uh value for N
0:06:26	you have uh
0:06:27	very good estimation accuracy but the tracking and you will of cost now
0:06:31	okay now we an example again yeah
0:06:34	yeah
0:06:35	got process of with a time varying mean and variance
0:06:38	a functions for that are given here
0:06:41	the and you are
0:06:41	you an example with two thousand samples
0:06:44	and now we can see
0:06:46	you low
0:06:47	you estimates for the meeting on the left hand side and the estimates
0:06:50	where on the right hand side
0:06:51	uh
0:06:52	you can see that
0:06:53	actually but
0:06:53	i go and
0:06:55	the the estimates what variance in fact that more course
0:06:58	since
0:06:58	second or or or uh uh statistics but you're here to be estimated
0:07:02	what happens now a few chris number of and then
0:07:05	the estimates
0:07:06	get most move of "'cause" but you to a since uh the tracking you
0:07:11	uh
0:07:11	not so good
0:07:12	as
0:07:13	a a variance can be no but since to
0:07:16	um
0:07:17	a function for the variance
0:07:19	uh
0:07:19	uh there is
0:07:20	um a very slow and time
0:07:23	now what happens now if of noise of the base and that is interesting case and know what oh
0:07:28	what kind of of modifications must
0:07:30	must be done
0:07:32	is not what what happens
0:07:33	oh
0:07:34	but
0:07:34	case of not it's of patients
0:07:36	the like to it
0:07:37	changes
0:07:38	and
0:07:39	you in see that you have no uh
0:07:41	um
0:07:42	at to the variance
0:07:44	of the you the noise
0:07:45	uh
0:07:46	at the corresponding terms of the likelihood function
0:07:49	and to a problem is not that a for this like a function that's of course not gonna get prior
0:07:54	since the the like to function
0:07:55	a factor
0:07:56	we have the variance of the observation or
0:07:59	is an i i is to and the spectre and there
0:08:02	skunk you the prior distribution
0:08:04	now what happens
0:08:05	here are just apply method
0:08:07	a without uh
0:08:09	considering that
0:08:10	error
0:08:11	and you will get a bias
0:08:12	and
0:08:12	a you few an example of a few once
0:08:14	mean and variance again
0:08:16	and uh the uh and observation or or
0:08:19	is is not a to be random and to
0:08:22	as is a uniform
0:08:23	draw from
0:08:24	this interval here in the right order or and that was a
0:08:28	actually a scale science crap function
0:08:30	and here or let's that side here
0:08:32	oh such a process and dark right again the noise free samples and
0:08:35	do not be noisy observations
0:08:37	and know what happens you use you what inside
0:08:40	uh uh what do you the algorithm actually estimates
0:08:43	is
0:08:44	um
0:08:45	um
0:08:46	and very biased since actually yeah a real tries to estimate
0:08:49	the
0:08:50	variance of the
0:08:52	a a couple of uh
0:08:55	process of means
0:08:56	but loose and but buttons
0:08:57	since
0:08:58	uh the variance of this process
0:09:00	uh make a flat rate very high
0:09:02	a time
0:09:03	the uh is
0:09:04	actually is not
0:09:06	a reasonable solution
0:09:08	i of the variance is high the or system
0:09:11	no not "'cause" all do of what's
0:09:13	has to be done
0:09:14	it to consider the observation error
0:09:17	oh at uh um comes as
0:09:18	two components
0:09:20	first one is uh
0:09:22	we proposed
0:09:23	first find a good approximation of the maximum
0:09:26	first you P yeah and the scale parameter
0:09:29	and the second step
0:09:30	we have proposed to approximate the posterior pdf
0:09:33	with the same shape
0:09:35	right I
0:09:36	he's that the maximum of the true posterior and the approximate
0:09:39	steering must match
0:09:41	and
0:09:42	we have
0:09:43	assume the same degrees of freedom from for the steered yeah
0:09:47	and the but and the approximate
0:09:48	posterior you have whatever that means
0:09:51	now a come on the first
0:09:52	a point
0:09:53	yeah i have
0:09:54	um
0:09:55	the true posterior P that looks quite complicated but
0:09:59	not
0:10:00	think you're is important i will
0:10:02	so you bought things here
0:10:04	and principle you could uh take this
0:10:06	as you if it happens to a local search of course
0:10:09	and um
0:10:10	about
0:10:12	as functions but
0:10:12	this would
0:10:13	on the one and very computationally expensive and this
0:10:16	point is that
0:10:18	a a you know
0:10:19	i could compute the maximal this
0:10:20	a it would
0:10:22	have no uh
0:10:23	clue all
0:10:24	escape from
0:10:25	now comes
0:10:27	a whole idea
0:10:28	if you look at these expressions uh which i you and colour
0:10:31	they were sampled you expressions
0:10:33	a a of the prior you have
0:10:36	and the prior yeah these expressions are constants and now you the expressions are
0:10:40	actually um
0:10:42	a functions of the variance
0:10:45	and now if you look
0:10:46	at these functions
0:10:47	for example
0:10:48	at the scale parameter for
0:10:50	for the um
0:10:51	for the mean
0:10:52	see that uh these function
0:10:54	they they between you probably tell
0:10:56	a couple of and and uh the new problem car and that's one
0:11:00	and uh
0:11:01	same same uh
0:11:02	holds for meeting
0:11:04	lies between me mean
0:11:06	and
0:11:06	you
0:11:08	now all idea was motivated by the fact that own
0:11:11	those values
0:11:13	uh
0:11:13	which are in the
0:11:15	vicinity of the true
0:11:17	uh variance variance uh since
0:11:19	the are
0:11:20	prior video will have a high values and that region
0:11:23	and for this reason
0:11:25	proposed approximate
0:11:26	these functions
0:11:28	oh
0:11:28	the variance by constant
0:11:30	by applying in the
0:11:32	variance estimate of the problem of the
0:11:35	um
0:11:36	process of of and
0:11:37	for a from the
0:11:39	a a time and
0:11:40	and
0:11:41	i do this we get constants
0:11:43	for yeah
0:11:44	um skate around at all in the mean out
0:11:47	and
0:11:48	first uh advantage that we uh what the maximum search
0:11:51	a in
0:11:53	and the second uh advantage
0:11:55	is that we
0:11:56	get a scale parameter
0:11:58	and you can see you also what happens if we do this
0:12:02	for example look at uh
0:12:03	channel
0:12:05	a here
0:12:06	uh a if the observation error is very high
0:12:09	and
0:12:09	you know it that would be done need to but this observation error or
0:12:12	and
0:12:13	the new estimate actually will
0:12:15	E
0:12:16	equal to the oldest estimate that means
0:12:18	that from a very no it's it's you can't learn you think that you
0:12:21	stick to the old value
0:12:22	and what happens if
0:12:24	it the observation are or is very low input put there as to the old to estimate here
0:12:29	then
0:12:29	uh
0:12:30	term maybe you can not do you get your
0:12:33	and
0:12:33	expression which is equal to one and that means that you can learn very much from this
0:12:37	H
0:12:38	okay
0:12:40	okay and the same of cost a holds for the mean
0:12:43	and now had
0:12:45	found that the mean
0:12:46	and uh the scale parameter
0:12:48	we in the second step um
0:12:50	we find the maximum of the post your pdf with
0:12:53	respect to the variance
0:12:55	and uh we have shown and all pay but that this is equivalent to finding the only root of for
0:12:59	for all the long you'll and known into well
0:13:02	and this can be uh done very easily you with a bisection method and uh
0:13:07	later later vacation of a new method
0:13:09	very
0:13:09	you you done
0:13:11	very simple and computationally efficient
0:13:14	on the advantage of
0:13:15	actually
0:13:17	okay and uh are now we come to
0:13:19	a second step now we have found the maximum of the true posterior and we have
0:13:22	found an approximate of the scaling parameter
0:13:25	and now
0:13:26	we
0:13:27	approximate this
0:13:28	a with a
0:13:29	with a a P D F which has the same shape as a prior in order to recursively applied met
0:13:35	and
0:13:35	for this
0:13:36	we have to choose a hyper parameters
0:13:38	two
0:13:39	first have parameters
0:13:40	which are already a which referring to be in a or time and
0:13:45	are we have
0:13:45	to choose
0:13:46	and the
0:13:47	parameters
0:13:48	sign which once in a while
0:13:50	observations actually
0:13:51	and we set it
0:13:53	uh actually to the number
0:13:54	a couple i am plus one
0:13:56	and
0:13:57	or the setting we also get
0:13:59	and
0:14:00	this scale problem at a for the variance
0:14:02	no i just an example of the true posterior pdf only that and side and them
0:14:06	approximate posterior pdf
0:14:08	right hand side and
0:14:10	i do not know if you can see any difference
0:14:12	what's uh
0:14:13	the that yeah is
0:14:15	the the rotated
0:14:17	to the right hand side here
0:14:19	and
0:14:19	this year is actually symmetric symmetrical to this axis yeah
0:14:23	but uh
0:14:24	i want to show actually that are quite simple
0:14:27	now an example
0:14:29	um
0:14:30	yeah again
0:14:31	process with the
0:14:32	um a constant variance and a
0:14:35	the observation errors again random
0:14:37	and
0:14:38	we have a a comparison but be
0:14:40	a conventional method and the
0:14:41	proposed method
0:14:42	on left hand side
0:14:44	use you first
0:14:45	a comparison between the mean estimate
0:14:48	estimation
0:14:49	yeah the could mention that of course
0:14:51	estimates the true mean
0:14:52	since the bear a sense to mean of the blue samples of cost
0:14:55	the same
0:14:56	as that of the dark
0:14:58	right samples
0:14:59	since the me since the was a vision error is zero mean
0:15:03	but
0:15:03	see
0:15:04	that the uh propose not that estimates to be more accurate
0:15:07	and
0:15:08	same same uh for the their an system it you see that
0:15:11	is no why is here and that the variances
0:15:14	actually
0:15:15	estimate is quite accurate while here in the
0:15:17	can mention and that estimation method
0:15:19	C
0:15:20	a quite by
0:15:22	now an example for nonstationary process
0:15:25	now we have a
0:15:27	a time varying variance
0:15:29	um
0:15:30	we have here an example of can with two thousand observations and the observation noise is not random again
0:15:36	and
0:15:36	yeah
0:15:37	a the right
0:15:38	ball of the right well as yeah a controlled by a factor of C which controls the maximum variance
0:15:43	a terrible
0:15:45	here
0:15:46	a comparison of the you
0:15:48	performance on the that the mention a method
0:15:51	this see that do the estimates fact a very
0:15:54	hi and so on but
0:15:55	a method yeah
0:15:57	more more at here and
0:15:59	again here is he
0:16:00	for the variance estimate at very
0:16:02	by a very high bias for the conventional method
0:16:05	which is not you
0:16:06	true
0:16:07	for the
0:16:08	but method
0:16:08	proposed
0:16:10	um
0:16:11	and
0:16:12	i have
0:16:12	just to slides
0:16:13	i think
0:16:14	will
0:16:15	you okay
0:16:16	um
0:16:17	no what we do uh what do you that to so we measure the root mean squared error
0:16:22	when we better you about
0:16:24	right a right part of the
0:16:26	interval for the uh
0:16:28	observation or or
0:16:29	and what you can see is here
0:16:31	the um would be it's good as for the mean and the variance for
0:16:35	a conventional and the proposed method and
0:16:37	but you can see use that we always
0:16:39	what was that all performance is always improve compared to the dimension method
0:16:43	and
0:16:44	that to improvements
0:16:45	get more pronounced with increasing use of observation noise
0:16:48	oh come
0:16:49	fusion
0:16:50	we have
0:16:50	a an approximate map approach for the estimation
0:16:53	slowly time varying parameters of not stationary white gaussian random process
0:16:57	and we have shown
0:16:58	but in yeah
0:17:00	um
0:17:01	the case of absence of observation noise is equivalent to conventional map method
0:17:06	but
0:17:06	in
0:17:07	presence of observation noise
0:17:09	is
0:17:09	proved estimation accuracy
0:17:11	and what is important that the computation that
0:17:14	but the only restrict
0:17:16	showing this function is that
0:17:18	variance of the observation error has to be no
0:17:21	and this is you that
0:17:23	papers is that
0:17:24	we have to analyse the effects
0:17:26	what happens
0:17:27	a if you do not know you um
0:17:29	yeah it's of the observation are right exactly but just an estimate of
0:17:34	but i i i can say that uh this method will not be
0:17:37	that's
0:17:38	it sensitive to this
0:17:40	now
0:17:41	future future
0:17:42	thank you remote real tension
0:17:49	and for a couple of questions
0:17:53	yes one
0:18:04	you process
0:18:05	why
0:18:06	sure
0:18:06	i
0:18:08	you
0:18:09	gives
0:18:11	uh i suppose that this question would come
0:18:13	uh
0:18:15	um um so far we assume all the cases
0:18:18	is just a a just a a method uh
0:18:20	if
0:18:21	we have these assumptions and we can uh we can for a give some
0:18:25	uh
0:18:26	some method to estimate the problem
0:18:28	may you that might be
0:18:30	might be an application for example of you some uh
0:18:33	sensor signals which and noisy and you have
0:18:35	can do all the observation are a which you can expect
0:18:38	and then you
0:18:39	um
0:18:40	i able to estimate something like a mean
0:18:42	uh
0:18:43	like a bias in the mean or something like is
0:18:46	this week an application but
0:18:47	we do
0:18:48	we did not uh
0:18:50	find
0:18:51	and
0:18:51	a calm concrete applications
0:18:54	and also can i guess
0:18:57	i
0:18:58	i
0:19:00	like
0:19:01	i
0:19:06	oh
0:19:08	cover
0:19:09	and
0:19:13	oh with
0:19:15	oh
0:19:16	uh no we didn't uh
0:19:19	didn't
0:19:20	and nice
0:19:21	with
0:19:22	with connection with home more more
0:19:24	but
0:19:27	yeah but
0:19:30	yes
0:19:46	uh no we didn't
0:19:48	um
0:19:50	you mean uh
0:19:52	you with to the proposed to compare are you performance of all with them with
0:19:57	which one
0:19:58	which what
0:20:08	okay
0:20:11	of course
0:20:16	uh
0:20:16	no we have measure are actually you you do the true accuracy which
0:20:20	uh with the measure like a lot of something
0:20:23	like that
0:20:24	we just uh
0:20:25	so that this method works quite well and so uh happens uh and a last
0:20:29	yes the the performance and
0:20:31	this kind of a metric
0:20:33	i Q
0:20:33	post
0:20:37	okay thank you
0:20:39	a standard speaker

MAP-BASED ESTIMATION OF THE PARAMETERS OF NON-STATIONARY GAUSSIAN PROCESSES FROM NOISY OBSERVATIONS

Non-Stationary Signal Analysis

Presented by: Alexander Krueger, Author(s): Alexander Krueger, Reinhold Haeb-Umbach, University of Paderborn, Germany