Přepis řeči - APPROXIMATION OF PATTERN TRANSFORMATION MANIFOLDS WITH PARAMETRIC DICTIONARIES

0:00:13	thank you um good afternoon
0:00:15	um
0:00:15	so
0:00:16	one of the problem that we tried to fall an image processing is a a is of some data
0:00:21	that have been exposed to geometric transformations
0:00:24	for example we might want to reduce is such data are or or classify them in a transformation invariant like
0:00:31	one of a common approach
0:00:32	for um be with such problems is the use all my fault models
0:00:36	so in this work we have a um concentrated on that transformation manifold
0:00:42	and it at transformation manifold in that
0:00:44	family a little images that are generated by a in certain set of a geometric transformations to reference better
0:00:50	for example if you take the structure ten P
0:00:53	we do not its transformation manifold by
0:00:56	P here
0:00:57	and uh so we assume that this is an an picks out much and the mind fold is also a
0:01:02	subset of R and
0:01:03	this case
0:01:04	each each which on this the transformation manifold is a geometric a transformed version of P
0:01:10	and we define just transformation by a parameter vector or that in the parameter space
0:01:16	and just that uh this uh a lot that house as the type of the geometric transformation for instance uh
0:01:21	it could be any combination of two D to transformations like
0:01:25	rotation and translation scale change one can find also for example
0:01:32	so our for
0:01:33	as in this work is uh the following we assume that we in one
0:01:37	a set of uh geometric metric "'cause" transform observations of a signal type like uh a five digit
0:01:44	illustration
0:01:45	a from the observations we are trying to construct a
0:01:49	it pattern transformation manifold so in part we want to find a pattern P
0:01:53	such that the transformation manifold of P
0:01:56	uh represents about this state that so it's like a extra fitting problem but if it and when you to
0:02:01	the data instead of like
0:02:04	so that
0:02:05	problem is the to find a spectral P
0:02:09	so um
0:02:10	this kind of a framework has some meaning that applications for the modeling and the registration of the input data
0:02:16	on
0:02:16	including is also possible because we will be finding the pattern P in terms of uh some parametric at so
0:02:22	it's also used to called the input data and like
0:02:26	and also another ad don't use that we provide an unknown not to the model for our money false
0:02:30	so that we can since to sides we can generate a new data on the manifold
0:02:34	and this makes it possible to compute
0:02:37	exactly at distance
0:02:38	bit mean it's estimates and construct a old
0:02:41	so this can be a a time as some classification settings for instance if we are given that test image
0:02:45	some geometric transformation if you want to class by
0:02:48	we just need to compute its distance to the uh of the transformation manifold
0:02:55	so for a so that all i was first uh try to form like the problem than i will describe
0:03:00	a solution that of the based on computing a representative pattern P uh with the greedy out to great really
0:03:06	by selecting some atoms from a parametric dictionary
0:03:10	so uh here's a show the manifold is not the by a and he and each image on the manifold
0:03:15	of this from by you um that P it means a
0:03:18	and the pattern P and uh i applied it to from some which the on it
0:03:22	and we denote are
0:03:24	uh input them just by you why this R
0:03:26	uh uh this gone for some geometric transformations
0:03:29	and what we are trying to do is find a common reference pattern P
0:03:33	and model be um input points
0:03:36	uh uh is transformation of this common pattern P
0:03:39	plus some uh ever try and this error time you i shows the deviation of
0:03:44	the image you i from the construct mind for
0:03:47	and uh we assume that you know the type of transformations for instance you know bidders rotation translation scale that's
0:03:53	a drought but
0:03:54	still we to re just their input that the that means we need to compute a vector along the i
0:03:58	for each of input image
0:04:01	and then we use this idea phone construct thing P is a combination of some uh
0:04:06	i
0:04:07	so P equals the sum of atoms a J base of by this collection of C J
0:04:12	and we also assume that use that sums come from a parametric dictionary that means
0:04:16	each atom in a dictionary
0:04:18	is a a geometrically transformed version of an an i'm not function so mother functions from by five here
0:04:24	this is a a a a a marshal so the geometric transformation
0:04:28	and some possible a little uh some examples for this on and will uh a generating mother function could be
0:04:33	a process cost and motor function or
0:04:36	an isotropic refinement but or functions from by a and R
0:04:39	and here you see some at some that are um the i form house thousand motor function to some geometric
0:04:45	transformation
0:04:47	and um here is the formation of this month for fitting problem
0:04:51	so we like to minimize the total distance of our input images to construct the money full we shall we
0:04:56	by
0:04:57	E
0:04:57	and and we want to we would like to uh it she'll just by picking a subset
0:05:02	all the atoms in the dictionary slow not P us these A J that comes a G R and
0:05:07	also optimized the for options of these atoms
0:05:10	such that this total distance that are he is mean
0:05:15	the uh and you know next case of this uh read out from that we propose
0:05:19	so we first so choose arbitrarily and that to mean the dictionary
0:05:24	a suitable one and then be set that part pattern P
0:05:28	uh and then we compute the projection of are input images on the money
0:05:33	and then here the main loop now all uh at each iteration we select and at some at a and
0:05:38	the coefficients C
0:05:39	such that we reduce the errors
0:05:42	and then we at this at some our pattern
0:05:46	so this this based on my fault
0:05:48	and an now the money for that it is a very compute the projections of are uh input image of
0:05:53	on them i if what and then we continue this loop
0:05:56	and till the the data approximation error is minimal
0:06:00	and now how to be a of the minimisation of this error are still i'm fortunes as error has a
0:06:04	complicated the panels on the at and option
0:06:07	and is for the following reason uh let's imagine that we are now in the j-th iterations of the already
0:06:12	have a computer this manifold and P J A lines one
0:06:15	and so if you take an input image you why i mean its projection
0:06:20	smile of that's already compute so we know the parameter vector or but i corresponding
0:06:24	i mean were when the a minor followed by adding and you want
0:06:28	it's projection point no change
0:06:30	and most probably will correspond to a parameter vector number i pride which is a a different from um by
0:06:36	and we don't know what this number by prime
0:06:38	will be
0:06:39	uh but if we right down the total distance used in that it depends on this uh a will real
0:06:45	new you of the parameter vector by prior so
0:06:47	that's this it's not use it to um
0:06:49	minimize directly this uh error E
0:06:52	so we uh defined an approximation you have
0:06:55	of of know instead of we minimize this you
0:06:58	and then what is the C had it is just the sum of the kind and this distance a little
0:07:02	imp point to the new my fall
0:07:04	and and time and this as as as follows we had a new manifold now and we obtain a first
0:07:09	order approximation of this money there on the projection points that are already or
0:07:14	and then the change in the sense of you i for this manifold is just the this distance between you
0:07:19	Y and a
0:07:20	uh a first order approximation
0:07:23	so uh
0:07:24	actually be do something pretty straightforward to minimize the that we just to each of the atoms of addiction or
0:07:29	one by one
0:07:30	and for each at an we find we compute the optimum options see that minimize the stereo tab
0:07:36	and if we you right this you had as a function of C
0:07:39	um we see that is a it's in the form of a racial function that means this function at a
0:07:44	i and G I's are on my meals of C
0:07:47	so in general um
0:07:49	such a function
0:07:50	has several local minima
0:07:52	and it where we can seen in practice a experiments we have seen that it is also in most most
0:07:57	of the time is possible to minimise you that just by a simple a and the sound out or two
0:08:01	is not that
0:08:02	extreme complicated function in practice
0:08:05	so um we try
0:08:06	each at and can compute all the local options uh and then in of all the atoms if we the
0:08:11	best one
0:08:12	that you small star
0:08:14	then we add the this at some to the new cut and uh by its uh optimal corruption
0:08:19	and you repeat the use of course
0:08:22	so now um some experiments for some on for and a later on
0:08:27	and in this experiment we use a transformation model of of uh we use the transformation manifold model of the
0:08:32	mansion three so we have uh
0:08:35	rotation and then it would be to two national translation
0:08:38	uh so we can generate a the syntactical path and by adding some loss in and a and are i
0:08:43	don't
0:08:44	and uh so we construct a different data sets from this at some uh each dataset consists of some random
0:08:50	geometric transformations of this the synthetic that pattern
0:08:53	and you have a four out to each data of that that uh it it is a uh gaussian noise
0:08:57	with
0:08:58	for noise variances for sports data set
0:09:00	and we use that dictionary consisting of some cost in them the R
0:09:05	so um here you see the data approximation error or lot that just like the noise variance
0:09:10	so i approximation error is the total squared distance of input images
0:09:14	the computed my
0:09:15	is see that it's uh it is it has a a linear variation
0:09:19	like to noise variance which is an expected result
0:09:22	uh uh have are if you pay attention here does just line doesn't pass from the origin so this actually
0:09:28	re we'll the error of the algorithm
0:09:30	and there are two main source of though
0:09:32	uh for this error of all is that use a grid out them and it doesn't have an optimal performance
0:09:37	T
0:09:38	and secondly we use a dictionary of
0:09:40	fine size
0:09:41	that's the discrete or this also introduce some there
0:09:46	and uh experiment sometime in it
0:09:49	this time we use the four dimensional transformation model because we also have a you changed um in the
0:09:55	um a as
0:09:56	and is they are uh we use a hundred to the geometric to transforms
0:10:01	hundred five
0:10:03	and use a similar dictionary so on the left you see some of the sound of they in the experiment
0:10:08	and on the right so uh you see the patch that we obtain the twenty four at
0:10:13	so it looks like a five digit that sure about the characters
0:10:16	digits five um despite the variation
0:10:20	the they does that
0:10:22	and also uh some uh for some numerical comparison we have compared to some rec
0:10:26	approach
0:10:27	and we have use this error measure a measure which is a the data approximation error
0:10:32	so in the first to uh a reference is that have again computed
0:10:37	progressive approximations of the uh are designed
0:10:40	so in the first one we have applied matching force on a typical are in the data that the average
0:10:46	are here and we have chosen it to be
0:10:48	the input data out it close as
0:10:50	to the centroid of all and i say
0:10:52	J
0:10:55	and uh in the second one we have applied simultaneous matching pursuit on or a line
0:11:00	to achieve that
0:11:01	sparse uh
0:11:02	find
0:11:03	i
0:11:04	and we don't
0:11:06	and finally as order approach like everyone provide a comparison between our method and uh
0:11:12	classical manifold learning
0:11:14	and it doesn't on that in some of the typical manifold learning algorithms they make use of the assumption that
0:11:19	data has a local in your be or on the mind
0:11:22	so we just uh a compute the this uh a local linear manifold approximation error
0:11:27	is the sum of
0:11:28	these
0:11:29	E i one E i
0:11:30	is
0:11:31	uh the distance between a point you Y
0:11:34	and the plane thing from the nearest neighbor
0:11:39	oh um you see that are lots here are so the move of is the transformation invariant matching proof of
0:11:46	word that we have proposed so we get the best or performance
0:11:50	um we see that the red of corresponds to matching pursuit on average but
0:11:54	a if i and it's as that
0:11:56	okay so to do that and the data that that that for all
0:12:00	uh
0:12:00	you know like you're
0:12:02	i that that that would be a lot but
0:12:04	is it or not
0:12:06	and this station is and that the one time and the patterns are um
0:12:10	when we have applied simultaneous
0:12:12	a a sparse
0:12:13	estimation of that
0:12:14	such P
0:12:17	and finally some experiments on
0:12:19	face image this time
0:12:20	this time at high dimensional the because we have an an isotropic scaling
0:12:24	and we have used some
0:12:26	um face images of the same subject but we also uh i had some
0:12:30	but uh in the data set and some variation of facial expression that we don't not model
0:12:36	things like uh a facial expression variations but
0:12:38	these things are are rather close there that the source of the deviation from the computed manifold
0:12:44	and uh uh here on the right so the that some they like can from the data set of on
0:12:48	the right to face them me that we have computed
0:12:50	so it looks
0:12:51	more or less like the phase of the same person
0:12:53	there is also some kind of averaging and
0:12:55	facial expression and uh
0:12:57	we you have a doubt that all lesions
0:13:00	and um if you look at the error loss we see here that
0:13:04	so okay K even if is still get the best error for from a uh we can see here that
0:13:08	the and and in is some people's the perform about
0:13:11	this is because the number of variation
0:13:13	then the face image of the same person are
0:13:16	what's smaller and compared to the micro variation the hand
0:13:19	it
0:13:21	that typical people pattern of the data set
0:13:23	like to approximate that all patterns
0:13:26	i mean there and if you look at this uh that line as locally in or approximation or is pretty
0:13:31	i
0:13:32	and very for this is that the data uh do we have just use thirty five of just so the
0:13:36	data is sparse the sample on the my fault
0:13:39	the local linearity assumption that hold the anymore
0:13:42	you had to
0:13:45	so um to a little bit have present presented the method to the for transformation and rent sparse approximation of
0:13:51	a set of signals
0:13:52	we are we have built a representative pattern with the grid out some by a parametric atom selection
0:13:58	and the complexity of the matter a method that we propose a
0:14:01	changes linearly with respect to the number of atoms in the dictionary
0:14:06	as a linear with respect to the number of images and the input that
0:14:09	and it has a corn of the panels on the notion of the mind for the image resolution
0:14:15	a there are um we have shown in another work that
0:14:18	uh under some assumptions on the transformation model
0:14:22	and also the structure of the dictionary we can it cheap a joint optimization of the at parameters
0:14:28	and uh the functions C
0:14:29	so in this case uh we optimize on the continuous dictionary might of fall rather than a
0:14:36	um
0:14:36	fixed dictionary a
0:14:37	speech uh at samples
0:14:40	and in this case uh we get rid of just
0:14:42	for star here we don't have a uh a the depends a number of because the local jurisdiction
0:14:50	so um is a
0:14:51	final remark a um are right
0:14:54	can related to to as in general one is sparse signal approximation of and the other is a all learning
0:15:00	so um what's that we gained over sparse signal approximation at like and P S and E
0:15:05	it is that we H you uh in a variance to geometric transformations of the data of you you we
0:15:10	use a transformation manifold model
0:15:13	on the other hand the and on to as we have over classical month learning algorithms are the following
0:15:18	a first of all we provide an article model for the data and that has a nice properties like a
0:15:23	it's the french it's move
0:15:25	it is also used to call the take the
0:15:28	parametric atoms
0:15:29	it a L the end generation need they on the manifold
0:15:32	and finally it has that it can still work if uh the something of that database
0:15:37	sparse
0:15:38	whereas as um
0:15:39	many need fall that work and would require a much
0:15:42	source
0:15:43	oh
0:15:44	so uh that's all and take you very much for function
0:15:52	thank you
0:15:54	as a first
0:16:05	i
0:16:05	yeah that's it the best to extract that um
0:16:09	for time read
0:16:10	actually what we do is we on
0:16:13	minimize mean Z V in one as an approximation yeah
0:16:17	so we do a oh so at each iteration okay we minimize the if that's that's T
0:16:22	but as E that is not equal to you that's one reason the second reason
0:16:26	um
0:16:28	so it and if you mean my is
0:16:31	this is one of the projection points change
0:16:34	the forty two reasons uh a menu do this optimization thing you want to a guarantee that you will reduced
0:16:40	so but we do in practice of that
0:16:42	uh okay so we try this pick the best that some of we want to the project and than on
0:16:46	be check if there are set of it just um we are fine accounting a if the error you don't
0:16:51	the green
0:16:52	the we try and reckon at them like don't to pick the best one but pick the second best one
0:16:57	and then tried
0:16:58	a but we we all
0:16:59	well of course they are able to uh
0:17:03	but a set up a date
0:17:04	only if the error is it just so
0:17:06	since the V we reduced the error E he for sure and in each iteration and so uh it has
0:17:12	a lower bound and that it has to converge at some point
0:17:15	um
0:17:16	oh for situation
0:17:21	what do you
0:17:24	we
0:17:26	exploit
0:17:28	yeah i i that um
0:17:29	i i think in whatever may you define fine of i mean whatever kind of transformation you can there i
0:17:36	think as long as a um you did find this error E and this like like double distance of them
0:17:40	but that you for it
0:17:42	to the degree that each iteration
0:17:44	um
0:17:45	yeah so if you use degrees and a function that is lower bound that uh it means that a test
0:17:50	to code word after a while
0:17:52	is monotonically decreasing function
0:17:54	you
0:17:57	as a in
0:18:01	i seven it depends on also a should be you have to to be used for the the dictionary
0:18:12	you you need a note the is actually that it to to play even if you do the meeting
0:18:18	you try to would
0:18:21	yeah like your that that's it
0:18:24	so um so that it's a question about dictionary learning i guess um we have a on anything like a
0:18:30	um
0:18:31	i mean doing something like a C you like case we get to optimize that
0:18:35	one reason for this is that we really would like to
0:18:38	to to been a parametric forms of all
0:18:41	uh a we need them to be differentiable function because we're talking about ten just to the might so they
0:18:47	just can't be an arbitrary function
0:18:49	so that than this a but uh this think that i have mentioned here
0:18:53	uh
0:18:54	finally this
0:18:56	the for all
0:18:57	that
0:18:58	kind of such as this field of addiction learning because here we have a a dictionary of money for and
0:19:03	not what we do is
0:19:04	you optimized
0:19:06	a on the big show mind fall that are you optimise the parameters of the atoms
0:19:10	this is
0:19:12	can related to a lot
0:19:14	but we i consider a differentiable uh
0:19:18	at like a in a and i to be any the french on article function so it's gender can that's
0:19:24	yeah but is not learn from the they are no we with that
0:19:28	yeah
0:19:29	as wish to
0:19:33	and that was to uh you said that yeah actually used to as the fact that the sparse approach
0:19:39	but the a D do actually they will go explicitly use is the constraint in your uh optimization
0:19:47	so yeah and question is there a house as is it depending on your T and know how were how
0:19:52	do you think this into account the in to you are uh
0:19:55	uh optimization problem
0:19:57	we have introduced the
0:19:59	that and L one norm or or or or no we don't take it this like to that
0:20:04	hmmm
0:20:04	you oh
0:20:06	sparse sparse the they'd sure talking about is and which main
0:20:09	so here and uh we there's sparsity in a a times of these
0:20:13	dictionary atoms that we use so we have to
0:20:16	um
0:20:17	version
0:20:25	yeah so here
0:20:26	uh uh you have
0:20:28	J O D is that some so if K is much smaller and and number of cells that you have
0:20:33	and the in which is done this pattern P is
0:20:35	sparse in this domain just consisting of or
0:20:38	a or a an hour
0:20:39	and
0:20:39	so um the made that you stick here is that look okay you can do that like
0:20:44	okay and not take uh fifty atoms
0:20:46	i keep the best fifty the atoms and um
0:20:49	use yeah head and it's
0:20:50	parsons and
0:20:51	okay approximation
0:20:53	yes
0:20:55	okay
0:20:57	as question
0:20:59	a you've not that again
0:21:02	a
0:21:03	no
0:21:04	and

APPROXIMATION OF PATTERN TRANSFORMATION MANIFOLDS WITH PARAMETRIC DICTIONARIES

Image Feature Extraction and Analysis

Přednášející: Elif Vural, Autoři: Elif Vural, Pascal Frossard, Ecole Polytechnique Fédérale de Lausanne, Switzerland