Přepis řeči - THE GROUPED TWO-SIDED ORTHOGONAL PROCRUSTES PROBLEM

0:00:16	i i think you uh many was out slow be presenting
0:00:19	uh paper
0:00:21	and title the the group
0:00:23	two two of for or as i don't because this problem
0:00:25	uh by brian kind of a my adviser
0:00:27	uh fess with peter image
0:00:32	okay so um a well known problem that's been around a while is uh your a title for "'cause" this
0:00:37	problem
0:00:37	where we uh C to identify
0:00:40	and orthogonal cycle transformation between two data matrices
0:00:43	and uh
0:00:44	this context we seek to
0:00:46	minimize the frobenius norm squared
0:00:49	of the difference of a
0:00:51	in this case
0:00:52	the rows of B with rotated rows of a
0:00:56	and and this is useful uh
0:00:58	uh it to serve as as a as a geometric
0:01:02	uh transformation or a geometric rotation to to correct for
0:01:06	uh
0:01:07	rotational distortion as
0:01:08	uh distortions between
0:01:10	i mitch each a and B
0:01:13	um
0:01:14	uh
0:01:15	going going to
0:01:16	going to a second order version uh there was there's the two sided or because he's problem
0:01:21	or you have to uh
0:01:23	symmetric
0:01:25	similarity matrices C and C are little are can to give is
0:01:28	reference
0:01:29	and
0:01:31	in this in this problem we have
0:01:33	rather than um
0:01:36	uh
0:01:36	de de de rotation during on one side we have it of on both sides
0:01:40	and in the form are transpose
0:01:42	C R and and again we're trying to match up
0:01:45	um this this this to side rotation with with
0:01:48	with some
0:01:50	uh
0:01:51	some reference
0:01:52	uh
0:01:53	similarity so
0:01:54	uh
0:01:56	the says a bunch of applications uh
0:01:59	one being a a covariance or or correlation
0:02:02	a matrix matching so
0:02:04	we eight
0:02:05	a data matrix to we've induced
0:02:08	uh a double rotation
0:02:10	on the covariance matrix
0:02:12	um a more interesting problem
0:02:14	would be a related to a graph matching problem so in a ice metric graph matching
0:02:20	we're given uh two adjacency matrices
0:02:22	again C and C R
0:02:24	and uh
0:02:25	this can be weighted
0:02:26	um
0:02:27	and and and and the at a is to find a a permutation matrix
0:02:32	which
0:02:33	best uh
0:02:35	which we we which serves to to reorder the the nodes of one graph and to the other man in
0:02:41	in the other graph
0:02:42	uh again
0:02:43	matching in in in the for been used um score different so
0:02:47	this is a hard problem big
0:02:48	given this combinatorial nature
0:02:51	uh there are
0:02:53	one one one popular method is
0:02:56	when me i'm as method um and
0:02:58	involves relaxing this problem
0:03:00	to
0:03:01	an orthogonal
0:03:03	uh constraint
0:03:05	and then and then finding the closest permutation matrix
0:03:08	as the hungarian algorithm
0:03:13	okay so if we go back to the of that don't is problem the set this has a known solution
0:03:17	and um
0:03:19	given
0:03:20	uh the the the right circumstances
0:03:22	it's a unique and this given by D
0:03:24	uh
0:03:25	single value decomposition of A transpose be where
0:03:29	a singular values
0:03:30	um
0:03:30	have to be
0:03:31	transform just a
0:03:33	all ones
0:03:34	oh have for the two sided however
0:03:36	it's a it's
0:03:38	it's
0:03:38	a bit more complicated because
0:03:40	uh we have a close form solution but it's really family of solutions so
0:03:44	if the eigen decomposition of C is is V
0:03:47	lamb V transpose and C are B R and R
0:03:51	the are transpose
0:03:52	then
0:03:53	uh
0:03:54	a solution to this problem is given by
0:03:57	V V D V are transpose where D is a a is a diagonal matrix where each i an entry
0:04:02	is plus one or minus one
0:04:04	so
0:04:06	um
0:04:07	right away we see that we can have know
0:04:09	to the N possible solutions
0:04:11	at least
0:04:14	uh so so so i was a bit of a history um
0:04:17	in in in this talk we're going to consider
0:04:21	scenario area where
0:04:22	a for example a graph verdi sees don't just have
0:04:26	similarity measures between nodes but
0:04:29	um but also
0:04:31	there some side information
0:04:33	and and so this paper considers
0:04:35	case where
0:04:36	the a the seize
0:04:37	our are parsley later
0:04:38	in to go
0:04:40	uh so
0:04:41	so for example if uh
0:04:43	let's say were modeling a face so
0:04:46	so you know certain certain notes belong to the i certain nodes belong to the nose set
0:04:50	um or we can think of a a spatial configuration so
0:04:54	and uh in uh
0:04:56	fmri analysis we might have
0:04:58	certain regions of interest
0:04:59	so
0:05:00	we have kind activity is between
0:05:02	of voxels but we also know uh what are our why those walks
0:05:05	two
0:05:09	okay
0:05:10	so suppose
0:05:11	are are are data dataset
0:05:12	is parsley did in
0:05:14	capital "'em" groups
0:05:16	but is for each point i there's
0:05:18	there's some group that i belongs to and that's uh
0:05:21	will note that G of i
0:05:24	and uh uh uh the goal is
0:05:26	to
0:05:27	again
0:05:28	estimate an R
0:05:30	uh a in in in in the two sided
0:05:32	of course this problem but
0:05:33	here are is gonna be block diagonal
0:05:36	and and and those block diagonals correspond to the
0:05:40	nature of of the uh
0:05:41	problem
0:05:43	so uh because problem the uh
0:05:46	groups to sided or that process problem and uh
0:05:50	of course when when when M equals one we
0:05:52	go back to the original
0:05:54	a standard to prior
0:05:59	so
0:06:00	get
0:06:01	given T block tan'll constraint
0:06:02	um
0:06:04	it essentially introduces
0:06:06	some some linear constraints that is the elements of
0:06:10	this large matrix R
0:06:12	a a such that
0:06:14	uh
0:06:15	the uh it in X P Q zero if um
0:06:19	if uh
0:06:21	notes P and Q of the
0:06:25	and uh some some benefits of this approach are
0:06:28	you know the the
0:06:29	these constraints regular as the problem so
0:06:32	uh we might be reducing the effects of fitting
0:06:36	and also were were we're using prior knowledge that uh
0:06:39	we know about a about this graph
0:06:42	um
0:06:43	also we have uh
0:06:45	a few were
0:06:46	parameters to estimate so it's uh
0:06:49	it's a
0:06:50	were dealing with a i
0:06:52	a a a a a a lower dimensional problem
0:06:56	okay so so for "'em" greater than one we so that it it it does have a
0:07:01	a close form family
0:07:02	solutions
0:07:03	but for "'em" greater than one
0:07:05	where we're gonna have to look at an iterative
0:07:07	uh technique
0:07:08	so that let see i i and C I R denote be with an region or within groups similarity structures
0:07:14	uh and
0:07:16	see i J where i is not equal to J we'll what we'll do not the cross
0:07:21	somewhere structures
0:07:22	so
0:07:23	again given that this is uh going to be an iterative
0:07:26	solution
0:07:27	uh we'd like to start from
0:07:29	a a a a a good starting point
0:07:31	and so
0:07:32	a logical starting point would be to just consider the you within group similar
0:07:38	so we can start it
0:07:39	each block
0:07:40	element of our
0:07:42	rotation matrix
0:07:43	to be
0:07:44	uh V M M D M V M M R transpose
0:07:48	where
0:07:49	uh
0:07:50	the V and the Rs
0:07:52	correspond to the eigen decompositions of
0:07:55	oh of uh
0:07:56	of uh a C M M and
0:07:58	and and M R
0:08:02	so and and so from the starting point our goal is to improve on this initial estimate
0:08:06	uh by looking at across region similarities
0:08:09	so it's broken out into two step process
0:08:12	uh the first is trying to
0:08:15	assign appropriate plus one's uh
0:08:17	plus one or minus one entries
0:08:19	for the uh D matrices
0:08:21	and it turns out that
0:08:23	um
0:08:24	and in in
0:08:25	in a in expanding the problem
0:08:27	this is um
0:08:29	reduces to a quadratic binary optimization problem
0:08:32	um
0:08:34	and so we can use uh
0:08:36	uh we we we can get an approximate solution with
0:08:39	the
0:08:40	max cut uh but but but by by using the algorithms
0:08:43	to solve max cut
0:08:44	uh the second step after we've got an R
0:08:48	are D matrices
0:08:50	wouldn't be to uh
0:08:52	fine the or that call
0:08:53	estimates
0:08:54	uh with a with a greedy uh strategy
0:08:57	and the way we do that
0:08:59	is by selecting
0:09:01	um elements
0:09:02	that that are within groups and applying
0:09:05	uh appropriate givens rotations to
0:09:07	decrease the objective and so this is of a form of a
0:09:11	block or set
0:09:16	i to just a a a a a bit on a
0:09:18	the second step um
0:09:21	we can we can view the matrix C as a yeah
0:09:25	as
0:09:26	a noisy
0:09:28	rotated version of
0:09:30	of of a C R
0:09:32	and
0:09:33	and because the noise the the the eigen vectors
0:09:37	uh of with a group similarity
0:09:40	our berger
0:09:40	so
0:09:41	um
0:09:43	the
0:09:44	the uh factorisation we had the for uh of our initial estimate
0:09:47	of
0:09:48	are M equals
0:09:49	the M M
0:09:50	and D M V M R transpose
0:09:53	uh
0:09:53	we introduce
0:09:55	another matrix you to get the M M U i'm the M V M M R transpose
0:09:59	uh and
0:10:01	and
0:10:01	we view the U M and the D M
0:10:04	as a a corrective measure
0:10:07	to decrease the object
0:10:10	so
0:10:10	the D Ms
0:10:11	as that's so so the use in the D uh are
0:10:15	uh a there to incorporate
0:10:17	the cross region similarities
0:10:19	to correct the uh
0:10:21	perturbed eigenvector
0:10:27	so if we let uh
0:10:29	you be D block diagonal
0:10:31	uh
0:10:33	version of of you want through you M
0:10:35	then
0:10:36	uh what we do is
0:10:38	is is
0:10:39	in our really approach is is you generate these givens rotations so we we pick
0:10:43	um
0:10:45	uh
0:10:45	elements uh
0:10:46	P T in Q T
0:10:48	that belong the same group
0:10:49	and we successively apply givens rotations
0:10:53	to um at at at iteration C to decrease the object
0:10:58	um um
0:10:59	and at a a P change you T R
0:11:02	are found
0:11:02	uh using uh and and and and approximate greedy
0:11:05	and and the angle is the sample the line search
0:11:11	so
0:11:11	uh look at some results
0:11:13	uh
0:11:14	so this was tested on on two data sets the first is set was the N nist handwritten digits
0:11:19	um and second was the L B face day so
0:11:23	for "'em" nist we have
0:11:24	i ten groups
0:11:25	for ten digits and for face database we have fifteen groups for
0:11:29	uh a fifteen
0:11:30	uh people
0:11:32	so each group
0:11:33	was
0:11:34	uh transformed by by a random orthogonal matrix it's of M
0:11:38	and then and and corrupted with noise
0:11:40	so
0:11:41	and you see before you
0:11:43	the the the D block diagonal
0:11:46	uh elements of this matrix are the with group similarity structures
0:11:50	and the and the off diagonal
0:11:51	elements are do you cross group
0:11:53	and so you'll see uh
0:11:55	you know
0:11:56	the covariance of uh
0:11:57	digit three and seven
0:11:59	here
0:12:01	point by the
0:12:05	okay case at test accuracy we looked at
0:12:08	um
0:12:10	the
0:12:11	our air estimate to noted uh
0:12:14	by what what we're each
0:12:16	we're reach block of
0:12:17	of that rotation matrix is the dated
0:12:19	uh
0:12:20	and had uh we looked at
0:12:23	the the inner product of um
0:12:26	a have with a and uh and uh we took the average so so so this as the figure between
0:12:31	minus one and one
0:12:33	um
0:12:34	and we see that uh
0:12:36	that at iteration zero
0:12:39	uh
0:12:40	this is the result if we just take in
0:12:43	to account the within group similarities and then
0:12:47	as as a T increases the number of iterations increase
0:12:51	uh out knowledge of across region similarities
0:12:54	uh help us
0:12:56	to teen uh a better match
0:12:59	and uh we
0:13:00	and this is true for both
0:13:02	you know of the M nist
0:13:03	and the old
0:13:07	um
0:13:08	going to the more adjusting problem of graph matching
0:13:11	so
0:13:12	uh
0:13:13	each group in this case was was transformed by by some random permutation matrix
0:13:18	and then perturbed
0:13:18	with noise
0:13:19	and uh
0:13:21	the rather than solve for the
0:13:23	uh am
0:13:25	after after solving for each i M at at every time step you then
0:13:28	use the hungarian algorithm
0:13:30	to get the uh corresponding
0:13:33	um permutation matrix
0:13:34	and
0:13:35	we can and and and we plot the accuracy
0:13:38	uh with the correct number of uh
0:13:41	permutation matches
0:13:43	and uh
0:13:44	for the M this there's there's a uh
0:13:47	uh
0:13:47	a a a a steeper improvement
0:13:49	between only looking at within region two
0:13:53	looking at across region
0:13:58	a in conclusion oh O
0:14:00	we pose the group
0:14:01	uh version of the you side for this problem orthogonal cycle is problem
0:14:04	uh we write an algorithm that's come to fit efficient given D givens rotations
0:14:09	um
0:14:11	and uh a simple to implement
0:14:13	and
0:14:14	there's results show that uh
0:14:16	effectively you
0:14:18	utilizing both within group but across group structure
0:14:21	allows as to
0:14:22	the estimation
0:14:23	of the
0:14:24	um and known
0:14:26	orthogonal transformation
0:14:31	thanks
0:14:34	i
0:14:36	sure
0:14:39	yeah
0:14:40	i i to los the obvious question is what can you say that convergence of the skein
0:14:49	i would say that um
0:14:51	that
0:14:52	there are probably
0:14:54	local minima
0:14:55	given given
0:14:56	given the constraints and so
0:14:59	the key starting from that that
0:15:01	that good initial
0:15:02	and and the initial point
0:15:04	so
0:15:05	um
0:15:07	more than that
0:15:08	uh i can say
0:15:13	no
0:15:14	no

THE GROUPED TWO-SIDED ORTHOGONAL PROCRUSTES PROBLEM

Classification and Pattern Recognition

Přednášející: Alexander Lorbert, Autoři: Bryan Conroy, Peter Ramadge, Princeton University, United States