Speech Transcript - WEIGHTED COMPRESSED SENSING AND RANK MINIMIZATION

0:00:13	no
0:00:13	thank you
0:00:15	um
0:00:17	yeah so uh
0:00:18	i should apologise if i'm gonna be able to
0:00:21	incoherent uh in this talk is met mention that just came from the airport
0:00:25	um there was
0:00:26	tornado in the midwest so i slept last night and dallas
0:00:29	port
0:00:30	kind of debating whether to
0:00:31	actually
0:00:32	come here and a
0:00:34	um so that so that is the good news is you know this is a simple paper um it has
0:00:38	a single uh message um and i should
0:00:42	and by that can that be so
0:00:44	uh this is joint work with my students summit and i mean and as was mentioned
0:00:49	weighted compressed sensing
0:00:51	and a little bit with right
0:00:53	um so here's the outline of the talk so what we want to look at is compressed sensing when you
0:00:57	have some sort of prior information i mean in particular
0:01:01	we assume that if you look at your your signal you can break it into partitions and you know something
0:01:06	about roughly the sparsity of each partition so
0:01:10	you're not assuming that sparse it is uniform over your uh signal you know what differs in different different chunks
0:01:16	and we want to exploit that information
0:01:19	i'll talk a little bit about to related work to not formally define the problem
0:01:23	and then basically um the contribution of this paper as a said it's a simple thing we give a simple
0:01:28	an explicit rule for doing a weighted compressed sensing weighted L one minimization
0:01:33	um and that say a few words about how to extend this to rank minimization problems and i won't
0:01:39	say a whole lot of about
0:01:42	um so again
0:01:45	uh
0:01:46	okay i guess i need you know
0:01:50	map type to make sense in the absence of math type of those squares are supposed to be i guess
0:01:55	bold face are so X is that
0:01:57	you know N dimensional vector this is uh just introduce the notation so the be a dimension is little and
0:02:03	the number of measurements is am so is an him by and may
0:02:07	um um
0:02:08	and uh in compressed sensing of course the goal is to recover a a a sparse uh
0:02:13	the signal X using you know if measurements than yeah been dimension and L one minimisation of all part of
0:02:19	and there's a great could of work it's been done
0:02:21	um people have looked at various algorithms um various models you know the previous top was a different algorithm the
0:02:28	top with
0:02:29	a statistical models
0:02:31	and this star i want to focus on a trying to do compressed sensing with signals that have a certain
0:02:36	structure
0:02:37	which is i set is this um just structure with not
0:02:41	sparse
0:02:42	um which is the following
0:02:44	so
0:02:44	um if you look at the
0:02:51	okay
0:02:52	so if you look at you know the signal here that the conventional assumption is that you know the signals
0:02:57	that one does that one minimisation on are uniformly sparse
0:03:01	but there are many applications one can think of for example in the N a micro is if you know
0:03:05	something about to a
0:03:07	the gene expression are so of what you looking at a
0:03:10	or other examples um
0:03:12	that i've mentioned your some network problem
0:03:15	uh sometimes you do compressed sensing iteratively and so you have prior information or you know there people have done
0:03:21	pressed sensing in conjunction
0:03:23	like all filters when you have
0:03:26	um
0:03:28	you know time variations and on so many cases you might actually know all a lot more about you know
0:03:33	what the sparsity structure is
0:03:35	and so in particular what will assume here is that you know your signal you might be able to break
0:03:40	up and to channels
0:03:42	and you might know this chunk
0:03:43	less sparse
0:03:44	and that chunk is more again you don't know where the
0:03:47	zeros and ones are
0:03:48	non
0:03:49	entries trees are
0:03:50	but you may have prior information
0:03:53	okay and so a actual thing to do um
0:03:56	if you have say
0:03:59	a collection of T partition
0:04:02	um
0:04:04	which partition i guess the ambient dimension into T disjoint sets
0:04:08	um in each partition might you know have a different sparsity a natural thing to do if you want to
0:04:13	stay within L one optimization is just the way
0:04:17	each
0:04:18	partition with a different way
0:04:20	so this notation means that this is all X is that have support on partition
0:04:25	set aside in this particular
0:04:27	sh
0:04:28	and i'm adding up the one norm weighted to buy particular double so this a different weight
0:04:33	for each of the set
0:04:34	define
0:04:35	a part ish
0:04:36	okay so we we do a weighted L one
0:04:41	um and so the problem setting think that i wanna look at so we can do some analysis
0:04:45	this
0:04:46	is that a less you on that we know either you know and uh all here it's mentioned deterministic leap
0:04:52	but this also probabilistic so
0:04:54	you might to see "'em" that the fraction of nonzero entries
0:04:58	in each
0:04:59	set set S i of the partition to be some point
0:05:02	the
0:05:02	which you know
0:05:04	a that would be kind of a deterministic assumption all the analysis um going to in the results i'm going
0:05:08	to
0:05:09	describe
0:05:10	extend to the case where this is actually probability
0:05:13	so what that means is that each
0:05:15	and tree
0:05:16	um in this particular set
0:05:19	is nonzero probability P
0:05:22	that would be a
0:05:24	you will
0:05:25	a stochastic care
0:05:26	session but as as the results for
0:05:28	okay and so we assume knowledge both of where these
0:05:31	sets are and what the probability of
0:05:34	them being nonzero what that
0:05:35	a portion of
0:05:36	a nonzero
0:05:38	and
0:05:39	um again
0:05:41	the aim here is to minimize the number of measurements
0:05:44	that we need to recover
0:05:46	it's so the question is given knowledge of the peace and S
0:05:50	how should we choose these weights
0:05:52	to recover X from the fewest number of measure
0:05:54	so how can be map P and asked to do
0:05:59	um
0:05:59	so people have looked at this there's this prior or work that is
0:06:03	a work of us one
0:06:04	a low uh
0:06:05	think of modified
0:06:08	i where they kind of look at this problem using a right P type results
0:06:12	um there is an earlier
0:06:14	per of hours uh i mean as
0:06:16	may main off
0:06:18	where we actually answer this question
0:06:20	um exactly if you were using grass
0:06:23	angle met
0:06:24	so we can actually
0:06:26	uh
0:06:26	essentially if you choose a particular weights we can compute
0:06:30	what a threshold uh in terms of the number of measure
0:06:33	you need to recover
0:06:34	spar
0:06:36	um it's a very
0:06:38	a detailed analysis you know uh
0:06:40	we can you know do things
0:06:42	more or less if you have two sets in your partition but once it goes beyond two to three or
0:06:47	four
0:06:48	it becomes
0:06:49	intractable
0:06:51	a you can do it
0:06:52	i wouldn't ever right
0:06:54	um and so the goal of this paper really was
0:06:58	a um some how to give some insight to out to come up
0:07:02	a can so what is the approach
0:07:04	um so we as well as you on that the measurements are go
0:07:08	this allows us to push to some
0:07:10	uh and i alice
0:07:12	and so rather than go to this class can angle stuff we we employ other technique it was introduced
0:07:17	by rack
0:07:18	the a and myself to look at right
0:07:20	as a or
0:07:22	um it's an approximation
0:07:24	well not an approximation of
0:07:26	it's not a type around it
0:07:28	a
0:07:30	and not an upper bound if you will and we use that to
0:07:34	um
0:07:35	and so the interesting thing is that we we end up with the very simple uh result essentially the weight
0:07:41	for each set
0:07:42	a one mine
0:07:44	a probability of being nonzero
0:07:48	so intuitive this makes sense because we're punishing these sparse or
0:07:52	um
0:07:53	you will set
0:07:54	a sparse you are
0:07:56	the the larger the weight it is
0:07:58	so we're can lies think you know actually
0:08:01	um
0:08:01	the the cost function
0:08:03	more
0:08:04	as we probably should
0:08:06	um for those
0:08:07	okay
0:08:08	um so it's very simple thing that one can use an show you results of this and of course
0:08:13	the the drawback is that this is suboptimal we can claim
0:08:17	all
0:08:17	thing
0:08:18	the awful thing
0:08:19	course
0:08:20	computation
0:08:22	um so what is the strategy to to recover this result what we first look and find a condition for
0:08:27	covering signals when you do weighted
0:08:30	um
0:08:31	L one optimization and the time
0:08:33	and
0:08:35	uh
0:08:36	this condition and so for those of you are familiar with null space condition
0:08:41	for recovering um signals with L one up
0:08:44	as a in the weighted case you get a very
0:08:47	similar results
0:08:48	except that you know the weights now play a role and so it turns out that you can recover a
0:08:53	signal and
0:08:55	and uh if and only if
0:08:56	for all vector
0:08:58	in the null space of your measurement make
0:09:00	following following inequality hold
0:09:02	um this is the sum
0:09:04	over the you know space and trees
0:09:06	um
0:09:07	so assuming the the signal has
0:09:09	or case
0:09:10	he's on
0:09:11	intersection i guess of the S size
0:09:13	and the complement of K
0:09:15	um and this is over the set
0:09:17	so
0:09:17	the difference you
0:09:19	oh
0:09:19	here
0:09:23	but in any event this is an inequality that should hold for every the the null space
0:09:28	and that's why
0:09:29	cool to analyse
0:09:31	yeah
0:09:33	i
0:09:34	and so
0:09:35	uh what we've done is we replace this with a simple condition
0:09:39	which one
0:09:40	yeah
0:09:41	um
0:09:42	and that's the following things so that one to T we have to guarantee
0:09:46	to be positive for all W use in the null space of a
0:09:50	um can be replaced by this quantity
0:09:53	which no longer an as the in it and if this is
0:09:56	um
0:09:57	pause
0:09:58	it's it's actually a
0:09:59	a lower bound on this
0:10:01	it's lower bound with high probability on a
0:10:03	so almost
0:10:04	any any you choose
0:10:06	um this will be a lower bound on
0:10:09	um and the good thing about this is that you know um
0:10:13	there no randomness in it it depends on the weights W depends on these are our which are be
0:10:18	um
0:10:19	fractions of these sets relative to the ambient dimension it also depends on you know which is the
0:10:25	ratio between the measurements
0:10:29	um
0:10:30	so once you have a this new note easy so that you want this be positive and an to make
0:10:35	this part
0:10:35	probably want the maximise so
0:10:39	she
0:10:40	um
0:10:40	what will do in a moment but you know for those of your into
0:10:43	in know
0:10:44	a paper but
0:10:45	this
0:10:46	based on some results of court on um
0:10:50	a a and actually invoking some lip should
0:10:57	uh of of of uh
0:10:58	i'll
0:10:59	a
0:11:00	okay
0:11:00	but
0:11:01	once you really that then it turns out it's easy to optimize over W and what happens as we get
0:11:06	the result
0:11:07	the i mentioned earlier
0:11:08	and what's interesting about the result is that it depends only on the probability P doesn't
0:11:13	and on side
0:11:16	from
0:11:18	um um
0:11:19	a few other things that i can mention again this is all approximate so if you believe these approximation
0:11:25	then i can look at this quantity are which is the dimensionality reduction
0:11:30	so
0:11:31	um in terms of recovering my signal so i'm look yeah dimension minus number of measurements i need to recover
0:11:37	however it
0:11:37	for regular L one minimisation in the context that we have
0:11:41	the problem
0:11:43	T and the al
0:11:44	being size
0:11:45	and
0:11:46	this is how much dimensionality reduction we get
0:11:49	for the weighted L one when you choose
0:11:52	these where
0:11:53	uh this is what you get
0:11:55	convexity of a
0:11:56	where
0:11:56	some of things where
0:11:58	want
0:11:59	some
0:12:01	um it's clear
0:12:03	you you get more
0:12:05	mentioned
0:12:09	so this is just to give some in
0:12:11	that's all
0:12:12	is optimal
0:12:15	um
0:12:16	uh you can look at um
0:12:19	i guess
0:12:21	this ratio so how much improvement you get
0:12:24	dimensionality
0:12:25	no
0:12:26	these are different
0:12:27	as i
0:12:28	a detail
0:12:29	and it on this scenario now
0:12:30	i
0:12:34	um um
0:12:35	here's another part i want to show as i mentioned in an earlier paper we
0:12:39	actually actually to this class when angle stuff can exactly
0:12:42	compute the W so that's look at a scenario your
0:12:45	um so this is a scenario where
0:12:47	we have i guess to that's i think but that's or equal size and ones said
0:12:52	you know a point or
0:12:54	sent of everything is nonzero on others
0:12:56	point five
0:12:57	we need to choose W one
0:12:59	um to W you want
0:13:01	one i T here or make a a is W to W one
0:13:05	and so it any omega it you choose
0:13:09	it's a different way to that one up
0:13:10	as a you might ask what is the minimum number of measurements a need to recover the
0:13:14	signal
0:13:15	and that's what the why
0:13:17	so for example if i two
0:13:19	and one optimization that U W Z equal
0:13:22	and i ni
0:13:25	and
0:13:25	five
0:13:27	um um
0:13:28	number
0:13:29	or
0:13:30	uh
0:13:32	a point fifty five yeah
0:13:34	measure
0:13:35	where is a by use the a are about which you get from
0:13:38	computing this for each
0:13:40	curve requires a grass mind or a compression each point on or
0:13:44	you get something around really and some improvement
0:13:47	uh this is another example
0:13:49	um
0:13:52	uh you are
0:13:53	right
0:13:55	if you use
0:13:56	the form of that we derive your which
0:14:00	the ratio should use the following in this example it's point nine five
0:14:05	and one
0:14:06	five
0:14:07	here and you know that
0:14:09	um so this is one point five maybe it's a
0:14:11	a off the optimal which
0:14:13	i two and three
0:14:14	but in terms of how much you lose on the measurements
0:14:18	point five
0:14:26	so it
0:14:26	actually a
0:14:29	quite
0:14:30	pair
0:14:30	close to
0:14:32	okay
0:14:34	and alice
0:14:34	switch you can
0:14:35	extend as i said you um
0:14:37	two sets were
0:14:39	we
0:14:41	um here again some simulation results of this is the signal
0:14:45	um where i is you know
0:14:48	so it's a different model of assume here that the signal
0:14:52	i'm a
0:14:57	whatever whatever you you use so this
0:14:58	support
0:14:58	signal i'm assuming an of that's probably T for each
0:15:01	and be non zero and this probably
0:15:06	i you i one optimization a signal with
0:15:09	structure
0:15:10	however
0:15:11	and you
0:15:15	if i break it into two channel
0:15:18	and you a way with the or weighting that we describe really only for each of these two then i
0:15:23	get the right
0:15:24	your
0:15:25	break for john like a night
0:15:28	and so this sure that you know even though we're not be
0:15:32	that
0:15:33	a small you're even going for
0:15:34	one one two show
0:15:36	oh
0:15:37	oh
0:15:43	and the gains get less and less
0:15:49	um
0:15:50	i'm probably short on time i you know i could said something about right minimisation but
0:15:56	yeah that pretty much everything that i mentioned
0:15:59	um can apply two
0:16:01	a a nuclear norm
0:16:04	um
0:16:07	minimization in this
0:16:08	should i know your math
0:16:10	is not your where you nuclear norm minimization so
0:16:13	so
0:16:14	um
0:16:16	linear constraints
0:16:17	you can do a similar thing if you have an idea of
0:16:20	where you think your um
0:16:23	you know so you know looking for a low right mate
0:16:25	you have an idea of where maybe
0:16:27	the subspace spaces that these things lie what the probabilities are way problem
0:16:32	a
0:16:33	you know look at a weighted
0:16:34	um nuclear norm minimization
0:16:37	problem
0:16:38	uh
0:16:39	i
0:16:39	through it there's an analysis
0:16:40	a very similar and you get a most
0:16:43	a little more involved almost identical as O
0:16:46	in terms of how to come up with the your
0:16:49	is i said the details are are are are more in the paper and as i said again
0:16:52	the the the the message is very simple
0:16:55	um
0:16:56	for the weighted
0:16:57	uh
0:16:59	or
0:17:01	here
0:17:02	for the way to problem this is a way
0:17:04	work
0:17:05	as this it doesn't depend on how many
0:17:07	sets we have a what the set sizes or
0:17:11	one
0:17:11	so
0:17:14	that we have
0:17:34	i i
0:17:35	i i i don't think so i as um you know if if you estimate a little that all
0:17:40	you know where the boundaries are means you'll be a little that often terms
0:17:43	i
0:17:44	one
0:17:45	set a say that you know
0:17:47	but it means you're often
0:17:49	i
0:17:50	and it doesn't seem to be a a whole you know i i should show
0:17:54	you could already see here
0:17:56	that
0:17:57	you know
0:17:58	it's not a sense
0:18:00	well
0:18:01	a is are of a little bit
0:18:03	um this for you know
0:18:05	even though you're you're your relative weights might change but both of them as long as you're not too close
0:18:11	to the a one
0:18:15	um but i i i haven't done analysis but in terms what we see in and i up there are
0:18:19	more plots than these that we
0:18:21	that but today
0:18:37	so we we have a look at is you know
0:18:39	when we been doing
0:18:41	a it's right
0:18:43	so
0:18:44	and suppose you have a signal that
0:18:46	sparsity beyond on say a traditional one
0:18:50	thresh
0:18:51	so you can do it out one optimization
0:18:53	you'll get something that's not
0:18:56	but
0:18:57	if you look at the larger
0:19:00	vision
0:19:01	and you can rig or this
0:19:03	um it turns out that
0:19:05	where the larger coefficients are there is a better chance of those are actually a nonzero ones and where you're
0:19:10	at smaller coefficients a better chance
0:19:12	i
0:19:13	so you can use that as your prior
0:19:16	i way
0:19:18	and you can
0:19:19	um so there are
0:19:21	situations like this
0:19:22	again you know uh when you have time variations of you have something like a time series that
0:19:27	you know
0:19:27	sparse in time
0:19:29	um you can always use your prior estimate
0:19:33	to give
0:19:34	you
0:19:36	in

WEIGHTED COMPRESSED SENSING AND RANK MINIMIZATION

Compressed Sensing: Theory and Methods

Presented by: Babak Hassibi, Author(s): Samet Oymak, M. Amin Khajehnejad, Babak Hassibi, California Institute of Technology, United States