0:00:19"'kay" but everybody uh my name hear close and from dot university of technology and uh i will present some
0:00:25joint work with uh house you and joe just you're not "'cause" from the university of minnesota
0:00:30so will talk about a a a bland basically of uh of totally least squares and and sparse reconstruction
0:00:36so i think you use so many
0:00:38a sessions here at a i icassp on on compressive sampling so
0:00:42people give a a little twist uh to that in this in this stock
0:00:46uh so as some of you might known uh might know i totally squares is a is a method where
0:00:52you try to basically sold a a set of linear equations
0:00:55but you hear uh have perturbations both in
0:00:59the data factor which you normally also have a least squares but also in
0:01:03this system matrix or the measurement matrix or
0:01:06whatever you wanna wanna call it
0:01:08and this has some uh applications in in statistic
0:01:12for instance in the errors and variables model but
0:01:14it also has a connection to linear algebra because the solution
0:01:18it's actually based on a on computing a single value uh decomposition
0:01:23a people have also extended it is totally squares principle to a case where you have some more prior knowledge
0:01:29on these perturbations on the on the data vector and this just a matrix
0:01:33this could be uh statistical
0:01:35uh knowledge but could also be knowledge on the on the structure
0:01:39and on the other and you have this large body of work of course on compressive sampling where you for
0:01:43instance use the L one norm minimization to
0:01:46to effect sparsity so well not
0:01:48uh go into details i there
0:01:50talks enough off uh on that
0:01:53so what we do here is we basically tried to solve compressive sensing problem but in a case where you
0:01:59perturbations also in the data matrix or somehow
0:02:02a kind of compares to the
0:02:04to the problem of the second speaker but
0:02:06but this time we we use some kind of statistical uncertainty on the on the system matrix instead of a
0:02:13a worst case a scenario
0:02:16and uh these perturbations they appear in in in compressive sensing uh through a a yeah in a in the
0:02:22number of of applications for instance you could have
0:02:25not i L at is in the implementation of a a of a compression matrix
0:02:29because off and this is something that should happen in hardware and you off do know exactly
0:02:33what's going on there so it it's a realistic assumption that you
0:02:36take into account some perturbations there
0:02:39but it also has a lot of applications in in uh
0:02:43and um
0:02:44compressive sensing uh uh techniques where
0:02:47you basically try to do some sensing
0:02:49and you use some kind of a great based approach to uh sense
0:02:53uh a location or direction of arrival or frequency
0:02:56and the targets might not be exactly on the grid so you also have
0:03:00you can model that's that's uh
0:03:03that our using a perturbation on the on the basis a matrix
0:03:08uh there are some uh uh a is on on the the performance of of compressive sensing
0:03:13including such a perturbation in the in the measurement may
0:03:17but uh those are usually performance analysis of of standard sparse reconstruction methods like a a so or
0:03:23uh or basis pursuit or a
0:03:25uh those type of method that so people have looked at what happens to the R P for instance in
0:03:28that case
0:03:30and how sensitive these sparse approximant are to that two basis mismatch
0:03:34here we basically gonna look at a and how but and to take those perturbations into account
0:03:41uh uh we also look at statistical optimality of of that problem and look at uh
0:03:47global and and local optima and uh some of these results have i have already appeared in
0:03:51in a transaction on signal processing paper uh recently
0:03:54today we will focus uh uh more specifically on
0:03:58the the case where you have this prior knowledge on the on the perturbations such as to correlations and and
0:04:04so that leads tend to weighted and structured a sparse totally squares
0:04:10so here you see an outline of that uh use see basically the the an outline of the problem uh
0:04:15so you have a simple under determined a system of equation and wise i
0:04:19and uh so you we have a that's equations i don't know and we assume that that
0:04:24a known vector is sparse so usually
0:04:26this problem is solved in uh
0:04:29using uh some kind of a
0:04:31these squares for instance a regularized weight uh and L one norm on on next so that leads to some
0:04:35kind of las so problem
0:04:37and where you try to minimize the square it's a two norm of the residual
0:04:42uh E which is called you hear a regularized with this L one norm on on H
0:04:48now this only count basically for at in on the data vector on on Y
0:04:53but when you also have uh uh at or as in are a matrix in that case
0:04:57you have to think about other ways other types of uh of
0:05:02and one of those uh uh is given by totally square so in a way it's some kind of
0:05:07way to compute C squares uh in in case you one include some robustness against perturbations on this on this
0:05:13S system a tree
0:05:15so in that case you and up with uh
0:05:17a problem like this where you have
0:05:19basically uh the square it's probably use where you try to minimize the squared frobenius norm
0:05:24of the total perturbations both on this system matrix
0:05:28a and the data vector Y
0:05:30and this again regular now with
0:05:32and L one norm constraint on on a
0:05:35uh and uh the constraint here
0:05:38yeah and the other constraint that you at is basically the the fact that you should
0:05:42have a a a a
0:05:43if you include the R is that you have any equality between the data vector the perturbed data vector and
0:05:48and the uh uh to data matrix times the unknown vector X
0:05:54uh uh so normally without out this uh without this constraint here without the the the sparsity uh constraints you
0:06:00basically have a classical totally squares in case you would have a a an over determined system and then the
0:06:04solution is actually
0:06:06given by a single value decomposition
0:06:08that you have to carry out on the composite matrix of a a uh
0:06:13um concatenated with the data vector one
0:06:16but here we are also we also have this
0:06:20uh a extra a L one norm constraint on the on the on the
0:06:27so basically this problem
0:06:30we can solve that in case we have no
0:06:32further information on the structure on of these perturbations
0:06:36uh if you have so that would be then on that case but if you have
0:06:40for instance that this tickle knowledge about these perturbations you could think about
0:06:44a weighted version of that problem
0:06:46weighted by the inverse of the covariance matrix for is on these perturbations
0:06:50or you could also look at the structured version where you take in uh where you take the structure into
0:06:54account on
0:06:55this corpus that a matrix of eight
0:06:58concatenated with Y
0:07:00and uh this this happens in various applications uh for instance in in in
0:07:05deconvolution or system identification or a linear prediction where these
0:07:09the this a matrix a often will be to plates or hankel
0:07:12but you could also have a circle and structures or from them on the structures
0:07:16for instance an harmonic retrieval
0:07:18and and also when you have these all these uh a grid based the sensing approach
0:07:22in that case this a is basically
0:07:25constructed by for this complex exponentials so you also have this from them on the structure there in
0:07:30in this type of of a problem
0:07:32and the structure uh mathematically this is basically a model as a some kind of a function of this comp
0:07:41as a function of this uh parameter vector B
0:07:44so all the structure is basically model in this way so there's a unique mapping between the parameter vector
0:07:49and this composite matrix which is also called has here
0:07:53and and the the the problem that we're solving here is basically a weighted and structured sparse total squares problem
0:08:00where we gonna uh minimize a a uh squared weighted norm
0:08:04on at along which is now the perturbation not on this the the composite matrix as but on the
0:08:09parameter vector so you basically minimize
0:08:12an or now on the perturbation of the parameter vector which is the standard approach in
0:08:16in a a weighted and structured total squares
0:08:19and now we again at here the the sparsity constraint
0:08:23uh two that's uh cost function
0:08:26and a gang subject to uh uh this uh you quality here which is basically
0:08:31the same equality quality as about we had
0:08:33you on this slide but now it's
0:08:35a a given in of as a function of uh uh the parameter vector P and it's
0:08:39a distortion have so this might not necessarily be uh a a your equation now
0:08:44that depends on how you can model the the structure
0:08:48so that's why we introduce a and assumption
0:08:52where i will start with a actually the second part where we model as as actually a linear function on
0:08:58a find function if you want
0:09:00all of a piece so basically what we assume is that
0:09:03uh as of P can be expressed as a linear function of P so this constraint or can be
0:09:08transformed into a linear
0:09:10uh a constraint
0:09:13the second part is it's
0:09:14yeah more of a a of a notational uh
0:09:17assumptions so that makes things a little bit easier so that we make also the structure and S bowls so
0:09:22we split it in
0:09:23a parameter vector up or a part of parameters that's related to the system matrix a and a part that's
0:09:28related to the data vector
0:09:30uh why
0:09:32so that that the to perturbations on those factors can be
0:09:35for instance like and separately
0:09:37and i after whitening that happens here you basically get the following problem
0:09:41so this absolutely a a is the perturbation on the parameter vector related to a
0:09:45and have lies the
0:09:49related to the data vector uh why but they're both white and now
0:09:52according to the their inverse covariance major
0:09:55and again here the here is that now this linear your expression which is due to the fact that we
0:10:00assume the linear form
0:10:01for this uh as for this structure in the in the matrix
0:10:05in the in the perturbation
0:10:08so this what we have to solve
0:10:09uh and of course you could always
0:10:11uh we
0:10:13at so may or epsilon why by
0:10:15the uh a its solution that's given by the constraint and make their it into an unconstrained problem for instance
0:10:21you could replace absolute why high the solution that's a given
0:10:24here so then you get an unconstrained problem but it's
0:10:27uh a non compact
0:10:29a problem that you have to solve because you have here both
0:10:32it's and and the unknown perturbation on the
0:10:34on the system matrix
0:10:38what before we start looking into solutions uh
0:10:41there's some uh uh uh a i mean there's some optimality related to that to that problems
0:10:46you can interpret uh the solution of this
0:10:48problem in both X and S all as a
0:10:51maximum a posteriori optimal solution
0:10:53under certain conditions
0:10:55uh the conditions are that you need the option
0:10:59and uh that's uh uh there's a dependence between all the all the variables and also that's the parameter vector
0:11:06on a
0:11:07that it's kind of an informative for a uniform distribution
0:11:10and that the the brown but the unknown vector X that this one is not much and distribute it so
0:11:15on the does
0:11:16circumstances you can show that
0:11:19the solution of to this problem gives you the maximum a posteriori optimality
0:11:25this problem it's not it has some statistical uh into it can it has some statistical uh meaning
0:11:33so to solve this we thing for is about an alternating descent method where you basically uh
0:11:38uh solve all for all uh iteratively between a a on a so the perturbation on the system matrix
0:11:45and X so you could for instance of fix absolutely eight
0:11:48so i could fix it here
0:11:50and fix it here and in that case it becomes like a
0:11:52a classical uh
0:11:55but a bit altered so a loss so like problem
0:11:58so basically the solution can be found by an uh algorithm uh the that
0:12:02has been proposed to to solve a
0:12:07sparse reconstruction problems using the least squares a cost function
0:12:11and then once it's just give you can update
0:12:13you can use that uh solution to uh
0:12:16of to uh
0:12:17find the result for the perturbation on the system matrix and
0:12:21a if a X is given and everything becomes a a a a a a a uh unconstrained quadratic program
0:12:26so then you for apps on you can find to the solution then
0:12:28in closed four
0:12:30and if you start with a perturbation that's equal to zero you basically start
0:12:34with the solution that's given by the classical loss to problem
0:12:38and you can show that you always uh uh uh
0:12:40improve your cost function that you go that you converge at least
0:12:44a a a stationary point
0:12:48within this cost function you can show this way that you will always improve upon the to the the classical
0:12:53solution the classical a so so good solution
0:12:57of course to salt and this loss a problem you can use your favourite us solve over
0:13:01what you could also do is you courts uh use scored in a test set a court of this and
0:13:05there to solve
0:13:06uh the last so
0:13:08with which basically means that
0:13:10you a fixed all the entries except for one in your X factor and that you sold that one
0:13:15separately and then it becomes like a scalar loss so
0:13:18which gives you a closed-form form solution
0:13:20my means of a soft thresholding
0:13:22so and you can
0:13:23do that those iterations altogether together so you can basically alternate between
0:13:27have a and then every entry of X
0:13:30and then go back to actual a and then sold that for every do we have big separately
0:13:35and also that one so that gives you a global
0:13:37uh core descent method that can also be shown to converge to at least a stationary point
0:13:47of course
0:13:48that this is not necessarily the global optimum but at least you know that you improve upon the the initial
0:13:53solution which is the the lasso solution
0:13:57so here are some uh them are coal the comparison so we assume
0:14:01a that we have a a a a twenty by forty T a matrix so it's a compression here
0:14:05uh uh fifty percent you could say
0:14:08there's some stupid structure in a matrix we assume also different variances on
0:14:12on a and Y
0:14:15so note that also the uh on the perturbations on on the and Y so also the perturbation on a
0:14:20has a to structure
0:14:22and the signal vector X here is generated with ten and nonzero entries
0:14:26and what is shown here is the L zero at or versus the parameter longer
0:14:31and the L one at versus the parameter a longer which basically
0:14:34this land like basically gives you a trade-off between
0:14:37uh solving that totally squares problem and uh
0:14:41yeah we use parts at each so the the bigger the long as the more
0:14:44wait you give to uh to this a sparse solution
0:14:49and you see that uh the best solution here in that so the L zero at or uh so this
0:14:54is basically related to support recovery so it's that percent H of
0:14:58uh and trees where to support between the true solution and the estimate solution or
0:15:03or not the same
0:15:04so this tells you something about support recovery
0:15:07and there you see that if you take everything into account so the blue curve here
0:15:11the weight it's uh structured sparse totally square you get the basically the best
0:15:15uh a sparsity recovery
0:15:16if you just take uh the weights
0:15:19the the correlations into account
0:15:21or the structure so these are the red curves and the black curves
0:15:24then you
0:15:26get a a a little bit uh uh uh
0:15:29uh L zero at errors
0:15:31and if you only do uh
0:15:33a if you don't take any weight or structure into account you are you're a a little bit
0:15:37uh worse and a loss so gives you basically the the worst L zero at
0:15:42for the L one at or or so this is basically the L one norm data or the the the
0:15:48a performance as are the that's it's closer to each other but of course supports recovery is is
0:15:53the most important in many many of these uh application
0:15:58a like i told you before this this approach is very useful in in cases where you uh do sensing
0:16:03and you use some kind of a grid based approach
0:16:06so that for instance uh can be used in direction of arrival estimation
0:16:11where you basically can uh uh
0:16:14divides the whole anger or space into different grid points into a a a or or and and angle great
0:16:20winces is every two degrees you can pick it a grid point
0:16:24and in that case you could express your received vector or Y T V as a linear combination of a
0:16:28array response vectors so
0:16:30basically this tells you this first here at don't want tells you
0:16:34uh how the system would be received out the target would be received the signal would be received if it
0:16:38comes in on an angle of arrival of
0:16:41T one
0:16:42so you get a linear combination of all these uh
0:16:45uh array response vectors on the different grid points
0:16:47but of course and and these X contains the combining weights
0:16:51but of course the combining weights will be sparse because only where you have a target you will have a
0:16:55combining way
0:16:57uh of course whenever you have uh
0:17:00targets that are in between the great
0:17:03the this quality will not be exactly true and there's some kind of
0:17:07on on the
0:17:09on the grid
0:17:10so you could say that the the true
0:17:13the true exponent all five
0:17:15in you or uh of your source
0:17:17uh could be then model modelled as
0:17:21the exponent in you are uh grid points
0:17:24plus some some than your correction
0:17:27because like i said before we wanna make you wanna have a the perturbations in a in a linear form
0:17:32so we want
0:17:33to have an a find expression for the perturbations so
0:17:36that means that in this case we need some kind of
0:17:38approximation because there is
0:17:40a lot of structure in this
0:17:42uh and in these perturbations but it's not a your so we approximated by a here
0:17:47by the find function of of the parameter fact
0:17:51i'm not gonna go into the details here
0:17:55uh and so that allows you then to uh
0:17:58to get a better estimate because next to solving for X you also allow these a grid points basically to
0:18:03shift to the two solutions
0:18:04so if you have a
0:18:06a source that is uh somewhere in between the two good points
0:18:09because you allows for perturbations on this a matrix a great point might be
0:18:13shifting to the true uh solution
0:18:16so you get some kind of super resolution of fact uh for free and this approach
0:18:20uh other approaches usually start from a rubber great and then they we find a grid
0:18:24uh in those locations where you have a a the target
0:18:27here you got a basically in in one shot
0:18:31uh for is as an example where you have H we see don't and and on time as an ninety
0:18:35great points
0:18:36so every two degrees you have a great point and you have a source at one degree and wanted minus
0:18:40nine degree
0:18:41so there are exactly in between two grid point
0:18:45and then you see that the classical us to basically give you uh four nonzeros
0:18:49basically the the grid points around the
0:18:52the the sources
0:18:54you could say a okay we can interpolate those and then we get the solution but
0:18:58you can only do that if you know that you have only two sources of course if you don't know
0:19:02the number of sources you could think that
0:19:04there are four sources now in this in this problem
0:19:07while the the weighted that's touch at uh
0:19:09sparse totally squares
0:19:11gives you basically two peaks in the in the red locations which
0:19:16which correspond to these black arrows where the true sources located so the great basically it's to the
0:19:21to the right position
0:19:23uh you see here also another or all but
0:19:25uh a is that this is indeed be so this dot this kind of twenty db below the the the
0:19:30first up
0:19:33so you you basically
0:19:34can also do some kind of a a number of sources cover using using this approach
0:19:40so i think
0:19:41that brings me to the to the conclusion so we've proposed as a weighted and structured to sparse uh a
0:19:47totally squares problem
0:19:48which is motivated by
0:19:50first of all the fact that you have non i'd yell at these in the in a
0:19:53in uh compression matrix but it's also motivated by a lot of these sensing applications
0:19:58so we can account for also correlations and structure in these perturbations
0:20:03and we show uh looked at the the uh map optimality of of of this uh a problem
0:20:09and we looked at uh a reduced complexity alternating descent and coordinated descent solutions
0:20:15uh ongoing and future research consist of recursive and robust implementations of of this method
0:20:21uh we also try to see whether are also the svd can be used in in some of these problems
0:20:25for since basically solving an a D also bows down to an iterative method
0:20:30do you you one there whether in those iterations you could also include
0:20:33uh sparsity uh and and still use
0:20:37uh i C D based uh a type of methods to solve a also a sparse totally squares
0:20:43so that concludes uh the my presentation
0:20:54any questions
0:21:01i was thinking
0:21:04much more as the complexity of your or a solution
0:21:09i mean
0:21:09what do use for a a a a a large problem so there um
0:21:14a microphone array very or something that
0:21:16well i can say that the
0:21:17the uh the the the of the complexity is basically determined by how you solve this uh the the initial
0:21:22sparse reconstruction from
0:21:24and and a and you do that
0:21:26iteratively so you do that maybe a
0:21:28five times
0:21:29i don't know exactly how many iterations that we used here but
0:21:33and general we don't need to i mean you can stop ever you want to right after one iteration you
0:21:37know that
0:21:38you're are already improve upon
0:21:40the classical uh a sparse reconstruction method so
0:21:43you could do it for instance uh you can solve
0:21:46you can say you are we have twice the complexity
0:21:48because solving than for the perturbation that's a closed form expressions so that's not
0:21:52uh the biggest complex
0:21:54so it depends basically on the solver or that you use for it is a sparse reconstruction
0:22:04as one to comment and of the question
0:22:06how how complex this is
0:22:08it can be some times less complex them
0:22:11or each now L S
0:22:14because you have to remember that a chunk ls
0:22:16tales and is really
0:22:19a can be more
0:22:20a less complex and how are you
0:22:23this is way
0:22:25what is worse
0:22:26mentioned should use that
0:22:29when people he of to regular ties
0:22:31okay the L S
0:22:34then a this sense of this use of the world
0:22:37so again can you mean just all of a
0:22:39some sort of each tentative
0:22:41one station
0:22:46yeah the regularized even to less with a quadratic
0:22:53okay can you
0:22:54change of the work for a while
0:22:58instead of
0:23:00have one
0:23:01a different from
0:23:04yeah that's that's possible because okay in the i mean in in
0:23:08in these iterations i mean the first start
0:23:10all the iterations you basically set your perturbation to zero and then it could be
0:23:14instead of a loss problem then you have a a an type of a sparse vector lies to
0:23:19a problem that you could solve in the in that step
0:23:22uh uh and and whatever you fixed
0:23:24and the solution okay in the second step of the it's always a close form expression for the perturbations so
0:23:29you could change is
0:23:30L one
0:23:32thank you
0:23:36come back into a a a high resolution a a lot connotation technique you estimation techniques
0:23:42what's that
0:23:43snr threshold
0:23:45when you use this kind of compressive sensing inspired take
0:23:49yeah i i don't know we are we should we should uh a yeah we should test it on on
0:23:53on more applications so this is more let's say the the theoretical framework
0:23:57and now okay we have to check this on on many different applications
0:24:01the thing is you can use here a kind of a rough great right as an initial point and
0:24:05question is how rough can you make your great now how much can you correct for
0:24:10a but that's something we didn't to analytically uh yeah
0:24:16yeah we have similar but uh a theoretical analysis think that yeah
0:24:35well for you so see you know we have also there is also spectrum sensing application but
0:24:40so it's always compared to the there the the standard uh a sparse reconstruction methods
0:24:46so uh
0:24:48is it is that the comparison you would like
0:24:50i mean
0:24:53oh okay yeah and and actually that's
0:24:59so we are yeah this one i i didn't show it but
0:25:01so this is when you do and really and it's some some uh
0:25:05uh for different monte carlo simulations you see what happens if you
0:25:09compare loss so which is the blue curve
0:25:12this is what actual before where you have to for peaks
0:25:14right you could say okay you integrates and then you get this blue dashed line
0:25:19what you see that even
0:25:20the the full the weighted sparse total score still does better than the integration
0:25:25and if we integrate we don't gain anything because we already are the good solution so
0:25:30so this is a guy for the direction of right
0:25:33so even with interpolation although you need to know the number of sources for this interpolation but
0:25:37even and we we we have a better performance
0:25:44okay i think you