"'kay" but everybody uh my name hear close and from dot university of technology and uh i will present some joint work with uh house you and joe just you're not "'cause" from the university of minnesota so will talk about a a a bland basically of uh of totally least squares and and sparse reconstruction so i think you use so many a sessions here at a i icassp on on compressive sampling so people give a a little twist uh to that in this in this stock uh so as some of you might known uh might know i totally squares is a is a method where you try to basically sold a a set of linear equations but you hear uh have perturbations both in the data factor which you normally also have a least squares but also in this system matrix or the measurement matrix or whatever you wanna wanna call it and this has some uh applications in in statistic for instance in the errors and variables model but it also has a connection to linear algebra because the solution it's actually based on a on computing a single value uh decomposition a people have also extended it is totally squares principle to a case where you have some more prior knowledge on these perturbations on the on the data vector and this just a matrix this could be uh statistical uh knowledge but could also be knowledge on the on the structure and on the other and you have this large body of work of course on compressive sampling where you for instance use the L one norm minimization to to effect sparsity so well not uh go into details i there talks enough off uh on that so what we do here is we basically tried to solve compressive sensing problem but in a case where you have perturbations also in the data matrix or somehow a kind of compares to the to the problem of the second speaker but but this time we we use some kind of statistical uncertainty on the on the system matrix instead of a a worst case a scenario and uh these perturbations they appear in in in compressive sensing uh through a a yeah in a in the number of of applications for instance you could have not i L at is in the implementation of a a of a compression matrix because off and this is something that should happen in hardware and you off do know exactly what's going on there so it it's a realistic assumption that you take into account some perturbations there but it also has a lot of applications in in uh and um compressive sensing uh uh techniques where you basically try to do some sensing and you use some kind of a great based approach to uh sense uh a location or direction of arrival or frequency and the targets might not be exactly on the grid so you also have you can model that's that's uh uh that our using a perturbation on the on the basis a matrix uh there are some uh uh a is on on the the performance of of compressive sensing including such a perturbation in the in the measurement may but uh those are usually performance analysis of of standard sparse reconstruction methods like a a so or uh or basis pursuit or a uh those type of method that so people have looked at what happens to the R P for instance in that case and how sensitive these sparse approximant are to that two basis mismatch here we basically gonna look at a and how but and to take those perturbations into account um uh uh we also look at statistical optimality of of that problem and look at uh global and and local optima and uh some of these results have i have already appeared in in a transaction on signal processing paper uh recently today we will focus uh uh more specifically on the the case where you have this prior knowledge on the on the perturbations such as to correlations and and structure so that leads tend to weighted and structured a sparse totally squares so here you see an outline of that uh use see basically the the an outline of the problem uh so you have a simple under determined a system of equation and wise i and uh so you we have a that's equations i don't know and we assume that that a known vector is sparse so usually this problem is solved in uh uh using uh some kind of a these squares for instance a regularized weight uh and L one norm on on next so that leads to some kind of las so problem and where you try to minimize the square it's a two norm of the residual uh E which is called you hear a regularized with this L one norm on on H now this only count basically for at in on the data vector on on Y but when you also have uh uh at or as in are a matrix in that case you have to think about other ways other types of uh of solutions and one of those uh uh is given by totally square so in a way it's some kind of uh way to compute C squares uh in in case you one include some robustness against perturbations on this on this S system a tree so in that case you and up with uh a problem like this where you have basically uh the square it's probably use where you try to minimize the squared frobenius norm of the total perturbations both on this system matrix a and the data vector Y and this again regular now with and L one norm constraint on on a uh and uh the constraint here yeah and the other constraint that you at is basically the the fact that you should have a a a a if you include the R is that you have any equality between the data vector the perturbed data vector and a and the uh uh to data matrix times the unknown vector X uh uh so normally without out this uh without this constraint here without the the the sparsity uh constraints you basically have a classical totally squares in case you would have a a an over determined system and then the solution is actually given by a single value decomposition that you have to carry out on the composite matrix of a a uh um concatenated with the data vector one but here we are also we also have this uh uh a extra a L one norm constraint on the on the on the fact so basically this problem uh we can solve that in case we have no further information on the structure on of these perturbations uh if you have so that would be then on that case but if you have for instance that this tickle knowledge about these perturbations you could think about a weighted version of that problem weighted by the inverse of the covariance matrix for is on these perturbations or you could also look at the structured version where you take in uh where you take the structure into account on this corpus that a matrix of eight concatenated with Y and uh this this happens in various applications uh for instance in in in deconvolution or system identification or a linear prediction where these the this a matrix a often will be to plates or hankel but you could also have a circle and structures or from them on the structures for instance an harmonic retrieval and and also when you have these all these uh a grid based the sensing approach in that case this a is basically constructed by for this complex exponentials so you also have this from them on the structure there in in this type of of a problem and the structure uh mathematically this is basically a model as a some kind of a function of this comp it's uh matrix as a function of this uh parameter vector B so all the structure is basically model in this way so there's a unique mapping between the parameter vector and this composite matrix which is also called has here and and the the the problem that we're solving here is basically a weighted and structured sparse total squares problem where we gonna uh minimize a a uh squared weighted norm on at along which is now the perturbation not on this the the composite matrix as but on the parameter vector so you basically minimize an or now on the perturbation of the parameter vector which is the standard approach in in a a weighted and structured total squares and now we again at here the the sparsity constraint uh two that's uh cost function and a gang subject to uh uh this uh you quality here which is basically the same equality quality as about we had you on this slide but now it's a a given in of as a function of uh uh the parameter vector P and it's a distortion have so this might not necessarily be uh a a your equation now that depends on how you can model the the structure so that's why we introduce a and assumption uh where i will start with a actually the second part where we model as as actually a linear function on a find function if you want all of a piece so basically what we assume is that uh as of P can be expressed as a linear function of P so this constraint or can be transformed into a linear uh a constraint the second part is it's yeah more of a a of a notational uh assumptions so that makes things a little bit easier so that we make also the structure and S bowls so we split it in a parameter vector up or a part of parameters that's related to the system matrix a and a part that's related to the data vector uh why so that that the to perturbations on those factors can be for instance like and separately and i after whitening that happens here you basically get the following problem so this absolutely a a is the perturbation on the parameter vector related to a and have lies the perturbation related to the data vector uh why but they're both white and now according to the their inverse covariance major and again here the here is that now this linear your expression which is due to the fact that we assume the linear form for this uh as for this structure in the in the matrix in the in the perturbation so this what we have to solve uh and of course you could always uh we at so may or epsilon why by the uh a its solution that's given by the constraint and make their it into an unconstrained problem for instance you could replace absolute why high the solution that's a given here so then you get an unconstrained problem but it's uh a non compact a problem that you have to solve because you have here both it's and and the unknown perturbation on the on the system matrix what before we start looking into solutions uh there's some uh uh uh a i mean there's some optimality related to that to that problems you can interpret uh the solution of this problem in both X and S all as a maximum a posteriori optimal solution under certain conditions uh the conditions are that you need the option perturbations and uh that's uh uh there's a dependence between all the all the variables and also that's the parameter vector on a that it's kind of an informative for a uniform distribution and that the the brown but the unknown vector X that this one is not much and distribute it so on the does circumstances you can show that the solution of to this problem gives you the maximum a posteriori optimality so this problem it's not it has some statistical uh into it can it has some statistical uh meaning so to solve this we thing for is about an alternating descent method where you basically uh uh solve all for all uh iteratively between a a on a so the perturbation on the system matrix and X so you could for instance of fix absolutely eight so i could fix it here and fix it here and in that case it becomes like a a classical uh uh but a bit altered so a loss so like problem so basically the solution can be found by an uh algorithm uh the that has been proposed to to solve a uh sparse reconstruction problems using the least squares a cost function and then once it's just give you can update you can use that uh solution to uh of to uh find the result for the perturbation on the system matrix and a if a X is given and everything becomes a a a a a a a uh unconstrained quadratic program so then you for apps on you can find to the solution then in closed four and if you start with a perturbation that's equal to zero you basically start with the solution that's given by the classical loss to problem and you can show that you always uh uh uh improve your cost function that you go that you converge at least a a a stationary point so uh within this cost function you can show this way that you will always improve upon the to the the classical solution the classical a so so good solution of course to salt and this loss a problem you can use your favourite us solve over what you could also do is you courts uh use scored in a test set a court of this and there to solve uh the last so with which basically means that you a fixed all the entries except for one in your X factor and that you sold that one separately and then it becomes like a scalar loss so which gives you a closed-form form solution my means of a soft thresholding so and you can do that those iterations altogether together so you can basically alternate between have a and then every entry of X and then go back to actual a and then sold that for every do we have big separately and also that one so that gives you a global uh core descent method that can also be shown to converge to at least a stationary point and of course that this is not necessarily the global optimum but at least you know that you improve upon the the initial solution which is the the lasso solution so here are some uh them are coal the comparison so we assume a that we have a a a a twenty by forty T a matrix so it's a compression here uh uh fifty percent you could say there's some stupid structure in a matrix we assume also different variances on on a and Y so note that also the uh on the perturbations on on the and Y so also the perturbation on a has has a to structure and the signal vector X here is generated with ten and nonzero entries and what is shown here is the L zero at or versus the parameter longer and the L one at versus the parameter a longer which basically this land like basically gives you a trade-off between uh solving that totally squares problem and uh yeah we use parts at each so the the bigger the long as the more wait you give to uh to this a sparse solution and you see that uh the best solution here in that so the L zero at or uh so this is basically related to support recovery so it's that percent H of uh and trees where to support between the true solution and the estimate solution or or not the same so this tells you something about support recovery and there you see that if you take everything into account so the blue curve here the weight it's uh structured sparse totally square you get the basically the best uh a sparsity recovery if you just take uh the weights the the correlations into account or the structure so these are the red curves and the black curves then you get a a a little bit uh uh uh bigger uh L zero at errors and if you only do uh a if you don't take any weight or structure into account you are you're a a little bit uh worse and a loss so gives you basically the the worst L zero at for the L one at or or so this is basically the L one norm data or the the the a performance as are the that's it's closer to each other but of course supports recovery is is the most important in many many of these uh application a like i told you before this this approach is very useful in in cases where you uh do sensing and you use some kind of a grid based approach so that for instance uh can be used in direction of arrival estimation where you basically can uh uh divides the whole anger or space into different grid points into a a a or or and and angle great winces is every two degrees you can pick it a grid point and in that case you could express your received vector or Y T V as a linear combination of a array response vectors so basically this tells you this first here at don't want tells you uh how the system would be received out the target would be received the signal would be received if it comes in on an angle of arrival of T one so you get a linear combination of all these uh uh array response vectors on the different grid points but of course and and these X contains the combining weights but of course the combining weights will be sparse because only where you have a target you will have a combining way uh of course whenever you have uh targets that are in between the great you the this quality will not be exactly true and there's some kind of perturbation on on the on the grid so you could say that the the true uh the true exponent all five in you or uh of your source uh could be then model modelled as uh the exponent in you are uh grid points plus some some than your correction because like i said before we wanna make you wanna have a the perturbations in a in a linear form so we want to have an a find expression for the perturbations so that means that in this case we need some kind of approximation because there is a lot of structure in this uh and in these perturbations but it's not a your so we approximated by a here uh by the find function of of the parameter fact i'm not gonna go into the details here uh and so that allows you then to uh to get a better estimate because next to solving for X you also allow these a grid points basically to shift to the two solutions so if you have a a source that is uh somewhere in between the two good points because you allows for perturbations on this a matrix a great point might be shifting to the true uh solution so you get some kind of super resolution of fact uh for free and this approach uh other approaches usually start from a rubber great and then they we find a grid uh in those locations where you have a a the target here you got a basically in in one shot uh for is as an example where you have H we see don't and and on time as an ninety great points uh so every two degrees you have a great point and you have a source at one degree and wanted minus nine degree so there are exactly in between two grid point and then you see that the classical us to basically give you uh four nonzeros basically the the grid points around the the the sources you could say a okay we can interpolate those and then we get the solution but you can only do that if you know that you have only two sources of course if you don't know the number of sources you could think that there are four sources now in this in this problem while the the weighted that's touch at uh sparse totally squares gives you basically two peaks in the in the red locations which which correspond to these black arrows where the true sources located so the great basically it's to the to the right position uh you see here also another or all but uh a is that this is indeed be so this dot this kind of twenty db below the the the first up so you you basically can also do some kind of a a number of sources cover using using this approach so i think that brings me to the to the conclusion so we've proposed as a weighted and structured to sparse uh a totally squares problem which is motivated by first of all the fact that you have non i'd yell at these in the in a in uh compression matrix but it's also motivated by a lot of these sensing applications so we can account for also correlations and structure in these perturbations and we show uh looked at the the uh map optimality of of of this uh a problem and we looked at uh a reduced complexity alternating descent and coordinated descent solutions uh ongoing and future research consist of recursive and robust implementations of of this method uh we also try to see whether are also the svd can be used in in some of these problems for since basically solving an a D also bows down to an iterative method do you you one there whether in those iterations you could also include uh sparsity uh and and still use uh i C D based uh a type of methods to solve a also a sparse totally squares so that concludes uh the my presentation hmmm any questions i i was thinking um much more as the complexity of your or a solution um i mean what do use for a a a a a large problem so there um a microphone array very or something that well i can say that the the uh the the the of the complexity is basically determined by how you solve this uh the the initial sparse reconstruction from and and a and you do that iteratively so you do that maybe a five times i don't know exactly how many iterations that we used here but and general we don't need to i mean you can stop ever you want to right after one iteration you know that you're are already improve upon the classical uh a sparse reconstruction method so you could do it for instance uh you can solve you can say you are we have twice the complexity because solving than for the perturbation that's a closed form expressions so that's not uh the biggest complex so it depends basically on the solver or that you use for it is a sparse reconstruction just as one to comment and of the question how how complex this is it can be some times less complex them or each now L S because you have to remember that a chunk ls tales and is really right a can be more a less complex and how are you this is way what is worse mentioned should use that when people he of to regular ties okay the L S then a this sense of this use of the world so again can you mean just all of a some sort of each tentative one station right yeah the regularized even to less with a quadratic yeah um okay can you change of the work for a while instead of have one a different from yeah that's that's possible because okay in the i mean in in in these iterations i mean the first start all the iterations you basically set your perturbation to zero and then it could be instead of a loss problem then you have a a an type of a sparse vector lies to a problem that you could solve in the in that step uh uh and and whatever you fixed and the solution okay in the second step of the it's always a close form expression for the perturbations so you could change is L one thank you yeah and yeah come back into a a a high resolution a a lot connotation technique you estimation techniques what's that snr threshold when you use this kind of compressive sensing inspired take yeah i i don't know we are we should we should uh a yeah we should test it on on on more applications so this is more let's say the the theoretical framework and now okay we have to check this on on many different applications the thing is you can use here a kind of a rough great right as an initial point and question is how rough can you make your great now how much can you correct for a but that's something we didn't to analytically uh yeah investigate yeah we have similar but uh a theoretical analysis think that yeah yeah uh well uh well for you so see you know we have also there is also spectrum sensing application but so it's always compared to the there the the standard uh a sparse reconstruction methods so uh is it is that the comparison you would like i mean oh okay yeah and and actually that's the yeah so we are yeah this one i i didn't show it but so this is when you do and really and it's some some uh uh for different monte carlo simulations you see what happens if you compare loss so which is the blue curve this is what actual before where you have to for peaks right you could say okay you integrates and then you get this blue dashed line what you see that even the the full the weighted sparse total score still does better than the integration and if we integrate we don't gain anything because we already are the good solution so so this is a guy for the direction of right so even with interpolation although you need to know the number of sources for this interpolation but even and we we we have a better performance okay i think you