"'kay" but everybody uh my name hear close and from dot university of technology and uh i will present some

joint work with uh house you and joe just you're not "'cause" from the university of minnesota

so will talk about a a a bland basically of uh of totally least squares and and sparse reconstruction

so i think you use so many

a sessions here at a i icassp on on compressive sampling so

people give a a little twist uh to that in this in this stock

uh so as some of you might known uh might know i totally squares is a is a method where

you try to basically sold a a set of linear equations

but you hear uh have perturbations both in

the data factor which you normally also have a least squares but also in

this system matrix or the measurement matrix or

whatever you wanna wanna call it

and this has some uh applications in in statistic

for instance in the errors and variables model but

it also has a connection to linear algebra because the solution

it's actually based on a on computing a single value uh decomposition

a people have also extended it is totally squares principle to a case where you have some more prior knowledge

on these perturbations on the on the data vector and this just a matrix

this could be uh statistical

uh knowledge but could also be knowledge on the on the structure

and on the other and you have this large body of work of course on compressive sampling where you for

instance use the L one norm minimization to

to effect sparsity so well not

uh go into details i there

talks enough off uh on that

so what we do here is we basically tried to solve compressive sensing problem but in a case where you

have

perturbations also in the data matrix or somehow

a kind of compares to the

to the problem of the second speaker but

but this time we we use some kind of statistical uncertainty on the on the system matrix instead of a

a worst case a scenario

and uh these perturbations they appear in in in compressive sensing uh through a a yeah in a in the

number of of applications for instance you could have

not i L at is in the implementation of a a of a compression matrix

because off and this is something that should happen in hardware and you off do know exactly

what's going on there so it it's a realistic assumption that you

take into account some perturbations there

but it also has a lot of applications in in uh

and um

compressive sensing uh uh techniques where

you basically try to do some sensing

and you use some kind of a great based approach to uh sense

uh a location or direction of arrival or frequency

and the targets might not be exactly on the grid so you also have

you can model that's that's uh

that our using a perturbation on the on the basis a matrix

uh there are some uh uh a is on on the the performance of of compressive sensing

including such a perturbation in the in the measurement may

but uh those are usually performance analysis of of standard sparse reconstruction methods like a a so or

uh or basis pursuit or a

uh those type of method that so people have looked at what happens to the R P for instance in

that case

and how sensitive these sparse approximant are to that two basis mismatch

here we basically gonna look at a and how but and to take those perturbations into account

uh uh we also look at statistical optimality of of that problem and look at uh

global and and local optima and uh some of these results have i have already appeared in

in a transaction on signal processing paper uh recently

today we will focus uh uh more specifically on

the the case where you have this prior knowledge on the on the perturbations such as to correlations and and

structure

so that leads tend to weighted and structured a sparse totally squares

so here you see an outline of that uh use see basically the the an outline of the problem uh

so you have a simple under determined a system of equation and wise i

and uh so you we have a that's equations i don't know and we assume that that

a known vector is sparse so usually

this problem is solved in uh

using uh some kind of a

these squares for instance a regularized weight uh and L one norm on on next so that leads to some

kind of las so problem

and where you try to minimize the square it's a two norm of the residual

uh E which is called you hear a regularized with this L one norm on on H

now this only count basically for at in on the data vector on on Y

but when you also have uh uh at or as in are a matrix in that case

you have to think about other ways other types of uh of

solutions

and one of those uh uh is given by totally square so in a way it's some kind of

way to compute C squares uh in in case you one include some robustness against perturbations on this on this

S system a tree

so in that case you and up with uh

a problem like this where you have

basically uh the square it's probably use where you try to minimize the squared frobenius norm

of the total perturbations both on this system matrix

a and the data vector Y

and this again regular now with

and L one norm constraint on on a

uh and uh the constraint here

yeah and the other constraint that you at is basically the the fact that you should

have a a a a

if you include the R is that you have any equality between the data vector the perturbed data vector and

and the uh uh to data matrix times the unknown vector X

uh uh so normally without out this uh without this constraint here without the the the sparsity uh constraints you

basically have a classical totally squares in case you would have a a an over determined system and then the

solution is actually

given by a single value decomposition

that you have to carry out on the composite matrix of a a uh

um concatenated with the data vector one

but here we are also we also have this

uh a extra a L one norm constraint on the on the on the

fact

so basically this problem

we can solve that in case we have no

further information on the structure on of these perturbations

uh if you have so that would be then on that case but if you have

for instance that this tickle knowledge about these perturbations you could think about

a weighted version of that problem

weighted by the inverse of the covariance matrix for is on these perturbations

or you could also look at the structured version where you take in uh where you take the structure into

account on

this corpus that a matrix of eight

concatenated with Y

and uh this this happens in various applications uh for instance in in in

deconvolution or system identification or a linear prediction where these

the this a matrix a often will be to plates or hankel

but you could also have a circle and structures or from them on the structures

for instance an harmonic retrieval

and and also when you have these all these uh a grid based the sensing approach

in that case this a is basically

constructed by for this complex exponentials so you also have this from them on the structure there in

in this type of of a problem

and the structure uh mathematically this is basically a model as a some kind of a function of this comp

it's

matrix

as a function of this uh parameter vector B

so all the structure is basically model in this way so there's a unique mapping between the parameter vector

and this composite matrix which is also called has here

and and the the the problem that we're solving here is basically a weighted and structured sparse total squares problem

where we gonna uh minimize a a uh squared weighted norm

on at along which is now the perturbation not on this the the composite matrix as but on the

parameter vector so you basically minimize

an or now on the perturbation of the parameter vector which is the standard approach in

in a a weighted and structured total squares

and now we again at here the the sparsity constraint

uh two that's uh cost function

and a gang subject to uh uh this uh you quality here which is basically

the same equality quality as about we had

you on this slide but now it's

a a given in of as a function of uh uh the parameter vector P and it's

a distortion have so this might not necessarily be uh a a your equation now

that depends on how you can model the the structure

so that's why we introduce a and assumption

where i will start with a actually the second part where we model as as actually a linear function on

a find function if you want

all of a piece so basically what we assume is that

uh as of P can be expressed as a linear function of P so this constraint or can be

transformed into a linear

uh a constraint

the second part is it's

yeah more of a a of a notational uh

assumptions so that makes things a little bit easier so that we make also the structure and S bowls so

we split it in

a parameter vector up or a part of parameters that's related to the system matrix a and a part that's

related to the data vector

uh why

so that that the to perturbations on those factors can be

for instance like and separately

and i after whitening that happens here you basically get the following problem

so this absolutely a a is the perturbation on the parameter vector related to a

and have lies the

perturbation

related to the data vector uh why but they're both white and now

according to the their inverse covariance major

and again here the here is that now this linear your expression which is due to the fact that we

assume the linear form

for this uh as for this structure in the in the matrix

in the in the perturbation

so this what we have to solve

uh and of course you could always

uh we

at so may or epsilon why by

the uh a its solution that's given by the constraint and make their it into an unconstrained problem for instance

you could replace absolute why high the solution that's a given

here so then you get an unconstrained problem but it's

uh a non compact

a problem that you have to solve because you have here both

it's and and the unknown perturbation on the

on the system matrix

what before we start looking into solutions uh

there's some uh uh uh a i mean there's some optimality related to that to that problems

you can interpret uh the solution of this

problem in both X and S all as a

maximum a posteriori optimal solution

under certain conditions

uh the conditions are that you need the option

perturbations

and uh that's uh uh there's a dependence between all the all the variables and also that's the parameter vector

on a

that it's kind of an informative for a uniform distribution

and that the the brown but the unknown vector X that this one is not much and distribute it so

on the does

circumstances you can show that

the solution of to this problem gives you the maximum a posteriori optimality

this problem it's not it has some statistical uh into it can it has some statistical uh meaning

so to solve this we thing for is about an alternating descent method where you basically uh

uh solve all for all uh iteratively between a a on a so the perturbation on the system matrix

and X so you could for instance of fix absolutely eight

so i could fix it here

and fix it here and in that case it becomes like a

a classical uh

but a bit altered so a loss so like problem

so basically the solution can be found by an uh algorithm uh the that

has been proposed to to solve a

sparse reconstruction problems using the least squares a cost function

and then once it's just give you can update

you can use that uh solution to uh

of to uh

find the result for the perturbation on the system matrix and

a if a X is given and everything becomes a a a a a a a uh unconstrained quadratic program

so then you for apps on you can find to the solution then

in closed four

and if you start with a perturbation that's equal to zero you basically start

with the solution that's given by the classical loss to problem

and you can show that you always uh uh uh

improve your cost function that you go that you converge at least

a a a stationary point

within this cost function you can show this way that you will always improve upon the to the the classical

solution the classical a so so good solution

of course to salt and this loss a problem you can use your favourite us solve over

what you could also do is you courts uh use scored in a test set a court of this and

there to solve

uh the last so

with which basically means that

you a fixed all the entries except for one in your X factor and that you sold that one

separately and then it becomes like a scalar loss so

which gives you a closed-form form solution

my means of a soft thresholding

so and you can

do that those iterations altogether together so you can basically alternate between

have a and then every entry of X

and then go back to actual a and then sold that for every do we have big separately

and also that one so that gives you a global

uh core descent method that can also be shown to converge to at least a stationary point

and

of course

that this is not necessarily the global optimum but at least you know that you improve upon the the initial

solution which is the the lasso solution

so here are some uh them are coal the comparison so we assume

a that we have a a a a twenty by forty T a matrix so it's a compression here

uh uh fifty percent you could say

there's some stupid structure in a matrix we assume also different variances on

on a and Y

so note that also the uh on the perturbations on on the and Y so also the perturbation on a

has

has a to structure

and the signal vector X here is generated with ten and nonzero entries

and what is shown here is the L zero at or versus the parameter longer

and the L one at versus the parameter a longer which basically

this land like basically gives you a trade-off between

uh solving that totally squares problem and uh

yeah we use parts at each so the the bigger the long as the more

wait you give to uh to this a sparse solution

and you see that uh the best solution here in that so the L zero at or uh so this

is basically related to support recovery so it's that percent H of

uh and trees where to support between the true solution and the estimate solution or

or not the same

so this tells you something about support recovery

and there you see that if you take everything into account so the blue curve here

the weight it's uh structured sparse totally square you get the basically the best

uh a sparsity recovery

if you just take uh the weights

the the correlations into account

or the structure so these are the red curves and the black curves

then you

get a a a little bit uh uh uh

bigger

uh L zero at errors

and if you only do uh

a if you don't take any weight or structure into account you are you're a a little bit

uh worse and a loss so gives you basically the the worst L zero at

for the L one at or or so this is basically the L one norm data or the the the

a performance as are the that's it's closer to each other but of course supports recovery is is

the most important in many many of these uh application

a like i told you before this this approach is very useful in in cases where you uh do sensing

and you use some kind of a grid based approach

so that for instance uh can be used in direction of arrival estimation

where you basically can uh uh

divides the whole anger or space into different grid points into a a a or or and and angle great

winces is every two degrees you can pick it a grid point

and in that case you could express your received vector or Y T V as a linear combination of a

array response vectors so

basically this tells you this first here at don't want tells you

uh how the system would be received out the target would be received the signal would be received if it

comes in on an angle of arrival of

T one

so you get a linear combination of all these uh

uh array response vectors on the different grid points

but of course and and these X contains the combining weights

but of course the combining weights will be sparse because only where you have a target you will have a

combining way

uh of course whenever you have uh

targets that are in between the great

you

the this quality will not be exactly true and there's some kind of

perturbation

on on the

on the grid

so you could say that the the true

the true exponent all five

in you or uh of your source

uh could be then model modelled as

the exponent in you are uh grid points

plus some some than your correction

because like i said before we wanna make you wanna have a the perturbations in a in a linear form

so we want

to have an a find expression for the perturbations so

that means that in this case we need some kind of

approximation because there is

a lot of structure in this

uh and in these perturbations but it's not a your so we approximated by a here

by the find function of of the parameter fact

i'm not gonna go into the details here

uh and so that allows you then to uh

to get a better estimate because next to solving for X you also allow these a grid points basically to

shift to the two solutions

so if you have a

a source that is uh somewhere in between the two good points

because you allows for perturbations on this a matrix a great point might be

shifting to the true uh solution

so you get some kind of super resolution of fact uh for free and this approach

uh other approaches usually start from a rubber great and then they we find a grid

uh in those locations where you have a a the target

here you got a basically in in one shot

uh for is as an example where you have H we see don't and and on time as an ninety

great points

so every two degrees you have a great point and you have a source at one degree and wanted minus

nine degree

so there are exactly in between two grid point

and then you see that the classical us to basically give you uh four nonzeros

basically the the grid points around the

the the sources

you could say a okay we can interpolate those and then we get the solution but

you can only do that if you know that you have only two sources of course if you don't know

the number of sources you could think that

there are four sources now in this in this problem

while the the weighted that's touch at uh

sparse totally squares

gives you basically two peaks in the in the red locations which

which correspond to these black arrows where the true sources located so the great basically it's to the

to the right position

uh you see here also another or all but

uh a is that this is indeed be so this dot this kind of twenty db below the the the

first up

so you you basically

can also do some kind of a a number of sources cover using using this approach

so i think

that brings me to the to the conclusion so we've proposed as a weighted and structured to sparse uh a

totally squares problem

which is motivated by

first of all the fact that you have non i'd yell at these in the in a

in uh compression matrix but it's also motivated by a lot of these sensing applications

so we can account for also correlations and structure in these perturbations

and we show uh looked at the the uh map optimality of of of this uh a problem

and we looked at uh a reduced complexity alternating descent and coordinated descent solutions

uh ongoing and future research consist of recursive and robust implementations of of this method

uh we also try to see whether are also the svd can be used in in some of these problems

for since basically solving an a D also bows down to an iterative method

do you you one there whether in those iterations you could also include

uh sparsity uh and and still use

uh i C D based uh a type of methods to solve a also a sparse totally squares

so that concludes uh the my presentation

hmmm

any questions

i was thinking

much more as the complexity of your or a solution

i mean

what do use for a a a a a large problem so there um

a microphone array very or something that

well i can say that the

the uh the the the of the complexity is basically determined by how you solve this uh the the initial

sparse reconstruction from

and and a and you do that

iteratively so you do that maybe a

five times

i don't know exactly how many iterations that we used here but

and general we don't need to i mean you can stop ever you want to right after one iteration you

know that

you're are already improve upon

the classical uh a sparse reconstruction method so

you could do it for instance uh you can solve

you can say you are we have twice the complexity

because solving than for the perturbation that's a closed form expressions so that's not

uh the biggest complex

so it depends basically on the solver or that you use for it is a sparse reconstruction

just

as one to comment and of the question

how how complex this is

it can be some times less complex them

or each now L S

because you have to remember that a chunk ls

tales and is really

right

a can be more

a less complex and how are you

this is way

what is worse

mentioned should use that

when people he of to regular ties

okay the L S

then a this sense of this use of the world

so again can you mean just all of a

some sort of each tentative

one station

right

yeah the regularized even to less with a quadratic

yeah

okay can you

change of the work for a while

instead of

have one

a different from

yeah that's that's possible because okay in the i mean in in

in these iterations i mean the first start

all the iterations you basically set your perturbation to zero and then it could be

instead of a loss problem then you have a a an type of a sparse vector lies to

a problem that you could solve in the in that step

uh uh and and whatever you fixed

and the solution okay in the second step of the it's always a close form expression for the perturbations so

you could change is

L one

thank you

yeah

and

yeah

come back into a a a high resolution a a lot connotation technique you estimation techniques

what's that

snr threshold

when you use this kind of compressive sensing inspired take

yeah i i don't know we are we should we should uh a yeah we should test it on on

on more applications so this is more let's say the the theoretical framework

and now okay we have to check this on on many different applications

the thing is you can use here a kind of a rough great right as an initial point and

question is how rough can you make your great now how much can you correct for

a but that's something we didn't to analytically uh yeah

investigate

yeah we have similar but uh a theoretical analysis think that yeah

yeah

well

well for you so see you know we have also there is also spectrum sensing application but

so it's always compared to the there the the standard uh a sparse reconstruction methods

so uh

is it is that the comparison you would like

i mean

oh okay yeah and and actually that's

the

yeah

so we are yeah this one i i didn't show it but

so this is when you do and really and it's some some uh

uh for different monte carlo simulations you see what happens if you

compare loss so which is the blue curve

this is what actual before where you have to for peaks

right you could say okay you integrates and then you get this blue dashed line

what you see that even

the the full the weighted sparse total score still does better than the integration

and if we integrate we don't gain anything because we already are the good solution so

so this is a guy for the direction of right

so even with interpolation although you need to know the number of sources for this interpolation but

even and we we we have a better performance

okay i think you