yeah thank you mister chairman and um

so the topic of my talk is on uh the relation between uh independent component and that analysis and mmse

and it's a joint work my phd supervisor a professor bin yeah

so a commonly considered case for independent component analysis

is uh the demixing of linear noiseless mixture

and in that case the um ideal demixing matrix W

is the inverse of the mixing matrix a

however i here we want to consider um linear noisy mixtures

and the noise changes the ica solution so it's no longer

the inverse mixing matching

and this can be modelled by um the thing question here W ica

is equal to a inverse class um deviation

W to the

or we can approximate this

um for small noise as a inverse plus sigma squared times W bar

um prior work on no a noisy ica mainly consists in methods to compensate um the by W to that

and they modify the cost function or updated creation of ica

however they require knowledge about the noise

and we are interested in the ica solution

for the noisy case without any bias correction

and because we have made the observation that indeed uh i think you haste

uh quite similar

to mmse

and uh and our goal is to find this matrix W bar

you

and that's creation

and uh by this we want to explore the relation between i C an mmse theoretically

so you a quick overview of my talk i will start of the signal model and the assumptions

then uh we will look at

three different solutions for the demixing task namely D

inverse solution and the mmse solution of a to not blind methods

then we will look at uh i a solution which is of course of course a

blind approach

uh in the is that section will then see that

indeed i can um achieve an mse close to the mmse

so the mixing and the demixing mixing process can be some right by these two equations creations you they are

probably about known to all of you

um

X is the vector of

mixture just signals which are are linear combinations of the source signals S

um with them mixing

through a square mixing matrix

a a

which is and by and

and we have some at additive noise uh re

yeah

and the D make signals Y are obtained by a linear transform W applied to the mixture signals X

the goal of the demixing is of course to get

the D mixed signals by

as similar as possible to

the origin signals as

so we make uh a a couple of these assumptions first the for the mixing process should be involved close

so this means a a inverse should exist

the original signals are assumed to be independent with the non gaussian pdf Q i

with uh mean zero and variance one

and furthermore we assume that the

uh a D F Q is three times continuously differentiable

and that all required expectation sick

we got "'em" the noise we assume that it's zero-mean mean

with a covariance matrix uh stick must where times are we so sick must where and D denotes the average

variance of

we and are we use a normal as covariance matrix

the pdf O

the pdf of the noise can be arbitrary uh but metric

and this means

that all or order moments of uh the noise are equal to zero

and last we assume that uh the original sources S and the noise we are independent

so he as the the first the non blind solution for the mixing that's uh it's the inverse solution

so

W inverse is equal to a inverse

and uh it has the problem properties that it achieves a perfect demixing for the noiseless case

however if there's noise this attention of noise amplification and this is

especially serious if

the mixing matrix a is close to singular

and of course it only possible if you know a

a in advance or some how can estimated

and sits a non blind method

the second non blind method is the mmse solution

which is a a the metrics W which and minimize

the uh M C

there's solution is given in this equation here

and we can approximate it

in terms of signal square where S in the last line

the properties are again that it's i think to to the inverse solution if that's no noise so we can

achieve a perfect

demixing mixing if there's no noise

whatever um

we need

to know the mixing matrix a and

properties of the

noise

or we need to be able to estimate a a second order moments between S and X

so again it the um non blind met

so now we come to the uh blind approach the ica solution

the idea of ica is of course to get um

the

did mixed signals by a statistically independent

since the since we assume that the original signals are statistically independent

and we can define a desired distribution of the D mixed signals Q of

why

um to be the product of the marginal densities

of the original source source Q i

and then we can define a a cost function namely the kullback-leibler divergence

between um the actual pdf of

the T mixed signals by

and the decide um P

Q Q five

and uh the

formula for the kullback-leibler divergence is given here

we just want to note that it's equal to zero if the two

P D

P and Q are identical

and it's larger than zero if they are different

hence speak can um so if the demixing type by minimizing this cost function using stochastic

gradient descent

and the update equations are given here

so the the update uh the at W

depends on W in uh in transpose

and this so do correlation metrics

and the function C i here

is the um negative

the remote of of the log pdf of the original source

okay so at convergence of course

the the update that of W is equal to zero and this is equivalent to say

um that

this

uh to the correlation matrix

uh fee of white tense why transpose

um is equal to the identity metric

the properties of the ica solution are that

um it is equal to the inverse solution if there's no noise

um

but the big difference is that we don't need to know anything about a or S

so um it's applied blind mixing

yep the only thing that we require a is that we know the

pdf of the original source

and uh the original sources must be non goals

if um all the pdfs at different

then there's no permutation ambiguity

and um if you know the pdf perfectly

then this also no scaling at but

um only a um but remains if the pdf

estimate

so now we come to the mentor or um of the paper

um

we can show by taylor series expansion of the nonlinear function fee i

that the ica solution is given

by this equation

where um are we that is a transformed

correlation matrix of the noise

and

and a is a scaling metrics

which uh contain uh which depends on the pdf of the original sources

through the parameters uh a pile and drove

which are given uh here

you just want to note that uh cup i a measure of non gaussian gaussianity

and it's equal to one if and only if S as course and and it's

in all other cases it's larger than one

and for comparison we have

written down here

the um

mmse solution and if you compared to see creation and the one on the top here

you can see that they are indeed quite similar

except for the scaling matrix and

go go back uh the scaling matrix and here

and if and is approximately a um a metrics for

with all elements equal to one

then we can conclude that the ica solution

um is close to the mmse solution and we can also show that in that case

um the two M ease

of the ica solution and the mmse solution are quite similar

the elements of the scaling matrix and are determined by um the pdf

Q of S of the source

and then to make any further conclusions we will assume

uh a certain family of pdfs

maybe the generalized some distribution

so the pdf um

is given here

where come as the come function and that at is the shape parameter which controls the shape of the distribution

for example for a type was to to we obtain the cost some distribution for a was to one

that a i think distribution and

if you let

but to go to infinity V get the uniform distribution

so um

if we fixed the variance to one um then we obtain um that rose people to better minus one

and the other parameters cut a to and the elements of the scaling matrix and

are given in the plot here and the table

so the diagonal elements and i i

um i exactly equal to uh couple divided by two

and the off diagonal elements

and i J are between zero point five and one

but maybe more interesting than these parameters is the question

uh what i is he can be a and how close

um can be get to the mmse estimator

and for this we uh make an example we consider to G two D sources with the same shape parameter

better

the mixing matrix

uh is given here we assume goes noise with

uh identity covariance matrix

and we have studied do relative mse E

so this means the mse E of the ica solution

divided it uh by the mse of the mmse is

and as you can see from the plot you on the right hand side

um

the relative M you of the ica solution is close to one for a large range of the shape parameter

better yeah

so uh it

less than one point zero six so uh only six percent worse than the mmse estimator

and for reference we have also calculated the relative mse of the inverse solution

for the two as an hours of ten db and

twenty db

so um you can see that

um

a blind approach i three eight out performs uh the inverse solution

um which is the non blind method

and also for a lot lot french of

uh the values of the shape parameter better

up to now we have a um can

so we have uh consider only use theoretical results um which are valid

uh only if

for a infinite amount of data of since we have evaluated all the expectations exactly

but in practice you'd never have uh internet amount of data

so now we want to look at um an actual um could back like the divergence based ica algorithm with

a finite amount of data

and in practice use really don't use the standard ready and

uh

but instead the natural gradient because it has a better convergence properties and the update

just

you know here

since we i now using um

a finite amount of data

of course not only the bias of the ica solution from the mmse solution is important

but also the covariance of

the estimation contributes to the mse

and

we uh can assume um

two identically distributed source so um ica suffers from the permutation but

so we need to resolve this before we can calculate the mse

and uh last

the scaling of the ica a uh components

is slightly different from the scaling of the mmse solution

so uh we also compensate for this before we can calculate the mse value

so he a a a on the left plot

we show the mse E for low passing dispute it signals with

uh for different snrs and different sample size L

the um like

so line line the mmse estimator

and um

the colour lines are the actual performance of the ica algorithm

and as you can see um

for large enough sample size

uh we can get

quite close um to the M M Z

um estimator so we can achieve a very good mse performance

um

for

a low snr

we can also see that ica out performs the inverse solution this is shown on the right hand side but

we plot the relative mse

so the line with a um

triangles down works

this

uh sorry trying as a a court

the inverse solution here

so for low snr it

increases quite dramatically

where is the ica solution still use a reasonable

um M E

and the point where they are are the um the ica ica solution and inverse solution

and this depends on the mixing matrix and the sample size

and one last point that i want to mention um we have also plotted the uh to radical ica solution

but the triangle down what's your

and old um you can see that

it matches quite well

with the performance of the uh actual i i them

except for very low snr of uh a zero db

and this is

a because uh we have made a small noise assumption in the door

a because you have only be considered um

terms up to order of signal square in or taylor series

uh we want also to study the influence of the shape parameter better around the perform

so here we plot of the relative mse of the ica solution

for different

um snrs

so a ten db twenty db and thirty db

and the channel or friend is that the more non goes and the source

so the close a um but has equal to zero point five or or uh uh

for large values of button

the low the relative mse is

except for a uh this case here where

for the um

as an hour ten db

and one last point is that um one might wonder why does the the relative mse increase

for increasing as are so if you go from ten db is not to thirty db snr Y

uh just the relative mse increase

um but this can be explained by the fact that

indeed uh the and is E for the

for i C eight for the noise this case

is not a close to zero um

but it's low or no longer but the cramer-rao role point

which um depends on a cup and to

i a couple and a sorry

and uh and the relative mse increases for increasing snr

so uh to summarise

we have a uh in this paper right the ica solution and the M if for the noisy case

have seen that there exists a relation between ica and mse

which depends on the pdf of the original sources

um however off from the ica solution which is the of course of blind approach

is close to the mmse solution

and we want to state that uh the relation also six just when the non great chief

uh fee i

does not match the true pdf

and we have seen in the simulation results

that uh we can at in practice achieve and mse close to the mmse estimator with

uh and

uh i C a i'd can based on the kullback-leibler divergence

and we have also seen that not only the bias of the ica solution is important

but also the covariance

of the estimation at D two minds yet to for the performance

and to

some up everything i want to state uh

blind demixing by a i is in many cases uh similar to non blind demixing based and M M C

so uh i think you for attention and if you press

these fit for each

yeah or

assuming a this is assuming uh

that there's no uh time dependent

and

it of course it depends if you assume for example if you assume the the rum um

the wrong type of the distribution if you assume to

it's that course in and and in in the source that super coarse and

then of course i say doesn't work

so it's done it's of assume the correct type

and

yeah it

it depends on the on the amount of mismatch

if the mismatch is

is um reasonably small or uh and uh used used to good approach

and for phone

and i i'm

yeah

yeah

it

new yeah

it yeah it could be it could in the that are and have of uh i i've mentioned that uh

in the in the paper or um that

um you put use uh this derivation revision to act maybe be right um fee function

uh

which could you a lower mse

by T

but a by um getting a a metric which is close

which is the elements close one

um but the problem is this

obviously depends on the snr and so you again you would need to

yeah a

so

yeah

oh