yeah thank you mister chairman and um
so the topic of my talk is on uh the relation between uh independent component and that analysis and mmse
and it's a joint work my phd supervisor a professor bin yeah
so a commonly considered case for independent component analysis
is uh the demixing of linear noiseless mixture
and in that case the um ideal demixing matrix W
is the inverse of the mixing matrix a
however i here we want to consider um linear noisy mixtures
and the noise changes the ica solution so it's no longer
the inverse mixing matching
and this can be modelled by um the thing question here W ica
is equal to a inverse class um deviation
W to the
or we can approximate this
um for small noise as a inverse plus sigma squared times W bar
um prior work on no a noisy ica mainly consists in methods to compensate um the by W to that
and they modify the cost function or updated creation of ica
however they require knowledge about the noise
and we are interested in the ica solution
for the noisy case without any bias correction
and because we have made the observation that indeed uh i think you haste
uh quite similar
to mmse
and uh and our goal is to find this matrix W bar
you
and that's creation
and uh by this we want to explore the relation between i C an mmse theoretically
so you a quick overview of my talk i will start of the signal model and the assumptions
then uh we will look at
three different solutions for the demixing task namely D
inverse solution and the mmse solution of a to not blind methods
then we will look at uh i a solution which is of course of course a
blind approach
uh in the is that section will then see that
indeed i can um achieve an mse close to the mmse
so the mixing and the demixing mixing process can be some right by these two equations creations you they are
probably about known to all of you
um
X is the vector of
mixture just signals which are are linear combinations of the source signals S
um with them mixing
through a square mixing matrix
a a
which is and by and
and we have some at additive noise uh re
yeah
and the D make signals Y are obtained by a linear transform W applied to the mixture signals X
the goal of the demixing is of course to get
the D mixed signals by
as similar as possible to
the origin signals as
so we make uh a a couple of these assumptions first the for the mixing process should be involved close
so this means a a inverse should exist
the original signals are assumed to be independent with the non gaussian pdf Q i
with uh mean zero and variance one
and furthermore we assume that the
uh a D F Q is three times continuously differentiable
and that all required expectation sick
we got "'em" the noise we assume that it's zero-mean mean
with a covariance matrix uh stick must where times are we so sick must where and D denotes the average
variance of
we and are we use a normal as covariance matrix
the pdf O
the pdf of the noise can be arbitrary uh but metric
and this means
that all or order moments of uh the noise are equal to zero
and last we assume that uh the original sources S and the noise we are independent
so he as the the first the non blind solution for the mixing that's uh it's the inverse solution
so
W inverse is equal to a inverse
and uh it has the problem properties that it achieves a perfect demixing for the noiseless case
however if there's noise this attention of noise amplification and this is
especially serious if
the mixing matrix a is close to singular
and of course it only possible if you know a
a in advance or some how can estimated
and sits a non blind method
the second non blind method is the mmse solution
which is a a the metrics W which and minimize
the uh M C
there's solution is given in this equation here
and we can approximate it
in terms of signal square where S in the last line
the properties are again that it's i think to to the inverse solution if that's no noise so we can
achieve a perfect
demixing mixing if there's no noise
whatever um
we need
to know the mixing matrix a and
properties of the
noise
or we need to be able to estimate a a second order moments between S and X
so again it the um non blind met
so now we come to the uh blind approach the ica solution
the idea of ica is of course to get um
the
did mixed signals by a statistically independent
since the since we assume that the original signals are statistically independent
and we can define a desired distribution of the D mixed signals Q of
why
um to be the product of the marginal densities
of the original source source Q i
and then we can define a a cost function namely the kullback-leibler divergence
between um the actual pdf of
the T mixed signals by
and the decide um P
Q Q five
and uh the
formula for the kullback-leibler divergence is given here
we just want to note that it's equal to zero if the two
P D
P and Q are identical
and it's larger than zero if they are different
hence speak can um so if the demixing type by minimizing this cost function using stochastic
gradient descent
and the update equations are given here
so the the update uh the at W
depends on W in uh in transpose
and this so do correlation metrics
and the function C i here
is the um negative
the remote of of the log pdf of the original source
okay so at convergence of course
the the update that of W is equal to zero and this is equivalent to say
um that
this
uh to the correlation matrix
uh fee of white tense why transpose
um is equal to the identity metric
the properties of the ica solution are that
um it is equal to the inverse solution if there's no noise
um
but the big difference is that we don't need to know anything about a or S
so um it's applied blind mixing
yep the only thing that we require a is that we know the
pdf of the original source
and uh the original sources must be non goals
if um all the pdfs at different
then there's no permutation ambiguity
and um if you know the pdf perfectly
then this also no scaling at but
um only a um but remains if the pdf
estimate
so now we come to the mentor or um of the paper
um
we can show by taylor series expansion of the nonlinear function fee i
that the ica solution is given
by this equation
where um are we that is a transformed
correlation matrix of the noise
and
and a is a scaling metrics
which uh contain uh which depends on the pdf of the original sources
through the parameters uh a pile and drove
which are given uh here
you just want to note that uh cup i a measure of non gaussian gaussianity
and it's equal to one if and only if S as course and and it's
in all other cases it's larger than one
and for comparison we have
written down here
the um
mmse solution and if you compared to see creation and the one on the top here
you can see that they are indeed quite similar
except for the scaling matrix and
go go back uh the scaling matrix and here
and if and is approximately a um a metrics for
with all elements equal to one
then we can conclude that the ica solution
um is close to the mmse solution and we can also show that in that case
um the two M ease
of the ica solution and the mmse solution are quite similar
the elements of the scaling matrix and are determined by um the pdf
Q of S of the source
and then to make any further conclusions we will assume
uh a certain family of pdfs
maybe the generalized some distribution
so the pdf um
is given here
where come as the come function and that at is the shape parameter which controls the shape of the distribution
for example for a type was to to we obtain the cost some distribution for a was to one
that a i think distribution and
if you let
but to go to infinity V get the uniform distribution
so um
if we fixed the variance to one um then we obtain um that rose people to better minus one
and the other parameters cut a to and the elements of the scaling matrix and
are given in the plot here and the table
so the diagonal elements and i i
um i exactly equal to uh couple divided by two
and the off diagonal elements
and i J are between zero point five and one
but maybe more interesting than these parameters is the question
uh what i is he can be a and how close
um can be get to the mmse estimator
and for this we uh make an example we consider to G two D sources with the same shape parameter
better
the mixing matrix
uh is given here we assume goes noise with
uh identity covariance matrix
and we have studied do relative mse E
so this means the mse E of the ica solution
divided it uh by the mse of the mmse is
and as you can see from the plot you on the right hand side
um
the relative M you of the ica solution is close to one for a large range of the shape parameter
better yeah
so uh it
less than one point zero six so uh only six percent worse than the mmse estimator
and for reference we have also calculated the relative mse of the inverse solution
for the two as an hours of ten db and
twenty db
so um you can see that
um
a blind approach i three eight out performs uh the inverse solution
um which is the non blind method
and also for a lot lot french of
uh the values of the shape parameter better
up to now we have a um can
so we have uh consider only use theoretical results um which are valid
uh only if
for a infinite amount of data of since we have evaluated all the expectations exactly
but in practice you'd never have uh internet amount of data
so now we want to look at um an actual um could back like the divergence based ica algorithm with
a finite amount of data
and in practice use really don't use the standard ready and
uh
but instead the natural gradient because it has a better convergence properties and the update
just
you know here
since we i now using um
a finite amount of data
of course not only the bias of the ica solution from the mmse solution is important
but also the covariance of
the estimation contributes to the mse
and
we uh can assume um
two identically distributed source so um ica suffers from the permutation but
so we need to resolve this before we can calculate the mse
and uh last
the scaling of the ica a uh components
is slightly different from the scaling of the mmse solution
so uh we also compensate for this before we can calculate the mse value
so he a a a on the left plot
we show the mse E for low passing dispute it signals with
uh for different snrs and different sample size L
the um like
so line line the mmse estimator
and um
the colour lines are the actual performance of the ica algorithm
and as you can see um
for large enough sample size
uh we can get
quite close um to the M M Z
um estimator so we can achieve a very good mse performance
um
for
a low snr
we can also see that ica out performs the inverse solution this is shown on the right hand side but
we plot the relative mse
so the line with a um
triangles down works
this
uh sorry trying as a a court
the inverse solution here
so for low snr it
increases quite dramatically
where is the ica solution still use a reasonable
um M E
and the point where they are are the um the ica ica solution and inverse solution
and this depends on the mixing matrix and the sample size
and one last point that i want to mention um we have also plotted the uh to radical ica solution
but the triangle down what's your
and old um you can see that
it matches quite well
with the performance of the uh actual i i them
except for very low snr of uh a zero db
and this is
a because uh we have made a small noise assumption in the door
a because you have only be considered um
terms up to order of signal square in or taylor series
uh we want also to study the influence of the shape parameter better around the perform
so here we plot of the relative mse of the ica solution
for different
um snrs
so a ten db twenty db and thirty db
and the channel or friend is that the more non goes and the source
so the close a um but has equal to zero point five or or uh uh
for large values of button
the low the relative mse is
except for a uh this case here where
for the um
as an hour ten db
and one last point is that um one might wonder why does the the relative mse increase
for increasing as are so if you go from ten db is not to thirty db snr Y
uh just the relative mse increase
um but this can be explained by the fact that
indeed uh the and is E for the
for i C eight for the noise this case
is not a close to zero um
but it's low or no longer but the cramer-rao role point
which um depends on a cup and to
i a couple and a sorry
and uh and the relative mse increases for increasing snr
so uh to summarise
we have a uh in this paper right the ica solution and the M if for the noisy case
have seen that there exists a relation between ica and mse
which depends on the pdf of the original sources
um however off from the ica solution which is the of course of blind approach
is close to the mmse solution
and we want to state that uh the relation also six just when the non great chief
uh fee i
does not match the true pdf
and we have seen in the simulation results
that uh we can at in practice achieve and mse close to the mmse estimator with
uh and
uh i C a i'd can based on the kullback-leibler divergence
and we have also seen that not only the bias of the ica solution is important
but also the covariance
of the estimation at D two minds yet to for the performance
and to
some up everything i want to state uh
blind demixing by a i is in many cases uh similar to non blind demixing based and M M C
so uh i think you for attention and if you press
these fit for each
yeah or
assuming a this is assuming uh
that there's no uh time dependent
and
it of course it depends if you assume for example if you assume the the rum um
the wrong type of the distribution if you assume to
it's that course in and and in in the source that super coarse and
then of course i say doesn't work
so it's done it's of assume the correct type
and
yeah it
it depends on the on the amount of mismatch
if the mismatch is
is um reasonably small or uh and uh used used to good approach
and for phone
and i i'm
yeah
yeah
it
new yeah
it yeah it could be it could in the that are and have of uh i i've mentioned that uh
in the in the paper or um that
um you put use uh this derivation revision to act maybe be right um fee function
uh
which could you a lower mse
by T
but a by um getting a a metric which is close
which is the elements close one
um but the problem is this
obviously depends on the snr and so you again you would need to
yeah a
so
yeah
oh