so that are known everyone

my name is sophia K the any and and come from a P of L as non

did they i one present a this work that this so work with my adviser professor a press from site

only now a with approximation based and different

i

this is the overview of my though

in the beginning i will talk briefly about the motivation

and i will present the formulation for the problem

and the uh that we used to sell

and find and to one so you the results that we having

so i guess the first question that this is the answer is why do we bother to use mine

as we all know transformation by that's is important uh in many applications like classification on me

a i used in human C are in the position of the relating the transformed versions of a signal easy

however computers are are not do the same thing and even in simple case is like the one on

so

like a one so there where we have this more traditional for a model

uh the pixel values of the immense change a significantly and therefore for computers and not in a position of

directly relate the to signal

uh and it turns out that manifolds are a natural way to hold all at the transformed very zeros of

a signal

this is true because manifolds are to sense any low dimensional structures and they in higher dimensional space

and when we are having transformation

what where are actually having use a set of uh

small

a small number of parameters

a during the appearance of a high dimensional signal

then for there is a direct link between the transformation parameters and the money for team

and in things in this figure here we are seeing a a one D manifold but these embedded in

but a three D space

and since the dimensionality of the manifold here is lot

it could be used to model a transformation uh with one parameter like a limitation of these lemma

uh a on the other hand we all know that usually we want to have a linear models because they

are simple there is it to handle

and the other five computations and specially for distance

uh however uh the most to real world uh the signals don't have a a linear manifolds at in many

cases this money are us a know that we don't know the an only form

what we can do in this case um since we can just use one linear model

is to use many or models to model the money for

uh uh in this what we are going to use a set of a fine model which are just lean

are mark a models that are translated from your reading

and from the one no will call them fly

when you what we going you're using sets a representation for a manifold it's important to respect the manifold geometry

that means that you should place the flat since such a way that this big peak the data on which

a and they don't violate

a a so the situation user is that we are having a unknown manifold them or they mention E D

D that's a lower than

the high that

and then which is the dimension that of the space where the manifold leaves

uh we have a set of samples

coming from this man for

and in order to model the underlying geometry we also have the neighborhood graph

which C is uh

but the graph that we get one we connect the neighbouring sample

what we would like to at C be stop proxy make them with D dimension of flat

uh

in general it's flat is going to be representing a set a a a a part of the money in

in our case is where having some was it would be present a set of samples

we know that the lowest dimension mention that finds space that includes uh all the points in in a set

is the affine whole

therefore are what we would like to let T Vs long cover sets of samples with d-dimensional dimensional a

so if we were having a the sum was that that are sewn there

uh what would like to separate them into groups

so that each group has one dimensional mindful for a one dimensional fine home because the underlying might is this

one payments

um

however is not that easy to compute the affine holes

but

what we can compute these is it's the tangent space at its sample the man

and

a a we can i'm sure also that the that tangent in space at each of the samples

uh a in the case where we have a set with that D them so a fine for is going

to be identical to these stuff fine whole so in the previous example will have these constants for each to

the samples

which is the same line as we had for the uh

uh based on this observation what we can do is that uh we can chart on feature

we some so having similar tangents and then group them together and represent them by the mean times

following doing this formulation we can you to ban mess of the cost of uh a of grouping

by a sound mean over all the samples in a set

all of the squared distances between the times and but it somewhat and the mean tangent and over the group

the mean that and

as in general D dimensional or subspace

and that's that's the they want the grassmann manifold which is the space of a dimensional are space so far

a common to use there is the projection these stands which is computed as from here

it based on the price for and does between the two sub-spaces

and we test some the principle of or canonical correlations and uh and

sept charge them from the dimensionality and that of this space

then the mean time and can be from like those the car to mean which sees the times and that

that minimize is the sum of squared distances over times as

or using this cost

we can for the from late of problem on shown here

the input of the problem is a set of sample coming from the on fall and the neighborhood graph that

is used to model the on it

and then what we are trying to figure out is a partition of these samples C two groups

so that we mean my as the sum of the costs for each school

we do that uh anything or to verify that there we respect the underlying geometry

we have also uh i'm not another constraint that says that

at the sub graph corresponding to each group has to be egg

uh yeah the is not we used to solve this problem is that be a bottom-up approach

and these uh it can consists of two basic steps

they what is the sample set the the same neighbourhood size K and the number of that cell

uh with a uh we want to used to approximate the man

in the first that what we do is the to compute the tangent space in the second we do the

basic region merging procedure

and then the output are the groups that we have a a and cover and the flats that we used

to read

and the first that

in order to a we

firstly we use the parameter K to construct the K nearest neighbor graph

then based on these graph and by doing is D D O need sample neighbourhood what compute that and

uh i order to not account for possible noise or outliers in our data set we find there's small that

times and by using gaussian kernels on the grassmann mine

the second step is the basic procedure of the are really is that greedy optimisation

you the basic idea is that we start with "'em" groups

each representing a uh one sample at the beginning

and that and then that detect a race and we met the neighbour in groups with a minimum cost

yeah

was the from thing is uh in fact that the difference between the cost of the mets group

my knows the costs of the groups before the marriage

if we use that the form that it have on before for these colours

we will see that it depends upon the mean tons and over the

um and group

and since these mean time this the outcome of an optimization procedure it wouldn't be got the feast in to

compute it for every possible manner

what we do instead is that we compute an upper bound for this quantity

that does not depend and more on uh the mean down of over the max group but only on the

mean tons and of the groups as the are so far

uh using this upper bound

we start with "'em" groups and then we do our iterations at each a race and we compute the co

uh a course for possible mode things

we decide which with thing has the minimum cost

we perform a and then we find the flat uh

to represent the you group as the mean can and of the most group

then would say how many groups we having and if we have a the uh the desired level we stop

in fact we don't count every single group but only the ones that are significant and uh a significant we

define a groups

that's how a more that two percent of the total samples

that's something that we do it to

to be sure that we don't pay too much as

ten and a outlier

uh at the end of a a a a a of the other read we out to put the the

groups that we have found and rate a representative flat

but there are a compute it as the sub-spaces corresponding to the D largest eigenvalues of bits groups that them

a a in order to to uh figure out to have the put in order to see the performance of

a are an we have compared it with three different others are schemes

the first one is the high rack of B basic plastering on three

which is a top down roads

and the linearity measure that is used there is that the deviation between that you played in and the job

see this

the second one is the hierarchical agglomerative clustering

at this is a bottom-up approach to right oh

and the linear the method that is used as again the deviation between you played them into this these

and and we have so uh compared with the N Q flat

this is an one that comes from the family of subspace clustering method

is that but it's and not directly applicable to money falls and that's because

use use lead they don't have any constraint that would lead them to respect the manifold geometry

that's an example is shown here where would have a P are um a for the K

and as we see the and the flat that uh the i'm board in the fines for representing the groups

the not for all that you much of the minor

so for hours

that is to have used to that this that the three this swiss roll and the S

a training set that was consisting of two thousand points randomly sample

and that that thing set from five times and random some

for the testing what we are doing is that for each testing sample

um we compute the K nearest neighbours

and then by doing

from the training set

and then by doing majority voting or over the we find two weeks flat we should project

then by protect yeah finding that distance between uh

the sample the actual some when the projection we have a reconstruction error for these

one and by summing over a or all of the samples of the testing things we compute the mean square

error

so the results for this this roll or so here

are are of this only with uh or

and that's we can and on the horizontal axis we have the number of that

and then the part of the goal of it's the mean the reconstruction they're

as we can see in general the performance of our scheme is better in space in in the meeting a

came she's from fifteen to thirty five

um the picture is the same as for the S your data set

where the difference is even more all views

uh and again we see that the for all the number of lots our scheme performs better than the rest

uh and you yeah also plotted the groups that we get for the S two for the case of base

to have flat

just to verify that each group uh uh is uh and connected region of uh one

or what they have presented to you today use that greedy bottom up are going to approximate manifolds meant to

be with low dimension how so that

the linear you to measure that we use is the differences of dancing

and we have seen that

for our experiments itself performs a manifold approximation problem

possible applications for this scheme could be data compression on multiview view classification in general

cases is when we have to recognise a signal from transform

for any transform versions of

and you

i

so question

i

okay okay a question uh what is actually the regularity assumption on your uh any for your a processing of

shown because i image

you to learn a menu for a summer get one

so we can i

some some irregularity because we are having a smooth this thing on the tangent space computation but of course for

what would have to be able to compute meaningful meaningful dungeons

so that it's means that it has to be

difference different sample or we have to smooth it at the beginning but for sure we are

a a the case is that we can handle a cases of money false that have a uh where are

where be states

how

so we cannot have something like to really rely you like that we got or have to smooth it out

in the beginning

or to a of small thing

during a

if you there is just go for that

for to approximate

you mean like a my folks of real thing mouse

but we are trying to do that now it's uh our future work

i

and questions

not

i