i can give a talk so that uh where your your stay so i think a my job of the easier a because uh a a a lot of stuff and of the background and and all those actually actually introduced by there is talks so of what what we're trying to do here is a to uncover the to operate of regulation by transcription factors and michael R As uh using a bayesian uh basic it's uh this is it's a regression fact mall or or call this so hyper affect them all so uh what's the object of okay uh the objective is saw to understand how gene expression basically transcription this being regulated by transcription factor all this common knowledge at and my car it a it's a small molecule that's side recently oh uncovered to normal also regulate a transcription so what what this approach are a so basing i wanna come you that that we can use this saw a base fact factor model to to to serve the display per and um using the small on top of uh michael or a expression data a lot of biological a prior knowledge so a just a little bit of background or which are sort already been introduced by a lot of uh uh a previous of speakers so this is essential that number like biology uh so it's uh i it to say uh that the you you know more the information flow goes from a D and they am are eight uh in and the protein protein use a basic building block all all living cells uh so here you know my focus is on transcription so basically how D and they it's been transcribed into M R uh a this process so here looks lean your but actually it's being have really regulate it okay and a male it's rewrite the by two factors or the first one is a proteins call transcription factor so now you're looking at the D and a okay the transcription basically is a copying of one change in the D and they into the small molecule into the small coke M are a and then by M R you being later translate like to to protein so that the rank of the first regulator their is called transcription factor in a lines to they was up from or region of the ageing source for example this is a gene and then the controls the of the product of uh or expression sure oh the M R a and for their know recently people also understand that the uh and that the small molecule actually is come my core it's a no i i rolled a soft and i are it's a little bit confusing with mike it actually binds in that so called mold the region of the E and that the search to be great degree M are so act actually no a together uh the transcription fact and Y core it together actually it's a kind of a better explain the complexity of of of the leaving so why we have this type versa a a a a traditional way if we want to look at transcription factor we we pretty much have a similar set of a transcription for we're difference of white so my or actually give you another lady of explanation so here might per mike my goal is okay okay a right now on you we have to my are rate high so see in that can be used to we the measure mri more case so for example we have michael or rate which also be introduced by a of three speaker and also well i happen to be that the we can also since this is also a are in a week it was a measure michael so we we are the goal here is to to really understand how am are watching gene transfer transcription be regulated by my car and transcription factor based on and M a measure all my mike or measurement this but there a case and michael R mission so oh before supposed to be a like in in which he yellow but now it's black so basically that's the that's the goal here or rate uh so that's these to two factors okay so let's see what are the calm approach has been taken all so a basic this clutching not work or in a a a a very simple way so you probably see these things how how i don't often right so it it's it's a very messy network well normal each not represent a gene and then you have links score L linking these different genes so uh how we only are the meeting here is a pretty much a that's okay if two genes a link they like to are sorted but but the problem here's how do we interpret a especially if i one to understand transcription regulation or what does this king really tells about transcription regulation oh oh it's very very difficult actually because all these only says gene are also she don't say whether the ching a association through the transcription regulation or some thing uh interact also also forth so this is actually a a a a i be working on this before about that you a kind of a stay away because it's so hard interpret and you when you present a just they they don't know what i don't even know what to tell them and they don't know how to interpret so basically i one you know look at them more D to about of biology and see if whether we can really model of this process of a transcription regulation and my car regulation oh oh oh star by modeling like a everybody that's here so we assume that the transcription fact is a protein so so that the protein activity we call this a acts a so the actively that and you know the the or a all basic basically a little bit of a about on the a basic on it the more transcription factor a you have prop possible in little wreck a so are quite a bit of a a a a a gene transcription so we use a to represent a transcription factor putting level activity and then use using Z to oh denote the expression level of michael or are okay and then where we're saying seen my car and transcription factor regulates the gene product are which is a are eight where we call why which can be measured by the mike or read data and and here we call what and then we can relate to this relation by a a simple linear relationship a where we stay okay uh the R a expression level as do to on the regulation or the bond that is a axis actively putting of activity of transcription factor for at and the mike or expression level a K and a and B U are the so so called with three or coefficient also this this very simple model and this is a free much just that's okay but in the case model the case where there's a the one shows can fact and the one my car a reality actually is a lot or more complex where you ball a lot of my car a lot of can with fact oh uh and and again you you're gonna have a more on that phone it a model like those for one E and the if you really although the in higher G know where you have possible week one T uh to forty thousand G and then you you're looking at basically a matrix like that in this but you're case the measurement R expression ah uh this is a matrix for each will represent aging and each column vectors represent he sample for competition the patient soul also for time points and the X the that it was here you know well represent expression mri of one so i assume early acts here represent a that's cheating that's sample and and an is the transcript factor of this act i and does Z as the mike or a a michael are and they activity sorry this is so this wrong and where i i slice said before this can be measured okay this can be measured by mike or all of high simple sequence and at S the so called a three stress and the B as in my regular wrist stress a and E is the ad it to i it to uh at what or right so basically we're we're looking at the C creation now uh we're given Y and Z E we tried to a secure rubber some white and Z base on this model and uh and uh so this is a goals of data is given Y Z what one understand a B and i a so how do we how how are we gonna really achieve this a so traditionally additional the of just have a model is really a a a a factor regression model this part is the fact the of this part as a regression model a so this nothing you are and you you know and and the solution you can see a a couple of different solutions pca i C S already be to use by but make less and uh and it and M have a a row one are good at this type of the mall or not really sufficient to to really model the D to while the white so the reason i give you a very simple reason here for example in you are we're looking at a this is a relatively uh a real scenario you like you can kind of a and get a sense okay a so yeah if if you want to use pca to kind of a although does a basic P Z sense of the loading matrix for this this is a a a more make be an a somali matrix so well we make must for right okay i believe all it are are very you know and now we know that each gene transfer or fact actually regulate only a very small set of genes okay while relative to okay it's all what a couple of yeah up two thousand a couple thousand genes still in in terms of the overall number of genes which is twenty thousand to forty as a sparse hiding so of the major should be spot and also on you you know where you have a regulation where or is it now as your abdomen can be we have we already accumulated a lot of are not just to which you know transcript fact to regular what's set of a so we should be able to you incorporate this type oh not and thirdly or so these samples actually an you know you look at the sample you like a sample these samples but like a whole you are represent for example patience you know the patient measure and saw in the case but these disease uh some some patience actually have similar expression path and meaning that they they have these can be used to define the stop type of disease also if you have most similar stop five you're expression level should be it so these problems are i actually should be carly to re represent the condo so something like i start X and Z a get this from a factor activity as michael or you should have a saw correlations you should have these group a in the set and what in C uh it doesn't really models it like this and also a lot the transcription transcription factor activity should be known that like a what the uh make and also argue a but that that was in the case of the gene but there's similar market so uh we had we need to model really non negative a transcript five i well in the case of my car in my car known to down regulate the transcription so it's loading matrix must be negative or a loss of this matrix actually is used to be negative so we need to somehow in all these T to by all the in to the ball that you know to do a a a basic and i'm gonna tell you how we we we model all these each of one of i they all or to start with a a a a sort of basic the modeling goal was to model the sparsity a and B or in knowledge and uh yeah a a model the non-negative transcription fact video and a negative regulation my car and then a the sample correlation okay so you need a lot of things a small the start with the sparse that we use exact the same model as that is what uh the close on johnny uh and uh L actual was using here yeah yeah the spy high just one a point out actually use high right of basically notes over bit a probability of transcribed factor L regulating gene and so this can be really them there's a lot of prior knowledge available from that they so we can really incorporate these prior knowledge a in two a time and a a so this is how we model the sparsity of a a a well in the case of a B actually very similar model yeah here now we have to use it a gaussian to really out of the down regulation of a a of a mike a B as the regular matrix um mike so that's only differs and i again as a prior knowledge and there are all their a databases and also part well as a oh yeah is just to a point of a like card regulation is do a very active or research just so we don't really know exactly how my or a right right of the genes not at the level of transcription fact yet but are a target prediction out that can be used to really a a give you some prior knowledge here so that's how a model the sparsity and copy the part but apply then let's more want to a a needs to be non-negative transcript factor i a body so it's not using actually trying to go also use the right only differences we have a mat actually and zero this possible i yeah you know there two of using rectified of one is it introduces a additional sparse the actually even a transcription factor activity and also a it gives a very nice the a function uh uh formation for the base and uh uh i base and duration so that's how i um fact and then owing to the correlation sample correlation of be fine example stuff so patients well we use basic assumption is that he's yeah samples are the same ballpark are so it's a natural plaster model so mixture gaussian and a problem with mixture girls you know that the fosters so we actually use of duration should process of mixture now a i sure or do should process of make sure a rectify record what we in use a duration process yeah so putting a in everything together this pretty much uh what the model looks like a also we have all these different parts we a basic a for you do the factor right uh i not factor not or projection and uh if you put in of them all looks like i so a lot of parameters to estimate at the of a sure yeah every why and then that the so how have resort to some of you you a traditional something for a for example you something for in this case because of these very powerful uh which also prior distributions we have thought conditional distributions in close oh base this uh but uh if sample thing of what am i'm not gonna do on the on the durations is along long i i but wise as they we are able to really create this beep something solution so was start by looking at the a like assimilation data where uh this but a we have one fund genes a with well some of us or are are this a this is called how a most rate with the a real situation oh later it's so and we assume there flight faster and and that are they a fully uh thirty five seven are we look at uh basic as are a a wall here i and talk about the correlations are with real also and i E uh mean square error estimate a were also look at a sparsity you station oh and the class triple form so just a an idea of how samples are were as there a case actually rather to the fast ball of course this and a a high any on what what kind of a different the settings and error so for but uh generally it become verge are relatively that a so this is the actual a be the cluster id so in this case was the two clusters uh you can see it actually covers a fairly fast so this is a and you know we we actually look at a performance a for different noise conditions in of it moist of errors and for example in this case why i we look at a this is a the this the so can estimate of a non negative or i spar as far matrix and we look at the precision and uh when the noise actually increases oh of the precision actually a when the boys increases i precision actor goes well i it some are but a and then both goes down this give case and uh but uh if you look at the faster actually class simple form a rates uh with the and also the estimation a with increase of the noise and then we look at the data base of because for for knowledge and the database has problems so a a a a there two type of problems for example whether the database really for all the norm knowledge like a like what of the database the whatever of data we oh for you know a precision again the precision recall problem we set up a a precision of the data and look at again you know that you better performance oh what what is that can be seen here data is precision you know increases you know if whatever big report basis to and we be able to really recover i well these on uh S regulations um nonzero out uh so uh i i i can speak this uh this uh one job right into the real a real data actually were looking at that we using the the a cancer genome at yeah this is in H a project and we take a look at the meal a right a oh where and you know particular we we look at a a of a form haitian i yeah a gene expression data and then a about one patient i i i oh yeah uh what what we and we need to also have thing i and really look at a on their own the show what perdition oh but yeah you know just okay extract now that all these conditions having my or or actually patient samples in addition one norm we have forty that i yeah we go the original why just to ask or what are a i yeah i and i saw that my first fine so which in fact yeah seven michael okay and i so with this are we told to be a basic the database P and C one are the try change step possible regulate i these set the transcript fact a seven my car and uh also in uh we a come up with a hundred thirty five genes so uh so the supplies to say that these a hundred thirty find genes are regularly by D all of these are set the my car some transfer factors in in many maybe to the conditions because all of these these uh prior knowledge are are are are are derive from an a different conditions but they are not necessary true but you heard too post one okay so and then for a a as to the prior knowledge for a transcription factor regulation way go to be trends back a and then extract these a regular regular three a a prior knowledge and from like a regulation actually we have our in house uh prediction uh we a these two papers so that's all these set all the of the uh basically a the experiments and then this is the uh i for the of proper ability poster their probability of nonzero elements and thus against can see most of them are nonzero or the probability of a one want it's very very small and only a small set of a probability actually give you close to one so and then all a not all these possible links we uncover a one hundred uh or regulations as so this is a side sparse and and into wrestling eh so uh one look at the covered regulations there are about the uh one fourteen i read report in the database and eleven a in the database are not really on cover a uh but we pick up seven additional you are regulations which are not really or in the data and then this is the the so regulatory what and this actually each node on the site represent a a transcription factor each no on this i represent my car and the circle here there are small those act actually stacked together and their are the represent right they represent genes and each link it has to very clear interpretation so have a here pretty much it means that this in fact the right that that that change and also uh we can use the the loading matrix the as a loading matrix to to to the to to uh to indicate whether this regulation of transcription factors up regulation or down regulation well for my car is always that regulation and this is a a a a heat map of a the loading matrix as so there are a lot of zeros basically here oh so and this is a cover the uh the transcriber fact activity all the fact models and together with the measurement of the my car but this is the for change remember i tell you in the sample there's and there's a normal some so we basically use an more samples of control to calculate the full change otherwise transfer factor oh about should be all cost okay and then this is the the cost or that that's be uncovered by the model so basic it model un covers three cluster and then also it it's a see the expression levels are more less the same within in the cluster and then we look at the saliva for each group of but these cost see whether bit do of form some trouble socks of a sub group a sub D C stuff sub type answer so the we we look at is so we look at the a bible with the see whether after treatment that the the the the patient in the same group have a similar so bible so it's a seeing you all um they the different groups spatial in different groups in have a difference some why will meaning that that you this separation does been something okay can indicate basically you an next um maybe see a patient this be can say that this patient possible after after treatment and need longer that the patient in this point group okay and then we look at the the basic the P of the pure voice i these source some bibles and the and the a clearly shows the who uh the the a cost or one the faster to actually has to large as as a what different so they they can be really are used to in as a viable of the effect effect is up a tree and uh so we were going back actually going back to the mike car an expression data C what thing you know you can a come up with a similar result ah by simply using by or and data and my a and the G did come by my car inching get a without going through the the factor analysis just basically for for one class room uh direct on these individual data and the reason uh the that can in is no i this is these are the P about was actually this is our perform this a lot log P about the as so we have a signal actually higher P about then if you look at my car gene or my car engine together a lot a without using the factor analysis so this really shows that you fact effect fact than this all of this fact model okay uh so that pretty much come close my my also keep those that uh just could that the in a a from a a in a H and uh and uh uh uh that okay thank you thank you i what what two questions yes and yes oh your examples you have quite small number of genes is uh i it is the factor analysis that that that you uh you have uh a restricted to very small sample so far yeah yeah i had it to use it T question a mode to model can you go back short each each one the matrix so the form yes this one is uh or this one yeah so you then you i don't if you but problem you mean the of course uh if you if you are one in C can you find the unique solution for a X in B of course for E X to me know do that the uh yeah yeah a very can question are actually the something really worse than a starting here we like i can give you a radical you know a cool whether there's so i by with it or so actually uh additional of things i haven't really talk about a for example no we have to restrict that a the the call uh of the uh all the factor needs to have a a a a a unique there and uh and also of the columns of uh oh a case these to have a a the same where it's actually the car and also uh the the the hours of a at and Z should be so be we we have to do some three prof make that is a a and C to be in the same Q you know as you have a i can't define all a and B okay but a whether there what what is the competition i can't that i i i i can at okay okay thank you things for at the end and uh i i you back