and run uh uh and uh you know functional um uh for that for thought of that but are uh you tech the right and going for a a know oh and like or use a course there are yeah i and we talk that the real problem uh i wouldn't never seen although all yeah so would change from code work but not what is a type or in the uh D if not from that but and uh we really want you know are still uh try oh a pitch detection the is uh and essential for in compute a know all to racine and not uh for to a meeting but it all the tears system mean a a out one source model don you know that machines and can to use really D a a a a really can do it when they so uh i'm the oh it what do you by the yet very easy uh an a as the cocktail party problem and they are on uh and it's very common from a you uh a a at work vol and to the model a chat or uh and or or be computational of the dreams and how so the real to replicate the yeah oh so once or one day she so that in the uh a a a a a a a model face we have several stages a frequency the analysis and and are looking for discriminative features a model speakers namely each frequency once that all of that spatial diversity all that but with and then some some of the and grouping at the end have some um a a well we're a a one one or oh i really would like to apply a one mask when we by this one by D spectrogram of the mixture sure we can call over the underlying source oh are we are interested in a single channel speech separation and you have two sources and one is speaker and we don't have any special favours in nation for T V oh so and and is based on our previous work we we ah i have a i don't know do you mention to the right model basically uh also uh we propose well well track information another other their discriminative feature you small sort all but there and semantic information you move them more right and we use them when we separate um so these are some prior knowledge can be trained uh i i one so that using um um but and that would be a record two sources so i method that very well but we did a a good was that in order to have some estimate from on the line each one so mean streets and we work and this uh well uh and then lot a P to which has several feature a five i yeah pitch contours and also a them to individual source so many our proposed works just only it that the contour and do want to on the a contour two individual and uh that is it is assumed that a one one of the on the line sort of is always which make it prediction very and also but that use a different from a a lot of papers essentially for a a music signal in which T time frequency continued to use more pronounced such that the could and is very easy or speech and also this these uh uh uh what is it should be we from the a rebel pitch action for single speaker but we have another source uh and and have some sort of the uh but in each or and we need we do need to recover both so but you a high level well the i don't from and and D tracker uh a a for uh stages section grouping separate interpolation but it can be a little what uh oh but was by we need only to resist also some some sort of a ah to making feature P and so a separate i i and in interpolating for a ah we uh one oh uh i think there's stage a a uh a a more pitch detection oh i mean why by the work of a client oh he's group but that propose a distortion measure for in so uh basically uh a as a why uh there is at all it of course so you in but that thing think the white one section we we our goal is to me white these source and B the new signal is the text of the the of the signal and D or its deviation of the spectral densities for on the line sources and they have a i but are aggressive or model two source are and then we need mean one that that in order to record the each a line a time yeah why the same concept i instead of uh you a C I you with a sinusoidal model and which are more suitable for the past well that's so we yeah she thought what you know the new signal yeah uh so one yeah it's of two a a you know the one source a and our goal for detection they to minimize these a distortion so uh for this that for the the uh uh uh a classic paper by mac will like they show that we can a group symmetry uh "'cause" that this that the in terms of sinusoidal modeling using some of some a a sound a does signals or or a a a a thing to of peaks the spectrum and that we we present i a a a you all uh be though and that that the you of L O I E the uh location of P for the presentation or the sinusoidal model but the the peaks don't occur exactly i at in with bit integer all uh fundamental frequency yeah another to out where here and B to a parameter in order to to a exactly match so uh you we have to and i don't for and to uh the location of the and so because we do not have access to the location along the line source P so we apply these approximation which we found the what pretty well in right so and to uh or are bits that separate say i a and then you paris and then be assign peak each data source and then they are very close the sign no ha of the peak to each individual sources and then and oh the only problem to the the to me might you the two pitch a a uh uh points so we we we my station and we got some um estimation for the i but one source for each a are you uh yeah idea of how to uh because that a whole one a for more speak to a the signal ah we have a source here a first one had a week one up to twenty eight the second one nine three i think and he he's there cool are so with the and the more people are integer all pitch frequency are not exactly a query in the more people are integer of the fundamental frequency mean and the white out uh from to you play around with these and the order to get these uh and do the thinking for that in a a a a a a can with these uh uh and and sort them than the signal we uh to minimizing i the second stop it i of the power was also a we detect a peak detection now or grouping that pitch a a a a a a a one so but yeah he is a large a a a a i don't long detection i in two D can "'cause" you don't want to point you want to be the curve to it so well you here uh that's you in the first frame but we search for a uh and these two or more the second row i or and that the reference of any P i one one in in it or what any um pitch and be we group and to get a and re oh a a a a a very to be for another and or not it can not be grouped into used uh um first so no one one another core a very uh uh uh uh uh that that for you try to each candidate and and and i got from five like now the second stage is that separate so uh we or that do the separation be uh she mean not be track and then compared the we you know we will try if the longest track yeah have in these uh to to the a representation i is that we do we do a right and and the longest track smaller than a threshold he sound in to one group you if not then you read a a of them to the sec i we we basically separate the uh individual tracks two source and that the that state yeah because we have to the you know we have a problem my there for the a or some sort of interpolation in order to the record me stand the mean pitch frequencies and some time here you here that that might be you to i'm voice signal power like is like that about that you to a second and the these uh and data using the relation requiring covering max uh uh so oh or are some than those from we tire uh a he's try and here this is another uh nice frequency ah and uh we also have another uh a heuristic parameters that with track or the overlapping they can be don't to want source so you the presence of two source and i a lot of the oh the to make the pitch contour which of the exact to the uh uh uh a reference uh one no so uh that's still i that a E can and detect on the line each one so but yeah but not well um results we are sure i one ninety oh like you know um i a combination of gender me mail in a email maybe a met a we with that the uh are be to interference rate you to zero to eighteen db uh hamming i mean window of black i it is that to bring the of the ten millisecond where live a a a new to segment the signal a reference speech uh uh are a using the uh talking method which is very what was five and accurate and uh the uh the white three a very or uh maybe ross or right and a your mention a previously a a voiced unvoiced or rate and separation error we compare this the with the uh one of the or a the to was back de leon wiring groups that have um um there have a have applied some sort of gammatone few trained with a channel and and another at that or or or or a a a proposed by captain in night for of course a sort of a harmonic suppression or for the so yeah or or or a result uh for error rate versus the target to interference ratio ah have three sets of lot which how one so the result for target and the battle for we are and um we have to lines here a dynamo you go and the uh so that one and um and and uh for one of those and uh the proposed method uh the the that we but that's stand for the uh hmmm and i have some good you know but and it can for all can be nation of mixtures a and five factor a a a from the two other techniques um so if we can see uh and there for the target if we a signal L so a a you is to a a voice as all uh he to incorporate these he met that you can be it and propose anything for the in the unvoiced the only work who worked know so we have to in here um a kinetic see a a a and it's factor uh oh a very well fit to the other that for all combination me male in in a minute and male email um week and the point only in terms of uh separation error uh we we see a you i two method uh how our method very robust against one a a a a a sort of the to like or and uh uh uh you get in separation performance for to uh a method yeah so uh there are a number of issues that should be risk a about this for uh so i have a problem but two pitch contours are are crossing each other how we can assign and two different sorts of than the pitch contours are very close a green i don't know even oh uh uh are the to system can separate them are our uh we are working to improve the performance in by applying some prior knowledge about a speakers i believe we can also apply the spatial diversity another at another clue we can uh yeah i meant to to do small and improve the for one and some prior knowledge about the there uh i been working on a bayesian inference method the performance yeah i would like to time is uh a a called me not and the only one who provided codes i three D for a a a a researcher or so for really how to compare a with a so and that ah right now week the whole of the code it's some demos from a my page i is really and that one so and now finally to you uh a to taking any question or comment about i but you come in a little bit i do you mean in terms of separation a sequential grouping problem of interest i think that that just to they're not actual separation and six that's right means voice yes the uh a pitch you want to do a speaker and and tracker as signs it to the seconds that as the separation yeah i a this classification be two class and i i i i are or correctly that we would also some that's make any attempt to to solve the problem that a i i i i one and that method that it if you need to do a different at contours for to no sure and yeah yeah maybe a and a we then and can the feast five the fact that you're are getting two contours from two speakers i got it at home it sounds okay so something about two you of to look at it that you like to translating to do most mcclay speak a charter to to look at the model look at that you use was that translates to like C and is that being online have uh just it something i duration of tracks so you you okay yeah i i i i uh marking cooking of S a i think it duration of sentences are about seven to two sec two to sec yeah oh i i i but the method automatically a as the and voices so when you have these uh five the so uh yeah as and what so we reference a a a a a i i the uh and with anything specifically to recognise and work a fact that you don't have a a a to here uh yeah i that we don't have any yeah which i think yeah i thank you very much your