0:00:13oh
0:00:13welcome
0:00:15ladies and gentlemen to this
0:00:17experts session on trains in or are and acoustic signal processing
0:00:23and is
0:00:24the that so many of you came
0:00:27and thank you are but in advance for postponing a lunch break a bit
0:00:31um i hope to mount will make it interesting
0:00:34i i was just reason the fight that we could also use this opportunity what you need to to do
0:00:39some advertisement for our a T C which is the T C and
0:00:42or you an acoustic signal processing
0:00:44as i'm not to really prepared for this or page take the whole thing as advertisement
0:00:50for a our T C and whoever wants to get involved
0:00:53please contact us
0:00:55and
0:00:55there are various ways of getting involved in our activities
0:00:59and of course we first one would like to
0:01:01tell you about what this is
0:01:03so am
0:01:04i i i in my are role as a posture of this T C and
0:01:09i would like to process to to to experts which are also from our T C which present the
0:01:14acoustic signal processing community and the audio community
0:01:18uh uh a very specific and i think of very
0:01:21uh
0:01:21we now and way i would like first like to
0:01:24uh point to pet we can a plea please skunk step forward
0:01:28a that you can be C
0:01:30but we can a is
0:01:32the
0:01:33from the imperial college london
0:01:36and and i think uh is the most important thing about to right now is
0:01:40that he just recently "'cause" did did the first book on speech to reverberation
0:01:45a for everything and you might look at is that sides which has also very nice pictures like can
0:01:52and uh on the other hand i have not come
0:01:55with well known for
0:01:57the audio or and
0:01:59music especially community and co
0:02:02he's score course actually much beyond that
0:02:04he from a research
0:02:07uh i should not i forget to mention that actually path have ties
0:02:11to both words though
0:02:13not come is
0:02:14oh also teaching that stand for that and patrick also has
0:02:18and that's true nations
0:02:20so we that further do you i i would say uh i should stop
0:02:28well thanks very much for coming along to this uh session help is gonna be interesting to you
0:02:33um
0:02:34we um
0:02:36try to think about what you might expect from this kind of session
0:02:40and i have to say that's
0:02:42the idea of trends is a very personal thing
0:02:45so uh we can to present
0:02:47uh what we personally think uh hopefully interesting things
0:02:51but uh obviously in the time concerns we
0:02:54we can't cover everything so some of these things are like uh
0:02:58a easy to define like counting papers as a measure of activity
0:03:02or counting achievements maybe in terms of except papers rather than submitted by papers
0:03:07some of them are much less uh
0:03:09uh uh uh how do you own be list
0:03:12and uh that more uh uh soft the concepts but we try to to go around this we a little
0:03:17bit
0:03:18and see what we can find
0:03:21so the first thing we did was to look at the distribution of submissions to
0:03:25uh the transactions on uh audio speech and language processing
0:03:29and uh
0:03:30i the plot this out that's a lot of detail on this pie chart here
0:03:34but the thing to note from this
0:03:36is that there is some big
0:03:38uh subjects which are very active within a community in terms of the amount of effort
0:03:44going into them
0:03:45so speech enhancement is a big one and has been for a long time
0:03:50source separation continues to be very active
0:03:53uh we fat ica sessions he
0:03:55uh to icassp uh
0:03:58microphone array signal processing
0:04:00still very big and uh showing up something like thirteen percent of submissions
0:04:05a content based music processing that's just called it music processing
0:04:09music is huge for us now music is huge for us and continues to grow
0:04:15as race if not
0:04:17and um
0:04:18uh this is a a a uh real even lucien that we sing maybe even a revolution
0:04:23in our uh profile of activities is
0:04:26uh also we could look at audio analysis as a
0:04:29as a big topic
0:04:30the ones that i've highlighted they're are the ones that we can to try to focus on in this session
0:04:34as i mentioned we can't possibly focus on
0:04:37everything
0:04:39so that leads just to music
0:04:41so some music is um become very big here as as patrick mentioned and and this year at i cast
0:04:46there
0:04:47are three sessions as you can um see listed there
0:04:49there's a number of reasons i thought well worth highlighting just because the is in to see how the Q
0:04:53to develop
0:04:54um so the the reasons is that the you X which is how people describe would papers there many a
0:04:59describe described paper it's meeting
0:05:00to conference
0:05:01um was changed to include music as an absent so
0:05:05it's a rather bureaucratic
0:05:06um we same
0:05:08but it probably has a lot large much to do with the fact that there's some music papers now at
0:05:12icassp in M
0:05:13and was i think that's a good idea
0:05:15um a second reason is as a lot more content to work with um
0:05:18music six easy to work with as we you know we all own large collections
0:05:22um and and the third reason is is become a very commercially relevant in the last few years
0:05:27um so i tunes impact or are certain it's two examples
0:05:31of companies who are are making a a a large my money from
0:05:34from music um ideas
0:05:36um
0:05:37as the mention the the data is easy um we all have um large um C D collections
0:05:43and and
0:05:44one of the the things that
0:05:45that is difficult but music is a all copyrighted or all the stuff the wanna work with this operator
0:05:50yeah and one way that Q T out with this is by um doing a to some a talk what
0:05:55little bit
0:05:56but another way that that that you D has a as um work with these it is two
0:06:02create what's called the million song database
0:06:04um and the idea of this is to distribute features of the song not the not the actual
0:06:10copper the material
0:06:11and so um
0:06:13actual forget me if are i think it you a hundred features
0:06:16purse on and there over time to
0:06:18um
0:06:19and columbian an echo nist uh provide this database
0:06:22um at online
0:06:24and there's a of data there that that people when use and it's really available in it's a very large
0:06:29database
0:06:29and i expect we'll see more more papers
0:06:32um but uses database
0:06:34the the matrix is an is of been the the best um thing for the
0:06:39scientific if a component of music analysis music processing
0:06:42this is the you list of tasks
0:06:44that were that are being uh work done for the two thousand eleven competition same
0:06:48um as a matching it's a big issue and
0:06:51what the mean X people do um is
0:06:53provide an environment and universe you wanna or i where people can one are algorithms a large data base of
0:06:58of song
0:07:00so the songs never leave you know was so on or
0:07:02so instead of you know getting data and doing your algorithms and send results back
0:07:06you said you algorithm universe you on the white
0:07:08um in a particular environment java environment
0:07:11and they bought it a they could do about get the up to it for you
0:07:14and and then they run the algorithm and their machines and the clusters
0:07:17and give you like results
0:07:18i one to highlight um a three uh tasks
0:07:21that are so right here
0:07:23that are um very um important in very uh a popular
0:07:26what is audio tag um classification so how you tag audio with various things
0:07:30um is it happy use a blues
0:07:33um anything you think of can be a a attack
0:07:36and people were that very hard
0:07:38um what for fundamental frequency estimation tracking
0:07:40um has been popular a yeah i
0:07:42yeah i before merrick started
0:07:45but mirror X as i think of a coming database and and really up scientific level can not people can
0:07:51can compare things on around
0:07:53and the other one is a other get chord estimation
0:07:55so that sense a court is is to another tag
0:07:58but very specialised tearing
0:07:59and helps people understand a music and people work on a lot
0:08:03um something else as happen and spend very have it this year
0:08:06yeah is um a lower work can separation analysis
0:08:09and they are all very model different approaches
0:08:13so this particular um um graphical model um
0:08:17is for paper but um
0:08:19um
0:08:21my our open france a right
0:08:23and it's shows um a sequence the note along the top and so in this case a have a score
0:08:27in know what's what's being played and that's that hard information to get
0:08:30and then the generating um
0:08:33um data about the uh than the that harmonics
0:08:36um um from here so you have the the amplitude
0:08:39the free have no i and the variance of the of the of the gaussian in the spectral domain
0:08:43oops sorry that are combined
0:08:45and and then you have similar simple able in so these of the spectral slices
0:08:49in what you try to do what you trying to
0:08:51um given the note sequence you have um
0:08:53i'm sorry
0:08:55build a um or find the you the these
0:08:58um emission probabilities
0:09:00that describe a music
0:09:01and from that you can do a lot of um a very everything work
0:09:05um you can to do things like um tagging with to mentioned for things like a motion in john right
0:09:10and and uh uh um something that's kind of do to my heart but shows a the kind of work
0:09:15that's being done is area
0:09:16um some work i'm and morphing um
0:09:19and the question that um
0:09:21um quite a known and what they want to ask was
0:09:24what's the right way to think about um audio your perception
0:09:27and in morphing
0:09:29and so if you do more fink lee
0:09:31the
0:09:33the path in feature space should be a line
0:09:35so if you're morphing between one position another position
0:09:38that feature moves along a line in the will domain
0:09:40and you want the same sort of thing to happen in the auditory domain
0:09:44so
0:09:44the
0:09:45um
0:09:46the graph that shown here on the left them so put pro quality but just give you a sense of
0:09:50it
0:09:50or with or or a range of a line spectral free of frequency envelopes
0:09:56and then and the right hand side are
0:09:58all the perceptual measures that of been used there have been calculated based on these
0:10:03on these on L ourselves
0:10:05and what they're doing is final look for one that's a straight line would you can see and in the
0:10:08bill there
0:10:09and and um some pieces work better than others are i think that research is still being
0:10:14pursuit
0:10:17right so uh
0:10:18uh audio and acoustic signal processing T C
0:10:22covers was quite a wide range of areas um
0:10:25which are
0:10:26well
0:10:27i have to say that it to me there exciting i help you feel also that same excitement about said
0:10:32the technology that are being developed
0:10:34and and i think we see trends that a lot of the is this of being in the low archery
0:10:39for many years
0:10:41and now starting to come to the point of applications industrial applications
0:10:44and we for about some of these in the planner
0:10:47and and in that kind of context
0:10:50if we look at uh the research that we do
0:10:53um i ask a question of how much of it is driven by
0:10:57uh the that is i have for exciting applications
0:11:00and how much of it is fundamental how much of it
0:11:03underpins
0:11:04the
0:11:04technology with good algorithmic research
0:11:08um so i else you know is there a happy marriage here
0:11:14and uh i have the uh do you can touch is of cambridge will forgive me for using that photograph
0:11:19uh but there is a serious point a high this um but before we come to the series point
0:11:28um
0:11:29so uh of course prince william is very very pleased um having uh now find found is very fine bride
0:11:41so he's maximised is expectations
0:11:44um and uh i had a very uh happy day
0:11:48the there coming back to something a little bit more serious i think um things which look good have to
0:11:54be underpinned by
0:11:56excellent
0:11:57in uh algorithmic and fundamental research
0:12:00so if there is a trend perhaps
0:12:02two things that look great
0:12:04let's just not loose sight to the fact that the power
0:12:08behind them uh is
0:12:09uh the algorithms that we do
0:12:12okay
0:12:13so one of the areas of out grizzly research which is very hot and has been for a long time
0:12:18is in uh array signal processing is applied to
0:12:21microphones maybe also loudspeaker right
0:12:25and here we see um and even of applications hearing aids as been very busy for a long time
0:12:31and has a
0:12:32uh many applications as well as excellent underpinning technology
0:12:36i do see now a big brunch out into the living room
0:12:40and the living room means V
0:12:43it means entertainment perhaps it means an X box three sixty with a connects
0:12:47a microphone array uh perhaps it means sky T V
0:12:51and so these are new applications which are really coming on stream now
0:12:55and uh i think we'll start to shape
0:12:58the way that we do research
0:13:00at asks haven't to change that much we still want to do localization we still want to do tracking
0:13:05we still want to extract to decide source from any
0:13:08uh would be that noise or other tool "'cause"
0:13:11um and then and then you a pass a new task is to try to learn something about the acoustic
0:13:16environment
0:13:18from uh a by inferring it from the multichannel signals that we can obtain with the microphone right
0:13:24and this gives is a dish additional prior information on which we can condition estimation
0:13:30um
0:13:31know that it's you is what kind of microphone array should we use and how can we understand how it's
0:13:36gonna behave
0:13:38people started off perhaps looking at linear arrays
0:13:41um
0:13:41certainly extending it into play you and cylindrical and spherical even distributed or race that don't really have any geometry
0:13:48three
0:13:50and uh that's signed of such arrays including that's spacing
0:13:53of microphone elements and the orientation uh uh is uh an important an expanding topic i think
0:13:59people started off with linear arrays
0:14:01um
0:14:02a bunch of microphones in a line
0:14:04perhaps uh this is a well-known i can mike from M H acoustics
0:14:08uh thirty two sense on the surface of a rigid sphere a eight centimetres or so
0:14:13of the little bar or tree prototypes products
0:14:17the come now into real products you can buy
0:14:20and uh connect your T V sets sky T V
0:14:23as
0:14:24uh the opportunity to include microphone arrays
0:14:27for relatively low cost
0:14:28uh such that you can communicate uh using your living room equipment
0:14:33um
0:14:34for a a very low cost
0:14:35to
0:14:37communications and hardware well
0:14:39and the channel just here that you're probably sitting for me away from the microphone
0:14:44so uh uh uh this is going to be i think a really hot application for us
0:14:49in the future
0:14:52interestingly uh people are still doing fundamental research so i'm pleased to see that and that he's a paper i
0:14:57picked out uh
0:14:58i can't say a random but it caught my eye
0:15:01um he he's a problem given and the source is an M microphones
0:15:06where should you put the microphone
0:15:09and uh in this work which is some uh work i spotted from uh from the old about group
0:15:15i given a planar microphone array
0:15:17some analysis which enables one to predict
0:15:20the directivity index obtained for different geometries and therefore obviously then allows optimisation
0:15:26of those too much
0:15:29okay so source separation is uh another hot topic and has been for a while
0:15:34i thought i should say that's obviously trends
0:15:37start somewhere
0:15:38the trend
0:15:39has to begin with the trend setter
0:15:42and i put this photograph up of uh colin cherry
0:15:45um simply because i think he used to have the office which is above my office now so
0:15:50i also feel some kind of uh proximity effect
0:15:53um
0:15:54and uh his definition of the cocktail party in is nineteen fifties book on human communication has often is often
0:16:01quite it's in people's papers
0:16:03um and the early experiments were asking the question as to the behavior of listeners
0:16:08when they were receiving to almost simultaneous signals
0:16:11and uh
0:16:12cool that the cocktail party
0:16:14at the picture here i put it up on purpose because i don't think many people would really have a
0:16:19good image of what a cocktail party was in nineteen fifty
0:16:25and so i i guess it looks a bit different now a
0:16:29but anyway
0:16:30uh so
0:16:31progress in this area has led us to be able to handle cases where we have both that i mean
0:16:36and undeterred on to determine scenarios
0:16:39i'm clustering has been a very effective technique
0:16:42uh the permutation
0:16:44uh problem
0:16:46has been addressed uh with some great successes as well
0:16:49and now we're starting to see results in the practical context where we have reverberation as well
0:16:56the uh usual effect of reverberation is talked about in the context
0:17:00um of dereverberation algorithms for speech enhancement
0:17:04and uh this is something that i've uh myself tried to address
0:17:08and uh perhaps we now at the stage where there is a push to take some of the
0:17:13algorithms from the lab archery and start to roll them out into real world applications
0:17:19that's will then learn whether they work or not
0:17:22and uh we have to address the cases which are both single and channel case
0:17:27uh often by using acoustic channel inversion if we can estimate acoustic channel
0:17:33and although
0:17:35this is all
0:17:35a slight title speech enhancement of course reverberation
0:17:39uh is widely used
0:17:41both positively and has negative effects also in music so let's not lose sight of that
0:17:48the other factor which i wanted to touch on here was seen
0:17:52so
0:17:53and interdisciplinary research is often a favourites modality
0:17:57and did not community we can see some if it's coming from
0:18:01cross fertilisation of different topic areas
0:18:04for example
0:18:06all of uh dereverberation reverberation and blind source separation
0:18:09and we start to see papers where
0:18:11these are jointly
0:18:13uh uh uh addressed with some uh good leave each from both
0:18:17uh but types of techniques
0:18:19equally
0:18:20speech for uh dereverberation reverberation coupled with speech recognition
0:18:25where
0:18:26a classical speech recognizer is in hans
0:18:29uh such that it has knowledge of the models of clean speech but also
0:18:33has models for the reverberation
0:18:36and by combining these
0:18:37is able to make a a big improvements in a word accuracy
0:18:45so i want to talk a bit about a week or anything that i've been seeing or less two years
0:18:49um
0:18:50both in this community and an elsewhere but i thought i i'd and mention it here first and and
0:18:55and that's about sparsity
0:18:56um
0:18:57and and no we're not talking about my here
0:19:00um
0:19:03the
0:19:03first a i saw this um
0:19:05was in the matching pursuit work that was presented here and ninety seven i think that was first done and
0:19:10you know a signal processing
0:19:12a transactions and ninety three
0:19:14and um at the time i thought it was interesting but a dime idea
0:19:18um
0:19:20and so now i'm a crack myself
0:19:21um but it's own up a number of resting places um in in the work we that has been done
0:19:27um it i cast elsewhere
0:19:28um compressed sensing a a a few years ago um was a proper the best example
0:19:33um
0:19:34but in in this community um
0:19:36and we seen any can you know to sorry is still low just as deep belief network
0:19:41um
0:19:42sparsity D has been a big part of of the work that's been done on D of that works and
0:19:46in machine learning
0:19:47i think that's pen um you know sing
0:19:50and
0:19:51um in a lot of paper is that we saw this this year um
0:19:54L one regularization is a way of of providing solutions that that makes sense
0:20:00um
0:20:01when you have a very um go over determined um very complex um basis set
0:20:06and so i i
0:20:07i i title this or a spouse a D uh but it's probably better described a sparsity
0:20:12in combination with um over over complete basis sets
0:20:16and i think that combinations and resting
0:20:18oh one example of that um was talked about a little bit go
0:20:21and session before this
0:20:22um in the work by a um
0:20:24i i new in cr
0:20:26um using a cortical representation to um
0:20:30um
0:20:31to model sound
0:20:32and
0:20:33and courts is probably the original um
0:20:36a sparse representation
0:20:37um
0:20:38it predates all of us
0:20:40and and the idea is that you wanna represent sound with the least amount of of biological energy
0:20:46and what seems work well there is to use bikes there are
0:20:49represent of are very um
0:20:52a a distinct sound atoms and how the top put together is still a matter discussion
0:20:56but uh
0:20:57i think is the been gone be you know sing
0:20:59and the way a uh a new but and ch has been using that is two
0:21:03take noisy speech and input if you these kind of um this very overcomplete complete basis set
0:21:09and then
0:21:10um
0:21:12phil to it
0:21:13you and in we regions
0:21:15that that are
0:21:17likely to contain speech
0:21:19and so
0:21:20in a sense
0:21:21um it's a it's a wiener filter but it's in a very rich environment
0:21:25where it's very easy to separate um speech from noise and things like that
0:21:28and what's on the bottom is is noisy speech the kind of feel to that makes sense for speech
0:21:32which for example has a a lot of energy rather forwards modulation rate
0:21:36and then the clean clean speech on uh on the op
0:21:40um
0:21:40the deep belief networks are are you know thing um i think um for similar reason this all ties together
0:21:46um
0:21:46was shown in the left hand side it is um
0:21:49um
0:21:50is a little bit of a waveform that's been applied to a a
0:21:54a restricted boltzmann scene
0:21:56which is just a way of saying that they have a their legal learn weight matrix
0:21:59the transforms the input
0:22:01on the bottom here
0:22:03to an output
0:22:04uh so on top there
0:22:05few um a a a a make a weight matrix
0:22:08and is a what little bit of a nonlinear you there
0:22:11in a can learn these things in a way that um
0:22:14um
0:22:16can we construct input so find too
0:22:18find a basis vectors um on the side what where is that by the way picks vector X
0:22:23so that give "'em" of these guys they can we construct the the visible units it sorry
0:22:28um
0:22:28and these are some they been doing this for image processing domain for a long time
0:22:32and these are some results
0:22:33in the waveform domain there are there are new this year
0:22:36and there's a bunch of thing um things that often look like um
0:22:40uh gabor is a very sizes
0:22:42but the one thing as an or things you have to see some very complex features so this in the
0:22:46fixed a domain
0:22:47and you got these things that have to frequency P
0:22:49which you know might be akin to formants
0:22:52um
0:22:53and so they will applying that to to speech recognition and i think that's in sing direction
0:22:58i'm gonna limb here because um
0:23:00i think the reason that um
0:23:02suppose C D's important
0:23:04is it because it gives this a way of of representing things that we can't do with that we can't
0:23:08do was well in other domains
0:23:10so we have grew up with the voice transform domain and what's on an and a left can side at
0:23:14two basis functions
0:23:15is one a basis to just to frequencies
0:23:18and with those two basis functions you can represent the entire subspace space
0:23:22so that point that's shown there to be anyone that subspace and and you can do all those things
0:23:26and it's a very which representation is a as we all know
0:23:29you know as is a satisfy the nyquist criteria you can you can do anything
0:23:33but
0:23:34i think that's the problem with
0:23:35with
0:23:36a dense representation like that
0:23:37and alternative is to you is you look at something like an overcomplete bases
0:23:41and and just pick out elements at you've seen before
0:23:44so you you just as some synthetic formants
0:23:47but the way i like to think about these things working is that
0:23:50if you train um if you if you build a system that that it exploits um sparseness
0:23:55whether but belief network whether be matching pursuit
0:23:58um whatever your favourite implementation technology as
0:24:01you can learn patterns that look like these formants and so what's on the left is is one of all
0:24:06with different vocal tract lang
0:24:08and uh on the second and a and the right hand side as a different valid different vocal tract length
0:24:13and
0:24:15the system on the right with a sparse overcomplete representation is just gonna learn these kinds of things
0:24:20it's goal balls with different vocal tract length
0:24:22it's not colour need entire space
0:24:24and so that if you wanna process things
0:24:26if you working in this space
0:24:28then only things that are valid sound sounds it you seen before
0:24:31will be represented by the sparse basis fact
0:24:33but a basis that
0:24:34and it can do
0:24:35yeah useful things and so i think that's where it's can be an important trend in a port direction for
0:24:39unity
0:24:44so one of the things we wanted to do is to get out to different sectors of a a topic
0:24:48area and uh put in some uh i hopefully interesting quotations from
0:24:53uh i just in those field so
0:24:55and he's one that comes from um
0:24:58from T T so he we have telecommunications company
0:25:01uh thank you for uh to here here not at E
0:25:04for this code remaining challenges in source separation
0:25:08could include blind source separation for an unknown or dynamic
0:25:12number of source
0:25:14it is that i artificially officially in it's cherry jerry chair uh a photograph on the wall of the large
0:25:22uh into the E how what areas so if we think about mixed signal I sees
0:25:27uh the the guys at the working on those uh
0:25:31functionalities
0:25:32really support what we want to do
0:25:34uh so i think that that's important to to listen to the heart guys as well
0:25:39so from uh we'll so micro electronics
0:25:41uh most lower is driving dsp P speed and memory compacity and they billing implementation of sophisticated dsp functions
0:25:49resulting from me is of research
0:25:51the end user experience
0:25:53uh maybe this is a which rather than the reality of the moment
0:25:56the end user experience is one of natural white and voice communications devoid
0:26:01of acoustic background noise and unwanted artifacts
0:26:04seems to me like the hardware manufacturers are on our side
0:26:09um um we had uh a little bit this morning about the uh X box connect
0:26:13uh you found a have
0:26:15thanks
0:26:15for this uh a contribution here of the applications of sound capture and enhancement and processing technologies shift
0:26:23oh he's a paradigm shift
0:26:24shift gradually from communications
0:26:28which is where they
0:26:29where region eight isn't half the home
0:26:31mostly a towards mostly recognition and building natural human-machine interface
0:26:38uh and he highlights mobile devices
0:26:41"'cause" and living rooms
0:26:42i key application at
0:26:45malcolm you get the last word
0:26:46well i i don't the last word but but we we have one more slide and we can decide whether
0:26:50this is the last word from
0:26:51i'm steve jobs or from with a ga got a
0:26:54but in either case the message is same and this large commercial applications for the work that we're doing
0:26:59it started with um M P three which enable this market
0:27:03but this still a lot of things we done in terms of finding music
0:27:06um
0:27:07adding adding to things um understanding
0:27:09what what people's a team a needs are so we really haven't talked but that very much
0:27:12but
0:27:13um
0:27:14this is an information but this does not information retrieval task you know people looking for things that are chain
0:27:18themselves some whether be songs or or or or or music or whatever
0:27:22um i'm you signals and and working with them is an important thing to do
0:27:25and so
0:27:26um i think both lately got got and see jobs can have a final word
0:27:30so thank you
0:27:39so
0:27:40thank you
0:27:41my come and
0:27:42patrick rate
0:27:43a now we have very little time for discussion but we certainly should not miss this up you need T
0:27:49to hear other the voices as well as that we mentioned
0:27:52obviously these views are not completely balance
0:27:56how could it they be
0:27:58so maybe somebody in the for a would like to add some but
0:28:01something and we can
0:28:03a we have a little discussion on more
0:28:06anybody
0:28:08yeah
0:28:13a thank you for that great summary
0:28:15uh i just want to add one more thing i think up
0:28:18we have to a isn't two years and the work together
0:28:21and i think cross model issues are
0:28:24a likely to be very important the
0:28:27i eyes did act that you has and the years detect the eyes and so on and
0:28:30likewise i think uh audition audio research and B vision suck should not
0:28:35proceed separately
0:28:37thanks
0:28:38the money for this comment uh
0:28:41this is certainly something which we highly appreciate and we always like to be in touch with the
0:28:46multimedia guys would don C uh audio as a media
0:28:50um
0:28:51but uh uh
0:28:53certainly we uh there are many applications where we actually closely working
0:28:58with with your persons just think about
0:29:01uh celeste tracking
0:29:03so if you want to track some acoustic sources
0:29:06and the source a silent then you're a the uh you better use you camera
0:29:11so they are
0:29:12a quite a few applications with this is quite natural to joint for
0:29:19i i you know just a
0:29:21to reinforce that there was a nice people saw us to remember who did it
0:29:24with their looking for joint source
0:29:26joint audiovisual sources and i think that's
0:29:29it's important and
0:29:30it can be easier i mean
0:29:31the signals are no longer a big deal
0:29:34so it's easy to get to the space commuter power is pretty easy
0:29:37it would be fun
0:29:42followed that uh people have to
0:29:44okay follow that talks about four years
0:29:47uh is there any research uh
0:29:49well i use a pen binaural a single person sinful
0:29:53binaural uh for musical signal processing
0:29:59i don't i don't heat so the question was whether is any binaural music research um
0:30:03i don't know of any i mean people certainly worry about um synthesizing um hi
0:30:08um high fidelity sound fields
0:30:11so um
0:30:13um
0:30:14the fun of a group for example from working on on synthesizing
0:30:17you know sound field a sound good no matter where you are
0:30:20and and so you know work with people stand for
0:30:22where various in in computing in in creating three D sound fields
0:30:26for musical experiences
0:30:28um
0:30:29um but i much or where X i go yeah
0:30:33i mean i i i if you'd S be ten use you whether we have five point one speakers in
0:30:36the living room
0:30:37i was set no
0:30:38but
0:30:39look what's happened
0:30:40so we we better
0:30:46else before lunch
0:30:52okay you talked about uh five point ones because the living room but um
0:30:56or thing a lot of new algorithms that a little do uh microphone array processing
0:31:01well would be saying devices that let us do it
0:31:03i mean like soft connect has a a a a few microphones i've seen a few um
0:31:08cell phones that have multiple microphones on for noise cancellation will have more devices allow us to
0:31:14a better processing algorithm
0:31:16yeah so the question was what what we have devices that will have uh
0:31:19uh the ability to allow us to implement
0:31:23yeah
0:31:24so i P eyes
0:31:25so on so forth
0:31:26i i i understand from this morning talks that day be a a um a guys will be a software
0:31:30development kits will be available for connect
0:31:32um and that could be a lot of fun
0:31:34um i think uh the hardware is that to enable us to do it and
0:31:38the key point at of this i think is one of the trends that uh
0:31:43uh we use C which is a move
0:31:46in audio from single to multichannel
0:31:48that's been happening for a while and that is their sign of its stopping
0:31:52as so the of we would expect the facilities
0:31:54uh the processing power
0:31:56the uh inter operability and software development kits to come with that as well
0:32:05near the question
0:32:07comments
0:32:09i have one uh
0:32:10final remark which came mark
0:32:13increasingly uh
0:32:15and that would like to put that as a channel a challenge because
0:32:18uh they're sensor networks are out there and they are
0:32:21uh in discussion on
0:32:24in many papers where a nice uh
0:32:28algorithms are provided all ways based on the assumption that all the senses are synchronise
0:32:35um
0:32:36this is a
0:32:37tough problem actually so
0:32:39and we feel in the audio community we could a
0:32:43if a lot if somebody could really built devices which make sure that all the audio front ends in
0:32:49distributed to beauty work
0:32:51synchrony a the synchronise
0:32:53uh the underlying problem is simply the
0:32:57once you
0:32:58correlates signals of different senses that
0:33:01um have
0:33:03not exactly synchronous clocks
0:33:06the what uh this
0:33:08correlation
0:33:09will fall apart
0:33:11and
0:33:11just look at all your optimize nation and all the adaptive filtering stuff that we have
0:33:16it's always based on correlation and
0:33:18even higher orders the
0:33:20but then uh
0:33:22this problem has to be solved
0:33:24and so if you want to do something really
0:33:27uh a good for us then please solve this problem
0:33:32as a have after once
0:33:34after lunch okay
0:33:36thank you were much for attending