ends and multimedia signal processing uh i'm phil chow a i i the chair of the N and S P T C so i get to be the moderator today we have three panelists uh they are all from the sub committee on technical directions you know R T C a their responsibility is to look for that a new trend to make sure that T C remains a a relevant and vital uh and just go is the fact just the did that L i is the uh is the chair of that committee uh and his professor telecommunications at university of though is interests are and media retrieval and video analysis for behaviour understanding and among other things he was technical co chair of i set two thousand five and general it italy uh and since two thousand nine he's been coordinating a large european project on media retrieval a time box is a for professor and the department of electrical engineering and information technology at the technical university of munich "'cause" interests are any areas of audio haptic information processing and communication as well as network and other and interactive multimedia media systems among other things he was technical program coach for a workshop the and an S P workshop held and send a of france last year and then uh are we got he is uh associate professor in the department of electronic in the probably technical jury no in italy his interest are in compressed sensing remote sensing that resilience coding a distributed source coding and security for images and video and among other things he is technical co chair uh uh at the international conference on multimedia and expo i C any a thousand twelve in melbourne australia so uh the important thing about a panel in my opinion is to get a audience participation uh so we will leave some time for sure uh to have you interact with the panelists and poke them have separate pair you are questions and figure out which panelist you want to address um but i did want cover some basic ground so i ask each of that a for just about six minutes maximum each uh a and some of things that a compelling to an individually a and then will get to your your question so uh without for do what's go on to french and a a a a a time is that shot so i will have a just to two a weekly interviews uh some uh ideas but before but uh i want to tell you that the general idea that these be I his presentation uh that is uh a information of communication technologies but think that you got about the media a a more a more uh cost D C plenary and he's is probably that that a more and then trained of the last years um the D C it doesn't mean just uh uh between these C P source file uh i C D but also with the these P is that we are and that not use it to the a we if that's so for example a psychology or or or source of size is uh or cognitive size and uh we we should be aware that the a them uh we create the leading says synapses between is these different feels uh the more we can be in all that in the thing the future also in our uh are read about fields and that we give you some some uh um uh some hints about these uh the first one or a very to the field of uh a a a a media source three about uh we know that the signal processing community was able to provide the a great advances in the last years of providing in for example a a uh uh with descriptors for for this image is deals of music and all a and uh and also all uh more those uh ways matrix to make sure the that's also for uh this the and and the summit again but is that uh uh uh we need a much more for in the future if we want to solve the problem uh uh and the looking up the uh a most the research in the field the uh we can see that the the answer can be found the uh looking also to to do research as which are a and yep to house or even not so near to as uh and and of all of a for example a the representational knowledge uh so for example a out to represent the context of a uh which is a um uh and and to be lot of the possibilities to to where reach of all it's about the that uh but still uh there a low uh reliable more those for doing that um um a a the social norm to many people use working on the uh exploiting the knowledge contained anything social media for example is which all networks are two um i so you've that the gain information about the uh uh but they gain the uh uh a no with the made so solution is a variable you this field uh a i do we are but the uh a like this kind of course it's a to the uh a great techniques and will define so we don't use you more complex a more sophisticated more that's can account the uh these kind of information gain that allow um um solving problems like a scalar E or sort with the uh uh a media uh source of information like they the internet uh and and also dealing with their C so only be able not the find a near duplicates or similar image is but also uh image is a a or three deals or uh a music that represent the the call of the we have so in a very um i um diversified the a and and uh uh a a a for incomplete the way yeah a P K shall include the uh what media search engine or a uh uh a uh and and and other fields like for example more by sir or i and i i like you just some example that i don't have the time to come to be they'll about old of these the just let me make some uh but but went the uh uh is via say that we have a very large european project that that is dealing with this problem see few be it the site the uh W W that will do a couple out that the doctor you you can find additional information about the uh a other feel the of uh uh in the race the where uh in that beside that in the uh can be really the the way you of boosting the performance of core at is he's is that of uh a a a a um i'm get it is or a more recently a also called the a a a get the awareness less or called media i by roman so source also for that's in this means that we have a a lot of information available uh uh uh a a a a a we are but in the capture for the environment much more information on that a on a a a the information that you must an even perceive a wednesday in a in the environment but yeah about of these information is not able to do a but the get the between the knowledge at that you have of the word and that knowledge that the cease thing can have a for of the word again again or uh what what kind of uh a um uh a problems are open the and can be exploit is feel for example interaction between uh a a a people not just a understanding it is for example or but this painting get the view of C i i and this thing in the verse that they you are involved in more about one per of for example uh social bits and on um i understand the ml show same the P all of the people and and using price uh of course uh uh uh making it a is that the a a four or the or also but in in in that the you need models uh uh all very the rush shows so uh uh be of yours set so um well we are more or less a yeah eggs so i stiff the example of a a a a i like go to the uh last but but to which is interfacing interface ease is the last my so we have a a a a to feel the get between application which are a for very sophisticated the and the the user and are so in this case uh a a a a there are a lot of of all full but do you need is so for example uh for designing got the faces which are uh called that list up make sure uh a a uh uh a accessible and usable or uh and uh and can provide the also access to a showing the complex environment double or environment that all um have to very the get a a a a more complex them but the problem is stiff and uh but i i have to close uh uh what's makes a look at the a in the in the different life of but they are still there but uh the way of looking at the problem maybe is is changing from my a yeah sorry to russia time is uh a lot yeah thanks bill i want to add to to to trends to that's just discussion is of course my very biased personal opinion but these are that two current track so the first one dress as our quest for in most if could a communication and if you look at the past you as we seen tremendous progress in the area of tell a so system and this was was mainly driven by at and this in this play capturing device devices highly efficient or your video coding and the ever increasing uh and but from our communication networks but um if you look at these conversational services that we are run or but these tell a person and systems i but we consider them to be a a a P directional in fact if you look at the actual or audio short data communication it's really on the uni directional or whatever information flow was from left to right E and from right to left this more or less on relate i i i is that true only in of uh it communication require a stop are not only able to uh be present in a remote bottom environment but also that we are able to physically interact with objects or other people in this environment and this we can only meant to traditional or to and uh and sure um about the T used by the haptic modality which addresses dresses our sense of of touch J here the situation changes quite a to map because mouth the for what path and the back what pop and a longer independent but every information you send on on uh in one direction here has a direct impact on the information that flows back to you because the path a couple of for the environment or fruit the human and as many she and i but so that the and the ms P can to use an in a good position to address these challenges just a a a few as that might be interested for interesting for you to look at it it's the about up an M P three for have ticks yeah something that is exploiting the put set to a limitations of human haptic perception rather than what we are used to you and your that's to find objective quality metrics something that is to get a virtual in existing in a have the communication that's that the record and play back of a physical interaction session something that is fundamentally different from the easy recording and replay of audio and video but just to do with the couple of of the two path and other topics spread but were familiar with like our a control ever was so and C and set are and the them to to yeah have have to communication the second "'cause" trying and that what would like to mention he has my about or a localisation so basically the I to a couple of a local uh you location for which choose a be that you capture from a more about device that is equipped with a camera then to i would say the most promising approach in this eh is to infer your location by retrieving the most similar image from a to attack reference to that base so it boils down to a content based image retrieval problem but might say a content based image retrieval was something but has been studied extensively during the past menu first so what is it that the and the P community could contribute to that field yeah so that's take a a system perspective here and and of running such a system would be you extract salient features on your mobile terminal you by a compressed them you on mister of a farm in the network the prefer the actual image which tree will stop but um step in in on the server found use inference the back of visual words and try and location estimate to the mobile terminal if to look at the system perspective for the round trip time the output of compressed features uplink capacities is typically quite constraint and the process but in but F T we problem never be able to get this morning at a frame rate and which few location a a a twenty five um that's per second so to this community he could we up the system perspective and basically we consider the question a process should run at which point yeah what what is happening on the map about have an of what is happening in the network and what type of information is exchanged between the two D but to be able to get a push or can at a local solution uh i'm going to yeah we different but channel that the assumption so i'm that the at which is also it would be interesting to look that for must be community it is that now the query we in which that you have that you capture and the corresponding yeah i i for a or location that can be there sure but with different to just to illustrate that can to sit than in the names also required that is to kind of run most outside of our campus if you to it just what the see that chat was but cars but that's true to a sparse way we have uh a map out to buy so that might be motion blur the can image of the reference to that might be a recorded at different seasons went to and and spain um um the on the cover innovation that just in the scene might change so that two in just that match actually might be a to similar in most parts and you have to focus on the really similar parts different that oh some cooking a a program here i could the and the ms P and uh sessions i could identify if the papers that address this issue can be it after one in and the S P L two session and two already but presented us that day so i would say that is actually could quite some activity in the C of that with the like vector from thank you a card a so my presentation will be mostly focus on multimedia delivery i try to make a least of all D an important aspects for multimedia lever in in out of the lease is impressive because multimedia delivery actual response several different application domains some technology to make domains so we have a T V C for the signal presentations and network coding cross later optimization or or communication and working and H T T P and R to be for streaming several networks and several applications so for today i a couple of interesting things and so but i'll be talking about use mainly corporation be a P to few in working and had to the J at uh using uh and not that agenda to using H T T P stream so P to be peer has been around for some time now and essentially relates to exchanging information in the corporate way using a logical network connections are built on top of the physical network connection and a strong trend i'm see here that is going to significantly improve pure to pure performance is the design of better overlays to to look at the example of a left for example a a typical you the construction of an overlay use completely unaware all the underlying network topology and so peer might be connected to a not appear on the other side of the world and this increase still a and decrease performance well and it would be possible to use geographic graphic information to connect to use are close to each other you know to reduce the number of hops so that obviously proves the performance and this is the important thing and hours on going standardisation a force on that or the application layer traffic optimization or i'll to you know i E D S which is handled that and i to several essential use a database that takes as input a pair of to use and provides a put the set of metrics like band which and delay but allowed to estimate the each of the at the actual geographical be sounds between the two servers and the are more than that a P two P are also faces problems such fear reputation trust and security some of those problems of fun solution in the i'm source from the working area and so uh a a a a i i think interesting thing additional the social networking i us to peer to peer that's used as you know to help solve such problem have an important thing to use uh this is true for a to be a but also um P to P systems future systems should look at context information about the user such as a position direction and speed of the user because that information essentially defines a kind of informational content that we are interested in at a even time for example should take a train each turn uh try commute but not likely to be willing to watch a to our movie but rather something like but but a short content and that should be a to the search engines as well in such a way that's when we i get at least a but best matches for our query those best matches is are um um actually to to to you D current context of the user not just general preferences a the topic to to to be streaming means so for such a for for a long time multimedia delivery research has been focus on the use of the R T P protocol over U U P but the and had a proliferation of seven different networks like why max a three point five G but i five and so and and those networks are uh man it's by different for all but i was but to security problems of now have five males and five rows R let like to go to let U D P packet back as go through so this has given rise to a for research on H T T P stream just mark a different from a to P C you because in a typical the session you use to several that she's packets to the client we as as in H T P stream you use the kind that prince back it's from the server so that's that the different and could it's a cost to problems including for example you hand not at or control which is not so of is it more but it's client based so the the the have several challenges to to to H T to be stream we have to use that the it's to P stream systems are not quite as good performance wise yet uh with respect to conventional systems and if but if some proposed a new solution is that is it to demonstrate that is better than the existing solutions because it's difficult to perform live scale experiments on three sees a since it's so essential so and those a problem with multiple access where use or compared in for the same band "'cause" a user wants to get the maximum sure of the band and this can create problems like a a position all the available benny for time that has to be a and that's so its a a a a a problem of cross like optimization where yeah where for each T P to have a but i than the conventional T P three also so gives there are several standardisation you shoes like be and back dash for example and are interest so that a nice paper which just been published a couple of months ago on ieee into that computing that provides a nice introduction to these talk so these so so that in and there are many my and michael lease of touched upon some of them so my my will search because there are problems in multi applications of a cloud computing and comedy radio is the problem of quiet you the speech three and multimedia delivery and many more so that's just my a biased opinion only contract that's it so that we come to the fun part thank you very much a four i i your questions uh uh and uh i figured out who you who we want to address your questions to um would like to go for you could step up to the microphone are you could just raise your hand and she oh one of the challenges that you pointed out it is uh again gain again and found in all these applications is this an and you can't a he has been there for a lot of time so is a yeah and just and is if you don to to know what in the future we can at least three part of these K some of things that the such compute but so in the you models in the middle is a a a do a to sort of these is month again can have a you about yet is that are just anybody in any of them because is a a a a i think you calling problem when users in a loop first semantic a what but the or or at up and to to do that the uh in in in many possible way so uh uh uh not a uh a a in general uh what what they be a a a can be very in uh use of for the use uh is not too uh a single user in the loop but uh uh but the source show yeah the the the social show the the possibility of exploiting or the information contained it a social network is is a is a very problem arising this P um there are up then so of course a the big the big probably used these out to get the bills they them because the they that are not public of a hmmm a problems connected to the privacy of the the the so you can get the image of the like form but this the there is a lot of information for example the possibility of using dogs uh or or uh uh uh a site like a a we keep you get for example that that would be a relationship between feedings so uh and and all kind of information on that can be used at that in the statistical together stays the doing a each of the knowledge about the a about the B D a in general that and and call them you want to is is anybody used mechanical turk to uh do any multimedia processing oh so that's one way to put humans in the loop then i think i seen some uh some work in that direction other the questions are common right you see significant differences and the quality of lot is just blessed double the audible differences and haptic decoding the we normal and line users as a blind user that presumably using the sense a lot more and probably has no of first get used a haptic interfaces are a balance as the also that is as option yeah a a perfectly ripe there's lots of studies is that sure if but you know i thought up but one point for instance only to a loose their their side and so the first of the uh uh of a completely separate room or adaptation process and a rely more than on your haptic feedback for instance one gating and space since or or or you know interacting with with a and of course i think we we can see that with our work it's now with as they grow up with these touch display and to and the can you have already haptic interfaces that allow you to since for speed back by pensions a set on of of course when you grow up with that you can i have a very different at at you to put these types of signal for some of us and i'm not sure how many of you are in the room i have ever experienced haptic feedback to be not to many because we or quite similar but to the man is a natural to have that in in the can work for int so like i guess is a was like this next generation is much well much better prepared to do with these things and then we are and the other uh a more questions so i have one um so what is the um state state direction of work in a multimodal perception and its use in multimedia application it maybe i can say look at home to in the sense of what you be sure and and haptic and and to um because of the what's been known at least in the cycle cool a have the six were and it is you that if you have the sense of touch also being the true see but the degree of um i a few of to get the S that's one thing i a high degree of presence so you from a press in a virtual environment environment when the in a real environment but then there's also a cross modal effects but can exploit yeah is a to emissions been have a lot some the be input and the situation the maybe you men have realise for instance on the the sense of touch this really depends even with by an application on the specific state in just to an idea when you for instance approach a virtual environment with a virtual and if vector assume of oh have a tracking to of its the surface is something the use does you or as can touch with this sir first and the bush or a few bags and is not a member them and then and to experience the roughness and the you know the L as to city and and and and these kinds of characteristics of the surface takes a and if you would ask the you how you experience that environment no he would say normal i'm have lee experience think get so even with an application that can dramatically change from one time instant um um to another the as well so what score and um multimodal integration or perceptual learning were you basically learn to few modalities at different levels that's a level um since scene type of fusion but is happening but then does all so at the decision level uh and the kind of fusion but of this this particular a let's a observation is is is a and on the brochure sure but i have a contradicting stimulus in to have text and then it's very interesting how the plane actually decides what is really than the perception you have this is something that is the longer research or so in the new technology uh a main to find out it i'm the which seem to ancient which of the different modalities is really dominant and that might hold where or combined with all others think the questions does anyone working in multimedia networking no nobody's transmitting a media over networks oh now cloud computing anything okay so no questions for every cal i have uh i i have another question um i wondering at that a what's on the horizon in terms of a sensor is and also rendering devices that we ought to be aware of uh a a in sensors are really there's been a lot of interest on compressed sensing recently uh compressed sensing a lot a signals using random projections so that's a i mean a a simple and powerful way of representing signals so that's uh potentially interesting in the all sensor networks for example where or sensors are very power constraint you can take many computation i i be able to perform the compression on the sensor because you don't have that's a a computational power so in that area for example compressed sensing might help a construct a to be so can most for example that's it's are much uh easier to handle in terms of the processing than conventional cameras much less samples so much less memory computational requirements and so on and rendering anything the where new rendering devices okay any any um final don't questions is your opportunity i two so we've got a for thirty minutes uh a i hope that's uh and of some interest to you uh thanks very much for and