0:00:12come a three of the and
0:00:16i hope we're or you had good sleep
0:00:18after is to is for circular
0:00:20so back to this go i i i hope you enjoy it
0:00:24and do you your the the music there
0:00:27so why won't read or the names some the links a to repaid use but in case you want to
0:00:32have
0:00:33with brave guys and are also uh invited to your garden party
0:00:37there will be doing so the on the uh on the conference web page
0:00:42i was especially a it is harm core just band but
0:00:45that was playing in one of the rooms that was re reexamine
0:00:48people from uh
0:00:49ordinary secondary school mode even music second can very group
0:00:53so uh a good and you can you can visit the
0:00:56their rip sorry
0:00:57or just type harm core prague to do ruin
0:01:01you will get it
0:01:02uh there is one a
0:01:05warm
0:01:07mixed white please
0:01:13oh yes
0:01:14okay so one one practical remark remote the restaurant
0:01:18we had the sum compliance like to you in the plumbing you
0:01:22uh a short clues from you use the not be type of food i won't
0:01:26so uh the the the solution is to ask or or do not a a uh uh the information desk
0:01:32and she will be happy to a device you a rest around in be vicinity of these place
0:01:37or or or or or or the there
0:01:40no uh uh this at all
0:01:42or the practical in four
0:01:44and we will have for an invitation a two us very here you are a conference
0:01:49number you are most probably only is from a some you know of pixels
0:01:58we morning
0:02:00this morning i would like doing that are used to you
0:02:02that two thousand and two L international conference on merging signal processing applications
0:02:09or use by twenty two well
0:02:12my name is by most problem in is and i am the general so obvious park where B to out
0:02:18use but where and good where all these marked it works wrong
0:02:21it is a full quorum forums organise by the signal processing society
0:02:26it is expected to be the third major problem for some of the some side view
0:02:30but together with i ask for nice here
0:02:33but because it used
0:02:35you is not dressed primarily to practising engineers
0:02:39of course everybody is welcome wall
0:02:42i was uh going to be a course in practice problem
0:02:45something else that makes use by be my when i see here
0:02:49use that to do will be held every year here at the same place and time
0:02:53you "'cause" been plan one for one with a consumer electronics show
0:02:58so give her beam last very guns
0:03:00in early january
0:03:03when we say practising and you here
0:03:06my the and some more who has a master's degree
0:03:09in general signal processing
0:03:11i i is or the presentations and expected to be applications already it
0:03:16and more can thing as a very clear to
0:03:19to first a good participation from mean dusting use by we also include but as patients without that paper
0:03:27of course papers first can be some you two
0:03:29and that these papers would be are in the R you i drip we explore
0:03:33but we're one from is mark expect to be a if you're at the girl
0:03:38in another major component of the conference
0:03:40is that the most rooms
0:03:42what we call show and they are here by cast
0:03:45and in have you sure we will have to house
0:03:48plan i to present creations
0:03:50and by discussions
0:03:52you can find more maybe from in the call for papers that we have a the information of both here
0:03:57by cash
0:03:58or blues and see me
0:04:02so here are all invited you to submit papers for is is where well
0:04:06one more thing
0:04:07the corn firms "'cause" not go about to the local yeah
0:04:10we can you to submit your idea for a ago
0:04:13and that there will be a three hundred the world
0:04:16for the log which or
0:04:17so i hope to see you all next gen was very guess it would be fine
0:04:22thank you
0:04:28thank you nose
0:04:29and it's time
0:04:32i
0:04:33i'm i'm to start the
0:04:35brisbane is primary
0:04:37or speaker will be a the channel
0:04:39and the like colleague your
0:04:42and chambers
0:04:44technical program chair
0:04:45we introduce or speaker
0:04:47from a
0:04:57good morning uh light is in uh uh and from then i'd like to uh gain by repeating what ones
0:05:03as Z and for that was of you who attended the uh check the evening uh yes to the evening
0:05:09it was a about very special
0:05:11that a page in and the uh music uh uh well as a a special uh experience that with
0:05:17thank you very much to the local organisers work i in that
0:05:24a bit by great applied to to uh
0:05:27introduce a a a it it
0:05:29doubt this morning
0:05:31just a battery quick look at is uh uh uh C B
0:05:36or by a will show you that uh
0:05:39it's a sort of battered simple
0:05:41whose who or but uh
0:05:44some of the leading
0:05:45academic
0:05:46and industrial
0:05:49unit
0:05:50in the U
0:05:51he V C he's degree in the nineteen eighty two
0:05:56from
0:05:56print then
0:05:58but really
0:05:59and than food
0:06:02following that C work
0:06:03at
0:06:04at and T bell lab
0:06:06multi he'll you to see
0:06:09the about
0:06:10how out so
0:06:11with that so and uh in california
0:06:14and since nine
0:06:17ninety i
0:06:18he's work that uh the microsoft
0:06:21with a
0:06:22and uh in redmond washington U N
0:06:25where he heads the communications and collaborative systems systems research
0:06:32philip has being a wheel
0:06:34so than for the ieee signal processing society
0:06:39most recently each yeah the multimedia technical committee
0:06:45and old so it's a so on the signal processing society by committee
0:06:51each distinguish is a a with it B and twelve
0:06:55a side please
0:06:56piper what in ninety ninety three
0:06:59and more recently in two thousand and seven
0:07:03even a the best paper award from the ieee transactions on multimedia
0:07:10so thank you again guy in to fit it for agreeing to uh make this presentation
0:07:16i i and i'm sure the whole O D and is looking forward to is
0:07:20to all
0:07:21in that it communication
0:07:23it
0:07:31you very much for the introduction it's a great honour to be on the stage especially with fred jelinek over
0:07:36here he was an inspiration to me early in the career
0:07:39and thanks all of you for coming uh to the plan area to at and after
0:07:43have to the banquet i don't think i ever be in
0:07:46at M plan a right after the bank so this a new experience for me i'm kind of looking for
0:07:50to seeing what these plan is actually like
0:07:54so i'll be talking about immersive of communication
0:07:56um
0:07:57there won't be any equations
0:07:59in my slides
0:08:00no signal processing people sometimes don't really understand things and less there
0:08:04this math behind it so i'll give you some references
0:08:07uh
0:08:09signal processing magazine had the special issue in numbers of communication that came that in january you can look at
0:08:14that
0:08:15and just last week a bunch of us submitted a paper to the ieee proceedings as part of their hundred
0:08:20anniversary um both of these are numbers of communication
0:08:24so the last
0:08:25second
0:08:26paper here um the fines of communication to be
0:08:30exchanging natural social signals with remote people
0:08:33as in a face to face meeting in a weighted suspense
0:08:36disbelief
0:08:38uh in being there
0:08:41this is uh
0:08:42not a new idea by any means the
0:08:44telephone was invented a hundred and thirty five years ago and that was the first grade breakthrough an immersive communication
0:08:51it wasn't long after the telephone was invented a here's a cartoon that came out oh
0:08:55three years later
0:08:57uh that people were starting to look at much more immersive scenarios
0:09:01so this is a
0:09:02by george them are yeah a
0:09:03um
0:09:05the
0:09:06caption of the bottom
0:09:08read something like uh
0:09:10uh as as these uh
0:09:12parents are here in london in
0:09:14uh uh uh watching their daughter in ceylon
0:09:18play badminton
0:09:20and
0:09:21father says speech
0:09:23come over here i want to whisper
0:09:25she comes over
0:09:26that's yes
0:09:27pop here
0:09:28uh
0:09:29who who's that charming
0:09:30uh lady
0:09:32playing by charlie side
0:09:34uses a
0:09:35um she's just come over from london um i'll introduce you after the game
0:09:39so we don't really have this kind of immersive of communication system even today but perhaps something that comes fairly
0:09:44close
0:09:45are tell a present systems
0:09:47tell presence uh as defined by
0:09:50the industry conference
0:09:52newsletter wayne house review is
0:09:55a video conferencing experience that creates the illusion that the remote participants are in the same room with you
0:10:01so probably the quickest way to get a sense of what that means today is to take a look at
0:10:07a thirty seconds just go
0:10:09commercial
0:10:13yeah
0:10:14hmmm hmmm
0:10:17hmmm
0:10:22ooh
0:10:26i
0:10:27i
0:10:28i
0:10:31i
0:10:37uh
0:10:39so so one can be uh
0:10:44so here you seen
0:10:45high definition video conferencing so compelling that one participant has forgotten that his counterpart is
0:10:53is remote
0:10:55so this just go tell the present system uh
0:10:58uh and others
0:10:59uh like them H B halo another others
0:11:02um off offer
0:11:04high definition audio and video lots of bandwidth
0:11:07um
0:11:08to try to create this solution of being in the same room
0:11:12um
0:11:13is this a breakthrough in a of communication or is just a lot of uh
0:11:19high definition televisions and a lot of bandwidth well
0:11:21in a sense they're both
0:11:23okay so
0:11:24i contend that these are
0:11:26bridge
0:11:26uh to the future and we're about to see um
0:11:30uh a rapid progress in this area of for the next few years
0:11:34so put that in the context as take a look at
0:11:37um
0:11:38a brief history of television
0:11:40television
0:11:41was invented in its current form and
0:11:43nineteen
0:11:44twenty
0:11:45six
0:11:46uh
0:11:47as a counterpart to the telephone
0:11:49so here you see one of the first television set
0:11:52eighteen T uh bell labs
0:11:55uh
0:11:56next to a telephone
0:11:58because the television is meant to communicate
0:12:00as a visual part of the telephone
0:12:02so here it is
0:12:05eighteen T president walter gifford
0:12:07uh
0:12:08in a you are K at bell labs talking to herbert hoover then secretary of commerce and washington D C
0:12:14in nineteen twenty seven
0:12:16so the first
0:12:17distance television call
0:12:20uh and shortly thereafter the television became
0:12:24the broadcast medium that we know today whereas
0:12:27video telephony
0:12:28um took another forty years to become the eighteen T picture phone
0:12:33uh
0:12:34the picture phone was a
0:12:35a stunning
0:12:37technical success but also a stunning
0:12:39commercial failure
0:12:41so by nine teen seventy nine when i was
0:12:45in intern at bell labs
0:12:47down the hall from me on the desk of my lab director bob lucky was the
0:12:52last remaining
0:12:53working
0:12:55picture phone in the world
0:12:57but would often complain that nobody ever called them on it
0:13:04did a forty years after that
0:13:06uh experience is actually very similar
0:13:09uh
0:13:10except that now these little rectangular images are on general purpose computers
0:13:15um
0:13:16or or on your phone now
0:13:18um but pretty much the experience the same there small little rectangular images of video
0:13:24video conferencing
0:13:25similarly
0:13:26um has been
0:13:28the same
0:13:29uh
0:13:29the for the last forty years in some sense here's the bell labs video conferencing system in nineteen sixty seven
0:13:35you can see the round tape around
0:13:38tables and multiple monitors cameras on top of
0:13:41each monitor or the
0:13:43data monitor
0:13:44um which looks very similar to the
0:13:46you know
0:13:47the recent tell presents as
0:13:48today
0:13:50so
0:13:52why do why contend that
0:13:55there's about to be a series of rapid breakthroughs in immersive communication when
0:13:59the last few years not much as happened in visual communication well
0:14:03several reasons
0:14:05the first is the internet
0:14:06okay so the internet
0:14:08has caused a divorce between
0:14:11the format
0:14:12of the content and then medium over which it is scary
0:14:16so when the past telephone calls were carried over telephone networks
0:14:20television over television networks are over
0:14:22radio and so forth
0:14:24today day all those are carried over the internet so there's a big possibility now
0:14:29of
0:14:30inventing arbitrary formats and they'll all be carried over in
0:14:34uh the second is of course the cost of computation band with an resolution has a in dropping exponentially for
0:14:42so long
0:14:43and that there essentially free now compared to what they work twenty years ago
0:14:48and the third reason is the technology
0:14:51evolves faster than biology
0:14:53okay so
0:14:54they will become
0:14:56bill be a threshold at which
0:14:59the number of bits per second that were able to capture transmit and render
0:15:04uh sir is what we can actually pass through the neural cut set around our bodies okay so at that
0:15:10point
0:15:11um
0:15:12we should have freedom to do whatever
0:15:14we like
0:15:15and
0:15:16the question is
0:15:17what do we want
0:15:19the future of you min
0:15:21communication to look like at that
0:15:24a
0:15:24so to the extent that hollywood is the
0:15:27uh keeper of our collective dreams
0:15:30you know
0:15:30be answer that hollywood
0:15:32would give is that they want
0:15:34communication to be immersive whether it's like
0:15:37uh
0:15:38the uh a hole attack in star trek or the jedi council meetings and star wars or
0:15:43the matrix in the matrix or
0:15:45um
0:15:46in in avatar
0:15:47um
0:15:48communication should be mercer
0:15:53however
0:15:53not all communication will be numbers
0:15:56okay we will continue to send S M S messages
0:16:00uh and the reasons are that there you know there are some reason such as privacy that we we we
0:16:05don't want to send everything
0:16:07about about so this is anticipated you can buy
0:16:10the jetsons and nineteen sixty two
0:16:13so here's
0:16:13change S and putting on her vanity filter
0:16:16before she makes an early morning video call to her friend gloria
0:16:20unfortunately flores
0:16:22on vanity filter has an embarrassing
0:16:24mel function
0:16:26then there's one of my famous my favourite cartoons
0:16:29uh
0:16:30it's this
0:16:31uh
0:16:31from pete steiner in the new york in nineteen ninety three
0:16:34i'm the internet
0:16:36nobody knows you're a dog
0:16:38no know that's
0:16:38still pretty much true today there reasons why uh we don't want to review reveal everything about ourselves
0:16:45and yet for the more complex
0:16:47human interactions
0:16:50a more creative things that we do with each other
0:16:52um we need more margin
0:16:54because we're social animals we have of all to work with each other best face to face
0:16:59so how we position ourselves relative to each other
0:17:02no where we sit
0:17:03what how we just your
0:17:05are are gaze
0:17:06um awareness of all of those things they're all very important
0:17:11and these have been studied um
0:17:13but social psychologists and
0:17:15and others engineers who build system
0:17:18for example uh the work by no i and and can
0:17:21and them the berkeley multi view project it build a system that
0:17:25uh preserves i gaze
0:17:26teleconferencing and they've shown that
0:17:28trust improves
0:17:30uh with correct i gaze
0:17:32so
0:17:34i gaze is bin
0:17:36uh
0:17:36a subject a video conferencing for
0:17:39a long time there many solutions
0:17:41have meters cameras behind screens
0:17:44you interpolation and and many others here's an example of view interpolation where there's an upper camera
0:17:49you when a lower camera view one they can be interpolated to get
0:17:52um
0:17:53better i gave
0:17:57not only I gaze it's important it's uh i
0:17:59reference and gesture
0:18:01so here some examples from the early nineties
0:18:04tongue and min "'em" and showed it chi ninety one
0:18:07a a system that
0:18:08had had a shadow of the remote collaborator behind the the collaboration surface
0:18:13and a year later
0:18:14you she had a same is clear board
0:18:17um where the actual video of the
0:18:19of the remote collaborator
0:18:21was projected onto the
0:18:22on the for the shared sir
0:18:26here's an example uh that's
0:18:28um
0:18:29more recent P H P connect board
0:18:31which
0:18:32uh
0:18:33is an update on it she's clear board
0:18:35in that
0:18:37perfect i gaze is established
0:18:39by tracking
0:18:40the position of the
0:18:41observers had as well as the eyes of the
0:18:44uh a of the
0:18:45the uh a person in the video
0:18:47so that
0:18:48um um
0:18:49the eyes of the on the video are always on the um
0:18:52path
0:18:53between the observer and the camera which is located time
0:18:56the screen
0:19:00at there many other
0:19:01cues the uh days and gesture
0:19:05um
0:19:06both auditory and visual starting perhaps with
0:19:09peripheral awareness of what's going on in the room
0:19:12and consistency between the local space in the remote's space
0:19:15so if you look back to at
0:19:17the tell present systems today
0:19:20you'll see that
0:19:21if these preserve those first two characteristic
0:19:24so
0:19:25per full awareness is provided by very large
0:19:29tell it
0:19:30okay so we know what's going on and that remote room
0:19:34um
0:19:36consistency between the local room and the remote room as
0:19:39is preserved by
0:19:40uh making the
0:19:42the desks look the same painting the walls to be a the same colour and all the different room so
0:19:48uh
0:19:50no difference between the back
0:19:52of of of one room and the back of another
0:19:54um
0:19:56so
0:19:56if you talk to the H P
0:19:58guys
0:19:59um they'll tell you that the halo system which was one of the first
0:20:02which was probably be first um system in this category to come out
0:20:06that was the collaboration between H P and
0:20:09the film the hollywood studio dreamworks
0:20:11and
0:20:12you know how we what is is set you know be know how to
0:20:15they know how to um
0:20:17uh
0:20:20use "'em" uh you um use illusion just suspend spend one's disbelief in actually being there
0:20:27but there are many other
0:20:28immersive of cues uh spatial cues
0:20:31um for example uh distance
0:20:34uh in audio is is uh indicated by
0:20:38a combination of relative loudness
0:20:40uh direct reflected energy and direct reverberant energy ratios
0:20:44as we can see and in these uh these clip
0:20:48oh yeah it's
0:20:50right and i seven is does two yeah
0:20:54so that one's
0:20:55presumably further away than this one
0:20:57oh yeah face
0:20:59a in the next set is
0:21:01does a to that was is probably adjusting the volume
0:21:05but you can
0:21:06uh i'm
0:21:07and of course direction is is given by in true inter aural temporal role and interaural intensity differences between you
0:21:13two years
0:21:14um
0:21:15and on the visual side
0:21:17um there are many many is
0:21:19cues on give you some examples of those
0:21:21relative size and perspective combined to give you a sense of
0:21:25distance
0:21:26and um also absolute size
0:21:29this of the same cues it are used in the famous
0:21:32aims room illusion
0:21:34uh occlusion lighting and shadow are also very important first giving a sense of
0:21:39at distance
0:21:40um
0:21:41in augmented reality it's very important to be able to paint you objects into a scenes such the
0:21:46occlude the background and yet uh do not occlude the foreground
0:21:51uh lighting in shadow were very important not only for real is some but also
0:21:55for um actual depth perception so if you look at this picture of the foot prints in the and
0:22:01um
0:22:02lighting and shot oh
0:22:03give you an impression of one
0:22:05uh a print being impressed into the the and the other one
0:22:09uh
0:22:10being above above the sand
0:22:13so
0:22:14a show of hands how many of you
0:22:16perceive the top foot prints to be the one that's impressed into the same
0:22:21into into the same
0:22:23okay
0:22:24and how many the bottom
0:22:28okay uh
0:22:29maybe one percent
0:22:31think the bottom a ninety nine the top
0:22:33um now keep your eyes fixed on that
0:22:40"'kay" don't blink because if you blink you probably change your mine
0:22:47okay so
0:22:49even even shot all uh a has something to do with depth perception
0:22:54uh focus as well so use a very large
0:22:57city D china
0:22:58uh but if we change the a focus
0:23:01the foreground background it suddenly shrinks down to look something like like a little model that might be
0:23:06uh
0:23:07next year
0:23:07model trains set
0:23:10of course there the the uh
0:23:12very strong cues of skates star got be and motion parallax
0:23:16so some of you may be able to cross your eyes and fuse these images
0:23:20and you get a very strong sense
0:23:21uh of of dat
0:23:23um
0:23:25and all show an example of motion parallax later on
0:23:27but
0:23:28so are these cues uh important for communication
0:23:32that's the question and i contend that the R
0:23:34um clearly are current communication systems do not convey this information
0:23:39and clearly they are not
0:23:40satisfactory are are are not uh
0:23:44um
0:23:45no the can't fulfil every need for communication that we have we continue to meet in person and whenever we
0:23:51are able
0:23:52we continue to travel to work
0:23:54we continue to
0:23:56attend parties and
0:23:58weightings in person we continue to come to meetings like this you know why don't we just
0:24:03email all are talks
0:24:05our slides to each other
0:24:06and just a stay at home well because there's something
0:24:10valuable about we feel about being here in person
0:24:13um and our current communication systems don't convey that
0:24:16but i
0:24:17believe that um the tools of signal processing will help us
0:24:20um bridge that yeah
0:24:24so
0:24:25if we're of a physicist you might take a different approach
0:24:32case of this is is uh i've read have uh actually succeeded in transporting matter
0:24:37uh a small molecule
0:24:39um over us
0:24:41significant distance
0:24:42um and this is how it would work if you were to use it for uh
0:24:46uh transporting you know somebody
0:24:49so you me too
0:24:50a different part the world i would step into a matter transporter
0:24:53i would be frozen down apps the zero
0:24:56i be turned into a little break of
0:24:59bowes einstein compensate inside
0:25:01in the process of schooling down i would it be light
0:25:03light would be transmitted to another place
0:25:07would shine on another little break
0:25:09of bowes einstein kind and set
0:25:11and then
0:25:12my original state would be reproduced
0:25:15of course if i
0:25:16don't
0:25:17plead lee cool down to zero i may leave some state behind hind uh with unforeseen consequences
0:25:25it's also sobering to think about
0:25:27uh what quantisation and data compression would do on um on the way
0:25:32rate distortion optimized mpeg four thousand
0:25:38um
0:25:38but at microsoft research for mostly computer scientists and signal processing people so
0:25:43we take a different approach which is
0:25:45we do the
0:25:46the um transportation uh uh matter virtually rather than physically
0:25:51um and we achieve that through court a transformation
0:25:55so if you imagine
0:25:57a person in a real space over here
0:26:00um
0:26:01he's got some court
0:26:02system in as local space and we just do a coordinate transformation
0:26:07uh of that over into to some
0:26:09no map that coordinate system into the coordinate system of some other states
0:26:12an if we do that for several people then we get them into the same
0:26:16the same
0:26:18so depending on whether the space that there being transported into
0:26:22is a virtual or real
0:26:24um it leads to two different approaches
0:26:27uh i'll
0:26:28um
0:26:29talked about both of these you might sort of a
0:26:31identify the first one with
0:26:33the movie the matrix if you're familiar with that
0:26:37everybody goes into a a virtual world
0:26:39um
0:26:40the second approach could be identified with the movie avatar in which you send
0:26:45a physical representative of yourself into are real physical
0:26:48space
0:26:49so i'll talk about
0:26:50um
0:26:51each of these
0:26:52starting with the first one
0:26:54so the virtual or
0:26:55the matrix movie approach
0:26:57um can be applied to what we call fully distributed meetings
0:27:01so we have people who are
0:27:03and completely separate places around the world maybe your
0:27:06you when your colleagues are writing a paper together and you're sitting in your office is
0:27:10different universities
0:27:11and you wanna talk to each other and the ideas to make it seem as if you're sitting around a
0:27:16see the around the same
0:27:17table
0:27:18um talking to each other
0:27:21so the way we are doing that is by
0:27:24capturing geometry of each person
0:27:27doing a court transformation to put them all into a common scene
0:27:31a life size scene
0:27:32and then you give each person a window on that's scene it's basically there display
0:27:37and uh each person gets a
0:27:40you know i
0:27:41own personal view of that scene
0:27:43um
0:27:44and uh people interact through the
0:27:47in this way preserving i gaze gesture direction and so forth
0:27:51but doesn't matter what that the projection is
0:27:53on on uh
0:27:55a flat
0:27:56surface or
0:27:57a two a multiple monitor situation or maybe even a curve
0:28:00screen so
0:28:01well we've done is build a a
0:28:03a
0:28:04wide field of view curved screen
0:28:06um and from a particular point of view make it appear is if
0:28:10uh uh vocal task is extended
0:28:12uh in space
0:28:14and then when we capture
0:28:16people's geometry using a depth can where that situated on top of the
0:28:20the screen
0:28:21then we can position those people around in space as if they're having
0:28:25a meeting so if you've never seen
0:28:27um
0:28:29what images from a that's camera look like
0:28:31this will give you
0:28:32you give you sense use up
0:28:34here's the still frame from a
0:28:36adept camera
0:28:38see you can
0:28:39render or uh render person from an arbitrary view
0:28:43okay so basically that's what we do capture
0:28:45geometry and texture
0:28:47of um
0:28:48of each participant
0:28:50place them in there's
0:28:51virtual world
0:28:52and then um
0:28:54what them collaborate so here's
0:28:55here's an example of what you might see if you were telling one of these meetings
0:28:59your
0:29:00collaborator sitting across the table from you
0:29:03uh
0:29:05maybe a on the wall on the right there some artwork it's glowing to reflect the mood of the conversation
0:29:10in the room
0:29:11on the back wall perhaps as a large active computer display that can be used for the
0:29:17uh for the meeting
0:29:18um a on the left wall here might be a window out on to
0:29:22the uh
0:29:23skyline of singapore in this case or wherever L C may wanna hold your meeting
0:29:27as other participants join the meeting
0:29:29um the people in the meeting can move around the table to make room
0:29:33people can bring in their own data
0:29:35other people may join the meeting and they come in from ordinary web cams or so so they don't actually
0:29:39have their dept associate with them but they can be represented as
0:29:43as a virtual
0:29:44displays here
0:29:46um
0:29:47data such as this what's showing in the centre of the table here can
0:29:50float above the
0:29:51the table and be manipulated and so forth so
0:29:54these are some of the things you can do
0:29:56um
0:29:57uh uh in this uh
0:29:59this scenario
0:30:02so that
0:30:04image of course um was
0:30:06i i didn't show was taken from one particular static point of view but as people move around
0:30:11uh
0:30:12the image is move so
0:30:13um here's a video that shows
0:30:16uh this the fact of motion parallax
0:30:21so by tracking the position of the camera in this case
0:30:24um you can change what is poor
0:30:27presented on and ordinary
0:30:29um
0:30:30display
0:30:31so that the peers as if there's some that's find the display
0:30:36oh this is just using motion parallax to give that sense
0:30:38depth
0:30:50the same can be done with a audio actually
0:30:53okay so you can
0:30:54uh
0:30:55do you head tracking
0:30:56by
0:30:57you know visually
0:30:58find out where where persons head is
0:31:01point
0:31:02and then um
0:31:05a table look up to find out
0:31:07the relationship between that person's head and uh
0:31:09and this
0:31:10and uh uh a virtual source
0:31:12get the H R T S for that and find out what's the signals
0:31:15that you want to each year
0:31:17similarly you know what the relationship of the head is to each of these loudspeakers
0:31:22and then you can
0:31:23or to
0:31:24transfer matrix invert inverted
0:31:26do some crosstalk cancellation
0:31:28and
0:31:28produce at these loudspeakers
0:31:30signals that depend on
0:31:32where your head is
0:31:33at
0:31:34uh so that the
0:31:36the perceived
0:31:38location of the virtual source stays put
0:31:41so it doesn't drift into one of the that the lots and the one of the speakers as you move
0:31:45towards one speaker
0:31:46apple
0:31:48so we've
0:31:49done experiments on different form factors as well
0:31:53this is an example
0:31:54of the
0:31:55of a wall
0:31:56where uh we've projected onto a whole a graphics screen the same kind of thing you might see in a
0:32:02high tech show
0:32:03you know that store
0:32:05window
0:32:06um
0:32:07and
0:32:08we try to make its
0:32:09a appear
0:32:10through motion parallax and stereo stereo parallax
0:32:14um that the person is standing in the same room with you
0:32:22so
0:32:23i want this
0:32:23take a
0:32:24five minute interlude here to talk about that the cameras because that's cameras at had a recent breakthrough
0:32:30in
0:32:31um
0:32:32in cost and performance
0:32:34so there are new sensor that's really of air are available uh very red to everyone here just like a
0:32:39uh a a webcam cam might be
0:32:41so let me tell you about them
0:32:43um
0:32:44you might have heard about the can act for X box three sixty
0:32:47it's a device
0:32:49like this
0:32:50um
0:32:51that makes
0:32:52use the game player
0:32:53uh the controller
0:32:55and what it was introduced in october
0:32:57last year
0:32:58it's sold
0:32:59eight million units in the next sixty days making it the
0:33:03uh a fast so link consumer electronics device in history
0:33:07reading it some position and the guinness book
0:33:09world records
0:33:11a
0:33:32oh
0:33:33hmmm
0:33:34hmmm
0:33:35and
0:33:45a so that's enough for that
0:33:47uh
0:33:47basically it's a
0:33:49it's a the camera
0:33:50okay
0:33:51so at every
0:33:52pixel it produces a their distance from the camera to something in the scene so it enables you to
0:33:58uh
0:33:59with do uh
0:34:00some skeleton analysis for example
0:34:02of where what people are doing enable you and labeling them to control
0:34:06the games
0:34:07so under the covers
0:34:09it's uh
0:34:11it's this kind of device we have
0:34:13um a regular
0:34:15rgb camera
0:34:16we have an infrared camera
0:34:18uh
0:34:19that's pretty much just like the rgb camera except that has a filtered front of it
0:34:23ooh allowing to see only infrared
0:34:25there's a infrared projector that projects
0:34:27and they'll pattern on the scene
0:34:29and by correlating that known pattern with the observed infrared image um
0:34:34you get a disparity at each pixel in the data
0:34:37also in this unit is a a
0:34:40uh design by even to even on you have who's
0:34:43possibly here
0:34:45uh
0:34:46and
0:34:47uh there's a motor or dsp unit U is P
0:34:50uh
0:34:51use P port so the motor tips the
0:34:53the unit up and down
0:34:56and all of this comes for a hundred fifty dollars
0:34:59so
0:35:00um
0:35:02it's not surprising that within two weeks after it was a
0:35:06it was released
0:35:07it was hacked into
0:35:09and people it up to their P C's
0:35:11and did all sorts of crazy things
0:35:13with it
0:35:14um there's a web here
0:35:16devoted to connect tax
0:35:18and
0:35:20by
0:35:21wow one month after release
0:35:24uh
0:35:25uh there were about ninety projects
0:35:27um um on that site
0:35:29uh
0:35:30in three months there work twenty four pages of
0:35:33the projects uh yeah as of yesterday there were forty seven pages the projects and these
0:35:38these are all crazy things like a um
0:35:41um for
0:35:41robot but
0:35:43uh navigation to um
0:35:45to seeing for the blind
0:35:47two
0:35:48greeting
0:35:49storefront windows that react to to people wandering in front of them uh and so forth and so on so
0:35:55if you're interested um
0:35:57in using this for some of your signal processing i invite you to jump in
0:36:01uh
0:36:02uh there is an open source uh driver or but
0:36:05it doesn't have any of the scale to tracking error or or eve ons uh
0:36:10uh
0:36:12mike array processing
0:36:13uh if you want that
0:36:15you can wait if few
0:36:16weeks uh
0:36:18i believe will be having a
0:36:20um non commercial
0:36:22uh S T K available for all of you to use uh
0:36:25free of charge
0:36:31so
0:36:32i talked about
0:36:33uh
0:36:35talked about how we were using depth cameras
0:36:38uh for
0:36:39burst of communication here's a here's an application that's very closely related
0:36:43and it will be actually released two
0:36:46um X box live gold subscribers
0:36:48um
0:36:50uh sometime this spring
0:36:52so i thought i'd show that to you
0:36:53it's called avatar connect
0:36:55uh rather than showing you an advertisement i've recorded my own video here
0:36:59uh this is corey she's talking to me on her
0:37:02right and to sash on her left
0:37:05her her images
0:37:06the image of her avatars
0:37:07uh in this picture in picture thing
0:37:12i
0:37:13i
0:37:15no who can see here avatars a her emotions
0:37:21so in addition to
0:37:22uh
0:37:23in addition to um still to tracking of arms we have to do facial expression tracking so
0:37:29you know who's uh
0:37:30uh uh when people are talking
0:37:32you have to figure out where they're looking so you can at intimate their avatars and so forth
0:37:37and you can imagine that um
0:37:40uh
0:37:42E there uh many other
0:37:45uh backgrounds you can use uh for this
0:37:47for this uh
0:37:49avatar can
0:37:50thing
0:37:50so you can do a uh
0:37:53a uh
0:37:54talk show for example with your friends uploaded to you to and so forth
0:37:58so you can imagine that as
0:37:59uh the avatars get more and more
0:38:02um
0:38:02sophisticated
0:38:03eventually we may move into any era where inverse of communication is something like a mass of multiplier video game
0:38:10where
0:38:10you know you enter you enter into this world
0:38:13um
0:38:14and you have terms are controlled by your actual motions your expressions
0:38:18and you're
0:38:19oh case alone let me
0:38:20um
0:38:21return to the second
0:38:23approach
0:38:24um this is this
0:38:25real or so called avatar like
0:38:28avatar movie approach
0:38:29um where you send a physical representative of yourself
0:38:33two a real space
0:38:35so this works well for so called and satellite meetings where there
0:38:39several people around you a real space like a real conference room
0:38:42and there are some remote attendees
0:38:45um called satellites lights and they sense some
0:38:47proxy in the me into the meeting
0:38:49so we call these embodied social proxies
0:38:52here's an example of an early
0:38:55uh
0:38:56early device
0:38:57uh
0:38:59a regular L C D used for the
0:39:01the face
0:39:02uh
0:39:03fisheye cameras and pan tilt zoom cameras used for eyes
0:39:07phone for mouth
0:39:08and years
0:39:09all of this is put on a cart
0:39:11is a computer
0:39:12wifi a battery backup and so forth so they can that end
0:39:15meetings for you
0:39:17and these are
0:39:19uh our current meeting
0:39:23we have four of these cards in this particular room
0:39:26uh one here one here one here and one out of sight
0:39:30we have collaborators
0:39:31um
0:39:32around the world john it's and silicon valley and over here's and cambridge
0:39:37england
0:39:37the you know the meeting room itself is in is in redmond washington near seattle
0:39:42um
0:39:43some of these are
0:39:45um on
0:39:46on cards that have to be wheeled in
0:39:48this one is actually robotic it walked through the door on its own
0:39:53the view from these carts of the room
0:39:55look like this
0:39:56okay so you get a wide field of view
0:39:59peripheral awareness of what's in the room
0:40:01and if you want to see it some detail like what's a a on the
0:40:05screen
0:40:06um
0:40:07you click on the screen the pan tilt zoom camera goes over there and gives you a high resolution version
0:40:14so we use these not only yeah for our own meetings but we've done studies on them across real
0:40:19groups
0:40:20that are
0:40:21using them so we deployed then in for different microsoft product development teams where one of the
0:40:27members of the team was remote
0:40:29we the plate them for an eight week period
0:40:32uh
0:40:33conducted interviews and surveys at the beginning and at the end to determined
0:40:37um
0:40:39whether people use these things
0:40:41uh
0:40:42and by
0:40:44asking questions on the surveys that are based on the seven point likert scale
0:40:48yeah um
0:40:49we determine meeting effectiveness awareness of social
0:40:52aspects by asking questions such as
0:40:55i think
0:40:55X has a good sense
0:40:57of my reactions
0:40:59or
0:40:59for awareness
0:41:00i'm where
0:41:04i'm aware of what X is currently working on and what is important to X work social aspects that as
0:41:09i have the sense of closest to
0:41:11and so by comparing this surveys that the beginning and at the end
0:41:15we can determine
0:41:16you know uh
0:41:17whether they've improved or not
0:41:19and
0:41:20um
0:41:21uh quite dramatically we see a lot of improvement uh a across
0:41:24all the four different groups that we put
0:41:27we put them into one also different uh
0:41:29category
0:41:34these are verified by user comments
0:41:36um
0:41:37when user for example
0:41:38so that um
0:41:40it actually succeeds in creating a workable allusion
0:41:43but the remote T member is in the room
0:41:44that overcomes the barrier
0:41:46physical distance
0:41:48and direct observation also showed that
0:41:51in contrast to
0:41:53the uh
0:41:54the audio conferencing that they get been using before
0:41:57is now allows rapid turn taking in conversation
0:42:00um
0:42:01following of what's going on the whiteboard and brainstorming
0:42:04resolving issues uh at the meeting instead of waiting for the next time the person
0:42:09uh a of is it
0:42:10um
0:42:11saving some trips
0:42:12and also assisting not native
0:42:14english speakers to understand what's have
0:42:18uh and all of all of the
0:42:20uh proxies were named
0:42:22and given hats
0:42:25and so they became
0:42:26part of the T
0:42:29these are not a new ideas um
0:42:31in the early nineties
0:42:32uh bill buxton then that be cell and a
0:42:35at their hydra system
0:42:37uh more recently paul of that the bed back um introduce the really cool uh
0:42:42volumetric display
0:42:44uh
0:42:46a few ks
0:42:46uh you and C
0:42:48it's been doing some work on
0:42:50and am at run X
0:42:54uh and
0:42:55there are tell a presence robots
0:42:57and they've even a an to the
0:42:59a boy to attend school even though he has some immune deficiency
0:43:03that was reported by C N and back in january
0:43:06and ieee spectrum and than your times last september both had cover articles
0:43:11on these tell a presence robot
0:43:14user comments in both of those articles feel pretty much the same things that we get also already observed so
0:43:19eric oh itself from
0:43:21ieee spectrum
0:43:23says that
0:43:24you know you participating is a row by you feel you get people's attention
0:43:28and there's a better sense of being there
0:43:30and
0:43:31mike bells there
0:43:32so you get the same kind of or interpersonal connection that you'd have
0:43:35um
0:43:36as if you were at the meeting
0:43:38a
0:43:39interestingly
0:43:40even in the one you know and all those cases
0:43:42in the new york times in and spectrum
0:43:45uh
0:43:45the device is for given names and they were given had
0:43:51are many issues
0:43:52uh
0:43:54save these one i went down to visit any but
0:43:57robotics in uh
0:44:00uh in silicon valley and found that they had one of their humanoid robots
0:44:04lashed to wall
0:44:06after had punched a big hole
0:44:08uh which are still visible in the wall
0:44:10a next to it
0:44:11so
0:44:12um
0:44:13there some safety issues with robots
0:44:15wifi coverage you know what happens when robots from
0:44:18walk out of
0:44:19at of range or run out of power
0:44:21um can make keep up with people who are walking
0:44:24you know how do they open doors or get on elevator
0:44:27you know you don't want people's hacking into your robot button taking it over
0:44:31um and their social issues like well do you want them to
0:44:34C better than humans here better than humans
0:44:37um what height should they be at is socially acceptable to
0:44:40to touch them and move them you know so there are many
0:44:43issues um including um
0:44:45how close should they come to looking
0:44:48human
0:44:49uh and maybe you've seen in ieee spectrum some of you should grows robots
0:44:53these are quite fascinating
0:44:55uh
0:44:57and the not just wax models they actually move
0:45:09i
0:45:09i
0:45:10oh
0:45:12i
0:45:14and so this is a gemini T K here is a
0:45:17is professor have wreck sharp
0:45:19at all berg university
0:45:21and apparently according to ieee spectrum has wife says that
0:45:24she prefers body number one
0:45:27but she's suggest that we should always send body number two to the conferences and stuff
0:45:33so don't be surprised if it future icassp cast you see you know some
0:45:39some robots wandering around
0:45:42so
0:45:44so far i've talked
0:45:45pretty much about
0:45:46two ends of a spectrum and one approach we've had
0:45:49uh these virtual
0:45:51you know everybody
0:45:52transports into to virtual space and on the other approach
0:45:55you uh transport into a real space through your proxy
0:45:59um um
0:46:00in some sense the first is about virtual reality which was
0:46:03uh
0:46:05promulgated by are and land here and the early eighties
0:46:08it's about embedding people into the computers world
0:46:11um
0:46:12the second one is about
0:46:13ubiquitous computing promulgated by mark wise or in the late eighties
0:46:17and so about embedding computation computational devices are computers
0:46:21in are world
0:46:23and so there's really um
0:46:25uh uh these are really at two extremes and i think what's really interesting is is what could be in
0:46:30the middle and the first time might became
0:46:32aware of that space
0:46:33was attending
0:46:35uh a a talk at stanford in that in the early nineties
0:46:39on um
0:46:41uh V V are male virtual reality
0:46:43markup language um it just been invented in the
0:46:46the inventors there were
0:46:48were discussing
0:46:49the virtues of it in showing as um
0:46:51how they had
0:46:53um
0:46:54made a model of the entire
0:46:56university and they were showing is how they could
0:47:02how we could um
0:47:03navigate their way
0:47:05through it
0:47:10and so they wander down the halls of the university and then came to a door
0:47:14of the conference room where we were sitting
0:47:18and
0:47:20i expected that they would just kind of open the door N
0:47:23inside we would
0:47:24you know we would see all of us sitting inside the room
0:47:28um
0:47:28and so i
0:47:29got a nervous feeling about oh maybe behind me through the door will come some giant eyeball ball and then
0:47:36there all be all these people behind a kind of looking at this
0:47:39so i realise that there's is big gap between
0:47:42uh
0:47:44the virtual reality part and and and the physical reality
0:47:47um and so that's an area that
0:47:50uh
0:47:53at scenario i think it'd be
0:47:54uh need a lot more exploration
0:47:57i'll just say briefly that we've been doing some some work in that area
0:48:01um
0:48:02we've called a parallel worlds that one point it's not overlaying real and virtual worlds on top of each other
0:48:07and the people to sort of cross back and forth between one and the other so how might
0:48:12people walk
0:48:13through plaza for example
0:48:15or tent
0:48:16no go to a museum or real these are real places how mike the attendees these remotely
0:48:21and
0:48:22uh experienced um remotely and have also have the people in these real places experience the remote people as if
0:48:28they were there
0:48:30uh
0:48:30i you know in the real space
0:48:33so
0:48:34we you know we'd had applications to attending weddings for example
0:48:37or trade shows and conferences and going to poster sessions how might you have
0:48:41a virtual person come to your poster and you could converse with that person as if they were there
0:48:45so the a lot of interesting signal processing is use how do you instrument the space to capture the
0:48:50the audit
0:48:51tori pardon the visual part
0:48:52um so you can represent them remotely
0:48:55and more
0:48:55a more difficult thing is actually how you
0:48:58rent their remote people into a real space
0:49:02so i
0:49:03want to include
0:49:04here
0:49:05um i've time talked
0:49:07uh
0:49:09mostly from a technical point of view about a of communication
0:49:12um here all just
0:49:14talk about
0:49:15uh this is societal need so
0:49:18climate
0:49:19energy and environment are are L
0:49:21the big words of the day
0:49:23uh i flew here
0:49:25uh from seattle
0:49:27it's five thousand miles
0:49:30why other five thousand miles getting back and the amount of C O two i've released into the atmosphere
0:49:35uh
0:49:36is equivalent to all my local transportation
0:49:39you know i've done this uh
0:49:41just for this week i've used up my years quota of local transportation
0:49:46uh
0:49:48uh probably all of the you know many of you of also flown from overseas
0:49:52um
0:49:53collectively i would say we probably released into the atmosphere coming to this conference
0:49:58about
0:49:59a thousand times
0:50:00of C O two
0:50:02so
0:50:02i don't think it's something we can afford really into the future
0:50:06um in terms of productivity
0:50:08you know i spent
0:50:09three days getting here for for a meeting so that
0:50:12ratio is not really good
0:50:13um but even on a day to day
0:50:16uh basis in at least in the us
0:50:18people spend close to fifty minutes per day on average
0:50:22uh
0:50:23uh commuting
0:50:24perhaps about that ten percent of their work
0:50:26their work
0:50:28um
0:50:28information workers though spend about fifty six percent of their time in communicating with other people
0:50:34um
0:50:34sixty percent of their time feel that they could do there
0:50:37job duties these just as well it's some remote location for example home so there
0:50:42there are things we can do about that
0:50:44and
0:50:45uh
0:50:46thomas friedman so the world is flat but it actually needs more flattening in today's economy we need to bring
0:50:51people to jobs and jobs to people
0:50:54in this is physical security
0:50:56i i guess the volcano one iceland lent started to rub thing again a but a year ago
0:51:01uh a little over your goal for period of three weeks
0:51:04it's uh
0:51:06cost the cancellation of a hundred thousand flights
0:51:09stranded it eight million people across europe
0:51:12uh many of you may have been among that group
0:51:15uh
0:51:16and uh cost the airline industry to one a half a billion euro
0:51:21so uh the economic damage is real
0:51:24um
0:51:25you remember
0:51:26uh i two thousand three was cancelled because of sars so
0:51:30if you know if there a viral out breaks
0:51:32yeah um they can shut down whole economies
0:51:34um
0:51:35uh uh not i mean is N
0:51:37earthquakes
0:51:38uh
0:51:39you know these things can also have a big
0:51:41uh
0:51:42a big um
0:51:44a a fact
0:51:45um and terrorist threats right
0:51:49to more traditional reasons for having uh tell a presence and numbers of
0:51:53um communication have to do with
0:51:56uh bringing families close together
0:51:58and then um
0:51:59pretty health care and education
0:52:01uh around the world
0:52:04so
0:52:05um
0:52:07i'll just uh
0:52:08mentioned that
0:52:09you know many of these
0:52:11national academy of engineering grand challenges
0:52:14um
0:52:15from
0:52:16uh
0:52:22having to do with a the global climate
0:52:24as well as um health care
0:52:26security is are all addressed by and by
0:52:29like tell a press
0:52:32so in conclusion um
0:52:33although
0:52:35visual communication has not really changed much in the last eight years
0:52:38the conditions are right do the convergence of
0:52:41affordable technologies and high social need for
0:52:44human communication to break through to new levels of them are given
0:52:47and uh the signal processing community in a unique position to be able to address
0:52:51uh the needed advances so
0:52:53i think we should embrace the opportune
0:52:56thank you
0:53:09thank you and that's in the a uh it it so that uh an exciting and uh informative a talk
0:53:16i I and the uh feel that we have deal
0:53:19then it faded it can having you do they bring that to okay about new have a yeah yeah
0:53:26how can you tell
0:53:27yeah yeah yeah yeah
0:53:29she
0:53:30anyway we have uh an opportunity or some uh a question in the floor
0:53:46are uh
0:53:48oh we have one at the back if you would got the mike lee
0:54:00well i'm and you think of this end so
0:54:02got
0:54:03yeah order
0:54:05you immersive it
0:54:05in case
0:54:06to touch and what else
0:54:08i in greater converter
0:54:10um feeling
0:54:11so probably very important but uh it's the least well understood
0:54:16there's some uh articles in the uh verse of communication issue about have tex
0:54:21um
0:54:22uh
0:54:25you know i i would say it's least well understood at this point O
0:54:29there's a lot of work to be done
0:54:35so uh
0:54:37basically when i was a a a a it be lot asking the little guys so
0:54:42oh
0:54:43looking at a them was station i ask them one question
0:54:46what about the band
0:54:47i mean
0:54:48you need a lot of they got to so and so one
0:54:51and so
0:54:53there are saying there are using a special line
0:54:56that's right
0:54:56so
0:54:58but everybody can these a special lines so
0:55:00i mean
0:55:02oh do you see that like a like that that's like this amount model of a got
0:55:05or the word you don't on so you're still basically but still
0:55:08you don't so a large amount of data
0:55:10and you need some basic the but well um
0:55:13bandwidth with capacities going to vary very over a very wide range uh
0:55:18and
0:55:19so
0:55:20you can do various
0:55:21uh i think to address lower band with
0:55:24so the avatar can act as an example where you just controlling
0:55:27um
0:55:28you're parametric avatar sending a series of parameters over
0:55:32over the line so it's much more like a
0:55:34uh
0:55:35you know
0:55:36a again
0:55:37um
0:55:38and you still get you still can get someone out of a margin and those
0:55:42situation so i think
0:55:44uh
0:55:45the amount of realism we'll just depend on how much bandwidth with you have available to you
0:55:49um and their strategies for
0:55:51are going down to lower band
0:55:56that's sufficient
0:56:08yeah well
0:56:09how much bandwidth with you need for different amounts of perception is
0:56:12uh is an area first
0:56:31a have that one should be taken this but more of line we have to for one last question
0:56:37oh okay okay i dean man
0:56:39one can not really right i
0:56:42you human you mean
0:56:45would you be able to you come and that share you
0:56:52um
0:56:53so no matter what technology we use
0:56:56there will be things that we just cannot not
0:56:58uh replace
0:57:00with human to human communication is that where you ask
0:57:02these
0:57:03so
0:57:04you know i i i suppose supposed just probably one of the good examples that's that difficult to do right
0:57:09now
0:57:09i
0:57:10i really can't say uh
0:57:12you do the future what
0:57:13whether smell and
0:57:15other things
0:57:16you know
0:57:17i mean i suppose eventually you just the jack to the matrix and put something the back of your brain
0:57:21and then
0:57:22uh
0:57:23replace everything
0:57:24uh you become a a brain in of that
0:57:27at that point
0:57:28uh but
0:57:29um
0:57:30what you lose
0:57:32philosophers last say we're all or or all ready brains
0:57:40sorry that's point two
0:57:42with
0:57:42hence
0:57:44okay we should uh i think uh it again and used to call thing to do
0:57:54X
0:57:58really