0:00:14getting a
0:00:15good taste of
0:00:16three three D today
0:00:17um my talk is about semiautomatic two D to three D conversion
0:00:22merging
0:00:23graph cuts and run one
0:00:28so i will talk a bit about to two D to three D conversion is definitely not dead
0:00:33for for a while
0:00:34if and the we have three D can
0:00:36um
0:00:37and talk about our method
0:00:39merge
0:00:40a segmentation technique
0:00:41once a graph
0:00:42five or
0:00:45and how our method
0:00:46a semiautomatic
0:00:47approach is not fully a
0:00:50uh no shows some results
0:00:54um definitely there has been
0:00:57yeah
0:00:57keen interest a three D over the last decade probably more so of the last five years
0:01:02uh so
0:01:04three D image that perception wanna rent two distinct images
0:01:08in two D three conversion that's cool you don't have
0:01:10uh
0:01:12given one image we want
0:01:13find the
0:01:15let
0:01:15right
0:01:16which we
0:01:16find
0:01:18in create a three D uh image
0:01:20so we know that there are technology it's painfully you can't that most of us
0:01:24um
0:01:25a experience three D uh set up
0:01:27some was probably have a home
0:01:29and it
0:01:30two most popular method to produce a three D content is
0:01:33having three D camera
0:01:35or taking a two D image
0:01:36very
0:01:37three D
0:01:41um the best solution that a nice to capture with three D camera
0:01:45and
0:01:46some people are not already
0:01:47even
0:01:48consumer cameras exist for people and have one myself
0:01:51and there are movie produces a of at that are being a good example of
0:01:55a portion of the movie actually film with to cam
0:01:59however
0:02:00there still an extremely large
0:02:02a proportion of
0:02:04people
0:02:05produce content
0:02:06that
0:02:07do not
0:02:08oh want to have refused the film with to camera
0:02:12so
0:02:13you available they don't wanna do that
0:02:14we feel that they they're are image in the way they see a seems is better represented a
0:02:18camera
0:02:19so there's something we can do
0:02:20a three D
0:02:21um
0:02:22they wanna can
0:02:23this to you wanna convert a three D at the words and they realise that they can make a lot
0:02:26more money
0:02:27um
0:02:28so it
0:02:29in in a when people do these two cameras realise it's
0:02:32can expand
0:02:34a and is difficult filming in three D with two cameras your mindset changes as high have
0:02:38so
0:02:39some things can be filmed a three D
0:02:41i you very fast close of motion there are certain ways it scenes are taken
0:02:46they cannot be then at three
0:02:47so there's a drawback
0:02:49um so
0:02:51many times to D dct you conversion to prefer especially if its also content that you've already had
0:02:56you're are a lot of a movie house
0:02:59have
0:03:00millions of hours of movies
0:03:02content they they would love to
0:03:04make a money from
0:03:05and they would love to convert a three D be able to
0:03:07do that
0:03:08so
0:03:09to use conversion is proof
0:03:10for that
0:03:11um
0:03:12of course we know that there are many examples of
0:03:16to you if conversion
0:03:17many of i'm work
0:03:19uh and that's true partners where our job was
0:03:22help them find methods in eight them and doing to D three version
0:03:25especially and
0:03:26high resolution cinema
0:03:28and is a great demand for that
0:03:30as much as we know that three D cameras are are there
0:03:32there was
0:03:33very strong the man
0:03:34for
0:03:35if
0:03:36so as much as i plug the previous uh
0:03:39uh people lectures
0:03:41to D converges to strollers
0:03:43and won't direct then
0:03:47so we want to recover or we wanna "'cause" recovering and depth map
0:03:51and and not necessarily the most accurate and precise that not we want be will show relative depth to mean
0:03:57but
0:03:58um um
0:04:00so once you have that that map as of seen
0:04:02previous like you can get a stereo
0:04:06what we ate what we've done in the past we aim to do is a a semiautomatic approach
0:04:11we
0:04:13we work "'cause" in the industrial part in the past
0:04:15very well known one
0:04:17that converts high resolution two D to three D uh
0:04:19movies
0:04:21and their goal is not a
0:04:23to do a quick
0:04:24their goal was do it precise
0:04:27they can spend about three months
0:04:30uh
0:04:30convert about twenty minutes of a movie and it's all than mine
0:04:33it was segmented objects from an image
0:04:36and they will be very
0:04:37we take how accurate that is
0:04:40so that they one extremely accurate
0:04:42uh edges
0:04:43and when a segment out
0:04:45that that is not so much the ratio should they can J
0:04:49so they they like a semiautomatic approach one where you can have we user
0:04:54give some indication
0:04:56of what things should be segmented
0:04:58and that the algorithm rhythm work from that uh uh information
0:05:01uh
0:05:02in the two D dct conversion rolled
0:05:04doing a fully automatic approach is to of the whole grail they only don't expect to see that have
0:05:10so the best results so far in that feel have this the autumn
0:05:13approach
0:05:15um
0:05:16our method has inspired by
0:05:19a good men and his team
0:05:20they you very good results
0:05:22test did that pretty complex
0:05:24uh algorithm rhythm scheme
0:05:26but it's a quite good and it's based on random walks
0:05:29uh and the user
0:05:31for back to machine for some training classification
0:05:34so i was it's
0:05:35we saw this is run walks and realise
0:05:37um
0:05:38we could use our method from random walks something similar
0:05:41and the from or interaction with industry we use graph "'cause" for the help a segmentation
0:05:47very pop
0:05:47me
0:05:48and we also used the random walks to refine
0:05:51so with we try to merge together
0:05:54so
0:05:56we merge random walks which is a
0:05:59well known technique and graph as was an even better no
0:06:02can can we have modified versions of a
0:06:04to the
0:06:09um
0:06:10instead of considering each label
0:06:12we we take an object in an image a you
0:06:15do a semi automatic approach usually you put markers in an image of and wants to me with graph that
0:06:19you know you brush strokes
0:06:20you can identify back
0:06:22for gram
0:06:23and a call a label
0:06:24so instead of
0:06:26the mind set of for segmenting an object
0:06:29background back where actually
0:06:30we modify graph that's
0:06:32and we do a multiple
0:06:34object segmentation we
0:06:35see see that essentially deaf so we allow the user
0:06:37a fine
0:06:38uh a object but also you a relative that
0:06:45um random walks
0:06:47is uh
0:06:48pretty well known method also it's a solution to a linear system
0:06:52and it's
0:06:53no two
0:06:54uh
0:06:56the very good on edges that are
0:06:58uh fine detail and
0:07:01uh
0:07:02gradual will change
0:07:04but they're not very good as strong edge
0:07:06whereas graph cuts is extremely good uh route
0:07:09strong edges
0:07:10but not very good on
0:07:11uh
0:07:12and is that change over with contrast
0:07:15so we want to combine those two together
0:07:22so what we
0:07:22first do is we we allow the user or shown later when a get the results also the user will
0:07:27select
0:07:28um
0:07:30different levels of that an image
0:07:32easy brush strokes
0:07:33and we then apply a random walks that
0:07:35and we do a random walk segmentation of
0:07:38i'll
0:07:38of the image
0:07:39and um
0:07:41we modify the random walks to scale space random walk where we
0:07:45find the random walks
0:07:46is susceptible noise
0:07:48and we do a scale space version of the image with do a random walks at each level
0:07:52and merge of the results here you mention me
0:07:55and we
0:07:55get our on result
0:07:59and once we've done that take it
0:08:01graph cut
0:08:02of the same a bead using the same uh input from a user
0:08:06and we do a a a modified version of graph
0:08:08you you multiple label
0:08:09right
0:08:10so we segment every of segmentation of
0:08:14objects
0:08:15relative to each other
0:08:17um over the multiple label
0:08:20and we end up with two different types of segmentation
0:08:23that
0:08:24uh information results
0:08:26and a point is to merge
0:08:27get
0:08:32um are that
0:08:33the merging a more depth map essentially done with
0:08:36um
0:08:37a geometric mean
0:08:39now
0:08:39this is a preliminary result and how we try to merge these things together
0:08:43we do a geometric should mean to
0:08:45um
0:08:47hopefully fully
0:08:48uh i i a i a lot the areas is like we're graph that is more uh
0:08:51a stronger
0:08:52to overcome the random walks so
0:08:55results
0:08:56so our geometric mean tends to from a first result and to have a
0:09:00a a a a good um
0:09:02experimental out
0:09:03uh a that we have a
0:09:05student this taking over this
0:09:07um
0:09:08task
0:09:09just trying to figure out a more that way
0:09:11of of finding weights that you much mean currently
0:09:14and that the it totally different weighting
0:09:16uh on top
0:09:23oh show you a few results
0:09:25how it's done
0:09:30so we
0:09:31and we get
0:09:33as for example this is a
0:09:35one of the original image is on the left
0:09:38and
0:09:40the user is
0:09:41presented with to D image and all the users expect do is the mark
0:09:45a relative
0:09:45that
0:09:48on the image
0:09:50okay so
0:09:51and i think that's why is consider
0:09:53would be smart by the user being close to us
0:09:57and anything that's black would be
0:09:59so way
0:10:01and the in between a grey values
0:10:04this case we only have three
0:10:05uh would be somewhere between but we can modify that the have
0:10:08many levels of
0:10:10uh
0:10:11spared differ
0:10:12that
0:10:13relative depth this
0:10:15so user could modify a multiple areas
0:10:17with the relative to
0:10:20and that's very similar to have no grab cuts works
0:10:23i you
0:10:24mark
0:10:25areas is
0:10:26for gram running that the are
0:10:27as job
0:10:28that's what mean by something out of we don't
0:10:30one provide a have a user providing more information
0:10:33then uh
0:10:37so the
0:10:40on the top left is the
0:10:42a a modified R S as R W or modified random walks
0:10:46that's map
0:10:47from the user what
0:10:49and the bottom one is the
0:10:51a graph cuts that
0:10:53notice we have some occlusion
0:10:56um
0:10:57holes
0:10:57a result
0:10:59and those uh
0:11:01we initially we just use the simple
0:11:03um
0:11:04inpainting method
0:11:07pretty much anything you use there
0:11:09but for for results so
0:11:11to this is to a modified random walk
0:11:14to actually fill that in
0:11:16in the end
0:11:16but we want
0:11:19mercies to result graph cut has to be a very good
0:11:21binary like
0:11:22segmentation of objects
0:11:24that's map us S a that we use a some more a gradual change and that
0:11:29point was them are those two things together
0:11:31so i
0:11:32using the
0:11:33geometric mean were hoping to merge those two together
0:11:36and we find a we
0:11:39ten to
0:11:40create a more gradual
0:11:42i of
0:11:43uh realistic that map by a merging two yeah
0:11:47so this is the
0:11:50the the size right you from the original image
0:11:52and this is an an live image course we don't have
0:11:55three glasses here but if anyone's dress
0:11:58web address and paper and can be viewed without that's online
0:12:02but it gives a sense of how the shift
0:12:04uh as happen
0:12:05how much the spared
0:12:06so
0:12:07shown in you
0:12:11so another example user
0:12:13just selects the
0:12:14background
0:12:15foreground and a couple the a strokes on sonic
0:12:18a building
0:12:22is the to uh
0:12:24random walks and
0:12:26graph "'cause" nose around what's
0:12:27a very nice bright roll
0:12:29foreground background
0:12:30a map
0:12:33the
0:12:33that
0:12:34so
0:12:39a synthesized
0:12:40uh right you
0:12:41along with you
0:12:43and of
0:12:43image
0:12:46no she how much stuff
0:12:49course not everything's rosy this
0:12:51example where
0:12:52um
0:12:53do work very well
0:12:54the original left
0:12:56that's labels at the user provides
0:13:02issues this our graph cut i think here but
0:13:05i think the merging
0:13:07with R
0:13:08you well investigation how to merge the two
0:13:11or that we can
0:13:13based
0:13:13that's some image information to try to
0:13:15create a better that's map merging two methods yeah
0:13:19um
0:13:20are
0:13:21again i i the stress the importance
0:13:23of our algorithm is the fact that
0:13:26we don't one how to we want to user
0:13:29what
0:13:30and
0:13:31on
0:13:31the only a of input that user
0:13:33take
0:13:34a couple of second
0:13:35of input
0:13:37to just draw steps types of uh
0:13:39a line
0:13:41on top of that the you D of it is it if it's result or or or something not right
0:13:45user was able to go back and modify race
0:13:48or keys their their strokes
0:13:49at at some more
0:13:51and help to help the algorithm rhythm work a little better
0:13:54now are our results are preliminary
0:13:56we know that are are S start tell you must as work very well path
0:14:01find the reference the paper
0:14:03the ad
0:14:03paper
0:14:04but um
0:14:06we feel that with the provide more murder to together
0:14:09to methods as work very well and and we should be able to get a very good
0:14:12uh
0:14:13approximate
0:14:14relative
0:14:15that
0:14:16between we not
0:14:18since be very important
0:14:23um
0:14:24so i our preliminary results are shown that that's maps
0:14:27a the best
0:14:28or try to in the best of
0:14:29both these met
0:14:31um
0:14:32so a graph provides
0:14:34"'cause" just as a noticeable borders
0:14:36why use very much and segmentation
0:14:38and random walks allows us to
0:14:41capture texture and gradient
0:14:43a a little better
0:14:44um
0:14:45there are drawbacks for
0:14:47um
0:14:48one of the that maps is not correct as i a showed
0:14:50previous one a graph good
0:14:52right
0:14:53to the results
0:14:55but because it's such just simple method
0:14:58run relatively quickly to a few seconds a frame
0:15:01a a user can go back and modify and change their labels
0:15:04the modified the R
0:15:06hope
0:15:06correct
0:15:08um
0:15:10also a weighting scheme is a way to scheme static
0:15:13again it's very
0:15:15initial results
0:15:16and we're trying to
0:15:18you know a find some kind of that way and
0:15:25oh
0:15:27so mention your mike future work
0:15:29students so we're looking ways so that the you uh that
0:15:32that that lemurs at that's maps
0:15:34um
0:15:35and another aspect of features to try to uh
0:15:38standard to videos
0:15:39great oh
0:15:40although the metric
0:15:42a type of environment we can merge
0:15:43results to graph cut a random walks across cross by match
0:15:47um
0:15:48model of scene and tries
0:16:01yeah we can have some questions
0:16:08i
0:16:11a a nice one almost great
0:16:14and one the more care to know what's
0:16:17and makes a great
0:16:19a you so
0:16:21proposed in to a week
0:16:25a i also great with that can vary between one and two
0:16:32yeah
0:16:32what you we have a
0:16:34and it's
0:16:36have you tried use instead of the G
0:16:39two reasons so
0:16:41okay
0:16:41um i i
0:16:43i'm not sure if
0:16:45students were aware that paper
0:16:46time
0:16:47i have to check on that
0:16:49um i'm not familiar with the
0:16:52i i'd reshape the ref
0:16:54okay
0:17:01oh oh my question is
0:17:03oh how to determine the number of that there is that you have and C
0:17:08it's all by the user
0:17:10the user it's up to the user
0:17:11to define how many taps levels there are
0:17:14that's layers
0:17:15so the user can you or can do four gram part and if that's what they want
0:17:19or or they can do two or three or four depending on
0:17:22how much uh detail tell or or that
0:17:24they
0:17:25they want
0:17:26it's of the U
0:17:37um
0:17:38how about the the consistency of the resulting time um
0:17:42you video sequence
0:17:44that's definitely a problem
0:17:46time
0:17:46we have investigated that at all
0:17:49so going from frame to frame you're gonna see issues with
0:17:52edges
0:17:52and ripples definitely one of biggest
0:17:54with
0:17:55to be if you D conversion
0:17:58again when we extend to video that's maybe one of the top
0:18:03oh oh also want to know that whether that use a need to a sign to that well you for
0:18:07each of the later
0:18:09well of the the user
0:18:10i mean if you're from a with a graph cut the user just likes four gram but
0:18:14so in this case is the think it four gram and the user thinks
0:18:17closer or
0:18:18and further away
0:18:19so if the user things that there are
0:18:21three or four objects for depth layers
0:18:24they would draw according to the objects
0:18:26and assign labels of those which are
0:18:29for the back from the front object and further their
0:18:31back from that object and so on
0:18:33so it's again as i mentioned to the previous question
0:18:36is that it's all defined the and on the users perception of the image
0:18:40and what they and what object they consider a close or for away from you
0:18:44how well the except that where you are to
0:18:46that that well
0:18:48that's a you is it's right what we care but is the relative that
0:18:51oh
0:18:51C
0:18:52um given
0:18:53companies like um i max for example when a convert a movie
0:18:57they're not
0:18:58they don't care about the actual that
0:19:00they care about the relative that because
0:19:03they will move
0:19:04objects
0:19:05further into the three D Q
0:19:07uh a to make the effect bigger so that then they're not there to make it seem
0:19:12realistic you there are there to make it more
0:19:14you know extra realistic or super really
0:19:17so they care about the that uh relative depth and that's what we concentrate on
0:19:21it's not like robotic vision were things have to be
0:19:24um
0:19:24you know the
0:19:25precise
0:19:36i walls
0:19:37happens if the use uh
0:19:40put things a normal to
0:19:41however are as it that situation not we're pretty much proved then
0:19:46well then the whole that's changes
0:19:48um
0:19:49there's no
0:19:50i mean i get it because a automatic
0:19:52all these methods depend on the user's expertise and the
0:19:56and interpretation
0:19:57so
0:19:58if for user wants to do that that's fine
0:20:01is gonna make the result total be
0:20:03how are
0:20:04so things of
0:20:05just ship
0:20:07i think you
0:20:14okay have a think are speaker again