getting a good taste of three three D today um my talk is about semiautomatic two D to three D conversion merging graph cuts and run one so i will talk a bit about to two D to three D conversion is definitely not dead for for a while if and the we have three D can um and talk about our method merge a segmentation technique once a graph five or and how our method a semiautomatic approach is not fully a uh no shows some results um definitely there has been yeah keen interest a three D over the last decade probably more so of the last five years uh so three D image that perception wanna rent two distinct images in two D three conversion that's cool you don't have uh given one image we want find the let right which we find in create a three D uh image so we know that there are technology it's painfully you can't that most of us um a experience three D uh set up some was probably have a home and it two most popular method to produce a three D content is having three D camera or taking a two D image very three D um the best solution that a nice to capture with three D camera and some people are not already even consumer cameras exist for people and have one myself and there are movie produces a of at that are being a good example of a portion of the movie actually film with to cam however there still an extremely large a proportion of people produce content that do not oh want to have refused the film with to camera so you available they don't wanna do that we feel that they they're are image in the way they see a seems is better represented a camera so there's something we can do a three D um they wanna can this to you wanna convert a three D at the words and they realise that they can make a lot more money um so it in in a when people do these two cameras realise it's can expand a and is difficult filming in three D with two cameras your mindset changes as high have so some things can be filmed a three D i you very fast close of motion there are certain ways it scenes are taken they cannot be then at three so there's a drawback um so many times to D dct you conversion to prefer especially if its also content that you've already had you're are a lot of a movie house have millions of hours of movies content they they would love to make a money from and they would love to convert a three D be able to do that so to use conversion is proof for that um of course we know that there are many examples of to you if conversion many of i'm work uh and that's true partners where our job was help them find methods in eight them and doing to D three version especially and high resolution cinema and is a great demand for that as much as we know that three D cameras are are there there was very strong the man for if so as much as i plug the previous uh uh people lectures to D converges to strollers and won't direct then so we want to recover or we wanna "'cause" recovering and depth map and and not necessarily the most accurate and precise that not we want be will show relative depth to mean but um um so once you have that that map as of seen previous like you can get a stereo what we ate what we've done in the past we aim to do is a a semiautomatic approach we we work "'cause" in the industrial part in the past very well known one that converts high resolution two D to three D uh movies and their goal is not a to do a quick their goal was do it precise they can spend about three months uh convert about twenty minutes of a movie and it's all than mine it was segmented objects from an image and they will be very we take how accurate that is so that they one extremely accurate uh edges and when a segment out that that is not so much the ratio should they can J so they they like a semiautomatic approach one where you can have we user give some indication of what things should be segmented and that the algorithm rhythm work from that uh uh information uh in the two D dct conversion rolled doing a fully automatic approach is to of the whole grail they only don't expect to see that have so the best results so far in that feel have this the autumn approach um our method has inspired by a good men and his team they you very good results test did that pretty complex uh algorithm rhythm scheme but it's a quite good and it's based on random walks uh and the user for back to machine for some training classification so i was it's we saw this is run walks and realise um we could use our method from random walks something similar and the from or interaction with industry we use graph "'cause" for the help a segmentation very pop me and we also used the random walks to refine so with we try to merge together so we merge random walks which is a well known technique and graph as was an even better no can can we have modified versions of a to the um instead of considering each label we we take an object in an image a you do a semi automatic approach usually you put markers in an image of and wants to me with graph that you know you brush strokes you can identify back for gram and a call a label so instead of the mind set of for segmenting an object background back where actually we modify graph that's and we do a multiple object segmentation we see see that essentially deaf so we allow the user a fine uh a object but also you a relative that um random walks is uh pretty well known method also it's a solution to a linear system and it's no two uh the very good on edges that are uh fine detail and uh gradual will change but they're not very good as strong edge whereas graph cuts is extremely good uh route strong edges but not very good on uh and is that change over with contrast so we want to combine those two together so what we first do is we we allow the user or shown later when a get the results also the user will select um different levels of that an image easy brush strokes and we then apply a random walks that and we do a random walk segmentation of i'll of the image and um we modify the random walks to scale space random walk where we find the random walks is susceptible noise and we do a scale space version of the image with do a random walks at each level and merge of the results here you mention me and we get our on result and once we've done that take it graph cut of the same a bead using the same uh input from a user and we do a a a modified version of graph you you multiple label right so we segment every of segmentation of objects relative to each other um over the multiple label and we end up with two different types of segmentation that uh information results and a point is to merge get um are that the merging a more depth map essentially done with um a geometric mean now this is a preliminary result and how we try to merge these things together we do a geometric should mean to um hopefully fully uh i i a i a lot the areas is like we're graph that is more uh a stronger to overcome the random walks so results so our geometric mean tends to from a first result and to have a a a a a good um experimental out uh a that we have a student this taking over this um task just trying to figure out a more that way of of finding weights that you much mean currently and that the it totally different weighting uh on top oh show you a few results how it's done so we and we get as for example this is a one of the original image is on the left and the user is presented with to D image and all the users expect do is the mark a relative that on the image okay so and i think that's why is consider would be smart by the user being close to us and anything that's black would be so way and the in between a grey values this case we only have three uh would be somewhere between but we can modify that the have many levels of uh spared differ that relative depth this so user could modify a multiple areas with the relative to and that's very similar to have no grab cuts works i you mark areas is for gram running that the are as job that's what mean by something out of we don't one provide a have a user providing more information then uh so the on the top left is the a a modified R S as R W or modified random walks that's map from the user what and the bottom one is the a graph cuts that notice we have some occlusion um holes a result and those uh we initially we just use the simple um inpainting method pretty much anything you use there but for for results so to this is to a modified random walk to actually fill that in in the end but we want mercies to result graph cut has to be a very good binary like segmentation of objects that's map us S a that we use a some more a gradual change and that point was them are those two things together so i using the geometric mean were hoping to merge those two together and we find a we ten to create a more gradual i of uh realistic that map by a merging two yeah so this is the the the size right you from the original image and this is an an live image course we don't have three glasses here but if anyone's dress web address and paper and can be viewed without that's online but it gives a sense of how the shift uh as happen how much the spared so shown in you so another example user just selects the background foreground and a couple the a strokes on sonic a building is the to uh random walks and graph "'cause" nose around what's a very nice bright roll foreground background a map the that so a synthesized uh right you along with you and of image no she how much stuff course not everything's rosy this example where um do work very well the original left that's labels at the user provides issues this our graph cut i think here but i think the merging with R you well investigation how to merge the two or that we can based that's some image information to try to create a better that's map merging two methods yeah um are again i i the stress the importance of our algorithm is the fact that we don't one how to we want to user what and on the only a of input that user take a couple of second of input to just draw steps types of uh a line on top of that the you D of it is it if it's result or or or something not right user was able to go back and modify race or keys their their strokes at at some more and help to help the algorithm rhythm work a little better now are our results are preliminary we know that are are S start tell you must as work very well path find the reference the paper the ad paper but um we feel that with the provide more murder to together to methods as work very well and and we should be able to get a very good uh approximate relative that between we not since be very important um so i our preliminary results are shown that that's maps a the best or try to in the best of both these met um so a graph provides "'cause" just as a noticeable borders why use very much and segmentation and random walks allows us to capture texture and gradient a a little better um there are drawbacks for um one of the that maps is not correct as i a showed previous one a graph good right to the results but because it's such just simple method run relatively quickly to a few seconds a frame a a user can go back and modify and change their labels the modified the R hope correct um also a weighting scheme is a way to scheme static again it's very initial results and we're trying to you know a find some kind of that way and oh so mention your mike future work students so we're looking ways so that the you uh that that that lemurs at that's maps um and another aspect of features to try to uh standard to videos great oh although the metric a type of environment we can merge results to graph cut a random walks across cross by match um model of scene and tries yeah we can have some questions i a a nice one almost great and one the more care to know what's and makes a great a you so proposed in to a week a i also great with that can vary between one and two yeah what you we have a and it's have you tried use instead of the G two reasons so okay um i i i'm not sure if students were aware that paper time i have to check on that um i'm not familiar with the i i'd reshape the ref okay oh oh my question is oh how to determine the number of that there is that you have and C it's all by the user the user it's up to the user to define how many taps levels there are that's layers so the user can you or can do four gram part and if that's what they want or or they can do two or three or four depending on how much uh detail tell or or that they they want it's of the U um how about the the consistency of the resulting time um you video sequence that's definitely a problem time we have investigated that at all so going from frame to frame you're gonna see issues with edges and ripples definitely one of biggest with to be if you D conversion again when we extend to video that's maybe one of the top oh oh also want to know that whether that use a need to a sign to that well you for each of the later well of the the user i mean if you're from a with a graph cut the user just likes four gram but so in this case is the think it four gram and the user thinks closer or and further away so if the user things that there are three or four objects for depth layers they would draw according to the objects and assign labels of those which are for the back from the front object and further their back from that object and so on so it's again as i mentioned to the previous question is that it's all defined the and on the users perception of the image and what they and what object they consider a close or for away from you how well the except that where you are to that that well that's a you is it's right what we care but is the relative that oh C um given companies like um i max for example when a convert a movie they're not they don't care about the actual that they care about the relative that because they will move objects further into the three D Q uh a to make the effect bigger so that then they're not there to make it seem realistic you there are there to make it more you know extra realistic or super really so they care about the that uh relative depth and that's what we concentrate on it's not like robotic vision were things have to be um you know the precise i walls happens if the use uh put things a normal to however are as it that situation not we're pretty much proved then well then the whole that's changes um there's no i mean i get it because a automatic all these methods depend on the user's expertise and the and interpretation so if for user wants to do that that's fine is gonna make the result total be how are so things of just ship i think you okay have a think are speaker again