0:00:16here so i'm going to present uh software going with uh what i might be
0:00:22a system and uh it's fine yes or entropy based supervised merging for one critical
0:00:28in section
0:00:29and here is the outline of my talk so i will uh briefly remind you
0:00:34about the back of what model which is the basis for a while and then
0:00:38i will introduce a technique that we propose which is the set of is close
0:00:43to being and i will present some experimental results in the room
0:00:49so the problem of dealing with this is a problem of a visual concept detection
0:00:54basically we want to build the system so that uh when we defeated image the
0:01:00system is able to detect whether the concept appeal do not this image based on
0:01:06the set of a given concept
0:01:10so we these is a classical way uh from this image will be the feature
0:01:17vector which is a representation of the visual content of the image content
0:01:33classifier trained
0:01:36since the a which is used as much as we do
0:01:47do not all uh
0:01:51and you
0:01:55but what uh
0:01:58just use the vocal going to be which are all one side image which can
0:02:06be maximization
0:02:11all sometimes we could use is necessary to the Z unit
0:02:22yeah on each of these point to look at this
0:02:27like he would you know this will see one
0:02:35oh
0:02:40so with this process uh presentation of an image is set over
0:02:51and uh
0:02:52that is really is a way isn't set
0:02:58i would like to
0:03:00so the is what we uh we do not images from the training set and
0:03:06each of these image we do
0:03:09the this paper we have this we will we find it is how is a
0:03:19uh a result that in and in the descriptor space we will just be used
0:03:29in the disk space and language this paper we will be shown in london or
0:03:36one like say in each of these that we represent each one where
0:03:44so in order to be matched with
0:03:47all right well we compute the local the speaker and then we for each descriptor
0:03:57we in which that it is an easy to uh the signal space
0:04:03is that will be we don't by one of all this
0:04:08though we only need this topic i just and we uh compute the is that
0:04:15are all the distinct in each is you know this is that the feature presentation
0:04:25oh i see that once is
0:04:28so that you agree that you know that we use the solution that we the
0:04:34number of discrete the B and
0:04:38we met in fact that the presentation of the interior presentation will be able but
0:04:47it's easy for this in the disk space
0:04:51so we use the rotation automation this routine was about
0:04:56density and because it's
0:04:58possible
0:05:00i
0:05:02and
0:05:04and
0:05:08uh this approximation is a make one by assuming that probability density is constant actually
0:05:18where each yeah it's
0:05:22yeah the descriptor space
0:05:26a uh so that is so that of course this is the visual channel B
0:05:33is going to be very
0:05:35because uh you we see used uh
0:05:40the approximation
0:05:44then we can also i we have a mission and that's we did not performance
0:05:51because the description of image that is more inside the dictionary size with the hotel
0:05:58all right side and increase the complexity we basically you could very well be something
0:06:08the rest of the data but also in the detection we use this is that
0:06:13uh is what we call
0:06:19and also that the loss is a we will be
0:06:26a way to classify them or no interest
0:06:31the more i got so i
0:06:34this your model that
0:06:38and well see that uh usable model so that uh the way it should be
0:06:46we should be sure is principle is we don't really information is that
0:06:53uh this is very diversity once i think uh it makes my presentation level but
0:07:02since the feature vector and classifier detect set we may see that there is some
0:07:08information tracking and uh that the
0:07:13the visual dictionary construction will be so this is exactly uh what we will
0:07:21to this uh this presentation and vision is to use the label information in the
0:07:29presentation that the future work
0:07:33it is therefore we if we got a process that the that this difference
0:07:39and we compare the labels we use this is so uh this is not a
0:07:44display of this the images and images label by a concept so that its you
0:07:54and that may be more interesting
0:07:59able to stop
0:08:00yeah
0:08:02uh in all the cases for instance because actually doesn't matter for all the detection
0:08:11set with this one you know
0:08:15yeah
0:08:17so uh how all this is this
0:08:25for human action at uh Z label information
0:08:29and the way is that we do is that the study a at which you
0:08:36will be the dictionary size the so you know that we are going to be
0:08:41a much a dictionary actually several times yeah it is and
0:08:47so is a large number one and then we do this and these were well
0:08:55into a final uh i'd like to uh consider the right yeah i use
0:09:03so we look at it would be where and when they are compared the same
0:09:11information about
0:09:13so basically uh this is this step uh
0:09:18well going to uh and stuff like this to the you do you dynasty is
0:09:23more well then we know the where and it ultimately we have on the one
0:09:30with each is your well maybe and all considered
0:09:40so uh i yeah is select the best one and one we decrease the size
0:09:50of the dictionary and we continue we take this process and we combine speech is
0:09:55the to the size
0:09:59now what is that like a comedy and a mouse is so the way that
0:10:08no information about label distinctive
0:10:12so you all these videos and music
0:10:16the uh each easier this company does that this is that we have one and
0:10:26we can uh final set which is you what we would like to have it
0:10:34involves analysis
0:10:36and actually not a set of the uh
0:10:45so in a more formal way what is that uh for each concept people in
0:10:50each other's and each visual what we do not all clusters of the concept label
0:10:56X E uh this one C by just counting for each unit which is how
0:11:03this uh the proportion of this which ones
0:11:10then when we can stop we can easily computed the condition the concept and then
0:11:18computed for all the concept label distribution
0:11:22you a visual channel which is that
0:11:26now uh we don't usually P and then we don't for what happens if we
0:11:34use the real data you visual dictionary what we can also you and we want
0:11:40to minimize what we do with the C is which minimizes
0:11:50so uh we did this process until we reach the desired size for you
0:11:58oh is that we can see once the basis B C and it is based
0:12:06scheme we divide people or the concept at the same time
0:12:11we could also see that this uh the visual attention for each concept and if
0:12:18you don't which is dependent on set
0:12:21in this way we will only consider a single concept be able to be uh
0:12:27and uh but which maximize information about the signal concept so we've been maybe others
0:12:36due to the T is a set
0:12:40all we can also add an additional uh
0:12:48the concept dependent entropy is a once in which we can also but it would
0:12:55say this so well to and
0:13:00and usually a directivity which may not connex in the descriptors
0:13:08now uh we haven't this approach using the technique they are we that the sense
0:13:15that one which is about what you saw and we have a spatial context
0:13:23um
0:13:24we do is look at images and uh C uh
0:13:31this is this uh which is the image and you our times the study we
0:13:39use uh the and we have a support vector machine with
0:13:48so as to see what we will do is used as in each of dictionary
0:13:54size which is not to be we try to two times and it is
0:14:02in chunking he's we
0:14:06yeah almost no connection
0:14:09it is yeah and the way we evaluate performance of the system is this talk
0:14:14about the evaluation
0:14:16which is the mean of the system and on the basis of what we basically
0:14:22now we apply the classifier that this and we get to school each concept in
0:14:29each shot we wish it something related to show it is what we did with
0:14:37this so that we can yeah
0:14:45also based on it should be a five hundred well and the initial dictionary size
0:14:51one thousand two thousand and four that the which are time times and it is
0:14:56the difference is uh those results that we got our baseline resulting in all five
0:15:04gives the precision of a seven percent seven set and if we do this by
0:15:11not and so we still have a final dictionary size of what but that multi
0:15:18condition dictionary you're one of the things that of that and you see that uh
0:15:24we get the performance which is a seven one nine eight point one percent
0:15:33what the substance of what well as usual
0:15:39so this is probably not be for the whole two distribution and we also concept
0:15:46dependent
0:15:49and actually and uh so i think be used as the results that are uh
0:15:54as and the concept dependent
0:15:59not what we mean that we have been T one map each concept yeah
0:16:08the reason yeah uh
0:16:13the reason is that is probably that they are
0:16:20concept to a reasonable is what about the
0:16:26and uh is lacking in a way that may not be fine you may be
0:16:34used for all the training they are but doesn't be
0:16:38no information on that
0:16:42for remote uh
0:16:46um
0:16:47we also tried if this is a expander with size to recycle one thousand and
0:16:54so uh if we start with an initial dictionary of size and the baseline would
0:17:00be seven time
0:17:02and uh
0:17:04if we do this by well uh times eight times the usual size no side
0:17:14but then we see that we stiff at least in once and see that they
0:17:19want to the set and so um this process all the side uh also might
0:17:28not see what was it should be which are with the same size
0:17:38so that we can say that can set can see that the performance of concept
0:17:43maybe the concept which what position is reasonable or
0:17:49a number of concept which
0:17:52which are very difficult which are
0:17:54precision
0:17:55the whole so that they use yeah uh the performance
0:18:02but uh yeah but there is a plan that the simple point T is the
0:18:08need
0:18:11and a real so i what happens with that is one possibility is to the
0:18:19constraint so that is one which i don't think it's to uh
0:18:27dictionaries which on the next and this did not be such a good a yeah
0:18:32actually uh i one should not be in each one thousand we increase performance but
0:18:42if we see that the size of the initial dictionary then we see that the
0:18:47performance is
0:18:48so these the that again we have a problem at that uh the data constraint
0:18:55we possibly is
0:18:57we were and we decided not which a given
0:19:04so i
0:19:08oh we apply and they are images which is you can say so they are
0:19:18uh
0:19:19yeah to what we do not can say yes they are at the moment so
0:19:26we see so you can say the entropy minimization and will be selected is we
0:19:35assign tdictionary side be if the initial dictionary size yeah it is not uh well
0:19:44we get a simple remote or at what will be goals of this is what
0:19:49surfaces on the unlabeled data and just doesn't uh doesn't you who uh the detection
0:20:00and i know the way you get this is a fine you once white and
0:20:09white at the same generic size and we look at another way to be you
0:20:16should not be size will make the performance and this is because the dictionary size
0:20:21side of each other and then back to classify data
0:20:27and so the experiment here is to uh
0:20:31but uh the uh the size which would be used
0:20:38and what we can see here is that um
0:20:43we have been outside between them in two thousand and uh is not size
0:20:52and so uh two times whatever B and C the homeless and we will start
0:21:03position the last see that we consider two D space and
0:21:09should be a very small be easy
0:21:13yeah attention this is what we are using the size of the dictionary reduces the
0:21:18complexity of the classifier
0:21:20which is an important step to the
0:21:24so and uh it is a victory for the study which is related information
0:21:32selection of the
0:21:34oh this it should be of the same size is almost always one outside the
0:21:40same performance uh this is the key which is the people and so we have
0:21:49here the two that uh full set but five it's an efficient to balance the
0:21:57complexity into
0:21:59in the detection
0:22:03thank you
0:22:04i
0:22:06i
0:22:19um well complexity is so we can be uh actually at least one
0:22:29well just space which i got an X is a painting
0:22:41oh yes and the distance is actually the prediction in what you're getting
0:22:58oh the keypoints oh we decided using uh let's motion detection
0:23:06it's the basic uh is that that's detection
0:23:11so it was found out
0:23:18oh
0:23:25yes
0:23:28i
0:23:35yes you away i see the result would be very similar
0:23:43because be used with the change uh the number of is well see
0:23:50we use the space but we would need as many where we present the same
0:23:59we don't see that the result
0:24:23there are C
0:24:24is a simple formula
0:24:28look at the process
0:24:33we also
0:24:39they are continuous it's
0:24:41it's not a distance
0:24:48okay but it needs quite simple
0:24:51it's a very simple
0:25:02yes
0:25:09uh
0:25:15that all possible uh that would be the question you see section
0:25:20think about that but uh
0:25:33also the point is that you uh
0:25:38the issue is balance the distance that the state space you will be
0:25:44labels
0:25:45uh so you see that
0:25:50the distance
0:25:51the disk space is um
0:25:55this is why initial okay
0:26:00uh i know two