0:00:15the morning everybody a my name is get down to get the you know from
0:00:20the universe the straight fine glasgow and you two percent of work title dsp embedded
0:00:26smart surveillance sensor we propose swad based tracker uh if i speak too fast just
0:00:32um to slow down and the other cultures of the paper a novel so you
0:00:37under from texas instruments and prefer subjects hologram from the university of stuff like so
0:00:42this is the outline of my presentation first of all i would give a brief
0:00:46introduction to set up the scene and then i was state at an object it's
0:00:50oral work and after that i will show an overview of the entire system after
0:00:56that i will talk about in details about the been only dates within the system
0:01:02and then i show some results uh to become our work after that i was
0:01:07that the convolution our show some future work
0:01:11so we just surveillance is the monitoring we need a we commoners two video cameras
0:01:16this is convenient because we can use multiple common us to surveillance and Y data
0:01:21and weighted wireless person it's only here we have to be able in the looking
0:01:25at fifteen be used at the same time
0:01:28and i think is that we can analyze we can store to be used for
0:01:31future access
0:01:34but there's a problem so when we have too many we use and the personally
0:01:38sleeping so what we're gonna do this case is a suspicious individuals walking around but
0:01:44surveillance miscellaneous is sleeping
0:01:46so the problem is the level of tension the reaction time and the crime prevention
0:01:52so we don't want to use the surveillance this footage for a process for a
0:01:56trial what we want to the contraction straightaway
0:02:01so we analytics is the semantic analysis of video data to computer systems uh using
0:02:08image can be the processing techniques in this case we talk about smart surveillance "'cause"
0:02:14we have different divvied you in every applies my algorithms to analyze the speed you
0:02:20and when we have we do not be the ninety six um we want to
0:02:23achieve is to have smart so smart sensors so we have the beginning it's embedded
0:02:29on uh processes which are then attached to the commerce so we can create smart
0:02:34units and we can deploy the intelligence and the edge of the network
0:02:38so when we have multiple us a bit smart surveillance sensor like in this case
0:02:43we can severely um we can so be a whole building um you real time
0:02:49we don't need to send on to be just change the central a station but
0:02:53we didn't we just need to send all their anyone information for example an object
0:02:58in this but also the person has to be tracked
0:03:02so the aim of this work is to create a smart surveillance sensor for tracking
0:03:07for automatic tracking using the ptz camera type it is a common is a camera
0:03:11can be combined to into one uh one of our some object
0:03:16and the object is to implement this month algorithms on a dsp board to have
0:03:22um automatically problematical controls the ptz from the board in to be able to activate
0:03:28and deactivate the tracking algorithm from remote
0:03:32so this is an overview of the system in the sense that you can see
0:03:36uh the and em which is the dsp board and the ptz which is our
0:03:40camera in when they are connected together we can process the we just streams from
0:03:45the camera on the dsp in this case we can talk about smart so smart
0:03:51so uh for either because a texas instruments dm six forty seven evm which is
0:03:57a fixed point dsp um then there's also i than a connection between the video
0:04:03in the because the kalman is and i think uh which can bomb gender and
0:04:08sixty degrees in two hundred and then the need to do this
0:04:12um the software is implemented in C uh space with the minimum pixel or system
0:04:17uh we also have it is if we sell where on the evm so we
0:04:21can send commands to activate and deactivate the algorithm rerun single time and more than
0:04:26twenty five frames per second uh on the ptz we have in http server but
0:04:31this property so we don't do anything we just send commands to try to be
0:04:35is that
0:04:37and this is the bead analytics uh basically uh we acquired we just three we
0:04:42decimate and then we didn't the leave and estimate so we uh have smaller frames
0:04:46the process and then we uh we apply our tracking algorithm um the result of
0:04:52this uh tracking algorithm is used to control the same and then uh we can
0:04:57send commands to the camera we can form of the target syllables the C
0:05:02so this is a yeah why the video stream which is that it be easily
0:05:05by C D C R the we didn't lev and we discard the chrominance components
0:05:11of retain only the luminance component uh so the algorithms can work on a actually
0:05:16works and gray scale images
0:05:18and then we decimate so we have small frames
0:05:22the tracking algorithm is based on them when matching and then we use and a
0:05:28sum of weighted absolute differences which is similar to slot is in the C and
0:05:33then we have another team uh rather uh updated them but uh that you really
0:05:37a bit more details about this algorithm are given in this paper
0:05:42so starting from the from a frame we have a region of interest ri and
0:05:47we uh try uh to find the best match for this template ti
0:05:54it's easy here we have the region of interest you have the time but we
0:05:58try to find the best match the in this region of interest so this is
0:06:01the basic concept of them but much
0:06:04the region of interest is defined as the surrounding area around the best match so
0:06:09in that case we have ri plus one in this is uh alright initial region
0:06:14of interest
0:06:16so to minimize uh to find this mismatch we minimize the swad coefficient is you
0:06:22can see here in this one coefficient basically say sum of weighted absolute difference but
0:06:27these um the weighting getting them
0:06:31it's a gaussian gonna this is because we want to give more weight to those
0:06:36to the peaks in this in the center of the target so in this course
0:06:39and uh peak so that the edge of the of the template i belong to
0:06:43an occluding object or in the background
0:06:48so uh up to update the template once a fun the best match which are
0:06:52we compute the template for the next frame so we start from the poor and
0:06:56then but uh we had the best match and then we fuse them together using
0:07:01uh this information which is basically an iir filter and i'll by submitting factor
0:07:09so in this way we can incorporate changes to the from the target in the
0:07:14time but getting on for the tracking in the next frames
0:07:19so once we have the position of the target we can control the ptz which
0:07:23is the common to and we do this to http requests a single H beta
0:07:27voice to the server on the comet or you can see a common commands for
0:07:31the ptz so basically we have uh maybe it is a common to the user
0:07:36name and the common the common the six is the see this is six bytes
0:07:40send um to the um to the camera in this is done from the dsp
0:07:46on the board to the car but also the internet at work so to want
0:07:50to control actually once all the ptz uh in save it to move up or
0:07:55about basically we detect if uh the ten the best match is in the stop
0:08:00originally that it up originally done that region basically the idea is if the best
0:08:05match is and near the edge we is likely that the target is going out
0:08:10of the field of view so we send the commander we don't the ptz either
0:08:13to give up or down left or right so in this way we are able
0:08:17to control the ptz import of the target
0:08:22so these same for frames from the memory of the dsp Z you can see
0:08:27the black box is the region of interest but at box is the target
0:08:33uh is the best match and on top left hand side you can see that
0:08:37there but for the current frame is you can see the target is moving
0:08:43and at the top you can see the template is you know the of any
0:08:45changes so we can always find the best match
0:08:51and for is also use a good as imprecision basically we have a position given
0:08:57by the target and the position uh you from the roundabout and we compute the
0:09:01cuda seriously involved in the precision is standard deviation
0:09:06at we apply the algorithm um with matlab implementations uh before sequences that do that
0:09:14for a sequence you can see that um basically all the track system for the
0:09:20target box the start and uh and cc the ncc is the normalized cross-correlation uh
0:09:27they perform worse because um they are formed by the peaks as a in the
0:09:33um i see that the edges of the time but as you can see the
0:09:36meat the middle this is fine and that's when the person in the video um
0:09:40uses an already space
0:09:43a in the pants the doesn't in six you can see the normalized cross-correlation the
0:09:48side the average so it means that they lose the target while the mean shift
0:09:52and this one can still for the target
0:09:56in these are the two sequences again we see that the normalized cross-correlation in the
0:10:00side the first uh the first graph the average so again that was the target
0:10:07one in this case we have the last example we have a lot sizzled we
0:10:11have that the mean shift just a single target so basically the slides uh tracker
0:10:15the swad based tracker perform but performs better than the sad ncc and the ms
0:10:20in the sequences
0:10:23in here we there are some but somebody got about as we can see that
0:10:26the accuracy could be that this anybody is always lower than all the general that
0:10:30are is not the sequences lots of the precision usually nor so this proves once
0:10:36again that we have good performance without tracking
0:10:40for execution time uh so this algorithm is implemented on dsp on the board and
0:10:48them in this one block with the takes seven milliseconds that we didn't all the
0:10:54fifty milliseconds or frames so basically is less than forty miliseconds and is much more
0:10:59than twenty five from the segment so we had she our name which is real
0:11:03time this efficient in this is done through intrinsics are uh C functions that implement
0:11:11it uh that are implemented for the a particle architecture in this case we have
0:11:17the dsp fixed point architecture so we use that meant for the subsets for the
0:11:22ball before which work on groups of four bytes or for pizza so basically be
0:11:28good um you one cycle one and um swipe matching block we compute we analyze
0:11:35for peaks also be basically got train cut down the competition by four
0:11:41yes an example here the non optimize mation of the same algorithm takes sixty three
0:11:46male milliseconds we just nine times more
0:11:50so this is a working example our system you can see that the board that
0:11:54we don't the ptz the bit that we just came from the ptz goes into
0:11:58the board the board analyses to be disagreement right before the target
0:12:03this is that we do
0:12:06so this is taken from a remote from a display the remote viewer
0:12:12it's you can see that it is a common is moving to follow the target
0:12:18as the target moves left to right
0:12:20your clothes are far away from the camera
0:12:25every as the label well the camera the algorithm is still able to track the
0:12:28target control the be the set so we can always of the target in the
0:12:32field of view
0:12:40so you conclusion uh i presented in a dsp embedded smart surveillance sensor uh using
0:12:47the ptz camera to uh for the target as he tried to move out of
0:12:52the field of view of the dsp on the dm six forty seventy six point
0:12:57uh and the target we use is the swad based tracker the results show high
0:13:02accuracy um accuracy and precision under partial occlusion
0:13:08so for future work we will try to think include also complete occlusion handling
0:13:14uh C
0:13:16take upon this paper is to avoid the just published so here you have a
0:13:21big deal with the swad based tracker we don't occlusion handling you can see that
0:13:25the tracker loses the target is it becomes occluded while we didn't you originally technique
0:13:34really able to recover the target this it comes out of depression
0:13:38so for future work we will try to implement also this feature on the board
0:13:47so this concludes my presentation thank you for listening in a few not constantly have
0:13:52a test i
0:14:21at the moment and we don't use the so uh feature of the calmer so
0:14:27yes when the target most close to the camera the at the target the size
0:14:31of the target sure larger screen and just not we don't do it for simplicity
0:14:37but as you can this is solved
0:14:41basically what we updated i
0:14:43see here though they're target the smaller so we can uh interface
0:14:50and then to close it closer to the camera so we can uh incorporate the
0:14:53changes of the target in of them but at the moment we don't uh i
0:14:58just a precise the target that's another thing to do in the future
0:15:22right there this target is not for face tracking or any particular objects that is
0:15:29it's a target tracking so it works is always a target they say and the
0:15:34obvious a good texture
0:15:36okay so you can discriminate target from the from the background so here we start
0:15:40from the face as an example and then he moves exactly closer to the calmer
0:15:45obviously the face is the big for the template and gets my fading mimo my
0:15:50so by the generous for any object is not only for france
0:16:21but you mean for the for the future work on mention yes okay and in
0:16:25that in this uh in this paper to enter with a complete occlusion basically what
0:16:32we do is we don't update when the target was under occlusion with an update
0:16:36the whole template at the same time but same weight but we have different weights
0:16:41for all the pixels in the time it so when you go center occlusion we
0:16:46don't update decide the one of the possible with it only this one
0:16:50and eventually when you see it yep it only few pictures on the site the
0:16:55means of the target is going to be able to discern so in the next
0:16:59three next few frames that is occluded
0:17:02in that case you don't update anymore and you say the target is occluded and
0:17:06then when it comes out is you have not updated decide the occluded one when
0:17:11it comes out on the occlusion the target is the template is preserved so again
0:17:15you can find the best match for your started
0:17:35a yeah it can be adapted for
0:17:43yeah well this is an usual in surveillance you have three components the detection algorithm
0:17:49the tracking algorithm then the position or something that this is only the talking a
0:17:55for an to select the target you can either we manually we can use an
0:18:00automatic algorithm
0:18:02usually in surveillance systems you have a person driving the ptz
0:18:09trying to find something and then the rest and we're not to be the set
0:18:14on the target
0:18:16and then why this algorithm to talk
0:18:41okay and the template
0:18:44this thing about it depends on the landing factor
0:18:50in this case as we process but more than twenty five from the second we
0:18:55give important way to the previous uh
0:18:58to the previous template into the best match but you can choose the brain in
0:19:03a real application so to who you want you want to give more weight so
0:19:08if you wanna have a um rgc
0:19:11you want to preserve just ten but then you would give more weight to your
0:19:14previous time but
0:19:16okay if you want to a docking very fast and you will give more weight
0:19:21to the best match in the case for example you give a divorce a divorce
0:19:25and seventy percent of the best match so you're able to incorporate the changes in
0:19:29the ten but for