0:00:13one introduction
0:00:14and uh i had a rabbit
0:00:16uh today i'm going to uh present um uh improve the meta control always accurate header size estimation
0:00:22and that this is not one with my adviser professor a constant
0:00:28so we know that that control is a uh
0:00:31essential part of a practical video encoder
0:00:35and uh uh sometimes we care about it a bit rate of the encoded bit-stream
0:00:40but uh you most uh
0:00:42a real-time applications we want to control the size of each frame actually to be
0:00:46and the limited to control it started control method originally proposed
0:00:50by just stopped at six three and i
0:00:54uh
0:00:55the basic assumption you don't need any control is that uh this time frame
0:01:00proportional
0:01:01to the percentage of one of nonzero coefficients
0:01:05uh
0:01:06in uh after quantization in the frame
0:01:08uh and experimental results have shown that uh uh automated control can
0:01:14uh controls the size of the
0:01:16uh offerings
0:01:17oh accurately
0:01:19and they're better than we want to use the mutual matched up
0:01:24uh the eucharist amount of header information becomes one of the
0:01:29uh
0:01:30this is because this side information is not of interest in the original domain into model
0:01:36so uh if we want to make me to control my to it
0:01:40well matched up to six
0:01:42we need to find a way to
0:01:44uh estimate the size of the information
0:01:47in the picture
0:01:49this is our motivation
0:01:52and uh this is my talk today force this motivation we never talk about
0:01:57and the next we will introduce some for the header information is
0:02:01and B also introduce a
0:02:02two stage with model
0:02:04to to stage of it can control them based on the mean control
0:02:07and that this is followed by some experiment with that's
0:02:10we compare our rate control method space
0:02:13some
0:02:14a previous work
0:02:16and uh family we will draw the conclusion
0:02:20so
0:02:21slice you what is included in the header information you
0:02:26so we know the two uh most common header C natural science writers and a macroblock header
0:02:33and that the size that it is relatively stable in a
0:02:38uh uh but once you know the net is in the pudding
0:02:43and we can accurately estimate the size of the i-th either
0:02:47and the format headers
0:02:48so that's not a macroblock header changes
0:02:51from michael michael and it is harder to estimate
0:02:55uh
0:02:56and the industrial for intra macroblocks
0:02:59yeah because the number of intra macroblocks are quite limited in the people
0:03:04uh uh we simply use L O
0:03:08had a size of the encoded in intra macroblocks
0:03:11to estimate the uh had a size of intra macroblocks in the current frame
0:03:16and for uh
0:03:17inter macroblocks we see that there so that different
0:03:21uh had informations in a in the head inter macroblock
0:03:25first is the managing information including the motion vectors and the reference frame ideas
0:03:30and that is the timing information
0:03:32uh it's a macroblock types and test
0:03:35and that's there is this information so the code and sir isaac information
0:03:42yeah and then this figure shows the percentage of
0:03:45uh different uh
0:03:47uh kind of
0:03:48how the information
0:03:49in relation to the
0:03:51uh per head aside enough
0:03:53so the x-axis here this is the frame index and basically this is the percentage of different information
0:04:00so that the name
0:04:02oh really
0:04:04uh is relatively small
0:04:06so we simply use the information in the past to estimate the
0:04:10uh and then
0:04:12up to now should be it's for michael in the current frame
0:04:16and the first time information
0:04:18you know the are only the data macroblock types and the block i
0:04:22a well beyond a need some counter and that just type information is encoded using a a a uh using
0:04:26a a a a a a fixed to be as a table
0:04:29so a to be a a the number of different kinds of
0:04:33uh a block and rocks
0:04:35you can uh estimate uh the size for
0:04:38uh a information X to D
0:04:41and the form shape information and uh as the information be see that the change at a a problem for
0:04:46inter N
0:04:47and uh uh the also a a a a to apply a large part of the total had a size
0:04:53so in the family we real focus on uh the estimation of the stats for
0:04:57motion information and a
0:04:59uh us to P information
0:05:03a
0:05:04let's
0:05:04yeah and of the a a motion information so in two N seven
0:05:08uh fashion
0:05:09so she's is that a a student of professor clear
0:05:12and the shouldn't has a a uh uh uh proposed a a a a good of a model for the
0:05:17uh most information
0:05:19so uh
0:05:21does more or she the that uh
0:05:24so i of the motion information based uh statistics up the
0:05:28um motion vectors
0:05:29so is you just had and at this is the number of
0:05:32not to a motion vector element
0:05:34and that motion vector element is simply add the the the radical
0:05:38component
0:05:39uh a from of the map
0:05:41and uh this is a a a a a a a number of
0:05:44this is number of motion actors
0:05:46and uh uh on the is that a constant is that we can derive from the experiment
0:05:51and that a i is that a a problem to a we can uh we need to update from frame
0:05:56to five
0:05:57and the experiment or that's have shown that that this man X back or many it has a this
0:06:03uh but but in our experiment we find that it is more of uh that's not to
0:06:08or is pretty the resulting bit-rate rate
0:06:11the resulting
0:06:12uh a number of bits actually be
0:06:14so uh C here are being called the different uh a sequence it is at a given the bit rate
0:06:19and that's access
0:06:20that's a X X six Q a
0:06:22this is uh a just this i in the bracket
0:06:25and the back
0:06:26here this this is the
0:06:27uh and number
0:06:29a a and the red points here we show the number of
0:06:33notion bits
0:06:34inside a a a a frame
0:06:37uh
0:06:37so uh
0:06:39this is that's in are uh
0:06:41so the to is not very uh
0:06:44on linear relationship between the number of of and bits and of this that them here
0:06:48so we think the problem here is that a a a a uh i a to is for uh that's
0:06:53not directly in as the motion vector
0:06:56instead that it to use put coding and the down is a
0:06:59difference motion vector
0:07:01so think that a and number of most of bits should have a strong relationship
0:07:05with the statistics of the difference motion back
0:07:09yeah so based on this idea
0:07:11so we that we slightly modify to small
0:07:14a a for a better prediction
0:07:16so this is it is just is our modified model
0:07:19is see that the not we use that uh and number of nonzero
0:07:23difference motion to elements
0:07:25and that this is the number of two
0:07:28uh a difference motion vector elements
0:07:30and then again i mean that is a a constant we can derive from the experiment
0:07:35and uh and uh a a a uh we need to apply for mac to mac well
0:07:39so i we have you down the
0:07:41uh a in but the experiments for the same has to use this
0:07:45we see that an hour
0:07:47uh a a and so this X X this is this have mean the bracket
0:07:51and the by its is it's is a number of motion bit
0:07:55we see that an out there is a a a a a linear relationship
0:08:00between between the number of
0:08:02um motion bits and this item here
0:08:04so we think this model can be used
0:08:06to uh pretty the side of a motion information that
0:08:11uh and uh
0:08:12uh but if we look at the this group point thought the was a number of header bits in a
0:08:18frame
0:08:18so if we look at the uh uh uh point we see that there is
0:08:22no a strong relationship between to
0:08:24the a of number of that the a number of bits and that this item here
0:08:28so as we have mentioned
0:08:29uh uh another very and the part and the macroblock block had a size is the
0:08:34a that information
0:08:37uh to
0:08:38that the add to a to estimate the size
0:08:40a a a a a a a of the a a had information be also need a
0:08:44man
0:08:45to uh uh at the me the side of the information
0:08:50uh you paper speaker show she uh she has proposed that uh the number
0:08:55all of us to be it's should be proportional to the
0:08:59uh
0:09:00uh a number of texture bits
0:09:02and the been a from the mean that the model of that's and number of texture bits
0:09:06sure that be a proportional
0:09:08to the percentage of nonzero coefficient
0:09:11so here we simply down some experiment is that uh
0:09:13a uh we have used to bounds some uh
0:09:16uh experiments
0:09:17to check this
0:09:18uh a relationship
0:09:20so the X axis Q is my man slow which is the uh uh a percentage of a of nonzero
0:09:25coefficients
0:09:26and the back Q as this is the number of also
0:09:31so we see that although the a uh some kind of linear you know relationship
0:09:35but this is not so strong as we have
0:09:37it
0:09:38uh this stinks uh it this is because the
0:09:41uh
0:09:42number of so that it's depends not only on the
0:09:46and number of nonzero coefficients
0:09:48but also around the distribution
0:09:50of this coefficient
0:09:53here in this paper
0:09:54uh do you want to uh
0:09:57a find a of which also consider as that distribution of nonzero coefficients
0:10:02and uh
0:10:03this that are proposed the with model
0:10:06for so P information
0:10:07uh we see here we use the a a a a a number of
0:10:11uh
0:10:12this is the number of nonzero macroblock
0:10:15and that this is number of to macro
0:10:17and that's a macroblock is
0:10:19uh a are defined as a macroblock block
0:10:22you may
0:10:23or is that i'm as the coefficients at those
0:10:26so we use
0:10:27uh this number as an indication of the distribution of the nonzero coefficient
0:10:33and uh we see on that is
0:10:35still a constant and the guy i is a parameter we need to update a at that to be during
0:10:40the encoding
0:10:41so we seen a uh
0:10:43i X X is
0:10:45this is this i time in the back it
0:10:47and the y-axis is the net but also it's
0:10:50so we see that uh now there is a strong linear relationship
0:10:54between this item here and than M L C D
0:10:57we will use uh this model to estimate the size of the information
0:11:03so it is uh
0:11:05is that the two models
0:11:06uh we propose
0:11:08and uh now we can introduce our two-stage rate control algorithm
0:11:12this is that this is basically that's the uh the same
0:11:15as
0:11:15uh proposed in the to node or the me but control
0:11:18model
0:11:19uh so we have to stay it is
0:11:21a a first to be a two uh frame level bit allocation so we term how many bits uh i
0:11:26don't K to to uh in of the frame
0:11:29and and the analysis that you be do motion estimation of the dct uh and uh transformation
0:11:35and the be of the coefficients
0:11:37the that information and the multi information in the back
0:11:40this is a be later use the by the encoding stage
0:11:43as the S by the it can draw what
0:11:45and the in the encoding so is we actually code the or the macroblocks
0:11:49and the first
0:11:50uh for each macroblock well we first estimate the had a the number of header bits for the we maybe
0:11:55macro
0:11:56except that could be it's for inter macroblock
0:12:00we could have a clue that the that be be it's because
0:12:03uh it depends also on the that selected a Q P because if we use different you P the way
0:12:09that that it's at his number
0:12:11so uh are that it and number of uh a non-zero coefficients all be different
0:12:15and the number of to be P is we also be different
0:12:17so we can only estimate this together is the texture
0:12:21and the biggest the suspect
0:12:22uh that is a a a a a a uh a just set aside from that
0:12:26uh
0:12:27cut and be bit but it and we get the uh bit about it for
0:12:31so do be and texture be it's
0:12:32and the next we find that to the piece to so that's that the sound of their are a number
0:12:38of so be that be it's
0:12:39and the the texture bits
0:12:41uh
0:12:42that's not exceed as the
0:12:44come to be the bad
0:12:46we will use our proposed model to estimate that
0:12:48and i but also if you bits
0:12:50and uh uh for text B is to use a a lot of mean with model
0:12:54and then be used the uh uh uh select a could be you go to each mac rock and after
0:12:59each map encoding
0:13:01B B updates uh
0:13:02prime terms in the risk models
0:13:05and uh after we have found this for all the macroblocks in the for an is and uh can update
0:13:11uh a the out but the crime in the model for prediction of the
0:13:15uh uh next frame
0:13:17a this is the basic work flow of our proposed
0:13:20two-stage rate control over them
0:13:22and we can have a look at some
0:13:25mm
0:13:26experiment results
0:13:27so if you encode the difference because it is that the different uh uh bit rate
0:13:31and uh here we compare for uh
0:13:34uh rate control algorithms
0:13:36that is
0:13:37and uh rate control next
0:13:40uh you X two six because our algorithm is implemented in the
0:13:45uh X
0:13:45two six four encoder
0:13:47and this and this by the original automated control
0:13:51uh without header size estimation
0:13:53and uh uh for the sake of instead of a and B uh
0:13:58yeah uh use the wood in the reference paper to estimate the header size
0:14:03each frame
0:14:05this is that uh on the second level so first we compare this for uh are within
0:14:10just four buttons on the second level
0:14:12is that an sequence that we want to see how close is the
0:14:16uh actual bit-rate to the high bit-rate
0:14:19so we see that uh
0:14:21actually it is uh for rate control algorithms
0:14:25uh perform well
0:14:26so we see that the resulting bit rate is very close to the
0:14:30target bit rate
0:14:31and uh you insisted we also show some show the psnr
0:14:36uh compress the
0:14:38oh that's really gonna to control algorithms with a
0:14:41there's control E X two things we see that uh
0:14:44uh for the two
0:14:46but control
0:14:47this header size estimation we can achieve
0:14:50uh
0:14:52applies to pairs are
0:14:54this is in the
0:14:55a second and then we can go down to the
0:14:58uh for that
0:15:00uh to see the uh that's speculation of of different uh
0:15:03and all them
0:15:04so here we compare
0:15:06uh
0:15:07three are the mean that control algorithms
0:15:09uh
0:15:10we in of the uh to sequence for about
0:15:13uh uh and uh the type difference that is
0:15:16for hundred about
0:15:17and of see is that of it
0:15:19just and this is
0:15:20the uh original are gonna in rate control it out had as that's estimation
0:15:25and so we see that compare this
0:15:27just to method of is this does that estimation
0:15:30uh
0:15:30the uh ones
0:15:32you of the frame size here is or
0:15:35so we see that uh this had is that's estimation you can
0:15:38uh reduce the
0:15:40uh for exact calculation bits in the
0:15:42you
0:15:46and that we can but the go down to the uh um back
0:15:49macroblock block that want to see the uh uh Q P variation between a frame
0:15:53we know
0:15:53was that a macro can level like control algorithms a lot the two P to be adjusted to
0:15:58for each mic block so that we can meet the uh type a frame size actually to but if that
0:16:04you that is changed to match
0:16:06so we have
0:16:07um
0:16:08uh quality activations means in the
0:16:11for him
0:16:11so here we simply show some expand the results
0:16:14uh this adds a fifty cent of had was for him in the for the accused
0:16:20you can spend that
0:16:21and uh we compare this really you're the meta control algorithms
0:16:24the point here is again
0:16:26for the uh uh
0:16:29for the original little minute control without
0:16:31i does that the estimation
0:16:33so we see that uh
0:16:34uh uh at the beginning of the frame that you hear meadows
0:16:37a very large
0:16:38and that we have and uh
0:16:40uh for them to be better changes dramatically
0:16:44we think this is due to the lack of header size estimation
0:16:47so that at the beginning of a frame
0:16:49the let control can now to estimate the
0:16:51resulting
0:16:52a number of bits actually a T and i the end of the frame H needs to
0:16:56change this
0:16:57a dramatically
0:16:59and we see uh for the two uh our presents
0:17:02uh this
0:17:03had a estimation we can achieve a smaller
0:17:06uh we can you was smaller or of two P relation
0:17:09for example
0:17:10this
0:17:11a our proposed it can draw was them B see that a
0:17:14uh as to of the whole for the Q P values
0:17:17you know D that's not change
0:17:19and he in this for it's changes only be you know very small
0:17:24so these are a work
0:17:26i the results
0:17:27and then now we control
0:17:29the conclude in
0:17:29so in this paper we have proposed with the models
0:17:32for estimation that side of the information you and starts to for
0:17:36and we also introduce a a two stage me to to control all of them
0:17:40uh this had a sense estimation
0:17:42and uh uh had those that's estimation be can achieve a better to control accuracy B can
0:17:47but you smell of frames that's fluctuation tuition in the
0:17:50sequence
0:17:51and the can can also achieve
0:17:52smaller or to be variation within that
0:17:56okay i think this
0:17:57or or met up to
0:18:01we we can do have a couple questions
0:18:08so you compute
0:18:11yeah
0:18:12yeah you compared
0:18:14i
0:18:15well
0:18:17i
0:18:18yeah
0:18:20yeah
0:18:21yeah so so
0:18:22this one
0:18:24the the the
0:18:25the right car is for the
0:18:28this this
0:18:28the
0:18:30yeah
0:18:31but in the weapon paper actually the for the source it's
0:18:34use the
0:18:36uh the quadratic model
0:18:38not a lot of model
0:18:40yeah we have used them and this model
0:18:42the together with
0:18:43not only mode
0:18:45this is
0:18:46experiment results
0:18:48does
0:18:51it
0:18:52also works
0:18:54also works
0:19:07so it was that you straight
0:19:09the extreme
0:19:12the that's you extract you
0:19:14the uh okay so i in this we encode the a uh for for each seconds to being called the
0:19:18for to uh uh uh a three hundred or a frame
0:19:21but only the fourth
0:19:22that's that's uh i four
0:19:24for the following a
0:19:26a a a whole with different
0:19:28oh
0:19:29be
0:19:29hmmm
0:19:30oh i hope will be you
0:19:33uh uh
0:19:35uh we we do not use any before
0:19:38so we do not use and hierarchical
0:19:41oh
0:19:59so
0:19:59uh_huh
0:20:01you
0:20:01usually larger size
0:20:04uh
0:20:05however
0:20:05yeah
0:20:06uh the only tried uh cues if the consistency
0:20:11we get a result
0:20:14what
0:20:17still
0:20:19hi
0:20:19which
0:20:20uh
0:20:21if
0:20:22okay because
0:20:23also do that
0:20:26i think for a uh
0:20:28for higher resolution depends on the target for a target a bit rate
0:20:34so
0:20:36maybe if the if the uh target uh
0:20:39target bit rate is
0:20:40i
0:20:41i think
0:20:42uh picture piecemeal
0:20:44okay cap most of the
0:20:46basing the sequence
0:20:47so
0:20:48so how does that
0:20:49mission
0:20:50will be nice
0:20:52efficient
0:20:52but the of the
0:20:54uh
0:20:54bit-rate this but
0:20:56we don't in the middle
0:20:57little
0:20:59i think that uh
0:21:00uh
0:21:01how does that
0:21:01estimation
0:21:02we bring
0:21:03a lot of
0:21:04okay
0:21:06yeah
0:21:08you
0:21:09i
0:21:11if
0:21:12i
0:21:14as you
0:21:15which
0:21:18i
0:21:20you know
0:21:21yeah
0:21:23i
0:21:25oh
0:21:27actually a we we haven't a need uh a high resolution
0:21:31yeah
0:21:32you haven't done on a a i am and the for high resolution sequence
0:21:35but i think uh
0:21:37for
0:21:37is
0:21:40the the percentage of nonzero coefficients is also
0:21:45so uh a a one hundred depends on the side of the frame
0:21:48on the other hand it's
0:21:49it's also depend on the target before
0:21:52so you you want to do
0:21:54right
0:21:54the high uh
0:21:55hi uh resolution second
0:21:57with a you are there is slow
0:22:00a only a little bit rate
0:22:03you you can to you
0:22:05i speech
0:22:06uh_huh you
0:22:08it
0:22:09yeah just
0:22:10it makes it
0:22:12yeah sure
0:22:13so
0:22:13this this case
0:22:14i think that uh
0:22:15um
0:22:16maybe the
0:22:17percentage of nonzero
0:22:19you you higher are much higher
0:22:21then our experiment
0:22:24okay okay