0:00:14so good afternoon everyone
0:00:16um this research is a joint effort between ins are N runs
0:00:21and the university of over in in finland
0:00:26as we all know um we can describe
0:00:29computer programs
0:00:30at different levels of abstraction by using different
0:00:34programming language
0:00:37um that the lower level of abstraction we can use for example as simply
0:00:43and then on the other extreme extremes we have a
0:00:46such languages is thus
0:00:47some functional of is
0:00:50and then in there between some commonly used language which used like see
0:00:56no these and different languages have a
0:00:59different properties
0:01:01if we program mean ask simply
0:01:03generally we think that we get
0:01:05more efficient implementation
0:01:08on the other hand
0:01:09if we go to the other extreme
0:01:12we get a nice features like code reusability
0:01:16we can analyse the designs
0:01:18we can be a very find that our programs are correct
0:01:22and also our productivity and programming probably goes up
0:01:28in this
0:01:30we're concentrating on a language called are we seek out
0:01:34which is pretty high level it's
0:01:36definitely about C
0:01:40and this R P C cal is a data flow language
0:01:44which is actually a subset
0:01:47of the cal language that has originally been developed that the U C berkeley
0:01:53R B C cal was recently
0:01:55standardise find that
0:01:57i S O standardisation organization
0:02:01and i will
0:02:02now so you a simple example
0:02:04how the programs look like in our we seek out
0:02:10this here is in a program that we can imagine
0:02:13it consists of three actors
0:02:16B and C
0:02:18and as we know in data flow actors are the elements that do would actual the processing
0:02:25now the data between these actors is transmitted over these
0:02:29five for boppers here
0:02:32there are no other connections
0:02:34these connections are be
0:02:36all the ones that exist between
0:02:38actors and enable the communication
0:02:42if we go inside an actor
0:02:44uh we see that
0:02:46these are actually a finite state machines
0:02:49so we have a different states
0:02:51like this
0:02:51skit age here
0:02:53and then we have a state transitions that we call actions
0:02:58and it's these actions that actually
0:03:01do the data processing
0:03:02so that consume data
0:03:04and they produce data
0:03:09the main difference of R V C cal compare to
0:03:13um traditional data flow language use like a D S that was
0:03:17in the previous presentation
0:03:19is that um
0:03:21are we see cal allows
0:03:22conditional execution
0:03:24so we can have a data dependencies in the al gore
0:03:30on the one side
0:03:31this is quite good
0:03:33because it makes the land which are applicable to a wide a set of applications
0:03:38we can have a data-dependent applications
0:03:41but there's also a now side that the line which is hard to analyse both score
0:03:47programmers and four compile
0:03:52now the topic of this work
0:03:54is that we want to improve the efficiency of the programs
0:03:58that we have described with these are we see cal language
0:04:04as you remember from this picture
0:04:07or are seek is here on the pretty high level of abstraction
0:04:11and what we want to do
0:04:13is to push
0:04:14the efficiency of this language
0:04:15closer to those
0:04:17programs that have been manually rate than for example in C
0:04:23we want to do this
0:04:26oh sorry
0:04:27um so you know are we seek how um each data flow actor runs really independently
0:04:34basically this is a good thing
0:04:36because it means
0:04:37that we can re use these
0:04:39actors in other programs also
0:04:45in practice when we look at the complete that application that's read and you know V C cal
0:04:50we actually find out that these actors are
0:04:53really depending on each other
0:04:55and the reason for this is that
0:04:57um and actor need state ah
0:05:00to be able to operate
0:05:02and if there's no data and actor cannot execute
0:05:06on the other hand
0:05:07if the five for buffer is full
0:05:09and actor condo parade either because there's no space for the output data
0:05:16so we try to
0:05:18automatically discover these inter dependencies between actors
0:05:23and with this information optimized the whole program
0:05:27and make it run fast
0:05:30and the approach that we chose here is
0:05:33dynamic dynamic
0:05:34program is
0:05:35which means that we actually observe
0:05:38a running program
0:05:40and based on the information we get from there we can generate the new program which hopefully is more efficient
0:05:47than the original one
0:05:51so now we can go into the details of our method
0:05:56um i told you about this feature of our we seek out that it enables
0:06:00data dependencies
0:06:02and the first step in our method is to
0:06:06these data dependencies the are we seek help program
0:06:11and we will call these
0:06:15signals that calls
0:06:16the data dependencies
0:06:18control signals
0:06:21and we have made a detection rule for finding these signals automatically
0:06:27and the rule is like this
0:06:29um if the value of data in coming from the five oh affects the behaviour of an actor
0:06:36is a control signal
0:06:39because this is a bit harder to grasp so quickly
0:06:42i will show you an example
0:06:44so this here is and are we seek out program
0:06:48we go to each and every actor in this
0:06:53and we inspect
0:06:55each and every input port
0:06:57of every actor
0:06:59at some point we arrive at this at at their here
0:07:04and the middle input port
0:07:06and we note this
0:07:07that the data that comes to this court
0:07:10actually affects the behaviour of these
0:07:13at actor
0:07:17we know that they signal here
0:07:19called E type
0:07:20is the control signal
0:07:25so now that we have discovered these control signals
0:07:29we can go on
0:07:36as we know the control signals we want to express the behaviour of the whole are we see cal network
0:07:42as a function
0:07:44of the data that values
0:07:46go to these control signals
0:07:51to be able to do that
0:07:53we have to insert
0:07:55spatial act there is
0:07:57in these are V see cal network
0:07:59we called than token gate
0:08:04so here an example
0:08:06uh suppose we have two act there's a a and B
0:08:10and we have a detected
0:08:12that the C Q know
0:08:13equals five four
0:08:15between then
0:08:16is a control signal
0:08:19with split the signal and in
0:08:21a token gate active or there
0:08:27so now there has been this bus of word
0:08:30strand and that has a here here
0:08:33um a strand
0:08:34we define as a sequence of actor execution or actor invocation
0:08:41we define it so that each value
0:08:45of of the control signal that passes to this token gate
0:08:49we in the bulk
0:08:50one strand and
0:08:51that this actor invocations sequence
0:08:54at runtime
0:08:58we can automatically detect
0:09:01these are
0:09:03the help of this
0:09:04token gate act
0:09:08we do with so
0:09:10we let one token at that time to this token gate
0:09:15and observe or the value of that token
0:09:18and then we record
0:09:20the set of actors
0:09:22that this so can invoke
0:09:27however unfortunately this is not enough
0:09:30because there's more flexibility and the cal model
0:09:34uh generally these actors can behave in many different ways
0:09:39for one control signal value
0:09:43so we also need to find out all the different actor behave years
0:09:48that can
0:09:49uh take place for one
0:09:51uh token gate value
0:09:59we observed
0:10:00that the behavior of an are we see cal actor
0:10:04can be fully predict the deed
0:10:06before it actually execute
0:10:08when wind you know know the following properties
0:10:12of an active or
0:10:14we have to know the values of the actors state variable
0:10:21we have to know how many tokens there are are at the input ports of that factor
0:10:28and we also need to know
0:10:29what the values of those tokens are
0:10:35and this set of features here
0:10:37we called the signature of the actor
0:10:41and ask we analyzed the operation of the B C cal network
0:10:47we record
0:10:49C nature of the actor
0:10:51you for we let it to execute
0:10:56and then
0:10:57for every signature
0:10:58we see what happens
0:11:00we record the set of
0:11:02state transitions
0:11:03that happened and
0:11:04in the act
0:11:08finally this is what we get
0:11:11you know we have a
0:11:12one uh and gate token
0:11:14or one control signal value
0:11:17it is it can have different values like this and here or or in here
0:11:23and each of these values in the boat
0:11:25one strand
0:11:27this is one and
0:11:29this here is the second one and this is that third one
0:11:33these is trans
0:11:34consist of actor invocations
0:11:37like the first strand here consists of six actor invocations
0:11:41one two three four five C
0:11:45and then in some actors we have a
0:11:48option on different behaviour patterns like this can behave in two different ways
0:11:54and this one can behave in five different weights
0:11:57and we differentiate between these
0:12:00alternative patterns turns
0:12:02based on the signature
0:12:06now that we have all this information
0:12:09we can go to the final phase of code generation
0:12:12where we create
0:12:14um the new program that hopefully is more efficient
0:12:18and the original one
0:12:23so we have a model to the functionality of the application with these
0:12:27gate token values and actor signature
0:12:32so now we generate a C code of course we could also generate other kind of code like
0:12:39pause col or a simply or whatever
0:12:42but we have chosen C
0:12:45practically we in sir
0:12:47a set of
0:12:48switch statements to these places where we have these
0:12:52uh different gate token mountains or active can each
0:12:56oh we don't have time to go to any example code
0:13:00but um we have to go directly to the
0:13:05we test it this uh method with
0:13:08a few versions of an and pick for part two
0:13:12video decoder
0:13:14um these where
0:13:15different implementations of these video decoders and
0:13:19we got
0:13:21different speed ups for different implementations
0:13:26some decoders
0:13:27became more than two times faster
0:13:30where on the ones gained all only about fifteen percent of speed
0:13:38so as a conclusion um
0:13:41we have presented a fully automated approach to speed up
0:13:45programs for than in R B C cal
0:13:49and the are average speed was about one point five times
0:13:54um for future work
0:13:57um based on what we have a learned from this
0:14:00uh work
0:14:01we hope that we can create that's static analysis method we which means that
0:14:07we don't beat you
0:14:08each sound mean a running program but we examine
0:14:12the source code of the program
0:14:14and hope to
0:14:16optimize the application to that
0:14:20and another direction of future work is improve in the code generation
0:14:26which should also improve the speed ups we get
0:14:30and finally
0:14:32we can make them in method applicable to a wide set of applications
0:14:36um it by
0:14:39um let's say rewriting writing that
0:14:41method such that we can have
0:14:43more than one token gate in an application
0:14:46the present
0:14:49method we have it only allows one token gate are application
0:14:53which is a bit twisted D
0:14:58so thanks for your
0:15:06a Q are there questions from a yes
0:15:14uh uh i use that it's uh dynamic approach right so you are basically a ink with some
0:15:20that's sample
0:15:21and uh
0:15:24i mean
0:15:24how many past samples did do not or when you on different test sequence
0:15:29does it it's to defend of the musicians or or i mean how you deal be we fact
0:15:34that the
0:15:35basically you need to have set college each and uh
0:15:38you have to learn a little
0:15:40to to basically optimized to set called that
0:15:43you can now
0:15:44yeah that that's
0:15:47yes you're completely correct so
0:15:49uh this dynamic or um code down at is means that
0:15:54the performance of this optimization is dependent on the input data we can also call it the training data
0:16:01uh we used to one uh a low resolution video sequence as the training data
0:16:08and then we test data the functionality and got these numbers
0:16:12by a high resolution
0:16:13well for high resolution sequence
0:16:16not where much longer
0:16:18but of course i mean there can always be situations
0:16:21not your training data that's not contain
0:16:25all the different options that are there and in that case
0:16:29the decoding might actually fail
0:16:31we have a some um um
0:16:33let's say safety mechanisms against that but
0:16:36that's why this
0:16:38static analysis would be much better
0:16:45i i just one question so you don't attempt to create a co
0:16:50you're training
0:16:52you have
0:16:53some sense or is there some way to get
0:17:08no at the moment there is no such mechanism
0:17:13i think if we would implement that we would already go to watch the
0:17:18static code analysis which we
0:17:21intend to to ran any case so i guess that's a clear direction of future work
0:17:33uh we thank you