thank you very much
and a um it's you also see yeah i'm not the first author of this paper
but in our case i must say for to T such that this cannot be here
to date because see his family has been in by a second door to two weeks ago so he can't
be here
um
and i'm working at the international audio lab or to recent a in which is that
joint institution
of
um the university of a you know back
and the problem of a institute for integrated circuits
what's the motivation for the work i'm going to present here
is that you often do in music production use a lie on
mixing prerecorded material
samples
and um you also need to at that these samples frequently two
different to musical context
then
the context they were recorded in
so in in some cases you might need to key mode conversion
this means major to minor or vice versa
and
they the a
algorithm for four
enabling this task
as been presented
um in
previous conferences
this this is called mode clock modulation vocoder
um it's some what's you to put to this task
but um we also found out that device
special enhancements necessary
in order to address
special requirements for this application
so i first want to um give a short overview on this model walk
accuracy
which performs the single pass and is
in a block wise processing
which is shown in in a block diagrams here
it does
first
uh signal adaptive band-pass filtering
which is aligned with spectral center centres of gravity
means we first of the um if T analysis
yeah
a dft analysis
and from the dft spectra
the um centres of
gravity
in perceptually adjusted then
uh determined in the band it's uh just it's so they are this decomposition is flexible
so from these centres
center frequencies
um and the around centre center frequencies to construct a bandpass filters
and
i in the yeah
done in the frequency domain
and in inverse
uh
dft T
get back for each bandpass signal a
to a time domain signal
and um this time domain signal
bandpass signal
is then and lies with and am and fm
and that this this
so you basically you have the carrier frequency which corresponds to the centre of gravity of this special frequency reach
and
a uh the F signal which gives the um
instantaneous frequency offset
quite um relative to this carrier of frequency
and you get um get the instantaneous make me to do M P chewed in the A M component
and then you can close to the signal in this modulation domain
for example you can change the carrier frequencies
and still maintain that uh fine temporal structure
um by keeping the A M and the F
it's
um
in the synthesis
you have to combine the a if M component with the maybe mode you modified
um carrier frequency
you have to
somehow one the different
um components from button block to the next block because it's tempered blocks sets it before
um
or just and yeah
and
um you to uh
and overlap it
processing of the am and the F M
or frequent instantaneous frequency
signals in order to get continuous
um parameter
and then you two
the synthesis
and at up
um all the sickness from the different bands you had
decompose the signal into four
so this is the basic structure of the modulation
well coder
but do you to the structure with the relatively
long blocks in the dft analysis
you still the um miss some of the um
signal
uh
characteristics by this processing
um and this is
one of the parts we we address by the enhancement
and this
the first of these enhancement was the so-called envelope shaping
i means
temporal envelopes of with in
the uh
dft blocks
might got get um lost or distorted
because you um can lose the
um this uh to to dispersed and you can
whose face
a relations between the different tone
and
this would could cost the temporal smearing of transients
and in this case it's better to use
then explicit
a temporal envelope
and you get access to the parameters of these
um of this temporal in below
but doing an lpc analysis in the frequency domain
because correlation in the frequency domain
corresponds to multiplication in the time domain
this means with it at coefficient
you get from from lpc analysis
along the frequency axis
you get parameters
you can use for a um getting
and
time function
you could could say at time response
yeah but you can
then it at the end
a might apply to to get back the temporal and middle
this what is done
with this
read looks
these are the um
enhancements what the
envelopes
um
in other
enhancement
the enhancement which is necessary
once you um start modifying spectra components
is
that you have to take into account
that
um
music a sounds are not normally consisting of a fundamental into a lot of harmonics the tone
and um you should keep this in mind when you modify frequencies
so the overtones tones are um quasi harmonic on uh the yeah frequency scale
which are you normally integer multiples of the fundamental frequency on you team integer multiples
um on the other hand to musical intervals are based on a logarithmic scale
and um now it's
a question
when you modify frequencies in which way you should modify them
um or and of course we want to modify them in the the based way for for the
a for what we intend to to for example for the transcription
and we have to consider a
this
because if it's a five it if it's an over of one fundamental to
frequency you which have to modified in accordance with the fundamental and not according to the musical scale
the the um if it would be and uh signal toll
on and that and then other um
part of the of the um skater
so yeah in this leads to
some kind of ambiguity when you get one told in just look
um
look at it on its own
so that's why we have to um get some addition interpretation
to find out whether it's uh
fundamental frequency
are if it's an overtone or uh a harmonic component of uh
a more complex sound structure
this is just an example
of um how in pulse of this key is uh
can match the
um how morning
and um just one example of uh to pick out
could be the number five which is
five times the
uh a fundamental frequency of one to alone
could be also
um and now that in which is a major it
a parts
am
in this in this diagram that the at might of of tapes and not taking into account so
so we you can have
um
some ambiguities between
a
second and also the for um
harmonic
which would then be just put of
op tapes and so on
so that's why you get
kind of an be treaty with um over to ones
and
music scores
and that's why this
second enhancement at been added to model clock
which is so that hmmm
which is called harmonic locking
so um is a set before the to estimated fundamental as
have to be mapped directory
and then you have to um decide for a the components
if it's a
um
oh but
then it has to be lot to the
transposition of its fundamental
just an the processing yeah
you decide um for money told if it's
um not
to another
frequency of bits
as be transposed on it's all
and by this which
yeah um just on either it transposition
of them G D node based mapping which is done for the fundamental frequency
yeah are it
um
done a transpose according to the to its fundamental
if it
if it's locked as up apply
uh indication here
it's not
non locked
then it's is locked in to test to be looked to the fundamental frequency and its map
now we come to the um listening test
methodology
it's a to
a difficult task if you to um
this kind of transcription
so we uh selected
me D samples
which we first at in the original domain
and we did
me transcription to obtain
um five which we could then yeah put into the test
so these but it is uh transcribe
um
reference signal which is done by T
and then uh transfer to a bay five
and on the other hand hand we get the original wave file
and be processed it um
to to with the transcription and then we can compare the to
and we have
different versions
three versions of of the more folk and one reference
transcription
system
job
also present
yeah
um there's one commercial system available which is the direct note excess in the middle line at each up by
a mini
and this is available since autumn
when a two thousand and nine
and it also allows
selective editing eating of polyphonic music
but it performs a multi-pass pass analysis
and it doesn't automatic decomposition into notes and um
a heuristic classification rule
but it also can be used to perform this scheme mode
clean key mode conversion
and so that's why we also try to um compare our
approach with this one
these are the the um items we used
um problem with to P a project we use some different signals
and
different midi files
is the set before
trash shown here
and this B
try to get some variety of more complex
orchestral music
and some more um solo instrument
hearts
so cup quite a mixture of
complexity of of
um content
these were the results of "'em"
so called mass for a test that we don't want to go too much into detail
in this test we have a a um
normally you hidden reference
is
um
i don't you know to by one
we have um
uh
so quite reference which is just uh
low-pass pass filtered signal which just numb do you know to by number two
and we have the more work the origin and what block
the more rock um is number three what work with the harmonic locking is for
and mark work with the a harmonic locking and D um
envelope shaping
it's
five and six is the the N A you the rate um
this system be compared to
um but not first we want to see how um
oh enhancements work in T V C
um um for this one example B that um a difference between four and five this means the addition of
envelope shaping
what's see a for the key tar um
the key top once it's a much clearer a
and so
somewhat preferred by
the listen
and
um
here i um we have the difference but a a difference between
uh the original remote walk and that mote work with someone it locking
with the which
um delivered but the for a no signal
we also see that uh in in most of the cases
um the D N A
perform better
and
um
i can make first summer right these sides here that
the harmonic locking really improve the term the
the envelope shaping also improve the trends in
parts
but you know was rated better for five
out of seven items
and um the rating could cover different aspects
of
this sound change which but was performed here
like a natural sounding artifacts on melody or car transcription errors
but tampa the preservation or pages
um and it is nice in many reported to trend for transposition
error us
um
in the in eighty
and
uh tampa problems from what talk
so we made an additional test which was the formant preference test
when these main quality aspects to find out more if this is really the case
for this
um yet twelve expert listeners
mean post technical a musical background
and we had now with them the extended model talk
and compared it to the N a
and
um we also found out in the first test
that is unknown mailer T which is a
a transcribed version of the original the um me D
is
somehow hard to
um to great for for people so we did it the other way around we did the transcription with me
D integral tries
transcribe it back to the original um score
um with a right for with our egg
for for signals
which are shown yeah also orchestra and some mixture and P know
and
now we we put this
five in the in the preference test
and and the outcome was
quite clear in the sense
is the people that
reported before in
that's there was a quite the uh preference for
uh the melody transcription for more walk which is shown yeah what focus all that the it left side
and in these are the results for a for the a transcription music transcription
and he uh are the results for time of the
which uh
show the clear preference for for the D N A
i can play an example
a can play all the five
to get a
yeah and short versions in the all that is they are shown here
first your reaching a
a
a
a
a
a
um
i
a
i
um
a
a
a
um
a
a
um
i think the some problems in the
in the music transcriptions in it in a a
a number uh a pressing this listening conditions yeah
so um not example is this is the piano no used to have time i play also the
um
this device here
uh um uh uh uh uh uh uh
uh um uh uh uh uh uh uh
uh um oh uh uh uh
oh
uh um uh uh uh uh uh um uh uh
"'kay" so um just a short summary
um
we have down now the what work for selective trends
position of pitch
which is capable of real-time processing
and which can put use
trends ends
and
uh also improves the time the by how money clocking
and it's
um
referred over the commercial system in the
in terms of transposition position of the melody T but it you know a the um
prefer
in time proposed preservation
so and in maybe in general
the
the both of the systems were and the range from fair to good so there's room for improvement
but the already
a somewhat use of yeah
the system thank you
we questions
one question i had as willis was trained listeners was goal years or where there
um um for the for the preference test it's it were a of people who were also yeah i had
some music background to stressed
um quite important for
this uh a time to the um grading let's say
but they weren't signal processors are not special to a golden yes no
and you questions
one harder question
well would you like to do me if you had all the signal processing power and all smart you could
do
what would you like to do to
oh problem
um i think
can be
that they
can be made a bit more complicated if you
you can imagine that you have total ones which are
a mixture of
uh maybe harmonics and find a mentor the frequency and so on
a a at different harmonics of different tones which match and the on the grid
so then of course the decomposition is much more complicated
and it and of course for this you would need to quite a more um up station um so i
think this would be one of the ways of a a a a a a further improvement could be achieved
a because the see
anything else
thank
okay can use a microphone
on your bullet point up there about a reproduction of transients improved by lpc based envelope shaping could you comment
on that what that is yeah the it we use the lpc parameters and um be obtained in the frequency
domain and apply this is a time envelope in the time domain
this is what i showed with the with the rates blocks and uh
when overview diagram
thank you