0:00:15well thank you for that kind introduction j are
0:00:19you're right about luggage is then an issue would me and
0:00:23but i'm close when i don't have my luggage as an even bigger issue
0:00:28for me
0:00:29so i
0:00:32appreciate the introduction and i thank the organising committee for inviting me
0:00:37and especially for naming this town i don't know joe once you ha i've never
0:00:45had this happened before we went to give it is a presentation so please
0:00:52so let me start by asking for show of hands
0:00:56who among us has participated in a forensic style evaluation of speaker recognition technology
0:01:06that's good
0:01:08that's good i'm gonna try to get more hands up with interest that the and
0:01:12my presentation
0:01:16who is processed real forensic case data
0:01:23well that's pretty good okay
0:01:26so i'll be preaching of the choir some of you
0:01:31and finally who has actually testified in court
0:01:37that's good
0:01:38very good okay
0:01:41so let me
0:01:44talk about some of the interesting not challenges in a forensic and investigatory speaker recognition
0:01:55the basic introductory material for my talk is
0:02:02you know basically to define the problem so in forensic in investigated a speaker comparison
0:02:09the speech utterances are compared
0:02:13and the process can either be by humans or machines
0:02:17and
0:02:19in the forensic case typically this is for used in a court of law
0:02:24this is very high state
0:02:27it demands the best that signs has to offer and those of you who pay
0:02:32attention to trials on television probably are a pretty nauseated by what you see out
0:02:40there and what is happening in the world
0:02:43in terms of these expert witnesses that i'll be talking about later in the methods
0:02:49they use
0:02:52the map it's a vary quite widely and there is a very nice survey paper
0:03:00by golden french the describes some of the variations in these processes
0:03:07and that's not necessarily for the good
0:03:12and
0:03:13it's important that these methods that are used be grounded in scientific principles and be
0:03:19applied properly
0:03:23and just as important
0:03:24is to decide when you should not except that case
0:03:29when i it would be irresponsible
0:03:33so this idea of went upon or not apply a the methods is also very
0:03:40work
0:03:43so we're gonna provide some analysis of the methods and a place to make citing
0:03:49examples that i hope will get you excited about
0:03:52how challenging this kind and domain really can be
0:03:57and
0:03:58one of the things i wanted you hear and in the broader sense it other
0:04:04conferences with wide diversity
0:04:08is to improve communications among
0:04:11the research community this rate group here a legal scholars
0:04:18you know we have for example in speech people like bill thompson
0:04:22who
0:04:23wrote the prosecutor's fallacy and was involved in the o j simpson trial so we've
0:04:30got a number of very high profile a legal scholars in the us
0:04:35involved and also international and of course the legal systems are different throughout the world
0:04:43so you have to address these contexts these questions within context
0:04:50and then finally
0:04:52i'm going to ask this community for health and present some other things that you
0:04:58could actually you get involved in and help us make progress
0:05:07so i'll start by giving some background
0:05:12cover some example approach is a talk about some of the activities
0:05:18that are currently going on a request some
0:05:21things for the community to get involved in some future ideas and conclude
0:05:29okay so with forensics and investigation basically they differ by
0:05:36primarily by whether the
0:05:39methods
0:05:41we will be presented in a court of law
0:05:43a lot of people for investigation will try to use a similar process that has
0:05:50the rigour necessary should it be important to pretty later presented in a court of
0:05:57law
0:05:58but the basic forensic community and investigative community
0:06:03work and similar problems in terms of trying to establish facts
0:06:09and the actual presentation form is where they differ now here i have a cartoon
0:06:17that shows basic the most canonical example of a speaker comparison we have a known
0:06:24a speech sample and a question speech sample
0:06:28and you compare them
0:06:30and there's some summary or analysis
0:06:32the
0:06:34forensic examiner or in less than mike right a reporter
0:06:39and
0:06:40we're not done that's
0:06:43that's the simple view of the world
0:06:48then i was happy when i asked the number of friends for suggestions
0:06:54michael jensen from a p k a kindly provided this table from his summer school
0:07:00that shows a little more granularity in terms of a forensic versus investigated
0:07:08including a large scale
0:07:13investigation where you might actually be running
0:07:17automatic systems that are similar to i office the f b i z integrated automated
0:07:23fingerprint identification system
0:07:25which conducts large-scale searches through databases
0:07:30and you can see here that they vary in terms of whether they will be
0:07:34presented in court
0:07:38what kind of
0:07:42methods are used
0:07:43a number of comparisons
0:07:45and the type then
0:07:48style of working on the date
0:07:51so let me now give just a couple examples
0:07:55of some forensic situations
0:08:01first you might remember the olympics in nineteen ninety six with the centennial park but
0:08:10there was a of thirteen second phone call
0:08:15that said there is a bum
0:08:19in centennial park
0:08:21you have thirty minutes
0:08:23that's it
0:08:24so now i you've got this thirties thirteen second call the people are frantically trying
0:08:31to figure out the address of where is centennial park at nine one
0:08:37so that they can dispatch officers to the scene
0:08:42basically a lot of time passes they have a short time to clear the park
0:08:47by the time the officers get their two people are murdered a hundred and twenty
0:08:52people are injured
0:08:54and now they have a suspect in custody two matches the description of someone that
0:09:00was seen it
0:09:01payphone
0:09:03and that person's name is richard jewel
0:09:06and
0:09:09they have quite a bit of her sin trying to establish if this person is
0:09:15the one on their call
0:09:19turns out the actual person who made the call
0:09:24escaped the scene and was not caught for seven years
0:09:29another a very high profile in recent case a tray of and martin
0:09:36this was
0:09:39had all sort of the wrong things happening all at once
0:09:44extreme mismatches of every type imaginable
0:09:48these outrageous claims of justified shootings and then
0:09:53just to make it more interesting the orlando sentinel newspaper decides to go higher some
0:10:01voice exports
0:10:03and
0:10:06i don't know if they quite appreciated the conditions under which they were working
0:10:13first of all it's hardly consider speaker recognition when the person is a crying out
0:10:19for help right
0:10:21and i'll show you later some of the issues involved in that so this was
0:10:27a very turbulent time in the us
0:10:31and
0:10:32a lot of controversy regarding the kind of data that was involved in this case
0:10:38in how
0:10:40i how inappropriate the whole situation is we have people by the way like george
0:10:45doddington who's here today for keeping the system on the rails he was one of
0:10:53the expert witnesses
0:10:57so how heart is forensic speaker recognition
0:11:01well
0:11:02a first step in that direction that actually is not truly forensic speaker recognition
0:11:10who was this nist a haze or evaluation and actually i miss the before the
0:11:15nist hazy there was actually and evaluation by and if i to you know that
0:11:22actually would real
0:11:23forensic case data
0:11:25i'll talk about that in one
0:11:27but in the haze your evaluation
0:11:30you know unlike conventional nist evaluations
0:11:36you know where you have so many trials they're not really pride itself practical for
0:11:41humans to process the data
0:11:44here there was a paring down to make the number of trials manageable by humans
0:11:51and the process for doing that was a two-stage selection process where
0:11:58you would use an automatic system to find the most confusable pairs
0:12:03and then a file that by using humans to then
0:12:09find the most confusable pairs of the confusable automatic here's so you have a very
0:12:16difficult data to work with and the benefit of that was now you can have
0:12:21a you know evaluation
0:12:24with the mere fifteen trials that's manageable by humans
0:12:29in this what is the beginning for the nist
0:12:33style of evaluations that are in this direction
0:12:37so i don't know if you've heard the use but let me just play one
0:12:40here
0:12:43so here is a trial eleven
0:12:46now play the two samples and the question
0:12:51that is asked are these from the same source
0:12:55here's the first one
0:13:07yours the seconds
0:13:18so
0:13:19it's pretty impressive to me the
0:13:23that's supposed to be as you can see by the truth label here
0:13:27two people
0:13:29i will i like i said in brno i would love to actually meet these
0:13:34two people and see that there are two separate people have dinner with them
0:13:40you know maybe it would be high price of for the meal but
0:13:45those two people confused the humans and the automatic systems consistently in the first is
0:13:53your evaluation
0:13:54and it inspired a lot of people to
0:13:57look in the this interesting problem
0:14:01and i unlike the to traditional nist sre protocol he's are of course allows human
0:14:07listening
0:14:08so this is exciting
0:14:10at the time of all the data was in english so that might somewhat limit
0:14:16some of the human approaches
0:14:19but it's shore gave a nice flavour of the challenge in this is difficult
0:14:25but you know what it's not nearly as difficult as the real thing and i'll
0:14:30play that the mom
0:14:34so some
0:14:37challenges in i speaker recognition for humans in machine i have a few slides
0:14:43the nist of else have made progress in things like channel mismatch
0:14:49distance to the microphone by progress i mean progress in evaluating the of these of
0:14:54facts
0:14:56also in terms of duration and cross language although not showing notes here
0:15:03so that this is good but there's a lot more going on in a lot
0:15:08of forensic case data
0:15:11so the typically in these scenarios
0:15:16the talkers are unfamiliar to the examiner
0:15:20the talkers tend to be familiar with each other
0:15:24and that affects their conversation-style there can be multiple talkers there's all sorts of different
0:15:32styles a conversational read aloud crying speech for example if you wanna call it speech
0:15:41and then accommodation when you have familiar talkers adapting to each other
0:15:47if there's a conversation that's part of the evidence which is often the case they
0:15:54might be deceptive
0:15:57and i have examples of this and sometimes you dealing with people who are mentally
0:16:01ill or medicated and they can be all these situational mismatches to deal with
0:16:10this goes on and on
0:16:12but you know what it's actually have a nation often used thing so if you
0:16:18have an evaluation where you evaluated a few of these factors the problem is these
0:16:24are combined in horrible ways to make it even more challenging
0:16:29when you're trying to determine what is the performance
0:16:33about system or a human or human with the system
0:16:37so you can have mit mismatch galore between the samples that are being compared and
0:16:43also all the information used to train our automatic systems the background hyper parameters it
0:16:51goes on and on
0:16:54then you have additional challenges in terms of how should this information be presented
0:17:01in terms of scoring or decisions you know we will be pretty strong advocates in
0:17:08general about say for example reporting log-likelihood ratios or something like that
0:17:14but
0:17:15a lot of the forensic people i work with
0:17:19the investigators don't wanna hear a log-likelihood ratio they want to know what they should
0:17:23go take action
0:17:25this gets very bad in the number of ways mathematically ugly because of asserting prior
0:17:32probabilities to make decisions this is a very hardened
0:17:38a tenuous situation
0:17:41in an area where this community is made some progress and i'm hoping route odyssey
0:17:46all actually sees the more in this direction
0:17:51then you have this whole issue of calibration with system scores moving around and drifting
0:17:59if you well in this causes chaos among the analyst
0:18:03so one of the biggest challenges in a lot of this is building a i
0:18:08trust and confidence in the analyst or examiners if your system starts misbehaving i they
0:18:17might start using it or do something kind of crazy
0:18:22so there is a lot of issues with down establishing trust and having the system
0:18:27be reliable and stable and calibrated
0:18:30then you have the issue of the courts question so we talked about sort of
0:18:35this canonical example with i got two speech samples
0:18:39is the source the saying
0:18:41well that's not necessarily you the question that the court hence
0:18:46it's not make it but the other guys just been murdered
0:18:50and we don't have any recordings of his voice
0:18:53so now what you do you there is a whole bunch of
0:18:59challenges with trying to figure out
0:19:03you know how do you deal with the
0:19:06you know a known in advance questions from the courts right now one of things
0:19:11that i've been pursuing with some probably is to see what are those questions somewhat
0:19:17negotiable
0:19:18and can we get a pretty good menu of what the history of these kinds
0:19:24of questions are to help us as developers build systems an acquired data to help
0:19:30address the kinds of questions that are likely to come up
0:19:37then you have this issue with the automatic systems where you know people might think
0:19:41that they're fully automatic
0:19:43but often what happens is there is models that have been bill
0:19:48i human head is segmented speech
0:19:52and
0:19:53decided what speech utterances are assembled to create models
0:20:00so you got this kind of chicken in the or the egg problem right so
0:20:05i'm trying to recognize speakers but yet when i'm training my models i need to
0:20:10do some segmentation
0:20:13so there's that factor to keep in mind also
0:20:18then there is a whole bunch of other things going on here
0:20:22i've already talked about went upon
0:20:25in terms of not accepting a case
0:20:29you went upon is the expression from american football i'm not sure that translates internationally
0:20:38and then there's some other issues about noise and degradation that are important to keep
0:20:44in mind
0:20:46and we'll talk more about those in the moment
0:20:50so
0:20:52now let's actually here some real case data
0:20:57this is pretty fascinating i thing
0:21:02i'm going to show some examples play some examples the first one
0:21:08i'll set it up for you a triples triple homicide is just been committed
0:21:15the suspect runs from the scene
0:21:18with one of the victims cell phones and their blue two
0:21:22and he's calling his for and
0:21:25to come and pick came up
0:21:28he's running is fast and see
0:21:31the wind is blowing
0:21:34and i it's a very difficult situation so let me play this
0:22:02so that has
0:22:04a lot of characteristics that you probably are used to working with in say the
0:22:09nist evaluations
0:22:14and the this is really challenging stuff and it gets better because
0:22:20now we have the suspect in our custody in his jails
0:22:27and he's kind of perverted to being like just in beaver
0:22:33so listen to the
0:22:58so that's pretty a mismatch when you say
0:23:03i don't know what you would do with the data like that
0:23:09so that's just one the one example of just incredible mismatch and always not only
0:23:16between the samples themselves well maybe the last one isn't
0:23:21terribly unlike a lot of that's training data that our systems are built with but
0:23:26i'd be surprised if are systems have been trained and have their hyper a hyper
0:23:31parameters and background models knowledgeable of the at the this like that first same
0:23:39so this is
0:23:42extreme mismatch not only between the samples but against are systems
0:23:48but we play another example
0:23:50of a very complex situation
0:23:54where you have some pretty stressed overlapping talkers
0:24:15how many talkers are there in that situation
0:24:20sounded about like three to me but you know i i'm not sure
0:24:25or
0:24:27you know and apart i didn't plays the beginning where you've got
0:24:31the operator at answering nine one and then you hear the person in whispering and
0:24:37then putting the phone into their pocket
0:24:40i where they found it later unfortunately
0:24:45who is then the victim
0:24:47so
0:24:48this is the type situation some so this gets in of the questions like what
0:24:53question am i trying to answer how many people work rats and
0:24:59who said what
0:25:01the area of disputed utterances as it is known in the forensic community
0:25:07so these guys of course are you know rounded up and they're all claiming nodes
0:25:13the other guy that shot am i was just visiting right and friends or so
0:25:19so there's challenges like that you're with
0:25:24another example
0:25:26is
0:25:27is a very interesting threat hall
0:25:31and this one has some timeliness about it as well
0:25:36so listen to this first recording
0:25:50so the audio system in here is pretty good i don't know if you could
0:25:55make that out but the guys basically giving the address of that's going to be
0:26:00attacked by gunmen tomorrow
0:26:03wow better decide what you're gonna do you
0:26:07so they decide to bring in a suspect
0:26:10and here's his interview
0:26:26so i there's and number of things going on that first call it seems like
0:26:33the person was like in the movies holding a handkerchief over the phone
0:26:39sound like they had marbles in their mouth
0:26:41the second one i don't know if there are medicated or what's going on there
0:26:46but there is a lot of mismatch going on in that situation and you know
0:26:52for investigative purposes even though you're not in a court of law
0:26:58it still has high stakes when you go decide to take somebody in the custody
0:27:04i mean that's a dramatic experience right so you still need to be cautious how
0:27:11to proceed with that
0:27:14but it's very difficult to make a quick decision in situations like this
0:27:20and you know
0:27:22this is just a small part of it
0:27:25as reversed warts at the your secret service as if it's always something every case
0:27:31there is a case where
0:27:34somebody a had a sex change operation during the
0:27:39first sample and the second same ball that we're being compared with
0:27:45you know the so a lot of our systems that are gender dependent like what
0:27:51you do you know that there is just
0:27:55so many challenging situations
0:27:59they come up when you're dealing with real a forensic case data and i should
0:28:04add
0:28:06the when samples get elevated to the level of the national resource like reba schwartz
0:28:13those of the hardest of the forensic cases the easier ones can be handled it
0:28:19a lower level
0:28:21so these are very challenging situations
0:28:26and one might ask what how do i figure out if
0:28:31i if i should process this data
0:28:34if it can be admitted in the core
0:28:38if i'm in the united states
0:28:40i have this
0:28:42admissibility standard and the with the doppler
0:28:48so for example
0:28:52in us federal court and in about half of the us the words
0:28:59the job which will consider the admissibility of scientific evidence
0:29:04but judges are often the first to admit that generally they're not sign this
0:29:09so they had this sort of d he role pushed onto them
0:29:16and the idea is
0:29:18under federal rules of evidence number seven no to the testimony by expert witnesses
0:29:25the purpose is to assist the trier of fact the jog through the jurors
0:29:30if the evidence is going to be very confusing
0:29:34then it's not
0:29:37it method
0:29:40so that this is kind of loose
0:29:42here the courts have in the us have tried to
0:29:49structure this
0:29:51and a
0:29:52form this so called out we're test
0:29:55this is a the over versus merrill dow pharmaceuticals
0:30:00and basically four or five depending on how you read it different factors
0:30:08are introduced in the this the outward test
0:30:12so has the method bin or can it be test
0:30:17well
0:30:18one of the nice things about our communities that we do test a lot
0:30:22not sure that we test on this kind of data
0:30:27another is you know has been subjected to peer review and publication
0:30:33well are communities very good at publishing papers and
0:30:38this odyssey is just one of those excellent the forms
0:30:44now we're in trouble
0:30:47does it have a known error
0:30:51wow well if you tell me what error rate you want i can find the
0:30:56corpus that will probably give you that error rate that's not the answer they wanna
0:31:01hear right they are they want something pretty solid much more certain like
0:31:07for example the in a
0:31:10which by the way also has variability
0:31:14but that's a whole nother story but at least it's relatively small compared to what
0:31:19we experience
0:31:20in the voice world
0:31:22are there existing standards controlling its use
0:31:26and maintain
0:31:28well currently there's very little in that area but in the us all be talking
0:31:33in a moment about some activities in that direction
0:31:37and
0:31:38learning about what's happening internationally which is one reason implied to be here this workshop
0:31:45and then of
0:31:47the first one is sort of this friendly thing like you know is it generally
0:31:52accepted by the scientific community
0:31:55then you get in all these problems like what's a community what's the scientific community
0:32:01and
0:32:02this up there are part
0:32:05is also known as the fried test which predated the arbour
0:32:10test
0:32:12so looking at
0:32:13the basic anatomy of the speaker comparison system
0:32:17you can form
0:32:18two parallel branches
0:32:20the start with the feature extraction and creating models and then go through a comparison
0:32:26of the
0:32:29hypothesis that the samples matched versus they don't
0:32:35i and then a producer calibrated a match score out what
0:32:41now that's
0:32:43fine however
0:32:46there's all these knowledge sources that are under the but
0:32:50and all these areas that are right
0:32:52for mismatch
0:32:54so for example let's just take and i-vector system
0:32:59so we have this signal processing chain
0:33:03and
0:33:05different stages here are shown where we need all these different kinds of background information
0:33:12whether it's
0:33:13hi hyper parameter tuning
0:33:16you know the universal background models
0:33:20i
0:33:20total variability matrix for the
0:33:24covariance matrix that's needed
0:33:26to make these systems successful
0:33:30but there's more
0:33:33what about calibration
0:33:35i need to train that system is well
0:33:39and a system that's not calibrated will drive in one is absolutely crazy
0:33:45and you lose their confidence and they'll stop using your system
0:33:51so this is a very important stage it's great the nico heads the paper here
0:33:56on
0:33:56calibration and weights to address this again
0:34:01one of nicholas favourite topics of mine too
0:34:05so basically you want to try to minimize all these nuisance as a some of
0:34:10which
0:34:12if you're processing single here's of samples at a time you can get a good
0:34:17handle on other nuisances are partly due on single pair comparisons
0:34:22those have to deal with logical consistency with the to use two samples matching
0:34:29and then another pair of samples matching but the others powder samples not match and
0:34:34when i say matching i don't mean that in the binary sense i mean scoring
0:34:39high
0:34:42so
0:34:43calibration is a good thing makes in was happy smile and when it works
0:34:50thank you go when everybody that works on
0:34:54so now what
0:34:56whatever what why do you if i want to combine these methods
0:35:00this gets also quite complicated
0:35:05and you know do you do we way these processes in a dynamic fashion taking
0:35:11into account when there are working in areas that they've been developed in trained on
0:35:17and
0:35:18d weighting them when there are running a little bit out of the regions that
0:35:24they've been developed for
0:35:27how do we mitigate the observation bias you know you certainly don't one day human
0:35:33examiner to know what the scores are from the automatic system before they can finish
0:35:39their evaluation
0:35:42but it gets even more fine grained than that sometimes
0:35:46you know you hear
0:35:48content in the mid in the samples you're working on that can bias you
0:35:53you might consider removing that content at the expense of working with less data
0:36:00you've got all these variabilities to deal with the subjects of the samples themselves the
0:36:06humans that are actually conducting the comparison process
0:36:10all analysts are alike
0:36:14for example then the machines that as well
0:36:18there's issues about consistency in repeat ability
0:36:24already mentioned logically consistent the desires and then
0:36:30you know having some best practices to establish howdy
0:36:34use these processes remember one of the doubt where criteria is the existence of standards
0:36:40and their maintenance
0:36:42to invoked these process
0:36:45so it works only there's a number of evaluations that can help us and if
0:36:52i t no i think in two thousand three had the very first one on
0:36:57real forensic data
0:36:59that was a lot of fun
0:37:02and you know the agreement we require that you destroy the data after you didn't
0:37:07unfortunately we divided by the agreement no longer have that the at the but
0:37:13that was really very nice
0:37:15but the good news is that
0:37:17there might be more about coming
0:37:21then we have the nist a teaser series which you know isn't quite forensic but
0:37:26it's probing some dimensions that will help us make progress i think in the forensic
0:37:30domain
0:37:32and the next sre
0:37:35might actually have real forensic samples and
0:37:41so
0:37:42are
0:37:43you know i think it's important to look at all this in the context of
0:37:47the delaware factor
0:37:49and
0:37:50i especially for application the united states
0:37:54but maybe throughout the rest of the world as well it's it they seem like
0:37:58pretty sound principles to me
0:38:01but if there's additional factors that are used internationally i would love to know about
0:38:06them to make sure that they're being is addressed at least in our work as
0:38:10well
0:38:14so some activities
0:38:16there's the us we speaker the scientific working group on speaker recognition
0:38:24here we have a history of starting this and
0:38:28a lot of the
0:38:30efforts were motivated by the two thousand nine a report from the national research council
0:38:37national academy of sciences
0:38:40and strengthening forensic science in the united states it basically called all of forensic science
0:38:46on the car
0:38:47and said what
0:38:51a the practise that's used for d n a is a gold standard
0:38:56the rest you guys should model it
0:38:58they call then the question things like got carpet fibre analysis tool marks
0:39:05things they just scientifically didn't quite have the background
0:39:10in terms of their development
0:39:12and that's partly because forensic science didn't grow up being developed by sign this
0:39:19so one area that worked reinhardt
0:39:23to address with the investigatory work
0:39:26voice working group
0:39:29actually is to make progress in different things like the different use cases and collection
0:39:36standards
0:39:38i or word already mentioned best practise are best practise when the pun
0:39:43standard operating procedures there's this new type of eleven standard
0:39:48the scientific working group has a number of ad hoc committees
0:39:53i including in our det any committee which number of you would probably be interested
0:39:58in
0:39:58and the best practices can maybe
0:40:01science and the law
0:40:03and vocabulary to get kind of the whole community talking together
0:40:08so best practices committee for example deals with the number of areas including collection audio
0:40:14recordings
0:40:16the related data that goes with an audio recording you know maybe you know about
0:40:21the phone numbers that handsets used
0:40:24a number things like that
0:40:26some of those factors should be passed to the examiner others might cause bias you
0:40:32have to be concerned about
0:40:33then there's the transmission part of the standard known as the type eleven record your
0:40:38probably be hearing a lot about that
0:40:42and then the proper application
0:40:45and also guidelines for examiners and reporting
0:40:51so here for example is how you form a standard transaction in this type of
0:40:56eleven a framework basically you create a transaction that has the known in questioned recording
0:41:06and then you've got the two type eleven a records the go with that about
0:41:12how to transmit
0:41:13that data you have type two information about the situation of each of those recordings
0:41:22and then you have this type to that has all the issue has all the
0:41:27information about
0:41:28the legal framework and justification and then an overall
0:41:33type one to enact the transaction and you go through this a process where you
0:41:39do something speaker recognition scoring reporting and then deliver the report back to the submitter
0:41:48so this is just one of seventeen
0:41:51ut types of transactions that are currently define in this effort i don't have time
0:41:57to go over all of them
0:42:00how do how does one actually a arrived at a best practise
0:42:05you can
0:42:08go through two branches survey the community as see what candidate best practices there are
0:42:15at the other branches to look for gaps and develop new best practices
0:42:20but in all cases these are going to go through a validation process the requires
0:42:26evaluation
0:42:29and then
0:42:31finally when they been evaluated they will be proposed a i and except proposed as
0:42:37an actual best practise and maybe a step further as a proposed standard this is
0:42:43all and within the in seen yes i t l framework
0:42:48sometimes you need multiple best practices especially in human based approach is because there's a
0:42:53lot of variability bit among analysts and what they're different talents are
0:42:58so if we had one standard this as a human recognition should be done by
0:43:02structured listening
0:43:03you will exclude eighty five ninety five percent of the laboratories mean i'd state
0:43:11whenever you do evaluation you need to be very careful about the design collection of
0:43:16data finding how do you keep this going
0:43:19so there is some new efforts
0:43:22that all talk about later with this sack
0:43:26let me start with this simple request to the community
0:43:30so if you have candidates for best practices please submit them to swig speaker and
0:43:37the sack for consideration
0:43:42pursued outer factors improve robustness
0:43:46work with the analyst you never in there's nothing quite as i opening is working
0:43:50with an analyst and understanding the challenges they're dealing with
0:43:54and participate in forensic style evaluations
0:43:58that's what we would really like to see
0:44:01wrote the most serious
0:44:03so here i just have a couple then slides i norm uninsured
0:44:08and the idea here is i mentioned set
0:44:11okay so the organisation of scientific area committees this is a new after
0:44:16it's house the nist
0:44:18swig speaker here will be absorbed in sack is there's speaker recognition subcommittee
0:44:25i've already mentioned in this in seen a slightly l type eleven records
0:44:32i has a great set of
0:44:36documents and a journal and even the air code of conduct
0:44:42that you might be very interested in
0:44:46there's a lot of other organisations i basically had a list of
0:44:51a quarter this line the mast some friends for help thank you everybody who sent
0:44:56me things
0:44:56now i have too many things to actually talk about all of them
0:45:00so this highlight to here
0:45:03and in fact
0:45:06i mentioned the and a five folks are pursuing some new data that's in the
0:45:12forensic domain i won't steal the thunder from their paper which is why trim a
0:45:16conference
0:45:17and there is some big efforts in
0:45:20euro in the
0:45:23f p seven as well for
0:45:25b
0:45:26multi integrate voice systems that are multi media multi
0:45:32source system
0:45:35okay so let me i conclude
0:45:39speaker recognition is successful used today in a variety of applications
0:45:44but must be applied responsibility with caution
0:45:47and this is referencing the paper the chair finally mention that the beginning
0:45:54we need to work more to address the factors in the forensic domain the
0:46:00i degrade performance
0:46:03real case data as you heard can be extremely challenging
0:46:07in right now if somebody wanted to ask okay that first example with the triple
0:46:12homicide what kind of error rate could i x that
0:46:16in that situation that is one of the downward factors
0:46:20nobody can answer that even close
0:46:26there's many challenges to as
0:46:28that are needed to address these questions
0:46:31please contact me if you have any ideas and i think has he said it
0:46:36best
0:46:37someone is a very good finish way for a decision
0:46:41so maybe we can talk more about this and this on a nine
0:46:45thank you so much
0:46:55when you think drawable is a very much so
0:47:00where a little bit longer but what we us to have five or ten minutes
0:47:05for questions so yes
0:47:09wants to
0:47:10begin
0:47:13what for microphone is coming
0:47:18i four recording
0:47:22self recording for the mismatched especially the first equality play
0:47:26is that the question of the intelligibility of the speech is even a human cannot
0:47:30understand for example the first but you like you can understand what they say how
0:47:35the machine can't it with like
0:47:37so that the intensity of the speech is one part of
0:47:41like for special for maybe locates the say okay
0:47:45this problem for just a bit of speech is no one can ask expert or
0:47:49t so we can expose from the beginning or something like that right is
0:47:53is the issue addressed before so
0:47:56the intelligibility issue is an interesting one because it comes up and one of the
0:48:00very first courtroom the ask goes with the michigan state leaves
0:48:05with some voice evidence
0:48:08when the testimony from one of the police was that this per the
0:48:16voice on that recording
0:48:18can only be this person to the exclusion of all others and then the judge
0:48:23played the recording
0:48:24he couldn't understand
0:48:27so then he's asking so how what makes you think
0:48:32and quickly this was overturned
0:48:35or ruled out
0:48:38then stepping forward
0:48:40as you saw with the structured listening
0:48:43the first step there's to transcribe the speech in the words and then look for
0:48:48these
0:48:48very variation
0:48:51i you're in trouble if you can't transcribe the speech in for that now
0:48:57one thing that we need to be cautious a with the automatic systems
0:49:02as long as they can detect speech which isn't always the case
0:49:07they'll process the data and produce a score
0:49:11well you shouldn't three like a black box
0:49:15that score might be meaningless
0:49:17so i don't really know how to directly address your question other than share those
0:49:22observations but if you're working on that would be good no
0:49:28okay
0:49:29thank you
0:49:31what else
0:49:35which are
0:49:40thanks for torture i'll well also adding speech and leon in france i attended the
0:49:45forensic tutorial
0:49:47and he said that when i have a tracks recording a and the core suspect
0:49:53like in to them but like to rate
0:49:56so that covers assigned phonetic pronunciations in the actual choice
0:50:01can you just
0:50:03i can i can clear
0:50:05next we cringe but go a sorry i was kinda listening to your presentation about
0:50:10the phonetic content we actually looking at the london fines right is that is that
0:50:15occur something you follow similar type thing i you get the suspect to pronounce assigned
0:50:20twelve fines
0:50:22use of this gets down to
0:50:25in one area the methods being use
0:50:29so the very old
0:50:32antiquated i method known is that spectrographic matching
0:50:37actually requires at least twenty word like units
0:50:42being spoken
0:50:45that match what's in the evidence
0:50:48so one way they would deal with this it's to give the person something to
0:50:53really get loads twenty word like units
0:50:56well as you can imagine read speech is disastrous if you're trying to study things
0:51:02like dialectal variation
0:51:05so
0:51:06what's good for the all
0:51:08spectrographic matching process is a disaster for modern
0:51:13methods like structured listening which i should add are inspired by a lot of the
0:51:17methods used in europe in germany by the be okay
0:51:23so this is
0:51:24those recordings that they could be talking about the old
0:51:28style manner
0:51:30just as a subsequent questioned then we're able to get some kind of speech recognition
0:51:36into a speaker id systems
0:51:38where there is some kind of phonetic alignment is not beneficial to the community
0:51:45the forensic
0:51:47well in fact some speaker recognition approaches
0:51:51have a layer where they're actually doing speech recognition and phone recognition
0:51:58and that a lot of that work was inspired by george doddington actually
0:52:03i and idiolect
0:52:05and sure whether it's in the recognition system itself for a by product of these
0:52:11structured listening approach speech recognition becomes a very important process whether it's automatic it's a
0:52:19different question
0:52:22but if there's a lot of data to analyze the overwhelming analysed if they have
0:52:27to manually i do you say phonetic transcription which was the approach being used for
0:52:33quite awhile
0:52:35that is this bad system i showed and that one slide helps to automate that
0:52:40speed the efficiency in fact
0:52:49but question
0:52:52sit under a texture you mentioned in a is the sort of pitch more and
0:52:58of course that's scary for us to what we're never gonna be as accurate as
0:53:02they are that's i think that's problem in speaker recognition
0:53:05but are we have valuable evidence to introduce it softer it sweeter evidence
0:53:12the using the american legal system can understand the concept of weaker evidence and how
0:53:18value valuable it can be an integer do you think a likelihood ratio
0:53:22can be understood by four
0:53:26okay so that is multiple ones
0:53:30the first one
0:53:31it is what the i national academy of sciences with calling for with the framework
0:53:37like the in
0:53:39they weren't although would be nice they were demanding that the performance be on par
0:53:44with the end
0:53:45but they let it be in a in the scientific background behind and very large
0:53:52studies that have been done here all evidence it's a very nice
0:53:57except by the way when you're dealing with uni mixtures but for the time being
0:54:02just assume that you the any samples where there is a whole nother the of
0:54:08dealing with some of the those channel so
0:54:11in a is not perfect but it's extremely good
0:54:15the next question about will jurors be able to deal with properly understand likelihood ratios
0:54:23so bill perhaps and it is conducting a survey of the mock your
0:54:30actually see when they're presented with
0:54:34evidence in different forms whether it's likelihood ratios i a verbal description of what a
0:54:41log-likelihood ratios for might mean to see how that's interpreted by jurors i don't know
0:54:49he's publish that paper but it should be happening soon
0:54:53and one thing that happened with dorothy going and see who is also involved in
0:54:57this study is a hybrid cy x where she came up with a very scary
0:55:02statistic in that was something like a quarter of jurors in the us
0:55:08don't understand fraction
0:55:11what are we gonna do
0:55:13move to europe i don't know how well i don't know what the ratio is
0:55:18in europe but wow that's this area so
0:55:23but it's important that the general public vad
0:55:29i don't know what but if i could commanders peace last question i'm not sure
0:55:35it's useful to ask the question in fact i have the answer but don't and
0:55:41pickle will not understand the likelihood ratio and we know all about because well for
0:55:46and able to understand likelihood ratio and how mine
0:55:49under the
0:55:51reason to and so like but there's
0:55:53you should still requesting for local overall system in all the countries to be expected
0:55:58to be a witness to coming from papa coped
0:56:01you know that we explain for people to you means but we still keep results
0:56:06but it so like to one issue is not the non-focal is not
0:56:10the lemon
0:56:11so why we define orifice
0:56:15that can break issue used only
0:56:17according to me to give you pour to needy to view of a party
0:56:22to
0:56:24i bouquet do to a estimated quality of what we didn't up of science in
0:56:31the ripple
0:56:32i like the ratio is defined for some difficult people use one expert
0:56:38in the park the report is using a global that if you meet all we'd
0:56:43like to ratio and of or expert could
0:56:48review baseball than the a firewall against to the middle
0:56:53and the we are in some to pick language not in the cold language after
0:56:58about the expert the younger people
0:57:00you see his own opinion and taking his own risk
0:57:05and this is not
0:57:06like calibration at all
0:57:09sorry i don't want to take that would a i would like to a location
0:57:13to discuss just question the later maybe k varies
0:57:17last question
0:57:19so one
0:57:20no you
0:57:24george the
0:57:26well likelihood ratios a wonderful thing
0:57:32the primary issue with the likelihood ratio use the
0:57:38happens to be the output of a system whose crazy
0:57:42the likelihood ratio
0:57:44if you actually know the likelihood ratio
0:57:47perfectly wonderful to use
0:57:50but the likelihood ratio audible supposed to most portion
0:57:55let's works
0:57:58maybe what you were just getting at is that we need to keep in mind
0:58:03we're always estimating likelihood ratios and it's just another
0:58:09i area cost of mismatch
0:58:12you know our systems are producing these estimates
0:58:15and
0:58:16using data that probably doesn't
0:58:18look anything like that first real case i
0:58:23so what you
0:58:25i don't
0:58:27i have to closed position a unfortunately i and i want to thank you
0:58:32by your jewelry okay