0:00:15 | well thank you for that kind introduction j are |
---|
0:00:19 | you're right about luggage is then an issue would me and |
---|
0:00:23 | but i'm close when i don't have my luggage as an even bigger issue |
---|
0:00:28 | for me |
---|
0:00:29 | so i |
---|
0:00:32 | appreciate the introduction and i thank the organising committee for inviting me |
---|
0:00:37 | and especially for naming this town i don't know joe once you ha i've never |
---|
0:00:45 | had this happened before we went to give it is a presentation so please |
---|
0:00:52 | so let me start by asking for show of hands |
---|
0:00:56 | who among us has participated in a forensic style evaluation of speaker recognition technology |
---|
0:01:06 | that's good |
---|
0:01:08 | that's good i'm gonna try to get more hands up with interest that the and |
---|
0:01:12 | my presentation |
---|
0:01:16 | who is processed real forensic case data |
---|
0:01:23 | well that's pretty good okay |
---|
0:01:26 | so i'll be preaching of the choir some of you |
---|
0:01:31 | and finally who has actually testified in court |
---|
0:01:37 | that's good |
---|
0:01:38 | very good okay |
---|
0:01:41 | so let me |
---|
0:01:44 | talk about some of the interesting not challenges in a forensic and investigatory speaker recognition |
---|
0:01:55 | the basic introductory material for my talk is |
---|
0:02:02 | you know basically to define the problem so in forensic in investigated a speaker comparison |
---|
0:02:09 | the speech utterances are compared |
---|
0:02:13 | and the process can either be by humans or machines |
---|
0:02:17 | and |
---|
0:02:19 | in the forensic case typically this is for used in a court of law |
---|
0:02:24 | this is very high state |
---|
0:02:27 | it demands the best that signs has to offer and those of you who pay |
---|
0:02:32 | attention to trials on television probably are a pretty nauseated by what you see out |
---|
0:02:40 | there and what is happening in the world |
---|
0:02:43 | in terms of these expert witnesses that i'll be talking about later in the methods |
---|
0:02:49 | they use |
---|
0:02:52 | the map it's a vary quite widely and there is a very nice survey paper |
---|
0:03:00 | by golden french the describes some of the variations in these processes |
---|
0:03:07 | and that's not necessarily for the good |
---|
0:03:12 | and |
---|
0:03:13 | it's important that these methods that are used be grounded in scientific principles and be |
---|
0:03:19 | applied properly |
---|
0:03:23 | and just as important |
---|
0:03:24 | is to decide when you should not except that case |
---|
0:03:29 | when i it would be irresponsible |
---|
0:03:33 | so this idea of went upon or not apply a the methods is also very |
---|
0:03:40 | work |
---|
0:03:43 | so we're gonna provide some analysis of the methods and a place to make citing |
---|
0:03:49 | examples that i hope will get you excited about |
---|
0:03:52 | how challenging this kind and domain really can be |
---|
0:03:57 | and |
---|
0:03:58 | one of the things i wanted you hear and in the broader sense it other |
---|
0:04:04 | conferences with wide diversity |
---|
0:04:08 | is to improve communications among |
---|
0:04:11 | the research community this rate group here a legal scholars |
---|
0:04:18 | you know we have for example in speech people like bill thompson |
---|
0:04:22 | who |
---|
0:04:23 | wrote the prosecutor's fallacy and was involved in the o j simpson trial so we've |
---|
0:04:30 | got a number of very high profile a legal scholars in the us |
---|
0:04:35 | involved and also international and of course the legal systems are different throughout the world |
---|
0:04:43 | so you have to address these contexts these questions within context |
---|
0:04:50 | and then finally |
---|
0:04:52 | i'm going to ask this community for health and present some other things that you |
---|
0:04:58 | could actually you get involved in and help us make progress |
---|
0:05:07 | so i'll start by giving some background |
---|
0:05:12 | cover some example approach is a talk about some of the activities |
---|
0:05:18 | that are currently going on a request some |
---|
0:05:21 | things for the community to get involved in some future ideas and conclude |
---|
0:05:29 | okay so with forensics and investigation basically they differ by |
---|
0:05:36 | primarily by whether the |
---|
0:05:39 | methods |
---|
0:05:41 | we will be presented in a court of law |
---|
0:05:43 | a lot of people for investigation will try to use a similar process that has |
---|
0:05:50 | the rigour necessary should it be important to pretty later presented in a court of |
---|
0:05:57 | law |
---|
0:05:58 | but the basic forensic community and investigative community |
---|
0:06:03 | work and similar problems in terms of trying to establish facts |
---|
0:06:09 | and the actual presentation form is where they differ now here i have a cartoon |
---|
0:06:17 | that shows basic the most canonical example of a speaker comparison we have a known |
---|
0:06:24 | a speech sample and a question speech sample |
---|
0:06:28 | and you compare them |
---|
0:06:30 | and there's some summary or analysis |
---|
0:06:32 | the |
---|
0:06:34 | forensic examiner or in less than mike right a reporter |
---|
0:06:39 | and |
---|
0:06:40 | we're not done that's |
---|
0:06:43 | that's the simple view of the world |
---|
0:06:48 | then i was happy when i asked the number of friends for suggestions |
---|
0:06:54 | michael jensen from a p k a kindly provided this table from his summer school |
---|
0:07:00 | that shows a little more granularity in terms of a forensic versus investigated |
---|
0:07:08 | including a large scale |
---|
0:07:13 | investigation where you might actually be running |
---|
0:07:17 | automatic systems that are similar to i office the f b i z integrated automated |
---|
0:07:23 | fingerprint identification system |
---|
0:07:25 | which conducts large-scale searches through databases |
---|
0:07:30 | and you can see here that they vary in terms of whether they will be |
---|
0:07:34 | presented in court |
---|
0:07:38 | what kind of |
---|
0:07:42 | methods are used |
---|
0:07:43 | a number of comparisons |
---|
0:07:45 | and the type then |
---|
0:07:48 | style of working on the date |
---|
0:07:51 | so let me now give just a couple examples |
---|
0:07:55 | of some forensic situations |
---|
0:08:01 | first you might remember the olympics in nineteen ninety six with the centennial park but |
---|
0:08:10 | there was a of thirteen second phone call |
---|
0:08:15 | that said there is a bum |
---|
0:08:19 | in centennial park |
---|
0:08:21 | you have thirty minutes |
---|
0:08:23 | that's it |
---|
0:08:24 | so now i you've got this thirties thirteen second call the people are frantically trying |
---|
0:08:31 | to figure out the address of where is centennial park at nine one |
---|
0:08:37 | so that they can dispatch officers to the scene |
---|
0:08:42 | basically a lot of time passes they have a short time to clear the park |
---|
0:08:47 | by the time the officers get their two people are murdered a hundred and twenty |
---|
0:08:52 | people are injured |
---|
0:08:54 | and now they have a suspect in custody two matches the description of someone that |
---|
0:09:00 | was seen it |
---|
0:09:01 | payphone |
---|
0:09:03 | and that person's name is richard jewel |
---|
0:09:06 | and |
---|
0:09:09 | they have quite a bit of her sin trying to establish if this person is |
---|
0:09:15 | the one on their call |
---|
0:09:19 | turns out the actual person who made the call |
---|
0:09:24 | escaped the scene and was not caught for seven years |
---|
0:09:29 | another a very high profile in recent case a tray of and martin |
---|
0:09:36 | this was |
---|
0:09:39 | had all sort of the wrong things happening all at once |
---|
0:09:44 | extreme mismatches of every type imaginable |
---|
0:09:48 | these outrageous claims of justified shootings and then |
---|
0:09:53 | just to make it more interesting the orlando sentinel newspaper decides to go higher some |
---|
0:10:01 | voice exports |
---|
0:10:03 | and |
---|
0:10:06 | i don't know if they quite appreciated the conditions under which they were working |
---|
0:10:13 | first of all it's hardly consider speaker recognition when the person is a crying out |
---|
0:10:19 | for help right |
---|
0:10:21 | and i'll show you later some of the issues involved in that so this was |
---|
0:10:27 | a very turbulent time in the us |
---|
0:10:31 | and |
---|
0:10:32 | a lot of controversy regarding the kind of data that was involved in this case |
---|
0:10:38 | in how |
---|
0:10:40 | i how inappropriate the whole situation is we have people by the way like george |
---|
0:10:45 | doddington who's here today for keeping the system on the rails he was one of |
---|
0:10:53 | the expert witnesses |
---|
0:10:57 | so how heart is forensic speaker recognition |
---|
0:11:01 | well |
---|
0:11:02 | a first step in that direction that actually is not truly forensic speaker recognition |
---|
0:11:10 | who was this nist a haze or evaluation and actually i miss the before the |
---|
0:11:15 | nist hazy there was actually and evaluation by and if i to you know that |
---|
0:11:22 | actually would real |
---|
0:11:23 | forensic case data |
---|
0:11:25 | i'll talk about that in one |
---|
0:11:27 | but in the haze your evaluation |
---|
0:11:30 | you know unlike conventional nist evaluations |
---|
0:11:36 | you know where you have so many trials they're not really pride itself practical for |
---|
0:11:41 | humans to process the data |
---|
0:11:44 | here there was a paring down to make the number of trials manageable by humans |
---|
0:11:51 | and the process for doing that was a two-stage selection process where |
---|
0:11:58 | you would use an automatic system to find the most confusable pairs |
---|
0:12:03 | and then a file that by using humans to then |
---|
0:12:09 | find the most confusable pairs of the confusable automatic here's so you have a very |
---|
0:12:16 | difficult data to work with and the benefit of that was now you can have |
---|
0:12:21 | a you know evaluation |
---|
0:12:24 | with the mere fifteen trials that's manageable by humans |
---|
0:12:29 | in this what is the beginning for the nist |
---|
0:12:33 | style of evaluations that are in this direction |
---|
0:12:37 | so i don't know if you've heard the use but let me just play one |
---|
0:12:40 | here |
---|
0:12:43 | so here is a trial eleven |
---|
0:12:46 | now play the two samples and the question |
---|
0:12:51 | that is asked are these from the same source |
---|
0:12:55 | here's the first one |
---|
0:13:07 | yours the seconds |
---|
0:13:18 | so |
---|
0:13:19 | it's pretty impressive to me the |
---|
0:13:23 | that's supposed to be as you can see by the truth label here |
---|
0:13:27 | two people |
---|
0:13:29 | i will i like i said in brno i would love to actually meet these |
---|
0:13:34 | two people and see that there are two separate people have dinner with them |
---|
0:13:40 | you know maybe it would be high price of for the meal but |
---|
0:13:45 | those two people confused the humans and the automatic systems consistently in the first is |
---|
0:13:53 | your evaluation |
---|
0:13:54 | and it inspired a lot of people to |
---|
0:13:57 | look in the this interesting problem |
---|
0:14:01 | and i unlike the to traditional nist sre protocol he's are of course allows human |
---|
0:14:07 | listening |
---|
0:14:08 | so this is exciting |
---|
0:14:10 | at the time of all the data was in english so that might somewhat limit |
---|
0:14:16 | some of the human approaches |
---|
0:14:19 | but it's shore gave a nice flavour of the challenge in this is difficult |
---|
0:14:25 | but you know what it's not nearly as difficult as the real thing and i'll |
---|
0:14:30 | play that the mom |
---|
0:14:34 | so some |
---|
0:14:37 | challenges in i speaker recognition for humans in machine i have a few slides |
---|
0:14:43 | the nist of else have made progress in things like channel mismatch |
---|
0:14:49 | distance to the microphone by progress i mean progress in evaluating the of these of |
---|
0:14:54 | facts |
---|
0:14:56 | also in terms of duration and cross language although not showing notes here |
---|
0:15:03 | so that this is good but there's a lot more going on in a lot |
---|
0:15:08 | of forensic case data |
---|
0:15:11 | so the typically in these scenarios |
---|
0:15:16 | the talkers are unfamiliar to the examiner |
---|
0:15:20 | the talkers tend to be familiar with each other |
---|
0:15:24 | and that affects their conversation-style there can be multiple talkers there's all sorts of different |
---|
0:15:32 | styles a conversational read aloud crying speech for example if you wanna call it speech |
---|
0:15:41 | and then accommodation when you have familiar talkers adapting to each other |
---|
0:15:47 | if there's a conversation that's part of the evidence which is often the case they |
---|
0:15:54 | might be deceptive |
---|
0:15:57 | and i have examples of this and sometimes you dealing with people who are mentally |
---|
0:16:01 | ill or medicated and they can be all these situational mismatches to deal with |
---|
0:16:10 | this goes on and on |
---|
0:16:12 | but you know what it's actually have a nation often used thing so if you |
---|
0:16:18 | have an evaluation where you evaluated a few of these factors the problem is these |
---|
0:16:24 | are combined in horrible ways to make it even more challenging |
---|
0:16:29 | when you're trying to determine what is the performance |
---|
0:16:33 | about system or a human or human with the system |
---|
0:16:37 | so you can have mit mismatch galore between the samples that are being compared and |
---|
0:16:43 | also all the information used to train our automatic systems the background hyper parameters it |
---|
0:16:51 | goes on and on |
---|
0:16:54 | then you have additional challenges in terms of how should this information be presented |
---|
0:17:01 | in terms of scoring or decisions you know we will be pretty strong advocates in |
---|
0:17:08 | general about say for example reporting log-likelihood ratios or something like that |
---|
0:17:14 | but |
---|
0:17:15 | a lot of the forensic people i work with |
---|
0:17:19 | the investigators don't wanna hear a log-likelihood ratio they want to know what they should |
---|
0:17:23 | go take action |
---|
0:17:25 | this gets very bad in the number of ways mathematically ugly because of asserting prior |
---|
0:17:32 | probabilities to make decisions this is a very hardened |
---|
0:17:38 | a tenuous situation |
---|
0:17:41 | in an area where this community is made some progress and i'm hoping route odyssey |
---|
0:17:46 | all actually sees the more in this direction |
---|
0:17:51 | then you have this whole issue of calibration with system scores moving around and drifting |
---|
0:17:59 | if you well in this causes chaos among the analyst |
---|
0:18:03 | so one of the biggest challenges in a lot of this is building a i |
---|
0:18:08 | trust and confidence in the analyst or examiners if your system starts misbehaving i they |
---|
0:18:17 | might start using it or do something kind of crazy |
---|
0:18:22 | so there is a lot of issues with down establishing trust and having the system |
---|
0:18:27 | be reliable and stable and calibrated |
---|
0:18:30 | then you have the issue of the courts question so we talked about sort of |
---|
0:18:35 | this canonical example with i got two speech samples |
---|
0:18:39 | is the source the saying |
---|
0:18:41 | well that's not necessarily you the question that the court hence |
---|
0:18:46 | it's not make it but the other guys just been murdered |
---|
0:18:50 | and we don't have any recordings of his voice |
---|
0:18:53 | so now what you do you there is a whole bunch of |
---|
0:18:59 | challenges with trying to figure out |
---|
0:19:03 | you know how do you deal with the |
---|
0:19:06 | you know a known in advance questions from the courts right now one of things |
---|
0:19:11 | that i've been pursuing with some probably is to see what are those questions somewhat |
---|
0:19:17 | negotiable |
---|
0:19:18 | and can we get a pretty good menu of what the history of these kinds |
---|
0:19:24 | of questions are to help us as developers build systems an acquired data to help |
---|
0:19:30 | address the kinds of questions that are likely to come up |
---|
0:19:37 | then you have this issue with the automatic systems where you know people might think |
---|
0:19:41 | that they're fully automatic |
---|
0:19:43 | but often what happens is there is models that have been bill |
---|
0:19:48 | i human head is segmented speech |
---|
0:19:52 | and |
---|
0:19:53 | decided what speech utterances are assembled to create models |
---|
0:20:00 | so you got this kind of chicken in the or the egg problem right so |
---|
0:20:05 | i'm trying to recognize speakers but yet when i'm training my models i need to |
---|
0:20:10 | do some segmentation |
---|
0:20:13 | so there's that factor to keep in mind also |
---|
0:20:18 | then there is a whole bunch of other things going on here |
---|
0:20:22 | i've already talked about went upon |
---|
0:20:25 | in terms of not accepting a case |
---|
0:20:29 | you went upon is the expression from american football i'm not sure that translates internationally |
---|
0:20:38 | and then there's some other issues about noise and degradation that are important to keep |
---|
0:20:44 | in mind |
---|
0:20:46 | and we'll talk more about those in the moment |
---|
0:20:50 | so |
---|
0:20:52 | now let's actually here some real case data |
---|
0:20:57 | this is pretty fascinating i thing |
---|
0:21:02 | i'm going to show some examples play some examples the first one |
---|
0:21:08 | i'll set it up for you a triples triple homicide is just been committed |
---|
0:21:15 | the suspect runs from the scene |
---|
0:21:18 | with one of the victims cell phones and their blue two |
---|
0:21:22 | and he's calling his for and |
---|
0:21:25 | to come and pick came up |
---|
0:21:28 | he's running is fast and see |
---|
0:21:31 | the wind is blowing |
---|
0:21:34 | and i it's a very difficult situation so let me play this |
---|
0:22:02 | so that has |
---|
0:22:04 | a lot of characteristics that you probably are used to working with in say the |
---|
0:22:09 | nist evaluations |
---|
0:22:14 | and the this is really challenging stuff and it gets better because |
---|
0:22:20 | now we have the suspect in our custody in his jails |
---|
0:22:27 | and he's kind of perverted to being like just in beaver |
---|
0:22:33 | so listen to the |
---|
0:22:58 | so that's pretty a mismatch when you say |
---|
0:23:03 | i don't know what you would do with the data like that |
---|
0:23:09 | so that's just one the one example of just incredible mismatch and always not only |
---|
0:23:16 | between the samples themselves well maybe the last one isn't |
---|
0:23:21 | terribly unlike a lot of that's training data that our systems are built with but |
---|
0:23:26 | i'd be surprised if are systems have been trained and have their hyper a hyper |
---|
0:23:31 | parameters and background models knowledgeable of the at the this like that first same |
---|
0:23:39 | so this is |
---|
0:23:42 | extreme mismatch not only between the samples but against are systems |
---|
0:23:48 | but we play another example |
---|
0:23:50 | of a very complex situation |
---|
0:23:54 | where you have some pretty stressed overlapping talkers |
---|
0:24:15 | how many talkers are there in that situation |
---|
0:24:20 | sounded about like three to me but you know i i'm not sure |
---|
0:24:25 | or |
---|
0:24:27 | you know and apart i didn't plays the beginning where you've got |
---|
0:24:31 | the operator at answering nine one and then you hear the person in whispering and |
---|
0:24:37 | then putting the phone into their pocket |
---|
0:24:40 | i where they found it later unfortunately |
---|
0:24:45 | who is then the victim |
---|
0:24:47 | so |
---|
0:24:48 | this is the type situation some so this gets in of the questions like what |
---|
0:24:53 | question am i trying to answer how many people work rats and |
---|
0:24:59 | who said what |
---|
0:25:01 | the area of disputed utterances as it is known in the forensic community |
---|
0:25:07 | so these guys of course are you know rounded up and they're all claiming nodes |
---|
0:25:13 | the other guy that shot am i was just visiting right and friends or so |
---|
0:25:19 | so there's challenges like that you're with |
---|
0:25:24 | another example |
---|
0:25:26 | is |
---|
0:25:27 | is a very interesting threat hall |
---|
0:25:31 | and this one has some timeliness about it as well |
---|
0:25:36 | so listen to this first recording |
---|
0:25:50 | so the audio system in here is pretty good i don't know if you could |
---|
0:25:55 | make that out but the guys basically giving the address of that's going to be |
---|
0:26:00 | attacked by gunmen tomorrow |
---|
0:26:03 | wow better decide what you're gonna do you |
---|
0:26:07 | so they decide to bring in a suspect |
---|
0:26:10 | and here's his interview |
---|
0:26:26 | so i there's and number of things going on that first call it seems like |
---|
0:26:33 | the person was like in the movies holding a handkerchief over the phone |
---|
0:26:39 | sound like they had marbles in their mouth |
---|
0:26:41 | the second one i don't know if there are medicated or what's going on there |
---|
0:26:46 | but there is a lot of mismatch going on in that situation and you know |
---|
0:26:52 | for investigative purposes even though you're not in a court of law |
---|
0:26:58 | it still has high stakes when you go decide to take somebody in the custody |
---|
0:27:04 | i mean that's a dramatic experience right so you still need to be cautious how |
---|
0:27:11 | to proceed with that |
---|
0:27:14 | but it's very difficult to make a quick decision in situations like this |
---|
0:27:20 | and you know |
---|
0:27:22 | this is just a small part of it |
---|
0:27:25 | as reversed warts at the your secret service as if it's always something every case |
---|
0:27:31 | there is a case where |
---|
0:27:34 | somebody a had a sex change operation during the |
---|
0:27:39 | first sample and the second same ball that we're being compared with |
---|
0:27:45 | you know the so a lot of our systems that are gender dependent like what |
---|
0:27:51 | you do you know that there is just |
---|
0:27:55 | so many challenging situations |
---|
0:27:59 | they come up when you're dealing with real a forensic case data and i should |
---|
0:28:04 | add |
---|
0:28:06 | the when samples get elevated to the level of the national resource like reba schwartz |
---|
0:28:13 | those of the hardest of the forensic cases the easier ones can be handled it |
---|
0:28:19 | a lower level |
---|
0:28:21 | so these are very challenging situations |
---|
0:28:26 | and one might ask what how do i figure out if |
---|
0:28:31 | i if i should process this data |
---|
0:28:34 | if it can be admitted in the core |
---|
0:28:38 | if i'm in the united states |
---|
0:28:40 | i have this |
---|
0:28:42 | admissibility standard and the with the doppler |
---|
0:28:48 | so for example |
---|
0:28:52 | in us federal court and in about half of the us the words |
---|
0:28:59 | the job which will consider the admissibility of scientific evidence |
---|
0:29:04 | but judges are often the first to admit that generally they're not sign this |
---|
0:29:09 | so they had this sort of d he role pushed onto them |
---|
0:29:16 | and the idea is |
---|
0:29:18 | under federal rules of evidence number seven no to the testimony by expert witnesses |
---|
0:29:25 | the purpose is to assist the trier of fact the jog through the jurors |
---|
0:29:30 | if the evidence is going to be very confusing |
---|
0:29:34 | then it's not |
---|
0:29:37 | it method |
---|
0:29:40 | so that this is kind of loose |
---|
0:29:42 | here the courts have in the us have tried to |
---|
0:29:49 | structure this |
---|
0:29:51 | and a |
---|
0:29:52 | form this so called out we're test |
---|
0:29:55 | this is a the over versus merrill dow pharmaceuticals |
---|
0:30:00 | and basically four or five depending on how you read it different factors |
---|
0:30:08 | are introduced in the this the outward test |
---|
0:30:12 | so has the method bin or can it be test |
---|
0:30:17 | well |
---|
0:30:18 | one of the nice things about our communities that we do test a lot |
---|
0:30:22 | not sure that we test on this kind of data |
---|
0:30:27 | another is you know has been subjected to peer review and publication |
---|
0:30:33 | well are communities very good at publishing papers and |
---|
0:30:38 | this odyssey is just one of those excellent the forms |
---|
0:30:44 | now we're in trouble |
---|
0:30:47 | does it have a known error |
---|
0:30:51 | wow well if you tell me what error rate you want i can find the |
---|
0:30:56 | corpus that will probably give you that error rate that's not the answer they wanna |
---|
0:31:01 | hear right they are they want something pretty solid much more certain like |
---|
0:31:07 | for example the in a |
---|
0:31:10 | which by the way also has variability |
---|
0:31:14 | but that's a whole nother story but at least it's relatively small compared to what |
---|
0:31:19 | we experience |
---|
0:31:20 | in the voice world |
---|
0:31:22 | are there existing standards controlling its use |
---|
0:31:26 | and maintain |
---|
0:31:28 | well currently there's very little in that area but in the us all be talking |
---|
0:31:33 | in a moment about some activities in that direction |
---|
0:31:37 | and |
---|
0:31:38 | learning about what's happening internationally which is one reason implied to be here this workshop |
---|
0:31:45 | and then of |
---|
0:31:47 | the first one is sort of this friendly thing like you know is it generally |
---|
0:31:52 | accepted by the scientific community |
---|
0:31:55 | then you get in all these problems like what's a community what's the scientific community |
---|
0:32:01 | and |
---|
0:32:02 | this up there are part |
---|
0:32:05 | is also known as the fried test which predated the arbour |
---|
0:32:10 | test |
---|
0:32:12 | so looking at |
---|
0:32:13 | the basic anatomy of the speaker comparison system |
---|
0:32:17 | you can form |
---|
0:32:18 | two parallel branches |
---|
0:32:20 | the start with the feature extraction and creating models and then go through a comparison |
---|
0:32:26 | of the |
---|
0:32:29 | hypothesis that the samples matched versus they don't |
---|
0:32:35 | i and then a producer calibrated a match score out what |
---|
0:32:41 | now that's |
---|
0:32:43 | fine however |
---|
0:32:46 | there's all these knowledge sources that are under the but |
---|
0:32:50 | and all these areas that are right |
---|
0:32:52 | for mismatch |
---|
0:32:54 | so for example let's just take and i-vector system |
---|
0:32:59 | so we have this signal processing chain |
---|
0:33:03 | and |
---|
0:33:05 | different stages here are shown where we need all these different kinds of background information |
---|
0:33:12 | whether it's |
---|
0:33:13 | hi hyper parameter tuning |
---|
0:33:16 | you know the universal background models |
---|
0:33:20 | i |
---|
0:33:20 | total variability matrix for the |
---|
0:33:24 | covariance matrix that's needed |
---|
0:33:26 | to make these systems successful |
---|
0:33:30 | but there's more |
---|
0:33:33 | what about calibration |
---|
0:33:35 | i need to train that system is well |
---|
0:33:39 | and a system that's not calibrated will drive in one is absolutely crazy |
---|
0:33:45 | and you lose their confidence and they'll stop using your system |
---|
0:33:51 | so this is a very important stage it's great the nico heads the paper here |
---|
0:33:56 | on |
---|
0:33:56 | calibration and weights to address this again |
---|
0:34:01 | one of nicholas favourite topics of mine too |
---|
0:34:05 | so basically you want to try to minimize all these nuisance as a some of |
---|
0:34:10 | which |
---|
0:34:12 | if you're processing single here's of samples at a time you can get a good |
---|
0:34:17 | handle on other nuisances are partly due on single pair comparisons |
---|
0:34:22 | those have to deal with logical consistency with the to use two samples matching |
---|
0:34:29 | and then another pair of samples matching but the others powder samples not match and |
---|
0:34:34 | when i say matching i don't mean that in the binary sense i mean scoring |
---|
0:34:39 | high |
---|
0:34:42 | so |
---|
0:34:43 | calibration is a good thing makes in was happy smile and when it works |
---|
0:34:50 | thank you go when everybody that works on |
---|
0:34:54 | so now what |
---|
0:34:56 | whatever what why do you if i want to combine these methods |
---|
0:35:00 | this gets also quite complicated |
---|
0:35:05 | and you know do you do we way these processes in a dynamic fashion taking |
---|
0:35:11 | into account when there are working in areas that they've been developed in trained on |
---|
0:35:17 | and |
---|
0:35:18 | d weighting them when there are running a little bit out of the regions that |
---|
0:35:24 | they've been developed for |
---|
0:35:27 | how do we mitigate the observation bias you know you certainly don't one day human |
---|
0:35:33 | examiner to know what the scores are from the automatic system before they can finish |
---|
0:35:39 | their evaluation |
---|
0:35:42 | but it gets even more fine grained than that sometimes |
---|
0:35:46 | you know you hear |
---|
0:35:48 | content in the mid in the samples you're working on that can bias you |
---|
0:35:53 | you might consider removing that content at the expense of working with less data |
---|
0:36:00 | you've got all these variabilities to deal with the subjects of the samples themselves the |
---|
0:36:06 | humans that are actually conducting the comparison process |
---|
0:36:10 | all analysts are alike |
---|
0:36:14 | for example then the machines that as well |
---|
0:36:18 | there's issues about consistency in repeat ability |
---|
0:36:24 | already mentioned logically consistent the desires and then |
---|
0:36:30 | you know having some best practices to establish howdy |
---|
0:36:34 | use these processes remember one of the doubt where criteria is the existence of standards |
---|
0:36:40 | and their maintenance |
---|
0:36:42 | to invoked these process |
---|
0:36:45 | so it works only there's a number of evaluations that can help us and if |
---|
0:36:52 | i t no i think in two thousand three had the very first one on |
---|
0:36:57 | real forensic data |
---|
0:36:59 | that was a lot of fun |
---|
0:37:02 | and you know the agreement we require that you destroy the data after you didn't |
---|
0:37:07 | unfortunately we divided by the agreement no longer have that the at the but |
---|
0:37:13 | that was really very nice |
---|
0:37:15 | but the good news is that |
---|
0:37:17 | there might be more about coming |
---|
0:37:21 | then we have the nist a teaser series which you know isn't quite forensic but |
---|
0:37:26 | it's probing some dimensions that will help us make progress i think in the forensic |
---|
0:37:30 | domain |
---|
0:37:32 | and the next sre |
---|
0:37:35 | might actually have real forensic samples and |
---|
0:37:41 | so |
---|
0:37:42 | are |
---|
0:37:43 | you know i think it's important to look at all this in the context of |
---|
0:37:47 | the delaware factor |
---|
0:37:49 | and |
---|
0:37:50 | i especially for application the united states |
---|
0:37:54 | but maybe throughout the rest of the world as well it's it they seem like |
---|
0:37:58 | pretty sound principles to me |
---|
0:38:01 | but if there's additional factors that are used internationally i would love to know about |
---|
0:38:06 | them to make sure that they're being is addressed at least in our work as |
---|
0:38:10 | well |
---|
0:38:14 | so some activities |
---|
0:38:16 | there's the us we speaker the scientific working group on speaker recognition |
---|
0:38:24 | here we have a history of starting this and |
---|
0:38:28 | a lot of the |
---|
0:38:30 | efforts were motivated by the two thousand nine a report from the national research council |
---|
0:38:37 | national academy of sciences |
---|
0:38:40 | and strengthening forensic science in the united states it basically called all of forensic science |
---|
0:38:46 | on the car |
---|
0:38:47 | and said what |
---|
0:38:51 | a the practise that's used for d n a is a gold standard |
---|
0:38:56 | the rest you guys should model it |
---|
0:38:58 | they call then the question things like got carpet fibre analysis tool marks |
---|
0:39:05 | things they just scientifically didn't quite have the background |
---|
0:39:10 | in terms of their development |
---|
0:39:12 | and that's partly because forensic science didn't grow up being developed by sign this |
---|
0:39:19 | so one area that worked reinhardt |
---|
0:39:23 | to address with the investigatory work |
---|
0:39:26 | voice working group |
---|
0:39:29 | actually is to make progress in different things like the different use cases and collection |
---|
0:39:36 | standards |
---|
0:39:38 | i or word already mentioned best practise are best practise when the pun |
---|
0:39:43 | standard operating procedures there's this new type of eleven standard |
---|
0:39:48 | the scientific working group has a number of ad hoc committees |
---|
0:39:53 | i including in our det any committee which number of you would probably be interested |
---|
0:39:58 | in |
---|
0:39:58 | and the best practices can maybe |
---|
0:40:01 | science and the law |
---|
0:40:03 | and vocabulary to get kind of the whole community talking together |
---|
0:40:08 | so best practices committee for example deals with the number of areas including collection audio |
---|
0:40:14 | recordings |
---|
0:40:16 | the related data that goes with an audio recording you know maybe you know about |
---|
0:40:21 | the phone numbers that handsets used |
---|
0:40:24 | a number things like that |
---|
0:40:26 | some of those factors should be passed to the examiner others might cause bias you |
---|
0:40:32 | have to be concerned about |
---|
0:40:33 | then there's the transmission part of the standard known as the type eleven record your |
---|
0:40:38 | probably be hearing a lot about that |
---|
0:40:42 | and then the proper application |
---|
0:40:45 | and also guidelines for examiners and reporting |
---|
0:40:51 | so here for example is how you form a standard transaction in this type of |
---|
0:40:56 | eleven a framework basically you create a transaction that has the known in questioned recording |
---|
0:41:06 | and then you've got the two type eleven a records the go with that about |
---|
0:41:12 | how to transmit |
---|
0:41:13 | that data you have type two information about the situation of each of those recordings |
---|
0:41:22 | and then you have this type to that has all the issue has all the |
---|
0:41:27 | information about |
---|
0:41:28 | the legal framework and justification and then an overall |
---|
0:41:33 | type one to enact the transaction and you go through this a process where you |
---|
0:41:39 | do something speaker recognition scoring reporting and then deliver the report back to the submitter |
---|
0:41:48 | so this is just one of seventeen |
---|
0:41:51 | ut types of transactions that are currently define in this effort i don't have time |
---|
0:41:57 | to go over all of them |
---|
0:42:00 | how do how does one actually a arrived at a best practise |
---|
0:42:05 | you can |
---|
0:42:08 | go through two branches survey the community as see what candidate best practices there are |
---|
0:42:15 | at the other branches to look for gaps and develop new best practices |
---|
0:42:20 | but in all cases these are going to go through a validation process the requires |
---|
0:42:26 | evaluation |
---|
0:42:29 | and then |
---|
0:42:31 | finally when they been evaluated they will be proposed a i and except proposed as |
---|
0:42:37 | an actual best practise and maybe a step further as a proposed standard this is |
---|
0:42:43 | all and within the in seen yes i t l framework |
---|
0:42:48 | sometimes you need multiple best practices especially in human based approach is because there's a |
---|
0:42:53 | lot of variability bit among analysts and what they're different talents are |
---|
0:42:58 | so if we had one standard this as a human recognition should be done by |
---|
0:43:02 | structured listening |
---|
0:43:03 | you will exclude eighty five ninety five percent of the laboratories mean i'd state |
---|
0:43:11 | whenever you do evaluation you need to be very careful about the design collection of |
---|
0:43:16 | data finding how do you keep this going |
---|
0:43:19 | so there is some new efforts |
---|
0:43:22 | that all talk about later with this sack |
---|
0:43:26 | let me start with this simple request to the community |
---|
0:43:30 | so if you have candidates for best practices please submit them to swig speaker and |
---|
0:43:37 | the sack for consideration |
---|
0:43:42 | pursued outer factors improve robustness |
---|
0:43:46 | work with the analyst you never in there's nothing quite as i opening is working |
---|
0:43:50 | with an analyst and understanding the challenges they're dealing with |
---|
0:43:54 | and participate in forensic style evaluations |
---|
0:43:58 | that's what we would really like to see |
---|
0:44:01 | wrote the most serious |
---|
0:44:03 | so here i just have a couple then slides i norm uninsured |
---|
0:44:08 | and the idea here is i mentioned set |
---|
0:44:11 | okay so the organisation of scientific area committees this is a new after |
---|
0:44:16 | it's house the nist |
---|
0:44:18 | swig speaker here will be absorbed in sack is there's speaker recognition subcommittee |
---|
0:44:25 | i've already mentioned in this in seen a slightly l type eleven records |
---|
0:44:32 | i has a great set of |
---|
0:44:36 | documents and a journal and even the air code of conduct |
---|
0:44:42 | that you might be very interested in |
---|
0:44:46 | there's a lot of other organisations i basically had a list of |
---|
0:44:51 | a quarter this line the mast some friends for help thank you everybody who sent |
---|
0:44:56 | me things |
---|
0:44:56 | now i have too many things to actually talk about all of them |
---|
0:45:00 | so this highlight to here |
---|
0:45:03 | and in fact |
---|
0:45:06 | i mentioned the and a five folks are pursuing some new data that's in the |
---|
0:45:12 | forensic domain i won't steal the thunder from their paper which is why trim a |
---|
0:45:16 | conference |
---|
0:45:17 | and there is some big efforts in |
---|
0:45:20 | euro in the |
---|
0:45:23 | f p seven as well for |
---|
0:45:25 | b |
---|
0:45:26 | multi integrate voice systems that are multi media multi |
---|
0:45:32 | source system |
---|
0:45:35 | okay so let me i conclude |
---|
0:45:39 | speaker recognition is successful used today in a variety of applications |
---|
0:45:44 | but must be applied responsibility with caution |
---|
0:45:47 | and this is referencing the paper the chair finally mention that the beginning |
---|
0:45:54 | we need to work more to address the factors in the forensic domain the |
---|
0:46:00 | i degrade performance |
---|
0:46:03 | real case data as you heard can be extremely challenging |
---|
0:46:07 | in right now if somebody wanted to ask okay that first example with the triple |
---|
0:46:12 | homicide what kind of error rate could i x that |
---|
0:46:16 | in that situation that is one of the downward factors |
---|
0:46:20 | nobody can answer that even close |
---|
0:46:26 | there's many challenges to as |
---|
0:46:28 | that are needed to address these questions |
---|
0:46:31 | please contact me if you have any ideas and i think has he said it |
---|
0:46:36 | best |
---|
0:46:37 | someone is a very good finish way for a decision |
---|
0:46:41 | so maybe we can talk more about this and this on a nine |
---|
0:46:45 | thank you so much |
---|
0:46:55 | when you think drawable is a very much so |
---|
0:47:00 | where a little bit longer but what we us to have five or ten minutes |
---|
0:47:05 | for questions so yes |
---|
0:47:09 | wants to |
---|
0:47:10 | begin |
---|
0:47:13 | what for microphone is coming |
---|
0:47:18 | i four recording |
---|
0:47:22 | self recording for the mismatched especially the first equality play |
---|
0:47:26 | is that the question of the intelligibility of the speech is even a human cannot |
---|
0:47:30 | understand for example the first but you like you can understand what they say how |
---|
0:47:35 | the machine can't it with like |
---|
0:47:37 | so that the intensity of the speech is one part of |
---|
0:47:41 | like for special for maybe locates the say okay |
---|
0:47:45 | this problem for just a bit of speech is no one can ask expert or |
---|
0:47:49 | t so we can expose from the beginning or something like that right is |
---|
0:47:53 | is the issue addressed before so |
---|
0:47:56 | the intelligibility issue is an interesting one because it comes up and one of the |
---|
0:48:00 | very first courtroom the ask goes with the michigan state leaves |
---|
0:48:05 | with some voice evidence |
---|
0:48:08 | when the testimony from one of the police was that this per the |
---|
0:48:16 | voice on that recording |
---|
0:48:18 | can only be this person to the exclusion of all others and then the judge |
---|
0:48:23 | played the recording |
---|
0:48:24 | he couldn't understand |
---|
0:48:27 | so then he's asking so how what makes you think |
---|
0:48:32 | and quickly this was overturned |
---|
0:48:35 | or ruled out |
---|
0:48:38 | then stepping forward |
---|
0:48:40 | as you saw with the structured listening |
---|
0:48:43 | the first step there's to transcribe the speech in the words and then look for |
---|
0:48:48 | these |
---|
0:48:48 | very variation |
---|
0:48:51 | i you're in trouble if you can't transcribe the speech in for that now |
---|
0:48:57 | one thing that we need to be cautious a with the automatic systems |
---|
0:49:02 | as long as they can detect speech which isn't always the case |
---|
0:49:07 | they'll process the data and produce a score |
---|
0:49:11 | well you shouldn't three like a black box |
---|
0:49:15 | that score might be meaningless |
---|
0:49:17 | so i don't really know how to directly address your question other than share those |
---|
0:49:22 | observations but if you're working on that would be good no |
---|
0:49:28 | okay |
---|
0:49:29 | thank you |
---|
0:49:31 | what else |
---|
0:49:35 | which are |
---|
0:49:40 | thanks for torture i'll well also adding speech and leon in france i attended the |
---|
0:49:45 | forensic tutorial |
---|
0:49:47 | and he said that when i have a tracks recording a and the core suspect |
---|
0:49:53 | like in to them but like to rate |
---|
0:49:56 | so that covers assigned phonetic pronunciations in the actual choice |
---|
0:50:01 | can you just |
---|
0:50:03 | i can i can clear |
---|
0:50:05 | next we cringe but go a sorry i was kinda listening to your presentation about |
---|
0:50:10 | the phonetic content we actually looking at the london fines right is that is that |
---|
0:50:15 | occur something you follow similar type thing i you get the suspect to pronounce assigned |
---|
0:50:20 | twelve fines |
---|
0:50:22 | use of this gets down to |
---|
0:50:25 | in one area the methods being use |
---|
0:50:29 | so the very old |
---|
0:50:32 | antiquated i method known is that spectrographic matching |
---|
0:50:37 | actually requires at least twenty word like units |
---|
0:50:42 | being spoken |
---|
0:50:45 | that match what's in the evidence |
---|
0:50:48 | so one way they would deal with this it's to give the person something to |
---|
0:50:53 | really get loads twenty word like units |
---|
0:50:56 | well as you can imagine read speech is disastrous if you're trying to study things |
---|
0:51:02 | like dialectal variation |
---|
0:51:05 | so |
---|
0:51:06 | what's good for the all |
---|
0:51:08 | spectrographic matching process is a disaster for modern |
---|
0:51:13 | methods like structured listening which i should add are inspired by a lot of the |
---|
0:51:17 | methods used in europe in germany by the be okay |
---|
0:51:23 | so this is |
---|
0:51:24 | those recordings that they could be talking about the old |
---|
0:51:28 | style manner |
---|
0:51:30 | just as a subsequent questioned then we're able to get some kind of speech recognition |
---|
0:51:36 | into a speaker id systems |
---|
0:51:38 | where there is some kind of phonetic alignment is not beneficial to the community |
---|
0:51:45 | the forensic |
---|
0:51:47 | well in fact some speaker recognition approaches |
---|
0:51:51 | have a layer where they're actually doing speech recognition and phone recognition |
---|
0:51:58 | and that a lot of that work was inspired by george doddington actually |
---|
0:52:03 | i and idiolect |
---|
0:52:05 | and sure whether it's in the recognition system itself for a by product of these |
---|
0:52:11 | structured listening approach speech recognition becomes a very important process whether it's automatic it's a |
---|
0:52:19 | different question |
---|
0:52:22 | but if there's a lot of data to analyze the overwhelming analysed if they have |
---|
0:52:27 | to manually i do you say phonetic transcription which was the approach being used for |
---|
0:52:33 | quite awhile |
---|
0:52:35 | that is this bad system i showed and that one slide helps to automate that |
---|
0:52:40 | speed the efficiency in fact |
---|
0:52:49 | but question |
---|
0:52:52 | sit under a texture you mentioned in a is the sort of pitch more and |
---|
0:52:58 | of course that's scary for us to what we're never gonna be as accurate as |
---|
0:53:02 | they are that's i think that's problem in speaker recognition |
---|
0:53:05 | but are we have valuable evidence to introduce it softer it sweeter evidence |
---|
0:53:12 | the using the american legal system can understand the concept of weaker evidence and how |
---|
0:53:18 | value valuable it can be an integer do you think a likelihood ratio |
---|
0:53:22 | can be understood by four |
---|
0:53:26 | okay so that is multiple ones |
---|
0:53:30 | the first one |
---|
0:53:31 | it is what the i national academy of sciences with calling for with the framework |
---|
0:53:37 | like the in |
---|
0:53:39 | they weren't although would be nice they were demanding that the performance be on par |
---|
0:53:44 | with the end |
---|
0:53:45 | but they let it be in a in the scientific background behind and very large |
---|
0:53:52 | studies that have been done here all evidence it's a very nice |
---|
0:53:57 | except by the way when you're dealing with uni mixtures but for the time being |
---|
0:54:02 | just assume that you the any samples where there is a whole nother the of |
---|
0:54:08 | dealing with some of the those channel so |
---|
0:54:11 | in a is not perfect but it's extremely good |
---|
0:54:15 | the next question about will jurors be able to deal with properly understand likelihood ratios |
---|
0:54:23 | so bill perhaps and it is conducting a survey of the mock your |
---|
0:54:30 | actually see when they're presented with |
---|
0:54:34 | evidence in different forms whether it's likelihood ratios i a verbal description of what a |
---|
0:54:41 | log-likelihood ratios for might mean to see how that's interpreted by jurors i don't know |
---|
0:54:49 | he's publish that paper but it should be happening soon |
---|
0:54:53 | and one thing that happened with dorothy going and see who is also involved in |
---|
0:54:57 | this study is a hybrid cy x where she came up with a very scary |
---|
0:55:02 | statistic in that was something like a quarter of jurors in the us |
---|
0:55:08 | don't understand fraction |
---|
0:55:11 | what are we gonna do |
---|
0:55:13 | move to europe i don't know how well i don't know what the ratio is |
---|
0:55:18 | in europe but wow that's this area so |
---|
0:55:23 | but it's important that the general public vad |
---|
0:55:29 | i don't know what but if i could commanders peace last question i'm not sure |
---|
0:55:35 | it's useful to ask the question in fact i have the answer but don't and |
---|
0:55:41 | pickle will not understand the likelihood ratio and we know all about because well for |
---|
0:55:46 | and able to understand likelihood ratio and how mine |
---|
0:55:49 | under the |
---|
0:55:51 | reason to and so like but there's |
---|
0:55:53 | you should still requesting for local overall system in all the countries to be expected |
---|
0:55:58 | to be a witness to coming from papa coped |
---|
0:56:01 | you know that we explain for people to you means but we still keep results |
---|
0:56:06 | but it so like to one issue is not the non-focal is not |
---|
0:56:10 | the lemon |
---|
0:56:11 | so why we define orifice |
---|
0:56:15 | that can break issue used only |
---|
0:56:17 | according to me to give you pour to needy to view of a party |
---|
0:56:22 | to |
---|
0:56:24 | i bouquet do to a estimated quality of what we didn't up of science in |
---|
0:56:31 | the ripple |
---|
0:56:32 | i like the ratio is defined for some difficult people use one expert |
---|
0:56:38 | in the park the report is using a global that if you meet all we'd |
---|
0:56:43 | like to ratio and of or expert could |
---|
0:56:48 | review baseball than the a firewall against to the middle |
---|
0:56:53 | and the we are in some to pick language not in the cold language after |
---|
0:56:58 | about the expert the younger people |
---|
0:57:00 | you see his own opinion and taking his own risk |
---|
0:57:05 | and this is not |
---|
0:57:06 | like calibration at all |
---|
0:57:09 | sorry i don't want to take that would a i would like to a location |
---|
0:57:13 | to discuss just question the later maybe k varies |
---|
0:57:17 | last question |
---|
0:57:19 | so one |
---|
0:57:20 | no you |
---|
0:57:24 | george the |
---|
0:57:26 | well likelihood ratios a wonderful thing |
---|
0:57:32 | the primary issue with the likelihood ratio use the |
---|
0:57:38 | happens to be the output of a system whose crazy |
---|
0:57:42 | the likelihood ratio |
---|
0:57:44 | if you actually know the likelihood ratio |
---|
0:57:47 | perfectly wonderful to use |
---|
0:57:50 | but the likelihood ratio audible supposed to most portion |
---|
0:57:55 | let's works |
---|
0:57:58 | maybe what you were just getting at is that we need to keep in mind |
---|
0:58:03 | we're always estimating likelihood ratios and it's just another |
---|
0:58:09 | i area cost of mismatch |
---|
0:58:12 | you know our systems are producing these estimates |
---|
0:58:15 | and |
---|
0:58:16 | using data that probably doesn't |
---|
0:58:18 | look anything like that first real case i |
---|
0:58:23 | so what you |
---|
0:58:25 | i don't |
---|
0:58:27 | i have to closed position a unfortunately i and i want to thank you |
---|
0:58:32 | by your jewelry okay |
---|