0:00:06thank you for the uh
0:00:07invitation to be here uh i it did come as a
0:00:11surprised because as you know uh
0:00:13uh immediately appreciate are not uh
0:00:15uh a voice
0:00:16or language recognition
0:00:19person uh but right from day one i realise that there are lots of issues uh circulating here but uh
0:00:24related to things that we've had to uh
0:00:27struggle with in connection with yeah
0:00:29i'm not even a D N A evidence
0:00:31person mainly i work in a kind of medical genetics context and my main uh brighton but work is
0:00:37you know looking for disease genes and cool you know 'cause
0:00:40the fixed
0:00:41genes an interesting uh phenotype
0:00:43uh but i've long had an interest in the interpretation of
0:00:47D N A evidence
0:00:48and uh
0:00:49try to contribute a uh
0:00:51a lot to the developments in the field over there
0:00:53oh the yeah it is and i'm pleased to say that
0:00:56we have made a
0:00:57a lot of progress
0:00:59um it's also clear that uh people in this
0:01:02uh community here have made a lot of progress in trying to get
0:01:06uh the
0:01:07the field on
0:01:09what i would regard as a more rigorous footing in terms of the
0:01:13interpretation and i i'm thinking
0:01:15uh in a second frames the context of all the evidence
0:01:18that will be
0:01:20comprehensible and meaningful in whole
0:01:23uh and so i've done a little bit of background reading of uh
0:01:28uh oh
0:01:29interesting our work in the field by F and what team and several other people here
0:01:34and so uh
0:01:35it's clear that i you know don't
0:01:37much to say about the
0:01:39the basics but what i thought
0:01:41i would um
0:01:42do is take a some slightly contrary in position and uh i would say there seems to be uh uh
0:01:48a kind of sense i got from the reading that the that the
0:01:51that the grass is greener in the next field uh
0:01:54that uh everything is solved and works very well
0:01:57for deny evidence and and uh i'm going to tell you that that's not the case uh it's
0:02:02very complicated and messy there are some compromises that
0:02:05that's all for work
0:02:07we are saved
0:02:09to a large extent by the
0:02:11evidenced by the by the fact that D N A evidence is in general very good
0:02:14evidence uh
0:02:16and very powerful
0:02:17uh and so even if you make a mess of the interpretation
0:02:21uh the ultimate outcome might not be the wrong one but that's not always the case uh
0:02:26actually the reality in courts today of the presentation of D N A evidence
0:02:31it's still pretty dismal uh
0:02:33and it doesn't matter when i get to the end are we talking about the
0:02:37the latest generation of low template D N A evidence
0:02:41very small amounts of D N A lots of stochastic
0:02:44uh and lots of
0:02:48those of you who are paying careful attention my
0:02:51recognised some of the
0:02:53whatever it in some of the children's change my tackles
0:02:56somewhat instead of talking i've talked about comparisons here instead of recognition so we we have the same debate in
0:03:02indeed evidence that we shouldn't talk about deny identification because
0:03:06identification is
0:03:07is not possible and not the business of the scientific expert
0:03:13personally i'm a bit more
0:03:14lack some light on happy to use words that make sense the general public even if
0:03:18uh we have to be careful about understanding uh
0:03:20what they really mean but anyway you know
0:03:22acknowledgement here i put voice comparison might happen
0:03:25but in the end although my goal was to try and think about relationships
0:03:30D N A evidence and uh and voice evidence
0:03:33all the basic work has already been done by people here and i didn't feel i had very much the
0:03:37so i'm really just gonna restrict myself to talking about the you know evidence
0:03:41some of the problems that we've had
0:03:42some of
0:03:43my views on how well come them
0:03:45uh and then we leave for the discussion uh
0:03:49the possibility for people to really raise parallels and you go advised me not to leave any time for discussion
0:03:54because it's a very controversial area but uh i'm going
0:03:57i'm going to try and take the risk
0:03:59uh but uh
0:04:01in fact
0:04:02i packed quite a lot of uh
0:04:03stuff into my slides and sounds to me a bit louder again
0:04:07just move it down a little bit
0:04:09the um
0:04:11and i i wouldn't have time to get through it all properly
0:04:14the um
0:04:16you have the luxury of knowing that
0:04:18uh you don't have to really get to grips with all of this material i just wanna give you the
0:04:22the flavour of
0:04:24the problems
0:04:24we worry about
0:04:26and uh historical
0:04:28perspectives on the go back right to the sort of beginning of time so to speak of this whole um
0:04:33weight of evidence academic literature a lot of it springs from this famous case uh
0:04:38in california in the nineteen sixty eight
0:04:41uh i could define how about all the papers written about how to interpret the evidence in this case correctly
0:04:46it would
0:04:47would go up to the roof and it wouldn't reach a conclusion the famous uh uh saying about the colours
0:04:52uh it it is a very very interesting case uh i i mean i think you get the details just
0:04:56from there that
0:04:57numbers one made up
0:04:59uh that might be associated with frequencies for various traits
0:05:03that the defendants possessed
0:05:06uh it was claimed the uh the true criminals also possessed
0:05:10uh and you know it's lots of fun you can give this to students 'cause there's lots of things wrong
0:05:14with this you know obviously those probabilities of just made up obviously independence is a problem
0:05:19but this sort of more fundamental issues uh every packet
0:05:22wave my magic wand and get rid of those problems and if those really work through probabilities and they really
0:05:27were independent
0:05:29what would the number you get by multiplying these probabilities
0:05:33together what would it be
0:05:34uh and
0:05:36how does it relate
0:05:38to the juror or the finder of fact
0:05:40of deciding whether the computer
0:05:42well i'm i'm not gonna answer that problem
0:05:44for you here entirely but uh
0:05:46it's an interesting and and uh and difficult problem
0:05:50certainly one version
0:05:52to answering it
0:05:54and and one branch
0:05:56of the academic literature
0:05:58but i'm sort of merging things here that slightly uh
0:06:01there there was
0:06:05developed slightly differently but the the the sort of canonical problem for D N A evidence is uh
0:06:10we've got
0:06:11a sample left
0:06:13crime scene
0:06:14uh we've already had some discussion of the notion of a match
0:06:18is meaningful for D N A evidence not always
0:06:21um not follow template in a of and
0:06:24not for the older form of D N A profiles that was in use
0:06:27in the early nineteen nineties and is still occasionally
0:06:30crops up
0:06:31but let's just
0:06:32he rarely all the idea that you know there is a notion of a match of a yes no answer
0:06:37uh and we've got some frequency information again um
0:06:40let's not worry about where this frequency information comes from just believe it for the moment
0:06:46and so
0:06:47you know how
0:06:48how convinced
0:06:49should you be i think the sort of fallacy that uh many people has already been alluded to here is
0:06:55to think well one and it's pretty small
0:06:57so he must be guilty
0:06:58a what you know what is that sort of logic there and
0:07:02that was a bit of a
0:07:04and academic literature a fun discussion that went on
0:07:07a quite a few years um
0:07:09it's uh
0:07:11in retrospect the answers
0:07:12seems very easy and you wonder how we manage to argue about it for quite a number of years but
0:07:17uh of the uh
0:07:18but uh anyway that's what economics uh and therefore and uh finding out
0:07:22uh problems to argue over and
0:07:24i can't consensus
0:07:26eventually emerged around the
0:07:29you know what
0:07:30from uh
0:07:31and orthodox
0:07:33bayesian position would be a kind of standard and straightforward response
0:07:37you should be using
0:07:38base there but uh and i put one version of it there um
0:07:43you can so
0:07:45uh introduce some notations C is the name of the person who committed the crime
0:07:50or let's say of course
0:07:52being the source of the D N A is not logically equivalent to committing the crime will just suppose it
0:07:56is here
0:07:56um and S it's the name of the
0:07:59of the
0:08:03there are some subtleties and difficulties built into here i'm gonna spend a little bit more time and
0:08:08talking about
0:08:09uh and that revolves around the idea of
0:08:12of what is the
0:08:13alternate hypothesis
0:08:18in um
0:08:20as i i've already uh mentioned earlier this idea that i i think there's a bit of an impression
0:08:24that the you know the grass is greener in the next field and things are easy if the D N
0:08:28A evidence but one of the things that's uh
0:08:30that's not easier in some uh is
0:08:33this uh
0:08:33specifying of the alternate hypothesis and
0:08:35and the uh
0:08:37can be
0:08:37why uh
0:08:39i difficulty logically
0:08:40uh in a number of ways
0:08:43uh but yeah
0:08:44i'm going to assume so so in the
0:08:48the speech recognition literature people have been happy to just posit that the
0:08:52the null hypothesis the prosecution hypothesis if you like the same source
0:08:56that the uh
0:08:58uh queried rick
0:08:59uh voice and the suspect's voice uh
0:09:02come from the same individual
0:09:03or different source
0:09:07you know evidence at least
0:09:08it can be that simple
0:09:10and uh i've chosen to break down the whole time to type up
0:09:13this year
0:09:15a number of hypotheses of the form X did it for various things
0:09:20uh but
0:09:21for more complex problems there are different ways uh to break down the evidence
0:09:26uh the alternative hypotheses
0:09:28for example if there uh multiple
0:09:31uh do you know samples which is often the case there are lots of alternatives around
0:09:35different contributors to the different samples you know just uh
0:09:39often it's just assumed implicitly that there's a single contributor
0:09:42but for mixed samples
0:09:43that's not at all straightforward
0:09:45uh there's different alternative hypotheses a around uh
0:09:50and there's different alternative hypotheses around the number of contributors to the sample
0:09:56but um
0:09:58in this form here
0:09:59i'm just thinking about breaking down the alternate hypotheses into all the individuals who it could have been
0:10:04and we have to add up
0:10:05over these uh evidence
0:10:09logically you have to add up of everyone on a
0:10:12uh and uh
0:10:13and that this idea came up actually in a court
0:10:16S and the judge was horrified at the idea that he has to sit there and think yeah
0:10:19uh about every person on a one at a time
0:10:23uh but i wanna emphasise this point that logically you
0:10:26have to
0:10:28if you want to prove
0:10:30that particular individual is the source of your D enable the source of your voice recording
0:10:35logically that means that everyone else on the
0:10:38is not the source
0:10:39and also alternate hypotheses around
0:10:42uh you know synthetic um voice fabrication on these kind of things all of those hypotheses
0:10:47uh have to be ruled out in order to establish the one type of
0:10:51do you
0:10:51care about
0:10:55so a little bit of uh
0:10:58manipulation we can right now
0:11:02like this
0:11:03in uh again this is kind of just
0:11:05classic uh
0:11:06way of breaking down the evidence
0:11:08uh uh breaking down the calculation in bayes theorem
0:11:11and the idea is to introduce some notation here i put a whole the likelihood ratio
0:11:16and again i want to emphasise that isn't one likelihood ratio there are many and we count
0:11:22that is difficult problem of how to combine the like
0:11:25uh although we we we
0:11:27we'd like to
0:11:29i'm thinking
0:11:31in terms of the
0:11:32D N A evidence being interpreted last
0:11:34and so
0:11:35this other ratio the prior
0:11:37i'm thinking all those incorporating all the other evidence i mean there's no
0:11:41logical reason
0:11:42for doing it that way around of course that's
0:11:43it's a nice
0:11:44coherence property of
0:11:46of the bayesian analysis you don't
0:11:48two one
0:11:50uh you get the same answer which
0:11:52whichever order you analyse the other
0:11:55and in order to get it in this form you need to make a
0:11:58uh assumption uh
0:12:00various uh uh independence
0:12:02some some
0:12:03we could argue about that scene
0:12:04generally reason
0:12:08some putting the um
0:12:12no getting to be able to write
0:12:14uh weight of evidence in this form in a kind of forensic setting
0:12:18was a pretty big uh
0:12:20step for all that it took yeah
0:12:22many years and lots of arguments and so on but it's
0:12:25you know pretty much accepted amongst a bigger
0:12:28community nowadays and
0:12:30it overcame a lot of problems that people
0:12:32uh struggled with i mean i've been in the field so long now that it's
0:12:36sort of hard to remember how difficult some of these troubles were
0:12:41but uh you know this basic idea that i nations you to well one in a million is really small
0:12:45he must be guilty
0:12:48it's not it's not true and people didn't know how to think about that
0:12:52uh until we were able to formalise the problem in this way
0:12:56uh and now it seems pretty easy to think about it for
0:13:00again this is a simplification in the general problem is not that simple but one way to think about is
0:13:05how many alternate suspects
0:13:06there are
0:13:07uh and
0:13:09you under some simplifying assumptions
0:13:12you were essentially add up the likelihood ratio of a your alternative suspects
0:13:17and so
0:13:18a likelihood ratio
0:13:20or one over a million
0:13:22isn't convincing uh if
0:13:24the number of alternate suspects
0:13:26is larger very
0:13:28into the time that there's you know no fundamental logical problem here about this uh
0:13:32nice uh
0:13:33distinction about between the role of the experts and the role of the
0:13:37of the of the finder of fact
0:13:39oh come back to that but but
0:13:41uh you know this is certainly an only true under some simplifying assumptions
0:13:45and whenever if i kind of present this idea in court
0:13:48i have to sort of be careful about wording like if you choose to assume that all the alternate suspects
0:13:53are equally likely
0:13:54uh then you come up with a formula like this
0:13:57of course nowadays um
0:13:59likelihood ratios
0:14:01a much
0:14:02bigger or smaller whichever way you do them around uh and one million
0:14:05uh and so uh
0:14:08tip in typical cases the problem has
0:14:11vanished but again i wanna emphasises lots of cases out there would mix profile small amounts of D N A
0:14:16complex relatedness
0:14:18we're all these issues still matter
0:14:22the role of relatives so that was a
0:14:25again much confusion about this than in the past
0:14:30this nice uh
0:14:31formalisation in terms of bayes theorem
0:14:34i want i might slip into this
0:14:36language but another point four
0:14:38for discussion is i don't think
0:14:40what i'm doing is fundamentally bayesian i tend to avoid
0:14:44the label bayesian the way i'm just using bayes theorem and its
0:14:46theorem probability that all uh
0:14:49or uh or light here it's the model
0:14:51mathematical probability except
0:14:53uh in fact i would say my approach is fundamentally non bayesian in ways that i will point out uh
0:14:58want to later
0:15:00uh i just remembered now i forgot to put a slide on this there was a mention of it uh
0:15:04that there was a big uh
0:15:05court case um
0:15:07in the U K a number of years ago where
0:15:09the uh
0:15:11that was strong D N A evidence implicating implicating a defendant
0:15:15but it was quite a substantial amount of evidence in his favour any particular
0:15:19uh the victim of this crime gave a good description of the defended
0:15:22uh of the of the attack uh and the defendant didn't match you know gross mismatch
0:15:27between the description and what he looked like
0:15:29but she also said in court this does not resemble them and
0:15:33you know he does not resemble the man
0:15:35since i wasn't interested it doesn't resemble
0:15:37the man that attacked me and
0:15:39so uh and he had an alibi and
0:15:42wasn't near the scene of the crime at the time and so
0:15:45um quite a complicated case went to summary trials and uh
0:15:49the um i wasn't involved in that case but the the defence expert actually
0:15:54uh proposed at all the jurors through
0:15:56a bayes theorem calculation
0:15:58uh with likelihood ratios for
0:16:01the wood
0:16:01description on likelihood ratios for the uh
0:16:04for the D N A evidence
0:16:05uh and likelihood ratios for the alibi evidence
0:16:08and suggesting values and the jurors were asked to multiply them together
0:16:12and the judge got quite enthusiastic about this and ordered somebody to go out and buy doesn't calculators for the
0:16:17jurors to uh
0:16:18multiply the numbers together but uh the judge kept getting zero and uh when he tried to do the calculation
0:16:25anyway the um
0:16:28the guy was uh was convicted
0:16:30but it went to appeal and the appeal court was sort of
0:16:32horrified absolutely horrified about this uh complicated mathematical stuff that these uh wise all judges didn't understand
0:16:39uh was uh about having this in court and so
0:16:43the judgement was very severe that uh
0:16:45you know bayesian methods one not to be introduced in U K courts that uh uh because you know ask
0:16:51right here at all judges don't understand that and uh you know it's all lots of a power thing no
0:16:56worried about losing that
0:16:57losing that how about i just thought it was a sort of amusing
0:17:00idea that any form of reasoning is is allowed there is no other
0:17:04role as far as i know there's not
0:17:06oh all reasoning is allowed in a british court except
0:17:10the form of reasoning that's been established to sort of be logical and read and rational and reasonable that's the
0:17:15only thing you're not allowed to present in a
0:17:17in a british court uh
0:17:19so that so that was a bit of an aside but that's why i sort of avoid the label
0:17:23bayesian and i think it's uh it is uh irrelevant and um
0:17:28to what we are doing
0:17:30uh and of course i don't have a
0:17:32explicitly introduce mathematical formalism
0:17:35whenever i'm giving expert witness
0:17:37but i do try and talk to jurors through this kind of thing and say
0:17:40imagine how many
0:17:42close relatives there are
0:17:44of the defendant
0:17:45what is the match
0:17:46what is the matter probability for then
0:17:48uh and imagine how many unrelated
0:17:50people what's the matter probability you gotta combine the total weight
0:17:54uh to come up with a
0:17:56and if that combined weight is norman
0:18:00you've got reasonable doubt about
0:18:03having the right guy
0:18:04uh unless there's other evidence uh implicating those
0:18:07uh oh or not
0:18:08taking the brothers
0:18:22yes many of these features of course i'm gonna be talking you want really relevant to voice recognition but i
0:18:27think some of them will be
0:18:28uh i'm talking uh
0:18:30i've got a sneak in the label here now i only mean genetic uh
0:18:33ideas of uh
0:18:34of ethnicity here
0:18:37relatedness matches the do you know i haven't
0:18:41this is really the same issue is
0:18:43close relatives
0:18:44but it's just relatedness
0:18:46on a more distance scale
0:18:49uh so the relatedness of people in an isolated uh
0:18:53uh geographical or religious group
0:18:57bay yeah
0:18:58compared to relatives
0:18:59they're relatedness is less
0:19:01relatives like cousins and so forth
0:19:03they're relatedness is less than is typically more of them
0:19:06and so they
0:19:07you know they
0:19:08i kind of plausibly balance out
0:19:11i'll come back to this
0:19:13uh really really important of these issues are around uh and and a real fundamental difficulty that i don't think
0:19:19we really have sold
0:19:20is what to do about
0:19:23air uh
0:19:24labelling errors
0:19:25uh and outright
0:19:27evidence for award
0:19:30but at least the bayes theorem
0:19:33tells us how to think about the problem and what
0:19:35relevant issues out but
0:19:36but what it tells us is um
0:19:39is a little bit worried i mean first of all one thing that's not wiring is that some
0:19:44critics of D N A evidence we're going round saying well you know i know
0:19:47no human activity has an error rate less than about one in a thousand
0:19:51and therefore
0:19:52numbers like ten to the minus six ten to the minus seven ten to the minus eight
0:19:55come up in uh
0:19:57in in connection with dean evidence a completely meaningless
0:20:00so that reasoning is invalid
0:20:02this is not the probability of any error that matters
0:20:04but only a narrow that generated the data that we observe
0:20:10there's a famous story that uh feel
0:20:12david uh
0:20:14pointed me to from
0:20:15price i think
0:20:16probably eighteen century english philosopher to discuss this point that
0:20:20a printing error in the newspaper is more likely than you winning the lottery
0:20:25uh but nevertheless
0:20:26if you see a number printed in the newspaper as the winning lottery number
0:20:31you don't through the newspaper out and say a probably a printing
0:20:35because uh
0:20:37the uh because it's not any printing error that matches the printing a rather generated your number
0:20:42is much less likely that you winning the lottery and therefore you do
0:20:45through the paper up and run down the lottery office to claim your prize
0:20:49um but
0:20:50this is a fundamentally a problem
0:20:53that uh
0:20:54some of the
0:20:56more reason critics of D N A evidence i don't think we can easily get away with this
0:21:00that uh evidence tampering
0:21:03doesn't involve this problem because evidence
0:21:05prob tampering doesn't generate
0:21:07the evidences of
0:21:12she quickly
0:21:14a reasonable persons
0:21:16view of the probability that the police or somebody else
0:21:19tempered with the other
0:21:20in some way
0:21:21it's gonna be much greater than a match than a match probability or likelihood ratio
0:21:25in connection with you know
0:21:29logically i think it is true
0:21:31but because of this
0:21:34this typically will swarm
0:21:36the uh the significance of the match probability
0:21:39D N A evidence
0:21:40so that
0:21:41if you do get a good you know profile match
0:21:44the actual number connected with it is pretty meaningless it's
0:21:48it's virtually impossible and the only way down now too
0:21:51thinking about these kind of alternatives
0:21:54but what i nets but can do about that in court
0:21:56is quite uh
0:21:58it's quite difficult you could you you can't even
0:22:01consider putting numbers on this kind of thing of course uh but uh
0:22:07you should be alert injurious to this possibility
0:22:09uh and uh
0:22:11that it should be
0:22:13you know wait
0:22:13in combination with the
0:22:15with the match probability for the D N A
0:22:23i don't have time to go this here this is also the start of stuff but this was a sort
0:22:27of fun
0:22:28debate will not find really 'cause i did get a bit tedious it just went on and on and on
0:22:31and it still goes on this uh argument about the uh
0:22:35effect of the evidence i know some of your read some of the literature so you'll be aware of these
0:22:38issues but uh
0:22:40have some of you want be one
0:22:42i imagine case number one
0:22:45uh i just say you know he matches and it's one in a million probability of a match
0:22:49uh case number two i tell you those two facts but also tell you all by the way
0:22:54uh i found him by looking through our database of D in a profiles
0:22:57he was the only match
0:22:59in which case
0:23:00is the evidence stronger case one okay stew
0:23:03the classical statistical viewpoint is a case too is uh evidence trolling
0:23:08uh you've gone through the uh
0:23:10uh you know you're going out fishing for hypotheses and we all know about multiple testing and one for any
0:23:15kind of thing
0:23:16uh evidence is much weaker
0:23:18if you go fishing for hypotheses
0:23:24uh if you've the defendant has been identified through a search in a database
0:23:29D N intelligence database
0:23:31of known uh
0:23:32previous offenders
0:23:33uh the the data
0:23:37we can
0:23:37then in a standard
0:23:39and of course uh
0:23:41completely wrong uh this i i think the uh the standard uh
0:23:45statistical reasoning is just uh inappropriate here certainly
0:23:49all of that you know there's a classical argument about frequentist and bayesian views or
0:23:53in the literature but
0:23:54but often the frequentist in the bay seems get to roughly the same place in the end
0:23:58that is one example where they get to very different places
0:24:02when it because of the strong logical foundations of the bayesian foundation
0:24:06whenever the two of them disagree
0:24:08it's pretty much always the bayesian view that's right
0:24:11uh and
0:24:12and it is
0:24:13yeah what
0:24:15people who are worried about the evidence trolling idea on that weakening the evidence
0:24:20the problem with their approach is they are not
0:24:23being critical enough
0:24:24in the first
0:24:25because even if i just
0:24:27even if there was no
0:24:29evidence trolling even if there was no database search i still have to logically
0:24:34to prove that this guy committed
0:24:35crime i have to prove that every other person on a
0:24:39didn't commit the crime
0:24:40those two
0:24:41formulations are equivalent statements of the problem that's a really tough
0:24:45task and nobody else
0:24:47nobody would have dared to
0:24:48even contemplate that in the past because it was unthinkable that you could prove that everyone else on the didn't
0:24:53do it
0:24:53but you can now would be an evidence you can think about an ugly
0:24:57arguably due
0:24:58and so
0:24:59any amount of
0:25:01fishing or trolling for hypotheses
0:25:03is it doesn't change that fact
0:25:05you still have to prove that everyone else on a didn't commit the crime
0:25:08and in fact
0:25:10it makes life
0:25:11better because everyone else in the database didn't match
0:25:14so that helps even your task
0:25:16that everyone else on a didn't commit the crime
0:25:18'cause you've got a whole lot of people that you've shown not to that a profile doesn't match
0:25:25but latter argument is an argument about uh
0:25:28hypotheses rather than so this
0:25:31issue about probabilities of evidence was probably
0:25:35i know it's a bit more about in a moment we like to separate the two but i
0:25:40insist that fundamentally it's not uh
0:25:42it's not possible to
0:25:43achieve that ideal in many situations
0:25:46that that's not really uh
0:25:50um so
0:25:51some of the things right feel that the
0:25:54the the
0:25:57of the problem
0:25:59even though it's just a sort of standard was no
0:26:01bayes theorem it wasn't obvious for about
0:26:04twenty or thirty years after the commons case when all this academic literature was piling up
0:26:08people didn't get to this position
0:26:10of just writing down bayes theorem and seeing its implications in the way described in them
0:26:15uh nowadays the majority of people even in the field don't
0:26:18uh so i don't succeed in understanding the evidence this way but there's a big enough community obvious that do
0:26:23uh that that isn't the problem
0:26:25but um
0:26:26there are many many uh
0:26:28problems that remain uh and uh
0:26:32i have already so stressed this one but this is one of my key points about
0:26:36the difficulty is that uh
0:26:37you know it's nice to think about a competition between the prosecution hypothesis
0:26:41and the defence side
0:26:42sis and uh
0:26:43you know many lawyers have argued with me that is fundamentally
0:26:47is what the whole legal system is based on the competition between two hypotheses
0:26:52and a sorta reject my idea
0:26:54but i've represented at but i
0:26:57claim that this is just a straightforward logical situation that in order to
0:27:01uh establish
0:27:02hypothesis one the prosecution hypothesis
0:27:05you must
0:27:05proof that every other competing hypothesis is false whether or not the defence puts forward and of course and
0:27:11most uh legal systems the defence downtown
0:27:13to put forward any story at all of course
0:27:15and even if they do put forward a story
0:27:19usually advise
0:27:22the court in a
0:27:23that um the jurors uh in now
0:27:26that um
0:27:27they don't necessarily
0:27:29uh even if they disbelieve the defence story it doesn't necessarily need the defendant is guilty these a separate question
0:27:36to be
0:27:37and sit separately
0:27:41it's inevitable that the forensic scientist has to make
0:27:47for example implausible hypotheses
0:27:49uh so
0:27:51uh in D N A evidence you've always got i was my
0:27:54identical twin
0:27:56story which makes D N A evidence uh
0:27:58completely uh
0:28:01uh it's actually quite remarkable
0:28:03how rarely that is used i think that everyone would just laugh it out of court actually identical twins are
0:28:08not rare
0:28:09and it's very hard to prove that you don't have an identical twin
0:28:12uh so if any of you do commit a serious crime and rub on court would again errands i do
0:28:16recommend you try the story that uh and uh
0:28:19uh i i i think logically it's hard to be uh um
0:28:24the uh but nevertheless in practice acting queueing sixties unfortunately that uh
0:28:30but also in the evidence we have the number of contributors do we deny sample even if there's no more
0:28:35than two little that every locker
0:28:37it doesn't follow that there's only one contributor
0:28:40there's no what the bound on the number of contributors
0:28:42uh and
0:28:44i'm the involved um right in the middle of a court case uh you know that i was giving evidence
0:28:48um fried enough to go back and continue mild evidence tomorrow
0:28:51uh and then the uh i it looks like one contributed
0:28:55the crime sample
0:28:56and i did some calculations one contributor to contributors
0:28:59and of course that the that i'm not advising the prosecution in this case usually on divine advising the defence
0:29:04uh but the defence of course it jumped up and said you haven't done any calculations for three contributors
0:29:09and and of course i said well
0:29:11you know there's no sign of even to contribute to so three contributors is ridiculous and they say a but
0:29:15you cannot rule out the possibility of three contributors and i have
0:29:18can see the uh
0:29:19but i can't you know that's a subjective judgement
0:29:22uh and uh
0:29:24i kind of transgress is this idea of trying to
0:29:29a clear logical distinction between the likelihood ratio
0:29:32uh and the uh
0:29:34and the probabilities of hypotheses
0:29:37and if you can read this
0:29:40in this respect scene is under what is is completely unavoidable that you can't uh you can't avoid
0:29:46making judgements about probabilities of hypotheses
0:29:48uh but nevertheless we should maintain the goal right just behaviour which is to try and avoid as far as
0:29:54any assumptions about the hypotheses and
0:29:57you know to be aware of them and make them explicit as far as we can
0:30:02the um oh i didn't mention this one as well contamination rates also an issue here
0:30:08there is um
0:30:09sometimes it's easy to get confused with discussions of
0:30:12of priors because there is a
0:30:14prior on the hypothesis
0:30:16uh that he's guilty
0:30:18uh for example
0:30:20and that's very clearly not the business of the expert
0:30:24and this is the
0:30:25and is the master of the
0:30:27a finder
0:30:27fact and it's for you know we have to be very careful in our wording to avoid any suggestion that
0:30:31we're expressing a view
0:30:32uh on the probability that he's guilty either before or after the evidence
0:30:36but of course we
0:30:37we include priors for other quantities all the way along a particular rate of contamination
0:30:42so with low template do you know profiles
0:30:44it's just amazing how difficult it is to get rid of contamination our environment is entirely covered with D N
0:30:50A you know for four meters around me
0:30:52there is my D N A staff it everywhere from my bread uh uh
0:30:56and i know touching things leaves your D N A
0:30:59uh it's a very kind of shocking
0:31:01so when you think about you know it is room is entirely covered with D N A
0:31:06uh the um
0:31:07but uh very thin film obviously
0:31:11we cannot ever exclude that
0:31:14there are some of the illegals we see in a mixed profile got there
0:31:18not through any of the main contributors that we're thinking about is the offender
0:31:22but some environmental
0:31:25uh that's a really serious issue with low
0:31:28low template D N A profiles but in any case any assessment
0:31:31about contamination rate
0:31:33is it is effectively a prior judgement
0:31:36based on you know there is
0:31:38is it
0:31:40into that the
0:31:46i yeah
0:31:48as us
0:31:49one that got too much stuff here i want to say very much i haven't said anything really about the
0:31:53technology of the of the you know profiling
0:31:56um i'd out there's a little bit there are those of the you don't know
0:31:59it's just that uh
0:32:01he short tandem repeat profiles
0:32:03a little words of D N A the repeated a number of times
0:32:06and the number of repeats affects the length
0:32:09uh and the current technology dist it's still not sequence based even though
0:32:13this may change in the future but there's so much investment in this technology now
0:32:17time to think about changing it
0:32:19we don't actually read the sequence we just
0:32:21measure the length of it in a fragment
0:32:23and the length is measured by running uh these fragments through a gel
0:32:27and there's a laser
0:32:28i detector at the finish line and
0:32:31uh her response
0:32:32the length
0:32:34usually we can interpolate
0:32:36the number of repeats so you might have seven copies
0:32:39of the repeat someone promise on the nine on the other
0:32:41so you would unit i would be represented the seven nine
0:32:44but uh unfortunately for this nice story partial repeats
0:32:48okay so this doesn't mean nine point three to decimal number it means nine copies
0:32:53all the four base pair repeat and then
0:32:55three base pair
0:32:58a repeat
0:33:00but uh
0:33:01nevertheless it is
0:33:02pretty much
0:33:04possible to
0:33:05to say yes no whether the um
0:33:08whether the fragment lengths match
0:33:10uh and this is sort of idealised view
0:33:12of the
0:33:13electra fairground
0:33:15basically a time series plot as these freshmen
0:33:17past the finish line
0:33:18uh there they are
0:33:20there are
0:33:21dies you know coloured eyes you can think about
0:33:24uh that distinguish
0:33:26the fragments from different loci
0:33:28uh and then
0:33:29different loci have
0:33:31and it in a different length ranges
0:33:33so that enables you in one test you
0:33:36to uh
0:33:37and the lies
0:33:38channel twenty uh genetic loci
0:33:42in the current technology we all done i mean we'd love to be able to take into account heights of
0:33:47these peaks
0:33:48uh but we don't we we just it was of the binary yes no
0:33:51uh there is a piece here
0:33:54and i'll come back to that and a little bit if i have
0:33:56time because with small amounts of D N A that
0:34:02so there's a lot of problems with these issues of where
0:34:05does that where do the probabilities come from
0:34:10uh people have mentioned to me here and it's sort of true that in um
0:34:14you know in D N A
0:34:16it's easy 'cause we've got population genetics theory which generates uh
0:34:21uh which generates probabilities and of course
0:34:24in a larger population genetics theories
0:34:26based on one of bruno's famous sums of course mental who is a uh here and that
0:34:31uh you did is a work on the P Z here um
0:34:35and but nevertheless
0:34:39labels as applied here
0:34:43near enough to being objective fact
0:34:46uh there's a lots of elements of
0:34:47theory that subjective
0:34:49strength of D N A evidence
0:34:51it's all about related
0:34:53uh and
0:34:54these questions of independence all questions about relatedness
0:34:57how you model relatedness
0:34:59is you know a typical story in complex scientific evidence
0:35:03you think about all the people in this room we've got hugely complicated
0:35:07of relatedness through all think of my all my lineage as
0:35:11mother father for grandparents great grandparents you know go back five generations where
0:35:16where up to large numbers of ancestors and then any other individual in this room
0:35:20every has got
0:35:21you know same so many lineage is up to sixteen great grandparents
0:35:26yeah all those lineage is one of my sixteen great grandparents someone of your sixteen great grandparents
0:35:31all meet in a common ancestor at some point
0:35:34past and
0:35:35unless you think i i'm an alien from another planet but more or less the uh
0:35:39that's pretty substantial evidence that we all have common ancestors
0:35:45fully detailed model would specify all the patterns of relatedness for every individual on a
0:35:51and of course that's ridiculous
0:35:52the complicated
0:35:53so we have to make simplifying assumptions
0:35:56and most of the models
0:35:57kind of break
0:35:58relatedness down into three levels known relatedness
0:36:01which is usually you know just one or two generations in the past
0:36:04uh relatedness future unknown shared ancestors but understood to be
0:36:09on a relevant relatively recent time scale and
0:36:12and how you define recent is how these theories
0:36:16uh and then the completely unrelated case
0:36:18is an idealised
0:36:19case where the ancestors of so far back
0:36:21but it really doesn't matter
0:36:22we can just
0:36:23assume independence
0:36:27they are kind of good enough models even uh
0:36:29he too few people really understand how they work
0:36:32now but you know the great
0:36:34vocational reality
0:36:35and i just wanna emphasise the uh
0:36:38this objective miss of the underlying model of these models
0:36:43there's a lot of argument over the years about independence various independence assumptions that go together
0:36:48and that's of course important
0:36:50for you guys as well uh
0:36:53we did this is where we do have an advantage that uh in in the
0:36:57the only dependence that matters
0:36:58is due to relatedness
0:37:01the other important point i want to make it causes that um
0:37:05is whether or not to think you know to kind of meaningless thing to say a is independent to be
0:37:09in a in a kind of a general real world objects that
0:37:12uh independence is all about what information you condition on and if you get the conditioning right
0:37:18are typically independent to a good enough approximation the example i've used this uh
0:37:23is uh
0:37:24reading ability and shoe size in children are not independent
0:37:29the bigger the better readers have big F eight
0:37:32uh and uh that's a very well established fact then you can look at the correlation it's quite strong
0:37:37uh of course they depended because of the varying ages both of those things are correlated with age
0:37:42if you uh
0:37:43you condition on age
0:37:44the dependence goes away uh and uh
0:37:48if you condition if you do the right conditioning for D N A evidence
0:37:52uh you
0:37:53and make a reasonable assumption of in
0:37:55and that sort of course you know if you
0:37:57if you want to take a contrary imposition which of course defences in court so sometimes do you can never
0:38:01rena rigorously prove anything to be independent
0:38:10what matters fundamentally in the match probability is a statement like this
0:38:14at a single locus the probability that an unknown individual acts
0:38:17as gina type A B
0:38:28okay good
0:38:29just a speck
0:38:34what matters is affected by this conditioning and of course the probability that this guy's got a be given that
0:38:40this guy's got a bee
0:38:41depends on the on their relatedness
0:38:44what doesn't matter
0:38:46and they argued about at great length
0:38:48is the dependence or otherwise of the two labels
0:38:50within a block us
0:38:52so cold hardy weinberg equilibrium i mean again uh
0:38:56much discussion about this
0:38:57it's relatively unimportant
0:38:59i've had
0:39:01overemphasised in my writing this condition because that's what i see is the important one
0:39:05but of course there's a lot of other stuff in the conditioning as well as all kinds
0:39:08assumptions and
0:39:09background data
0:39:10uh that you are relying on and i'll i'll say more about that moment
0:39:15now i see i'm going to take uh
0:39:17i didn't intend to
0:39:18uh following because advice and uh use up all my time and not leave any for discussion but
0:39:23uh there's a bigger
0:39:25this is where i say that uh
0:39:27what i've been doing
0:39:29is fundamentally non bayesian although based on bayes theorem
0:39:32because uh all of these theories require
0:39:35uh parameter estimates
0:39:36and and everybody likes to putting plugin estimates
0:39:40the simplest thing to do
0:39:41but also
0:39:42you know
0:39:43you can think about what the different estimates are the different parameter estimates are change them
0:39:48that the parameters for us
0:39:49are they really all frequencies
0:39:51uh and this population genetics parameter which is the average
0:39:55relatedness in a community
0:39:57uh we've got various estimates of these
0:40:03we like to use
0:40:03plugin estimates it has an advantage that it keeps the
0:40:06the evidence specific to the case over here and all your training and background data that feed into apply guest
0:40:11estimates over there
0:40:13of course the ideal
0:40:15and and again this or the bayesian
0:40:17position would be to integrate out the unknowns
0:40:21to in a sense
0:40:22combining the data so uh and and
0:40:25you know feel david in
0:40:27london right
0:40:27papers trying to do this where
0:40:29the you know the
0:40:30the days you conditional is not just the dependence
0:40:33but the defendant profile and all the profiles even the scene before
0:40:36uh that uh that formula background information
0:40:40so we are recognising this idea like to think it is just a bit too complicated
0:40:46and so i have
0:40:47donna sort of a good compromise
0:40:49of using plugin estimate
0:40:51but in recognition
0:40:53uh all these uh problems
0:40:57the about the um
0:40:59the expectation all the hype how i can be much greater than the power the expectation
0:41:04so this is why putting in plugin estimates
0:41:07at that uh something like maximum likelihood estimates
0:41:10can be really really misleading uh because you know the out the effect of uncertainty is not symmetric
0:41:16uh when you've got high powers and product
0:41:21that's right you know there's a lot of again this sort of boston vast amount of wasted literature in this
0:41:25field like there is in any academic field so you have people talking about how to do
0:41:29maximum likelihood estimates of these plug in parameters
0:41:33and it's just a complete waste of time because the maxima like to estimate or anything like in any kind
0:41:37of sensible estimate in the middle of the distribution is hopelessly wrong
0:41:40uh because of this problem here
0:41:44but i haven't really got a very good solution i just say well we want something new the top of
0:41:49the plausible range like a ninety eight or ninety percent or something like that
0:41:53uh although of course i haven't really got any formal
0:41:56doing that
0:42:01a lot more to say what should i choose to include
0:42:04i can't resist talking a little bit i talked about
0:42:07the the probability is coming from
0:42:10uh series
0:42:12uh population genetics theories which sound very brandon i can easily put them past just get you know a courtroom
0:42:17who never sort of question me about any of these things but ultimately when you looking them
0:42:21it's all full of subjectivity and judgements and
0:42:24and i chosen this theory and not the theory and so on
0:42:29of course many people are happy with that kind of subjective element and they want to sort of rigorous
0:42:33and one way and again the sort of classical statistical position
0:42:37to get a kind of rigorous probabilities
0:42:39uh is to put it in the context of random sampling so that a lot of literature out there and
0:42:43a lot of
0:42:44um critical thinking which is based around the idea
0:42:48that the suspect as being chosen randomly in a population
0:42:51uh now i've already talked about evidence tampering and uh relatively high probability that the police could fiddle with the
0:42:57but the possibility that the
0:42:59police are capable of uniform random sampling is completely ridiculous i don't to accuse them of that of course all
0:43:05all the X
0:43:06that's in the world can't do uh in a find it very difficult to do uniform random sampling
0:43:11uh and so
0:43:13you know many people think
0:43:15uniform random sampling idea
0:43:18field on a rigorous footing because these are objective probabilities
0:43:22uh and i would say yes
0:43:24objective that is clearly nonsense uh the police haven't
0:43:26sample size
0:43:27X randomly
0:43:28this is just a completely made up assumption
0:43:30uh which is uh
0:43:34i i mean i'm probably overdoing it here i mean it is the kind of assumption that
0:43:37that people make and for good reason in some settings
0:43:40i don't think we need to make it here
0:43:42and it doesn't lead to lots
0:43:44problems and in particular
0:43:46the sort of
0:43:47endless endless arguments about in which population has suspect being randomly chosen
0:43:53and i say
0:43:54here that you know because there is no such
0:43:56sampling and there is no such population is like arguing over the number of weenies on the two very that
0:44:02yeah that that you know there is there is no such object so there's no point arguing about the properties
0:44:08you know there is a real fundamental problem here that the more now really define the population
0:44:13the better it is for the defendant
0:44:15and we usually try to sort of
0:44:17leaning defendants direction but the only logical endpoint of this
0:44:21is uh
0:44:22the population of size one that includes the defendant and have a hundred percent frequency for the uh for the
0:44:27dependence profile which of course is a useless uh
0:44:30and of course
0:44:31so you get
0:44:32you lose all these nice advantages
0:44:34of the
0:44:35bayesian formulation because this friend sampling hypothesis
0:44:38i don't suppose you could do it
0:44:39i mean you just can't do it in a bayesian way because it's just a ridiculous hypothesis that's got nothing
0:44:43to do with the
0:44:44with the with the evidence
0:44:46and all this stuff which works well in the
0:44:49framework i've been telling you about
0:44:51hard to do in this
0:44:52in this setting
0:44:56this is an old topic of
0:44:57mine and i will skip over this one that the the the U S
0:45:01national research council did a report maybe fifteen years ago it still and hold absolute sway in the U S
0:45:07it's all
0:45:08it's all based on this random and hypo
0:45:10sis and it's all
0:45:11kind of
0:45:12riddled with errors but it's interesting in the sort of social psychology of the feel
0:45:17we had huge arguments about D N A evidence and an early nineteen nineties
0:45:21any nineteen ninety six the mood was just right
0:45:23the kind of settled on a compromise
0:45:25and so the authority of the
0:45:27national research council in the U S was
0:45:29was such that everyone kind of lead on this
0:45:32uh and
0:45:33in some kind of prey
0:45:34consensus it sort of worked you know D N A evidence
0:45:37based on this is gonna be does a lot of people in the U S and they're probably all guilty
0:45:41but the fact that it's a completely riddled with misunderstandings and errors and uh and
0:45:46and the evidence is being devastated
0:45:48in almost every
0:45:49court case in the U S
0:45:51involving deny evidence the evidence is routinely overstated because of
0:45:57you know the truth is that the evidence was probably pretty strong anyway and this is why we haven't had
0:46:01the kind of gross
0:46:03miscarriages of justice
0:46:04coming to light
0:46:06that would just a
0:46:07channel these floors
0:46:09so i won't go into that but all these things i've been talking about they
0:46:12they use on the stored
0:46:13but really the important thing was this uh population genetics
0:46:18what we
0:46:19care about is the conditional match probability
0:46:22but what they cared about was just the marginal probability
0:46:25and everything all the population genetics
0:46:27issues are in this conditioning
0:46:29uh and so
0:46:30by leaving that out
0:46:32they had a whole population genetics experts on this comedian they had big chapters on population genetics and completely missed
0:46:38the point
0:46:39uh and gave completely misleading and recommendations
0:46:44i yeah i want you that i tried to have a
0:46:47too much of material in this talk
0:46:48and um
0:46:50i've withdrawal these topics but i just wanna bring you up to date with some of the uh
0:46:54let 'cause everything i've been talking about today i could've
0:46:56talked about years ago it's sort of a what the arguments of the nineties
0:47:00uh and uh really
0:47:01uh two thousand
0:47:03but what's really
0:47:05really come to a crunch this year in particular is what to do about this low template D in a
0:47:10way down to
0:47:11getting D N A from samples of just two or three cells and so this huge stochasticity in the results
0:47:18the uh and of course many
0:47:20jurisdictions just say this is way too complicated and
0:47:23we don't want to touch this
0:47:25uh but
0:47:25more and more particularly uk that more and more people are
0:47:29and it ended in and and uh it is potentially you know it doesn't mean
0:47:32that just from the slide
0:47:34it's rather than collect a fingerprint
0:47:36uh it's
0:47:38can be strong evidence to collect
0:47:40D N A from this way
0:47:42i think
0:47:45but we get all these kind of stochastic features
0:47:48i've got some slides yeah one huh
0:47:50time to
0:47:50but um
0:47:51these peaks that i showed you about you get
0:47:54so the top half as we could be in a
0:47:56good amount of D N A
0:47:57and this is with a sort of moderately low amount of D N A and you get all these features
0:48:01like uh
0:48:02peak imbalance but most one really
0:48:04complete drop out of any of the labels either two peaks there
0:48:07but there's only one showed up here
0:48:09and that's
0:48:11the the P C R reaction that underlies
0:48:13the whole thing with such
0:48:15with so few cells involved it can just completely fail if there's some uh
0:48:19uh you know mutation the primer or something else goes wrong
0:48:23and you can get dropped in the contaminant really owes you
0:48:26you would have thought that
0:48:27these high tech uh
0:48:29le bar trees could keep
0:48:31the land in a free but it's absolutely impossible even just you know the plastic where
0:48:36that uh people use it
0:48:39it's full of D N A and you just because our in denies everywhere in our environment
0:48:43uh it's impossible to keep it out
0:48:47so the little bit here about the
0:48:50the various
0:48:51so uh
0:48:52where draw so these thresholds
0:48:54uh that are being used you can see that
0:48:56the way the evidence is
0:48:57analysed is quite true
0:48:59uh but this threshold means anything below this doesn't count
0:49:03so this he he is very strong evidence could be against individual but that peak is now we have
0:49:08so tall
0:49:09because it's uh because of the thresholding affect
0:49:11but this is
0:49:13the threshold
0:49:14for where there's a single peak about this
0:49:16we assume there's enough
0:49:18D in a block
0:49:19that part hasn't dropped out and there's only one only able to true hamas i get
0:49:23but a single peak below this such as that one there
0:49:28the black one uh it doesn't have a partner
0:49:30but because it's below the threshold
0:49:32it's considered the dropout all of this is sort of a battery in very unsatisfactory but it's about this
0:49:36where at at the moment
0:49:38uh i would say so much about that case now
0:49:41'cause i'm running out of time but this
0:49:43is a zeromean on what the electorate rhymes actually look like
0:49:47and with these low announced at dinner
0:49:49it's quite noisy this thirteen liam turned out to be quite important
0:49:53and at that time and this court case i
0:49:56on this axis was regarded as the threshold and you can see the audio
0:49:59the team
0:50:00reached a peak height of fifty four on this one run up the dozens and dozens of reruns of different
0:50:05samples from the crime scene
0:50:07that was the only time that it reached about fifty
0:50:10but i counted
0:50:11as a
0:50:11a full a leo and this
0:50:13because it's a rare really all turned out
0:50:15the strongest evidence against this guy
0:50:17so you can see what
0:50:18this page here much bigger than that one
0:50:20is of no evidential value that's just an experiment
0:50:24a cold start
0:50:27one yeah
0:50:28and this one here are assumed to be just background noise
0:50:31uh and so you can
0:50:33see that it's not quite as sensitive issue about whether that's a real big
0:50:37um but uh nevertheless it was counted as such
0:50:41and in that
0:50:42there was some three
0:50:45a labels that shouldn't be in there if the defendant really was
0:50:49the contributed
0:50:49sample but one time we have a lot of argument about
0:50:52how to deal with this
0:50:57the standard
0:51:01way of analysing this problem is a kind of version of the random anything you work out you would the
0:51:07there are a a guy chosen at random in the population would be excluded by the seven
0:51:12and there's a huge amount of problems with this
0:51:14you probably gathered i'm not a fan at all
0:51:16all this approach
0:51:18and i got hollis
0:51:19here things that are wrong but i'm sort of rushing out the end of my talk
0:51:23uh so i won't go
0:51:24in any uh did how but other
0:51:26other than that to say that in the
0:51:29the whole idea of inclusion and exclusion don't apply anymore when we've got a small amounts of D N A
0:51:35and uh
0:51:38just one of many uh problems with this approach
0:51:42and how we're gonna talk you through a little bit
0:51:45of the
0:51:47how to work through a likelihood ratio in this problem in the way that i would
0:51:51fig is at least uh
0:51:52somewhat acceptable
0:51:53but i want to
0:51:55i won't go into that so they're all these issues about modelling dropout
0:51:59but i'm going to skip over
0:52:06quite important would be low level cases usual masking that you often have D N A from a victim which
0:52:11is of high level
0:52:12uh and it could be masking nearly all from uh
0:52:15from the true uh
0:52:19so we need to take that into account
0:52:21drop in
0:52:22uh i've got just some little
0:52:24simulation results here that showed no matter how much you feel this
0:52:27to pee wee which is part of the
0:52:29of the random and I D and always claim to be conservative
0:52:33it's not
0:52:34so these probabilities under various
0:52:36sorry likelihood ratios which are smaller than the likelihood ratio one to that
0:52:40two P rule
0:52:41uh and i'll skip
0:52:43all of that oh i see this one is quite interesting if i
0:52:47it's quite interesting
0:52:49this is about
0:52:50what happens
0:52:51if the crime scene profile is now
0:52:54and if the defendant is pictures i guess
0:52:57uh i'd say that's like evidence against
0:52:59such a big so
0:53:01the typical position of almost everyone in the field would be to say that if the crime scene profile is
0:53:05now i'm is nowhere
0:53:06we can ignore it
0:53:08uh i say that's why incriminating
0:53:12if you didn't see anything it's more likely that the offender was hedges i guess
0:53:16and so if you would defend the dispatchers i guess that's like evidence against him
0:53:20it's like evidence in his favour if he's homozygous
0:53:22but if there's masking
0:53:24it can be dramatic yeah evidence in favour of
0:53:26and that sometimes so
0:53:28not appreciated
0:53:32i will
0:53:34do you have
0:53:36hesitate just like maybe on this case because
0:53:39it's sort of remarkable the idea
0:53:41is sometimes suggested that uh all the problems are solved
0:53:45in the uh
0:53:46in the D i haven't
0:53:47field and this is an example
0:53:48about what seems to me the most kind of scandalous uh
0:53:52uh miscarriage of just
0:53:53as i understood the case
0:53:56the uh
0:53:57there was another who contributed to the sample
0:53:59uh and the case revolved around whether this whether or not this
0:54:03the stuff
0:54:04it's suspect it contributed was actually true contributor
0:54:07this is what was seen in the crime scene profile you see several dashes he means nothing was observed
0:54:12so both contribute is we're very low levels of the you know i
0:54:16and we have a substantial amount dropped out
0:54:18this was the sort of random and not excluded probability reported in court
0:54:23totally one in ninety six thousand
0:54:25it seems to be convincing enough to the guy to get
0:54:27i did
0:54:28uh but if you start looking closely at this
0:54:32and there's some really uh
0:54:34scandalous things going on here
0:54:35uh look at this twelve and thirteen that was in the in the crime scene sample
0:54:40it's exactly the same as the G the type of the node contributed
0:54:43so arguably this is no evidence at all it's just reflecting the known contributed doesn't tell us anything
0:54:49but the relevant likelihood ratio used for that locus was six point five
0:54:53uh because that's what you get from this
0:54:55random and not excluded for me which is completely illogical
0:54:58uh and completely miss rate presents the evidence
0:55:01and uh when i
0:55:04the sort of likelihood ratio based theory
0:55:07that i'm talking about so i got some criticisms of the methods here
0:55:11i could
0:55:13modify the random and not exclude formula to be a bit more reasonable instead might
0:55:18two thousand i would've got eight
0:55:19uh but when i did a likelihood ratio calculation
0:55:22that allows for example evidence to favour the depend
0:55:25some loci with less than one
0:55:28i come up
0:55:29instead of ninety
0:55:30X thousand with a like a racial too
0:55:32uh this is
0:55:33you know virtually useless and
0:55:35i'm the worst uh
0:55:37uh we can study and i haven't i don't come across this really hardly any
0:55:41information in in italy three labels
0:55:44in all of this
0:55:45that are attributable
0:55:46to this person and not to that person
0:55:49so it's really uh kind of uh
0:55:51shockingly weak evidence
0:55:55misunderstood and misrepresented in court
0:55:58uh and the guy was found guilty
0:56:02my uh
0:56:03conclusion as i said i had hoped
0:56:05the come back to draw more explicit parallels with
0:56:09voice problems but i'd i didn't really feel confident uh to do that
0:56:13uh i
0:56:15tell you that uh
0:56:17there's a lot of progress being made with D N A evidence situation is much better than it used to
0:56:21uh well as the previous case just shows that a lot still wrong
0:56:26and uh
0:56:27much remains unsatisfactory
0:56:32there are some fundamental problems with the logical approach
0:56:36that uh
0:56:38to which they i don't think there's ever going to be really sad
0:56:40actually solution but it nevertheless
0:56:42provides the most useful framework for
0:56:46so i should stop
0:56:58very much
0:56:59you could be going on
0:57:02the i think we can
0:57:04a few minutes to
0:57:15i work with
0:57:16consuming no pollution
0:57:18automatic systems
0:57:22usually we select
0:57:24speaker comp
0:57:25from cool
0:57:29recuse work
0:57:36so when we do
0:57:39to me oh cues
0:57:43equipment we have
0:57:47different speakers
0:57:51well obviously it's um
0:57:53it's difficult to get it right and it seems to me that this is you you have to do some
0:57:59some version of this calibration on the basis of man
0:58:04speakers but
0:58:05but but let me see comes to your question i mean the problem is about to leave the limited
0:58:10selection of comparison
0:58:12is that what you see
0:58:13not mutation but um
0:58:16uh usually
0:58:19the question is
0:58:21should we only
0:58:23different speaker comp
0:58:25you know
0:58:26all speakers
0:58:27who sound simple
0:58:29the keys
0:58:30the case
0:58:31no obvious
0:58:33comes to us with
0:58:34two totally different sounding speakers
0:58:39uh i see that um
0:58:41well i i can i i
0:58:44do you
0:58:45you know the issues that i had to worry about a a quite a distinction is some overlap but there
0:58:49are fundamental uh
0:58:51differences and um
0:58:53i would
0:58:54that is that's obviously
0:58:57somewhat uh
0:58:58unsatisfactory but nevertheless i can see that it's going to sort of
0:59:02bias you in a difficult
0:59:04in in a bad direction because this is most
0:59:07this is the most challenging situation
0:59:09to distinguish the similar sounding voices
0:59:12and um
0:59:14any and by biasing harrington that should be a good bye
0:59:17i would
0:59:17it makes it more difficult for you to um
0:59:22or whatever get produce evidence for identity
0:59:26but what about all this
0:59:27summation of all
0:59:28accuracy and precision
0:59:34speaker so maybe we want to
0:59:37during to use
0:59:39hmmm sounds
0:59:41oh yeah
0:59:42more easy
0:59:46but if you watch trying to distinguish
0:59:49same source
0:59:50from different sources if the different sources that you use
0:59:53i'm different but similar
0:59:56that makes that a hot a comparison not easy so
0:59:59that's what i was suggesting that um
1:00:02there should be a
1:00:03yeah that's this
1:00:05it's good that you you have to do one
1:00:08something like we would like what you were doing it would be nice to have
1:00:12probably well designed experiments where you have uh
1:00:15speakers that are similar and
1:00:17speakers that are more different
1:00:18uh and you can see the range of differences um you know i have emphasised a lot roller
1:00:23relatedness for D N A evidence but i don't know well
1:00:27you know of relative you know distinguishing
1:00:29brothers speaking for example whether that's harder than for unrelated individuals
1:00:34um but it
1:00:37ideally you'd like to
1:00:38be able to consider all those
1:00:42when you say um
1:00:43you might be overstating the precision
1:00:45but ultimately
1:00:46you want to process your
1:00:52and if you've given yourself harder task
1:00:55by having the different speakers being somewhat simple
1:00:58i like it
1:01:00the other way around
1:01:02you should
1:01:04you know
1:01:05so we can be more
1:01:11yeah okay maybe i missed something the problem 'cause it does seem to me harder task if you had very
1:01:14different sources you could distinguish them quite easily
1:01:17uh and so that's an easy task
1:01:19if you have similar sources trying to distinguish them is hard so you have given yourself harder task it seems
1:01:24to me on this i've missed something
1:01:28maybe maybe we can chat a bit more later and i guess the bottom of this
1:01:40i think if we don't
1:01:42my question
1:01:43tries to the conventional you should be using
1:01:47yeah imagine we we
1:01:49yeah but uh
1:01:50speech lab
1:01:51which analyze the highest you
1:01:53to present evidence
1:01:55and is
1:01:56which would be that uh we have a
1:01:58with the recording
1:02:04um we have
1:02:05we we are able to estimate multiple
1:02:12do you have
1:02:13you know actually you know not
1:02:15you would
1:02:16imagine that
1:02:24and correlation with them but you cannot
1:02:28you like
1:02:29to to make the problem but you
1:02:31kind of
1:02:32uh those uh
1:02:34multiple issues
1:02:36all of them
1:02:37small values maybe ten
1:02:40more than one thousand
1:02:44the idea is
1:02:45how to present that
1:02:46i didn't see what would you have
1:02:51i mean it was
1:02:54do you cannot
1:02:55the proof
1:02:57this do you
1:02:59the independence
1:03:04okay that is it
1:03:05an interesting and uh
1:03:07difficult question and i
1:03:09i feel
1:03:10instinctively as i was saying earlier that um
1:03:13you know what
1:03:15the right framework and some independence assumption you know or to be uh
1:03:19more or less reasonable and you can never prove independence it does always uh you know
1:03:23like people
1:03:24spent a long time trying to prove
1:03:26independent self
1:03:28different labels in D N A profiles
1:03:30it's a
1:03:31it's a few tile um
1:03:33the size
1:03:35all ultimately
1:03:36but um
1:03:38so but if you if
1:03:41is a really serious problem i am M
1:03:44i'm just trying to think
1:03:46i need i need to understand what the dependence structure is to
1:03:49so really help anybody dependence is a real
1:03:54to mount a problem then i think you're stuck really i don't i really can't see how
1:03:59to make use
1:04:00of the
1:04:02multiple level
1:04:03because obviously you know if you did have independence you can multiply likelihood ratios and everything is uh
1:04:08it's uh
1:04:09is easy
1:04:11the um or
1:04:14the analogy with um
1:04:17with D N A evidence is that is that the relatedness is the right
1:04:20the condition on
1:04:21things become uh
1:04:23can be independent once you've got the right
1:04:28conditioning but in general
1:04:32kind of model where there's some kind of latent variables so essentially
1:04:36relatedness is a latent variable
1:04:38uh and some kind of model you i
1:04:41there is
1:04:42a latent variable that
1:04:45encapsulate the common features of the different recording
1:04:48to generate
1:04:51if you can
1:04:53uh condition on that latent variable and then integrated out in some way would deal with it in some appropriate
1:04:59uh you know i feel as if there should be some modelling approach like that
1:05:02that would work and allowing to
1:05:04then make
1:05:05and it depends
1:05:06assumption i mean just you know in general modelling
1:05:08dependent data disk
1:05:10and of um
1:05:11random effects models type things work
1:05:14and i would go back
1:05:16you have you you have to explore
1:05:17to the extent where you can be
1:05:19reasonably confident about independence assumption and give
1:05:22some good arguments for it i mean i've always
1:05:25no i can still never
1:05:28ruth independent any of the independent sign
1:05:30assumptions i make for the D N A profiles
1:05:32but i just tried walking from uh
1:05:35from reason that
1:05:36you know relation is just you know
1:05:39and if we model that we should have sold
1:05:42and i i would have thought that some kind of a venue like that
1:05:44the only option
1:05:46well about can can you tell me briefly what is the cause of the dependence that uh
1:05:50the deepens is what you see yeah we have
1:05:53as for
1:05:53two can be analysing different phones
1:05:57personally depends
1:05:59well then
1:06:00have different characters
1:06:01but a lot and come from the same source
1:06:04but once you conditioned on it being the same source
1:06:07uh anyway i think there's a modelling answer but if you really can't
1:06:12the dependence with some kind of modelling and so than that and i think it is you know you do
1:06:16have a real fundamental problem because ultimately gonna say well
1:06:19if there is dependent stay then how big could it be and uh
1:06:22if you can't really quantify that in some way then
1:06:25i don't think you can usefully give multiple likely
1:06:27ratios and
1:06:29court will figure it out
1:06:33you've got to do the work
1:06:37but you do with a
1:06:38people like working tonight
1:06:40for historical reasons B C we should emulate D N A yeah
1:06:45you've gone through to
1:06:46oh you're all the problems of the yeah yeah
1:06:48oh i i did i think you're doing exactly the right thing i wouldn't disagree with the strategy that order
1:06:52that's what i wanted to ask them what you
1:06:55should we still be saying we should ideally i i
1:06:58i think so yeah something where a lot for all these
1:07:01i mean i have to say difficulties remain otherwise on out of a job
1:07:04and the uh
1:07:07uh about
1:07:09it is true but we definitely much better off than we were ten years ago
1:07:12and uh
1:07:13it's a bit like
1:07:14when you're teaching
1:07:16well almost anything
1:07:17in effect you tell the second year class to get everything we told you last year that was an over
1:07:21simplified version of the problem here is that
1:07:24here is the real
1:07:25them and then in the third year you tell the students to get everything we told you last year that's
1:07:28an over simplified version of the problem here is the real thing
1:07:31uh and uh
1:07:33i i
1:07:34i mean i can remember
1:07:35now there are some things where i do actually literally tell the students that and uh
1:07:39and i think
1:07:41you have to
1:07:43you know get
1:07:45it's from the community by focusing on the
1:07:48simplified versions
1:07:51it is a step forward and then there will always be
1:07:53you know you never going to
1:07:56overcome all the proper
1:07:57but is actually interesting that um you know the way these various complications that i've talked about many of them
1:08:02i don't think you do have an awards for it in many cases you're better off
1:08:06uh we i think
1:08:08you suggested to me in conversation that
1:08:10you know we have these population genetics models
1:08:13and that so
1:08:14and basically this thing i'm talking about that
1:08:16all the dependence comes from relatedness and once we
1:08:19concludes condition on the right level relatedness we can get rid of the dependence
1:08:22and i agree that's a good point but there's a lots of
1:08:25subjective ms
1:08:27in those models
1:08:30we have
1:08:31many other
1:08:33problems that i was describing to you that i don't think you do have any
1:08:37and the analogy for so
1:08:39in many ways i think
1:08:41the grass is greener on your side of
1:08:42fans i
1:08:47well i would like to
1:08:48thing but again
1:08:56we should do this