We often see episodes on CSI and Law and Order where DNA evidence is quoted as identifying a guilty person to an accuracy of one part in a billion or so. These numbers are impressive and often convincing to a jury. The underlying statistical calculations and meaning are not usually discussed on the video shows. Unfortunately, in real life, most juries do not understand the analysis that leads to those impressive numbers.
My interest in law is limited to video and an occasional speeding ticket, so I was unfamiliar until recently with a case that seems to be taught in every law school. At least you can find many references to “People v. Collins” including the usual starting point, Wikipedia.
The essence of this interesting case is that an elderly woman was robbed. She and another witness got some partial descriptions of the suspects. The prosecution was nearly at a loss because they didn’t have a firm identification, but they did have two suspects who fit the individual characteristics as reported by the witnesses. To prove their case beyond a reasonable doubt, they assigned probabilities to each of the reported parameters:
- Part yellow automobile: 0.10
- Man with mustache: 0.25
- Woman with ponytail: 0.10
- Woman with blond hair: 0.33
- Black man with beard: 0.10
- Interracial couple in car: 0.001
I’m not sure where the probabilities came from. For instance, I don’t think one out of ten women on the street have ponytails. However, that is not the point. Assume the assigned probabilities are correct. The prosecutor next assumed that all these probabilities were independent measurements and multiplied them together to derive the probability of all these things happening at once as 1 in 12 million. Since this probability certainly meets the test of being guilty beyond a reasonable doubt, the jury, which was probably as uneducated as the prosecutor, convicted the couple.
The case was fortunately overturned on appeal.
Longtime readers of this column will recognize several problems with the prosecution. The most glaring one in my mind is the blithe assumption that all these parameters are independent. The probability of a man with a beard having a mustache also is obviously higher than the probability of a man picked at random of having a mustache. The next problem is the complete lack of a Bayesian computation that would take into account the available population.
The robbery took place in Los Angeles and the couple was arrested a few hours after the robbery. This means to me that the pool of potentially guilty couples was probably a few million. Assume 4 million couples could have physically been close enough to commit the crime. Then if the odds were as the prosecution computed (erroneously) a reasonable jury could compute that the probability of two couples selected from this pool matching those parameters is 0.33. That is, the data available to the prosecution even when used incorrectly did not meet the standard of proving guilt beyond a reasonable doubt.
Also hidden in this case is a classic example of the “Prosecutor’s Fallacy” which I discuss in my book as it occurred in the OJ Simpson trial. I will discuss this fallacy in more detail later.