

 






IN THE COURT OF CRIMINAL APPEALS

OF TEXAS





NO. AP-75,062


JACKIE BARRON WILSON, Appellant

v.


THE STATE OF TEXAS




ON APPEAL FROM THE CRIMINAL DISTRICT COURT NUMBER THREE
DALLAS  COUNTY



 Johnson, J., filed a concurring opinion.


C O N C U R R I N G   O P I N I O N


	Numbers flummox many of us, and as a result, numerical evidence can become confusing
and misleading.  This is particularly true if the evidence that inserts numbers into the legal equation
is new and marginally understood.   We are now at that point with evaluation of DNA evidence. The
experts who come to court to present DNA evidence frequently come up with probabilities (1) of such
great magnitude that they are patently unsupportable to those who understand numbers and very
impressive to those who do not.  The following discussion assumes that the only evidence linking
the defendant to the offense is DNA.
	The first probability mistake that experts make is to treat all variables (2) as independent.  A
variable is independent if, when the numerical value of the variable changes, no other variables
necessarily change also.  In a rectangle, height and width are independent variables; changing the
height does not necessarily change the width.  A dependent variable is one that necessarily changes
in response to a change in another variable.  The area of a rectangle is the product of multiplying
height times width and is a dependent variable; if the height or width changes, the area necessarily
changes.
	The probability of two independent variables occurring at the same time is the product of the
probabilities for each: if each variable occurs one time in ten, the probability of both variables
occurring at the same time is 1/10 x 1/10, or 1/100 = one in one hundred.  Probabilities decrease
rapidly with the number of variables.  With only six independent variables that have individual
probabilities of one in ten, the probability of all occurring at once is one in a million.  The probability
decreases even more rapidly for variables that occur less often than one in ten times; for variables
that occur once in a hundred times, the probability of one in a million requires only three variables. 
	A problem arises when dependent variables are treated like independent ones.  In a California
case from 1964, (3) The People v. Collins, an older woman returning from the grocery was accosted
from behind and did not see her attacker, who took her purse.  She did see a young woman running
from the scene and described her as weighing about 145 pounds, wearing "something dark," and
having blonde hair that was lighter than Janet Collins's hair was at the time of trial. A man who had
been nearby reported seeing a young white woman with a blonde ponytail running from the direction
of the robbery, but did not see the offense occur.  He described the woman as slightly over 5 feet tall,
of ordinary build, wearing a dark blonde ponytail and dark clothing.  He also reported that the young
woman got into a yellow or partly yellow car driven by a black man who had a beard and moustache. 
	The defendants, an interracial couple, were arrested and charged because they "sort of"
matched the physical descriptions, owned a car that was at least partly yellow, were newly married,
jobless, and broke.  They denied involvement and provided an alibi.  At trial, the state presented a
mathematics instructor from a state college as their expert witness on the probability that the
defendants were guilty. The witness refused to assign probabilities to the various factors chosen by
the prosecutor, so the prosecutor  proposed "probabilities" of his own;  one in three young women
were blonde, one in ten wore a ponytail, one in ten cars was at least partly yellow, one in ten black
men had a beard, one in four men had a moustache, and one in a thousand couples was interracial. (4) 
The prosecutor then multiplied his own "probabilities" together and calculated that the profile would
match one in 12 million couples. (5)  The Collinses were convicted of the robbery based on the claimed
probability that no other couple in California matched the reported description of the perpetrators. 
	In reversing the conviction in 1968, the California Supreme Court noted that the evidence
was presented "[w]ithout presenting any statistical evidence whatsoever in support of the
probabilities for the factors selected." Collins, 428 P.2d at 36 n9.  The Court also noted that all of
the selected factors were treated as independent and factually true and that there was no adjustment
for dependent variables or the possibility of mistake.  Id. at 39.  Most men with beards also have
moustaches, so a correction for the overlap was necessary.  Id. at 39 n15.  The Supreme Court also
noted that the witness had failed to consider other plausible possibilities, for example, the young
woman was a light-skinned African-American with bleached hair. (6) 
	Some of the bad guesses increased probability, others decreased it, but the expressed
probability itself was not reliable.  "Mathematics, a veritable sorcerer in our computerized society,
while assisting the trier of fact in the search for truth, must not cast a spell over him." Id. at 33. The
Collins Court also noted that, even if one accepted the prosecutor's guesses, appropriate calculations
indicated that there was a substantial probability that more than one other couple matched the
selected factors. Id. at 43. 
	Multiplying the probabilities of all variables together, without regard to dependence, leads
to a probability that is too small, often greatly too small.  For example, variable A has a probability
of occurring one time in one thousand, and always occurs with variable B.  B always occurs with A. 
A and B are dependent variables, and the probability that A and B will occur together is still one in
one thousand, because they never occur separately. If the probabilities of a random match to A and
B are improperly multiplied together, the probability of both A and B occurring together is 1/1000
x 1/1000 = 1/1,000,000, or one in a million, and is one thousand times too small.  The numbers soon
get out of hand.  One expert testified that a given profile occurred one time in 2.578 sextillion (2.578
followed by 21 zeroes), (7) a number larger than the number of known stars in the universe (estimated
at one sextillion). (8)
 The population of Earth is about 6.5 billion, so anything in the sextillion range
is more than one trillion times larger than the population of Earth.  It is no wonder that, faced with
numbers too large to conceive, some juries simply dismiss DNA evidence as not helpful, not
persuasive, or not credible.  The other side of the coin is a jury that accepts any claim about
probabilities "because it's DNA."  They have all seen "CSI: Crime Scene Investigation" or "NCIS"
and "know" that DNA is infallible.
	The reality of the human genome is that some genes are recessive and are therefore dependent
on other genes for expression.  For example, blue eyes occur only if both parents pass on the gene
for blue eyes.  If one parent passes a gene for brown eyes, the probability is high that the child's eyes
will not be blue.  Many genes are intertwined to some degree; blue eyes often accompany blonde
hair, but some Irish have strikingly blue eyes and black or red hair.  It is very difficult to determine
the probability of a given characteristic because we do not have a map of how each gene affects
every other gene.  It may be that one in ten people have blue eyes and one in twenty is (really)
blonde, but because we know that the probability of blue eyes increases if the person is blonde,
simply multiplying 1/10 x 1/20 will not tell us the true probability of having both blue eyes and
blonde hair; the calculated value will be too low.  If you are Japanese, there is close to a one hundred
percent probability that you will have dark hair and brown eyes.  When the probability of a person
of Japanese ancestry having both dark hair and brown eyes is calculated, we must take that into
account.  All of these characteristics are controlled by DNA, and the same rules that apply to any
probability calculation also apply to calculating the probabilities of a DNA match; if areas A and B
of each DNA sample match, but A and B always occur together, A and B must be treated as one area
of matching, not two.
	In this case, the claim at trial was that only 1 in 2,083 persons of Hispanic descent would
match appellant's DNA profile.  How was that number calculated?  We do not know.  An even more
basic question is: what makes one "Hispanic"?  Appellant's surname is Wilson, a name not
ordinarily thought to be Hispanic.  May we assume that appellant's father was not Hispanic?  Part
Hispanic?  Part African-American?  Part Western European?  Eastern European?  Asian?  Were the
differing probabilities of a non-Hispanic gene pool taken into account in calculating probabilities? 
How are probabilities for racial groups calculated in general?  How do we calculate reasonably
accurate probabilities for people like that famous self-described "Cablinasian," Tiger Woods? (9)  We
do not know. 
	Secondly, a statement that the DNA profile of the defendant occurs in only one in one million
members of a given racial group means just that: if the reference group is one million individuals,
one person will match; if you have two million individuals in the reference group, two individuals
will match, and so on.  Unfortunately, most people  translate that statement, that only one person in
a million matches the profile, to mean that there is one chance in a million that the defendant is not
guilty.  Statistically, however, a city with ten million members of the reference group will include
ten individuals who match the profile, and thus, there is only a one in ten chance that the defendant
is guilty.
	In this case, the offense occurred in Dallas County.  Assuming a county population of one
million and an Hispanic population of thirty percent, 300,000 Hispanics live in Dallas County.  DNA
is very reliable as to gender, so if we assume equal occurrence of gender, there are 150,000 Hispanic
males in Dallas County.  Dividing 150,000 by 2083, we find that, statistically, 72 men in Dallas
County fit the profile.  But do we really know that the perpetrator lived in Dallas County?  Dallas
is part of the Metroplex, which has a population of more than 3 million, and we are a mobile society. 
In the Metroplex, there are 216 statistical men who fit the profile.  Could he be visiting from
Houston (216 men) or Chicago (360 men)?  Assuming that Mexico City is close to 100% Hispanic,
5,281 men in that city alone match the profile.  How many male residents of Mexico City (or
Guadalajara, Ciudad Juarez, Acapulco, etc.) were in Dallas County at the time of the offense?  Even
if we restrict the possibilities to Dallas County, a stated probability that only one Hispanic in 2,083
matches this profile does not mean that there is one chance in 2,083 that he is not guilty; it means
that the probability is one in 72 that he is guilty.
	Finally, trial attorneys need to understand how to validate (or repudiate) DNA evidence. They
must begin with the reported match.  Prosecutors may leap from a lab report saying that the samples
match to an immediate conclusion that the defendant is guilty, thus the origin of the term
"prosecutor's fallacy." (10)  But is it really a match?  How many areas of the DNA strands coincide? (11) 
How big is the specified error range? Just as for fingerprints, the more areas that match, the more
likely that this is truly a match. (12)  If there appears to be a match, advocates then need to discover how
often the laboratory that did the DNA testing produces a false positive. (13)  Part of the problem for the
State of California in the O. J. Simpson trial was the revelation that the state's testing laboratory had
a false positive rate of 1 in 200, that is, one match in 200 was not, in fact, a match, thus opening the
door for the defense to argue that the sample really did not match Simpson's DNA. (14)
	Once it is established by the state that the two samples do, in fact, match within an
appropriate margin of error, the next question is whether the defendant is the source.  The random
probabilities that are routinely used are valid only for unrelated persons.  The closer the relative, the
greater the number of areas on the DNA strand that will match.  Identical twins have identical DNA. 
Parents will share DNA similarities with their children, and siblings will have many commonalities. 
Double first cousins will also have many commonalities; first cousins will have fewer
commonalities, yet still a significant number. The only living male in a given family has a high
probability of being the source, but a family with only sons over several generations will present a
greater challenge.  If the state shows that the defendant is the source, there is one more hurdle; can
the defendant be placed at the crime scene in the appropriate time frame?
	DNA is durable; it does not evaporate or dissipate, and the time at which it was deposited
on a surface cannot be directly determined.  If the DNA sample was retrieved from a place where
the defendant lives, works, or visits frequently, it is probably not probative, as one would expect to
find the defendant's DNA in those places.  Sex crimes aside, if the sample is from a place where the
defendant should not have been, the DNA, by itself, can confirm only that he was there at some time
and cannot, by itself, prove conclusively that he was there at the time of the crime.  By the same
token, DNA cannot prove that the defendant was not there at the time of the crime.
	DNA analysis is a powerful tool in determining guilt or innocence, and usually there is other
evidence that links the defendant to the offense, but we must remember that DNA analysis is
performed by humans and is not foolproof, nor are the conclusions drawn from the analysis always
correct. Only if all the prerequisites for reliability-true match, correct source, and presence at the
crime scene in the applicable time frame-are satisfied can society have confidence that the DNA
evidence is, in and of itself, strong enough to support a conviction. 
	I concur in the judgment of the Court.

En banc
Filed: January 18, 2006
Publish

































1.  In this context, a probability is the likelihood that the DNA of a randomly selected person will match the
DNA of a known sample. 
2.  In a defined system, a variable is a characteristic that has a numerical value that changes.
3.  428 P.2d 33 (Cal. 1968)(reversed and remanded for a new trial).
4.  In 1964, the probability of a young woman having a ponytail was probably higher that one in ten, but even
in California, the incidence of natural blondes was likely to be less than one in three.  Except for taxis, yellow cars
are uncommon; probably, many fewer that one in ten non-taxi cars were yellow.  In 1964, the number of interracial
couples was probably far fewer than one in a thousand. 
5.  He then performed the "prosecutor's fallacy."  Post, infra.
6.  Other possibilities include: the young woman wore a blonde wig; she was running, not because she had
robbed the older woman, but because she was late; she and the driver were merely car-pooling and were not a
couple; the perpetrators were not from California.  
7.  See, e.g. Ex parte Russeau, No. WR-61,389-01 (Tex. Crim. App. writ application pending)(trial court's
findings of fact and conclusions of law, p. 38)("1 in 115.6 quintillion for Caucasians, 1 in 10.28 quintillion for
blacks, and 1 in 1.578 sextillion for Hispanics."); Benford v. State, 2005 Tex. App. LEXIS 840, No. 03-02-00686-CR (Tex. App.-Austin,  delivered February 3, 2005, unpublished,  pet ref'd)("1 in 8.77 trillion [in the Caucasian
population], 1 in 7.46 trillion for African-Americans, and 1 in 4.52 trillion for Hispanics").
8.  http://imagine.gsfc.nasa.gov/docs/ask_astro/answers/970115.html.  "We believe that there are on the order
of 1021 stars in our universe."
9.  Mr. Woods's father has Caucasian, African-American, Asian, and First People ancestors.  His mother is
Thai.
10.  "The attorney and social psychologist William Thompson and his student Edward Schumann seem to
have coined the term 'prosecutor's fallacy.'"   Gerd Gigerenzer,  Calculated Risks 154 (2002).
11.  DNA analysis scans fragments in 13 areas of high variability.  High probability of a true match is
assumed if four to five areas  match.  See www.ornl.gov/sci/techresources/Human_Genome/home.shtml, a site
funded by the United States Department of Energy Office of Science.  Given that only the individual whose DNA it
is will match all 13 tested areas, using an explanation of the probability of a random match that is expressed in terms
of how many areas match and the closeness of the matches instead of using statistics may be more persuasive to the
finder of facts.  Many people feel as Benjamin Disraeli did when he said, "There are three kinds of lies: lies, damn
lies, and statistics."
12.  The form of the DNA molecule is a double helix, similar in structure to a loosely coiled ladder with two
long, continuous "legs" and many "rungs." Nuclear DNA has four nitrogenous bases, referred to  A, C, G, and T. 
Each base matches with only one other base, but because the sequence is directional, the order of the bases matters:
A + T and G + C are different from T + A and C + G. The bases are arranged in varying sequences from rung to rung
and the sequencing order encodes genetic information.  By way of simplistic explanation, in a laboratory analysis
DNA strands are broken into fragments, which are separated by electrophoresis, placed on a nylon membrane, and x-rayed.  The result is an autoradiogram, a series of bands that resemble the bars of a UPC, but are less well delineated. 
Because the bands are not sharply defined, a match may be found when the bands in two samples align within a
stated margin of error.  As the allowed margin of error becomes larger, the statistical reliability of the match becomes
smaller.  The nuclear DNA strands that are analyzed are "non-coding" sections, that is, nuclear DNA that has no
known function in the production of protein.  Non-coding nuclear DNA has less selection pressure than coding
nuclear DNA and therefore shows higher variability among individuals.  Higher variability makes it easier to exclude
or include individuals.  Mitochondrial DNA goes even further toward identifying individuals because it is separate
from nuclear DNA and different from it in that mitochondrial DNA comes solely from the mother and is therefore a
clone of her mitochondrial DNA rather than a blending of the nuclear DNA from both parents.  One of its uses is to
differentiate between person who have different mothers, such as paternal half siblings, or cousins, or fathers and
sons.  The Y chromosome, present only in males,  is passed directly from father to son and may offer similar
information for males.
13.  A false positive is a test result that indicates that a factor is present, but in fact it is not.  E.g., a blood test
indicate that the person is HIV positive, but in fact the person is not infected.
14.  Gerd Gigerenzer,  Calculated Risks 167 (2002). 
