GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2003-11 > 1067974746


From:
Subject: Re: [DNA] Puzzling Result: Statistician Needed?
Date: Tue, 4 Nov 2003 14:39:10 -0500 (EST)
References: <1c4.11277eb5.2cd9175c@aol.com>
In-Reply-To: <1c4.11277eb5.2cd9175c@aol.com> (DNAforBrowns@aol.com)


Jim wrote:
> When B's results came back, however, he scored only 9/12 with A -- meaning
> it's highly unlikely the two men share a recent common ancestor, according to
> the statistical rules almost everybody uses in DNA genealogy.

When you say "9/12," you leave open the question of the genetic distance.
If that distance is, in fact, 3 (i.e., one step each on three different
markers), the interpretation is quite different from, say, two one-step
differences and a two-step difference. In the latter case, the distance
would be 6 (not just 4). That's an awfully big gap.

On the other hand, if the distance is 3, then I think you need to
expand those results to 25 markers and see what happens, as Julia and
Kay have already suggested.

> But here's the puzzle in the case at hand: Out of 54 unique haplotypes found
> so far in our project (excluding B's own haplotype), B scored closer to A's
> haplotype than he did to any other haplotype in our project!

That's not a puzzle. If B had scored closer to some other participant,
you would simply have written off the supposed connection to "A" and
would now be dilligently looking for evidence of a connection to "C".

The hallmark of the name "Brown" is multiple, independent origins.
You may think that 54 is a lot of haplotypes, but I can assure you
that your project will turn up many, many more of them if you keep at
it, and it is not at all unlikely that you will some day find a
haplotype that matches "B" at least 10/12. If that new haplotype
serves to bridge the gap between "A" and "B", then you will have
the possible nucleus of a star-shaped phylogenetic tree, but there
is no guarantee.

> If we assume A and B are not closely related, then I submit that B has an
> equal chance of being "closest" to EACH AND EVERY participant in our project.

Another factor that you didn't mention is haplogroups. I presume that
a goodly fraction of the participants are (tentatively) R1b, while
others are I, and a few other haplogroups as well. "B" will
necessarily agree much more closely with the members of his same
haplogroup (apparently including "A") than with the members of other
haplogroups. For that reason, "B" is not equally likely to be closest
to each participant. To take an extreme case, suppose "A" and "B"
were the only two R1a samples so far seen in the study -- they would
naturally agree most closely with each other, but the multiple origins
of the Brown surname would forbid you to make anything out of that
coincidence.

> Conclusion: It seems to me that the hypothesis of "no close relation" between
> B and A is falsified at approximately the 98% confidence level.

Ok, suppose both "A" and "B" are R1b, along with 30 others. You might
think that puts you close the 54-to-1 case you stated above. However,
you know that people close to the AMH are apt to turn up EXACT MATCHES
without even sharing a surname. The lesson is clear: beware of random
coincidence.

Of course, you still have the genealogical evidence. Don't let the
problematic DNA results distract you from the paper trail. While
you're waiting for the expanded DNA results, you could be looking
for weak links in the conventional research and trying to prove or
disprove them.

John Chandler


This thread: