GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2010-11 > 1291039748


From: Ann Turner <>
Subject: Re: [DNA] Provenance of a DNA segment (importance of phasedhaplotypes)
Date: Mon, 29 Nov 2010 06:09:08 -0800
References: <8CD5D4D68281851-1580-323FF@webmail-m005.sysops.aol.com>
In-Reply-To: <8CD5D4D68281851-1580-323FF@webmail-m005.sysops.aol.com>


On Sun, Nov 28, 2010 at 11:55 AM, Kathy Johnston <> wrote:

>
> According to Ann:
> > Family Finder tolerates occasional mismatches,
> > which could be due to genotyping error or microdeletions.
>
>
> Do you think that Family Finder is tolerating too many mismatches? Are
> genotyping errors and microdeletions major sources of false positives? If
> these were not tolerated, then would there be way too many false negatives?
>

I don't know what the optimum number would be, but I have seen cases where
people match at both 23andMe and FTDNA. When I looked at the raw genotype
data, the 23andMe segment had 4265 consecutive SNPs with no isolated
contradictions. The FTDNA segment had 4028 SNPs in roughly the same
boundaries, with 5 isolated contradictions (e.g. GG and AA), so it was good
to allow that many.

I have also seen a case where the percentage of contradictions was higher (6
out of 694). A match was declared at FTDNA but not at 23andMe. When I looked
at the raw data in that region for 23andMe, which happened to have denser
coverage, there were 39 contradictions out of 1463 SNPs, including a few of
the ones FTDNA allowed. FTDNA was too tolerant in that case: they were
genuine contradictions.

Keep in mind that I don't have a large sample size, and these may be extreme
cases. But they definitely occur.


> According to Bruce Walsh, the odds of a run of 5 cM being shared between
> two individuals (when using 500,000 markers) is 1 in 10 million, but FF does
> not use 5 cm as the threshold. It looks like 7.7 cM is the initial threshold
> FTDNA is using. When they report on a match of 5 cm, are you saying that
> they are ignoring important mismatches? It seems to me that the genotyping
> errors and microdeletions may be a significant source of kinship prediction
> errors. Should the technology be improved in order for it to really be ready
> for prime time? Are our own natural deletions getting in the way of
> accuracy?
>

Did Walsh give any derivation for his 1 in 10,000,000 statistic? Was he
maybe thinking of haplotypes, not genotypes? If it's really 1 in 10,000,000
with genotype data, I must have hit the jackpot <g>. I have seen a genotype
match fall apart at 23andMe, too, when I have access to father/mother/child
data and can compare haplotype to haplotype. I picked a case from Ancestry
Finder, which will display matches down to 5 cM. There was one apparent
segment of 5.2 cM with 836 SNPs in a child, not found in either parent (and
not listed in Relative Finder). That makes it suspicious. When looking at
various combinations of paternal and maternal haplotypes, this apparent
unity fractured into 66, 74, 106, and 120 short segments. It just happened
when a short fragment from one parent ended, a short fragment from the other
parent picked up the slack, and this continued for quite some distance.
Allowing a paternal OR maternal allele in one person to match a paternal OR
maternal allele in the other person allows a lot of leeway in declaring a
match. Genotyping error was not an issue here -- it's just that we must
allow AMPLE opportunity to observe a mismatch.

>
>
> > Those small segments may be even smaller and less
> > significant than we realize.
>
>
> I believe I heard Bruce Walsh say that if you have a lot of small segments,
> e.g. 50 of these, then that is a signal. When he talks about signals, I
> think he means there is significance. He also stated that the only "good
> signal" was a long block.
>

Has anyone observed that many short segments in a Family Finder match? My
matches show an average of 10 segments, with a range of 5 to 18.

Ann Turner


This thread: