GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2010-11 > 1290473643


From: "Ken Nordtvedt" <>
Subject: [DNA] TMRCAs and Triangulation
Date: Mon, 22 Nov 2010 17:54:04 -0700


In an early volume of JOGG I derived some corrections to the simple TMRCA = n / 2M rule based on the relationship between the haplotype pair's likely MRCA haplotype and the founding haplotype of the clade to which the haplotype pair belongs. The rule basically boiled down to: if the likely MRCA haplotype was equal or very close to the clade founding haplotype, then tmrca distribution and its peak moves out to greater times (up to twice as long), while if the likely MRCA haplotype were very far from the clade founding haplotype, then tmrca distribution and its peak moves in toward the present. There then was a middle situation for the MRCA haplotype for which no correction was suggested.

I recently found an entirely different methodology to treat and solve this situation. The final result completely agrees with the old JOGG formulation's solution for these corrections.

If one assumes two present haplotypes h1 and h2 which are separated by n one-step mutations, and they belong to a clade of age G* and founding haplotype hf, then we have a triangulation problem for finding the MRCA age G for the two haplotypes --- both the distribution and most likely G.

h1 _______________________G G*
|_hc_______________________ hf
h2________________________|

If hc is one of the likely alternatives for the MRCA haplotype (and there are 2^n of them), and n* is the number of mutational steps between hc and hf, then the probability of
this tree segment being realized is proportional to

Prob = G^n (G*-G)^n* exp(-MG) times a constant involving mutation rates. Note the exponential exponent is -MG rather than -2MG (M is sum of haplotype STR mutation rates)

That distribution can be plotted as function of G if desired. It peaks at:

G = n / 2M times 2 / [1 + n*/M(G*-G)]

If the hc haplotype is the clade's founding haplotype, n* = 0 and the G estimate is twice the simplistic n / 2M value

If n* is equal to the simplistic or likely expectation M(G*-G) then n / 2M remains the G estimation. If n* is even larger because hc is very different from hf, then G estimation falls below n / 2M

There are 2^n "first tier" candidates for the MRCA haplotype hc. The one which is closest to hf will have the largest probability. In the simplistic treatment of the h1/h2 TMRCA, all 2^n "first tier" alternatives for the MRCA haplotype would be equally likely. The triangulation knowledge provided from the earlier clade founding haplotype differentiates among these 2^n alternatives in liklihood. And that triangulation knowledge also narrows the width (variance) of the distribution.



This thread: