From: Dienekes Pontikos <>
Subject: Re: [DNA] FW: Odds Are, It's Wrong - 5% of the time
Date: Sun, 21 Nov 2010 03:57:47 +0200
This point needs stressing. Quite often people model

P(n | g) where

n: number of mutations
g: number of generations

or P(ASD | g), or P(Variance | g),

i.e., how some measure of STR variation is distributed for an ancestor
g-generations old.

But, the problem of TMRCA estimation requires us to use the posterior

P(g | n) = P(n | g) P(g) / P(n)

i.e., it depends on both knowledge of how STR variation accumulates
with age [ P(n | g) ] and a prior on the age of the common ancestor

P(g) in turn depends on population history, e.g., reproductive skew or
growth rate. We have some idea of when the MRCA of two Y-chromosomes
from a population lived _before_ we look at their haplotypes.

With a _really_ tight P(n | g), the prior becomes irrelevant, as
P(n|g) has a narrow band around a _g_ and is close to zero outside it,
and the likelihood dominates the posterior.

However, P(n|g) derived from Y-STRs is anything but tight, so the use
of an appropriate prior is really important.

On Sat, Nov 20, 2010 at 7:42 PM, James Heald <> wrote:
> This issue has implications in TMRCA estimation.  People sometimes use
> Frequentist confidence intervals to give a range estimate of potentially
> plausible TMRCAs -- ie they find the TMRCA for which obtaining n
> mutations or fewer would only occur with a probability of 0.05
> But, as above, this is *not* the same as being able to say that the
> actual TMRCA would be less than that value in 95% of the cases that n
> mutations or fewer were observed.
> Nor can one say that a person making the pronouncement that the TMRCA
> was less than that value has a precisely one in twenty chance of being
> wrong.

