GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2009-12 > 1260780164


From: "Tim Janzen" <>
Subject: Re: [DNA] R-U152 and R-L21 on the European Continent
Date: Mon, 14 Dec 2009 00:42:44 -0800


Dear Ken,
Thanks for your comments about the proper confidence intervals for
interclade TMRCA estimates. I added formulas for the 95% confidence
intervals using your methodology to my TMRCA estimator program. Below are
the TMRCA estimates for the various options of markers followed by the 95%
confidence intervals using your methodology for the node of haplogroups A
and B:

10 slow markers: 147279 (+/-158727 years)
10 slow medium markers: 78714 (+/-77695 years)
10 medium markers: 8396 (+/-9838 years)
10 medium fast markers: 21794 (+/-20655 years)
10 fast markers: 13847 (+/-12909 years)
50 markers: 35245 (+/-16156 years)
10 YHRD markers using YHRD mutation rates: 8811 (+/-10198 years)
24 slow markers: 77576 (+/-52593 years)

Below are 95% confidence intervals I included in my recent message based on
what you referred to as the "textbook expression":

77533
32218
6175
6750
3479
3070
5657
22008

The 95% confidence interval is indeed quite wide for 24 slow markers
at 52,593 years using your methodology. Also note that even though we now
have a wider confidence interval for the 50 marker panel at 16,156 years,
the result is still clearly inaccurate and there is essentially no chance
that the true TMRCA for the node of haplogroups A and B is between 19089
years and 51401 years.

If we look at the node for R-U106 and R-P312 using 67-marker
haplotypes these are the TMRCA estimates for the various options of markers
followed by the 95% confidence intervals using your methodology:

10 slow markers: 7594 +/-19078
10 slow medium markers: 7514 +/-12068
10 medium markers: 3602 +/-5231
10 medium fast markers: 3417 +/-4073
10 fast markers: 3118 +/-3271
50 markers: 3740 +/-2374
10 YHRD markers using YHRD mutation rates: 4073 +/-5610
24 slow markers: 6256 +/-7540

Below are the corresponding 95% confidence intervals based on what you
referred to as the "textbook expression":

17606
9954
4045
2673
1651
1000
3846
6250

Sincerely,
Tim

-----Original Message-----
From:
[mailto:] On Behalf Of Ken Nordtvedt
Sent: Sunday, December 13, 2009 3:33 PM
To:
Subject: Re: [DNA] R-U152 and R-L21 on the European Continent


----- Original Message -----
From: "Tim Janzen" <>
The basic formula I using for the 95% confidence
> interval in generations is this: 2*G/SQRT(2*G*sum of the mutation rates).
odd-ball outliers of either clade.


That's the text book expression. M = .01, G = 1000 for 30,000 year TMRCA.
It is based on the erroroneous assumption that variance of variance for a
marker is 2 m(i) G

But even with that erroneous formula (good for very young ages)
dG then is 2 * 1000 / (2*1000*/100)^1/2 = 448 generations = 13,400 years
I am assuming your sum of marker mutation rates is 1/100

Since then I have published many times that the formula should be:

2*G x 1/ sqrt(Sum i 2Gm(i)/[1+4m(i)G])
because the variance of variance for each marker is 2m(i)G[1+4m(i)G]
and for large G grows quadradically in G, not linearly.

Ken



This thread: