GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2009-12 > 1260746099


From: "Tim Janzen" <>
Subject: Re: [DNA] R-U152 and R-L21 on the European Continent
Date: Sun, 13 Dec 2009 15:14:59 -0800
In-Reply-To: <00ba01ca7c3d$8cfb01f0$6400a8c0@Ken1>


Dear Ken,
Thanks for your response. As I recall, I am using the 95%
confidence interval formula that you recommended in May 2008 when we were
initially developing these programs. See your message below. You are more
of an expert on statistics than I am. I would be happy to send you a copy
of my current program and you can take a look at how the confidence
intervals are generated. The basic formula I using for the 95% confidence
interval in generations is this: 2*G/SQRT(2*G*sum of the mutation rates).

Sincerely,
Tim

From: Ken Nordtvedt [mailto:]
Sent: Friday, May 23, 2008 6:56 PM
To: Tim Janzen
Subject: Re: using weighted mutation rates for calculating ages of subclades

I believe the 95 percent confidence interval due to statistics will be plus
or minus SquareRoot (2/G Sum w(i) m(i)) for deltaG/G
For the S28/S21 MRCA that's plus or minus 50 percent
So it costs to weight the markers.

It does not buy you anything to use large sample size for the interclade
MRCA age estimate. All pairs of haplotypes are basically sharing the same
branch line with each other. Sample size is to get a diverse
characterization of the clades as a whole so you are not taking the sum of
mutations between odd-ball outliers of either clade.

See my revised version of Generations2. I installed my own iteration lines
of code. There is an instructional comments box to the right of the upper
output cells


-----Original Message-----
From:
[mailto:] On Behalf Of Ken Nordtvedt
Sent: Sunday, December 13, 2009 1:45 PM
To:
Subject: Re: [DNA] R-U152 and R-L21 on the European Continent

I don't know what statistical confidence intervals you are using, but they
seem much narrower than I believe should be the case.

For a simple TMRCA for two haplotypes, equivalent more or less to a very
deep interclade TMRCA, the one-sigma G estimate variance stated in
fractional units of the age, itself, is:

<dG^2> / G^2 = 1 / {2G Sum i [m(i)/(1+4m(i)G)]
with i being sum over markers.

The above is of course for markers assumed to be the simple text book
markers.

My rough estimate using your 24 slowest markers is that for 30,000 year
TMRCAs the one sigma dG/G is .36,
while for the extremely old age TMRCA only gets as good as .29

95 percent confidence interval is close to twice the one sigma values.

Ken


This thread: