Archiver > GENEALOGY-DNA > 2010-11 > 1291129831

From: James Heald <>
Subject: Re: [DNA] P value (was chances are, it's wrong)
Date: Tue, 30 Nov 2010 15:10:31 +0000
References: <F9C440A2-FC59-4A9E-AAAC-85DEE9D2FAB0@GMAIL.COM>, ,<COL115-W50D879F102DC3996D9D454A03A0@phx.gbl>, ,<>, ,<COL115-W1464B78AF0292D6AEFA183A03B0@phx.gbl>,<COL115-W5950BB2C58A31B4806036EA03B0@phx.gbl>,<>,<COL115-W45724B549DCDA5DD2EC4A0A03B0@phx.gbl> <COL115-W424C7732D1583F8960685CA03B0@phx.gbl> <> <> <><> <><>
In-Reply-To: <>

On 30/11/2010 05:13, John Chandler wrote:
> the
> statement "P (theta within interval | data) = 95%" is *precisely* the
> definition of the 95% CI.

Popular myth, and true in some special situations, but in general *false*.

The definition of the 95% CI is that if you repeat the experiment a
large number of times with the same value of the unknown parameter, then
a procedure for generating an interval estimate for the unknown
parameter generates a *confidence interval* if in 95% of those repeated
trials the interval generated contains the parameter value which was

So, in the simplest case of a single parameter, and data which can be
summarised in a single statistic D, which is a sufficient statistic,
then you set the right hand end theta_max of your CI so that the
cumulative probability of obtaining some data d less than D is 2 1/2%;


int from 0 to D of Prob (d | theta_max) = 0.025

or equivalently,

Prob (d<D | theta_max) = 0.025

That is in general *not* the same as the interval you get calculating
the conditional probability

Prob (theta | D)

using Bayes theorem, and then fixing theta_max so that

Prob (theta>theta_max | D) = 0.025


int from theta_max to infinity of Prob (theta | D) = 0.025

There *are* situations where the two approaches coincide, for example if
theta is a location parameter, and the Bayesian prior P(theta|I) is a
flat uniform distribution; or if theta is a scale parameter, and the
Bayesian prior P(theta|I) is a Jefferies distribution, P ~ 1/theta.

These are of course very important, and often modelled situations.

But in general, if the dependence Prob (d | theta) is not so simple (as
it is not so simple in the estimates of TMRCA), then the two sorts of
interval do *not* in general coincide.

This thread: