GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2010-11 > 1290274954


From: James Heald <>
Subject: Re: [DNA] FW: Odds Are, It's Wrong - 5% of the time
Date: Sat, 20 Nov 2010 17:42:35 +0000
References: <F9C440A2-FC59-4A9E-AAAC-85DEE9D2FAB0@GMAIL.COM>,<COL115-W50D879F102DC3996D9D454A03A0@phx.gbl>,<4CE7A3C0.7050702@ucl.ac.uk>,<COL115-W1464B78AF0292D6AEFA183A03B0@phx.gbl><COL115-W5950BB2C58A31B4806036EA03B0@phx.gbl>
In-Reply-To: <COL115-W5950BB2C58A31B4806036EA03B0@phx.gbl>


On 20/11/2010 13:35, Steven Bird wrote:
>
> The Frequentist definition of a p value of 0.05 is that the person conducting the t-test (or whatever test is being used) has a one in twenty chance of being wrong, simply due to bad luck. That means that 19 out of 20 times, on average, the experimenter is right. If that is not good enough, you only accept a lower p value before rejecting the null.
>
> I realize that a CI of 95% is not identical to a p=0.05; sorry for conflating the terms.
>

No. P-value has a very precise meaning in Frequentist statistics.
It is "the probability of obtaining a test statistic at least as extreme
as the one that was actually observed, assuming that the null hypothesis
is true".
http://en.wikipedia.org/wiki/P-value

In the dog-barking example, if 95% of the time when the dog is not
hungry it does not bark, then observing the dog barking is an event with
a P-value of 0.05.

This may lead a Frequentist to "reject the hypothesis that 'the dog is
not hungry', at the 95% significance level".


Now it is not at all clear to me what you mean by "the person conducting
the test has a one in twenty chance of being wrong" -- this is this kind
of language that can cloud the issue.

But if you are meaning "there is a one in twenty chance of the null
hypothesis being rejected when it was in fact true", that is not what a
P-value means. That would be an estimate of the probability of the null
hypothesis being false, which is not a coherent idea in Frequentist
statistics, since Frequentism regards the null hypothesis as something
either true or false, and therefore not something to which a Frequentist
probability can be ascribed.


This issue has implications in TMRCA estimation. People sometimes use
Frequentist confidence intervals to give a range estimate of potentially
plausible TMRCAs -- ie they find the TMRCA for which obtaining n
mutations or fewer would only occur with a probability of 0.05

But, as above, this is *not* the same as being able to say that the
actual TMRCA would be less than that value in 95% of the cases that n
mutations or fewer were observed.

Nor can one say that a person making the pronouncement that the TMRCA
was less than that value has a precisely one in twenty chance of being
wrong.


This thread: