**GENEALOGY-DNA-L Archives**

From:Subject:Re: [DNA] Genetic Distance calculation method -- which method ismost correct?Date:Thu, 20 Nov 2003 22:42:34 -0500 (EST)References:<OF1A2B197A.A0F0DD49-ON85256DE3.007525EA-85256DE3.0078EBB1@downstate.edu> <REME20031119183221@alum.mit.edu> <3FBD7168.7000706@kerchner.com>In-Reply-To:<3FBD7168.7000706@kerchner.com> (message from Charles on Thu, 20Nov 2003 20:59:04 -0500)Charles wrote, quoting Bennett quoting Bruce:

> What is the "equivalent" number of single mutations for individuals off

> by two? Microarrays mutate by the stepwise mutation model, wherein a

> mutation can move the number up one or down one. Hence, a rather formal

> statistical model has to be used to account for the actual number of

> mutations when individuals differ by (say) two mutations. Analysis of

> this model, which is straightforward but a little complex (it involves

> type II bessel functions, feel free to email me for details), shows that

> the expected number of actual mutations for individuals that are off by

> two is roughly 2.1.

> Hence, the correct equivalent number of single mutations is essentially

> 2, not 2*2 =4.

Unfortunately, that result is nonsense. The example is, in fact,

simple enough to explain to the whole list and requires no Bessel

functions. Consider two individuals who have between them actually

experienced exactly two mutations relative to a common ancestor. We

can neglect the bias toward increases because we are looking at the

difference between two individuals (who both would have the same

bias). Therefore, we have two equally likely cases: either the two

mutations canceled each other out, giving an observed difference of 0,

or the two mutations reinforced, giving an observed difference of 2.

We take the (equally) weighted root-mean-square of these two values

and get an expected observed difference of 1.4 when there are exactly

two mutations. (Actually, it's 1.414213..., i.e., the square-root of

2.)

In short, if the observed difference is 2, the expected number of

mutations has to be considerably more than 2.1. How much more?

Well, let's take a wild guess and examine the case of 4 mutations.

Obviously, it's a little more complicated if there are exactly 4

mutations, since there are three possible outcomes (observed

differences of 0, 2, and 4), and these three outcomes are not

equally likely: 38% 0, 50% 2, and 12% 4. (The interested reader

can easily confirm these percentages by writing down all the

possible combinations of pluses and minuses.) Anyhow, it is easy

to see that the observed difference strongly favors the "2" case,

and a quick calculation verifies that the RMS value is exactly 2.

Therefore, the equivalent number of single mutations is, in fact,

4, as I have always said.

I don't think any more needs to be said about that.

There is, however, more to the story. The stepwise model does NOT

give a correct picture because it doesn't allow for two-step

mutations. The full and complete answer has to allow for those and

has to be informed of the exact relative probabilities of one-, two-,

and three-step mutations (which we do not know with any precision).

On the other hand, since we ALSO do not know the absolute rate of

one-step mutations, this whole aspect can be swept under the same rug,

and we can define something called the "effective" mutation rate which

includes the small contributions of double and triple mutations. In

the end, then, we have the same simple rule of summing the squares of

the differences, and just a little footnote attached to the

still-fuzzy average mutation rate.

John Chandler

**This thread:**

- [DNA] Mutation rate and distant ancestors by "Nicholas Penington" <>
- Re: [DNA] Mutation rate and distant ancestors by
- Re: [DNA] Genetic Distance calculation method -- which method ismost correct? by Charles <>
- Re: [DNA] Genetic Distance calculation method -- which method is mostcorrect? by "Nicholas Penington" <>

- [DNA] Genetic Distance calculation -- message from Bruce Walsh. He asked me to post it to the list by Charles <>

- Re: [DNA] Genetic Distance calculation -- which method is best by
- Re: [DNA] Genetic Distance calculation - Comments re MacGregor and a further question by "Richard McGregor" <>

- Re: [DNA] Genetic Distance calculation - Comments re MacGregor anda further question by (VON HAMRICK)

- Re: [DNA] Genetic Distance calculation method -- which method ismost correct? by Charles <>

- [DNA] mutation rate and distant ancestors correction by "Nicholas Penington" <>

- RE: [DNA] Mutation rate and distant ancestors by "Mike Harper" <>

- [DNA] Re:Mutation rate and distant ancestors by "Nicholas Penington" <>

- Re: [DNA] Mutation rate and distant ancestors by