From: Charles <>
Subject: Re: [DNA] Genetic Distance calculation method -- which method ismost correct?
Date: Thu, 20 Nov 2003 20:59:04 -0500
In various posts I have seen you state:


> You have to square the differences
> John Chandler

I asked Bennett of why they calculate the Genetic
Distance in their website tables by basically linearly adding of the
differences for the various markers while others such as you say that
the calculation should be made using the "sum of the squares of the
differences". Bennett forwarded my question to Dr. Walsh who is their
scientific advisor on such matters and he replied as follows and
permission was granted by Bennett to post this reply from Dr. Walsh to
the List.


Here's a quick response:

What is the "equivalent" number of single mutations for individuals off
by two? Microarrays mutate by the stepwise mutation model, wherein a
mutation can move the number up one or down one. Hence, a rather formal
statistical model has to be used to account for the actual number of
mutations when individuals differ by (say) two mutations. Analysis of
this model, which is straightforward but a little complex (it involves
type II bessel functions, feel free to email me for details), shows that
the expected number of actual mutations for individuals that are off by
two is roughly 2.1.
Hence, the correct equivalent number of single mutations is essentially
2, not 2*2 =4.

In Dr. Walsh's example of a 2 step distance for one marker the answer of
"simple summing of differences" yields 2 for Genetic Distance and that
answer is a lot closer to 2.1, which he says is the correct answer using
"type II bessel functions" as compared to the squaring method, which
yields 4. Apparently Dr. Walsh the scientific advisor for such
calculations with does not agree with using the "sum
of the squares of the differences" method of calculating the Genetic

John, I am not a mathematician and thus I cannot debate this issue
directly with you. But I throw it out for debate in this forum to be
tackled by others in this forum with sufficient mathematical skills to
debate the pros and cons of the method used by to
calculate Genetic Distance, which is basically a "simple summation of
the differences" (with special attention to certain special markers to
avoid duplications in counting differences), as opposed to "summing the
squares of the differences" method which you have advocated in this
forum with your posts. Is there a right or wrong on this issue? Let's
hear from the math and physics majors. :-)


