**GENEALOGY-DNA-L Archives**

From:Charles <>Subject:Re: [DNA] Genetic Distance calculation method -- which method ismost correct?Date:Thu, 20 Nov 2003 20:59:04 -0500References:<OF1A2B197A.A0F0DD49-ON85256DE3.007525EA-85256DE3.0078EBB1@downstate.edu> <REME20031119183221@alum.mit.edu>In-Reply-To:<REME20031119183221@alum.mit.edu>John,

In various posts I have seen you state:

wrote:

> You have to square the differences

>

> John Chandler

I asked Bennett of FamilyTreeDNA.com why they calculate the Genetic

Distance in their website tables by basically linearly adding of the

differences for the various markers while others such as you say that

the calculation should be made using the "sum of the squares of the

differences". Bennett forwarded my question to Dr. Walsh who is their

scientific advisor on such matters and he replied as follows and

permission was granted by Bennett to post this reply from Dr. Walsh to

the List.

----------------------------------------------------------------

Bennett:

Here's a quick response:

What is the "equivalent" number of single mutations for individuals off

by two? Microarrays mutate by the stepwise mutation model, wherein a

mutation can move the number up one or down one. Hence, a rather formal

statistical model has to be used to account for the actual number of

mutations when individuals differ by (say) two mutations. Analysis of

this model, which is straightforward but a little complex (it involves

type II bessel functions, feel free to email me for details), shows that

the expected number of actual mutations for individuals that are off by

two is roughly 2.1.

Hence, the correct equivalent number of single mutations is essentially

2, not 2*2 =4.

--------------------------------------------------------------------

In Dr. Walsh's example of a 2 step distance for one marker the answer of

"simple summing of differences" yields 2 for Genetic Distance and that

answer is a lot closer to 2.1, which he says is the correct answer using

"type II bessel functions" as compared to the squaring method, which

yields 4. Apparently Dr. Walsh the scientific advisor for such

calculations with FamilyTreeDNA.com does not agree with using the "sum

of the squares of the differences" method of calculating the Genetic

Distance.

John, I am not a mathematician and thus I cannot debate this issue

directly with you. But I throw it out for debate in this forum to be

tackled by others in this forum with sufficient mathematical skills to

debate the pros and cons of the method used by FamilyTreeDNA.com to

calculate Genetic Distance, which is basically a "simple summation of

the differences" (with special attention to certain special markers to

avoid duplications in counting differences), as opposed to "summing the

squares of the differences" method which you have advocated in this

forum with your posts. Is there a right or wrong on this issue? Let's

hear from the math and physics majors. :-)

Charles

