Archiver > GENEALOGY-DNA > 2012-05 > 1336006871

From: "G. Magoon" <>
Subject: Re: [DNA] New results for TMRCA of Y-Haplogroups - based of 1000Genomes Project data
Date: Wed, 2 May 2012 21:01:11 -0400
References: <>
In-Reply-To: <>

Terry, very interesting analysis. Would you be able to elaborate on how you
obtained and processed the SNP/ variant results? In particular, did you use
one of the (Phase I?) chrY VCF files produced by 1000 Genomes Project or
did you do your own variant calling on their data? How do you treat the
spurious heterozygous results? Is there any filtering for spurious
variants? Did you attempt any sort of correction to account for the fact
that the reference sequence is not the ancestral sequence (and the
associated ancestral/derived confusion). Also you mentioned the mutation
rate you used in answer to Ken, but I don't think I've seen mention of the
number of nucleotide sites...was this assumed to be the same from sample to
sample? Sorry for the barrage of questions here, but I think these are some
of the key issues that may need to be considered as you refine your
analysis. I think what you've done here is at the very least a promising
On May 1, 2012 8:22 PM, "Terry" <> wrote:

> Some new TMRCA results for Y-haplogroups, *without* relying on STR methods,
> are shown as a tree in Update 10 at the bottom of the webpage:
> The time to most-recent-common-ancestor (TMRCA) for the Y-chromosome of a
> set of males, is often estimated by methods based on STR values. If the
> individual STR mutation rates are known for all the various STR markers
> used, and a model for how the mutations can occur and other details, then
> the TMRCA can be computed with an uncertainty depending on the STR markers
> that are used.
> There are of course issues related to the actual STR mutation rates, and
> also to the precise methodology that is used. Either way, the error bars
> are typically quite broad for STR methods that estimate TMRCA.
> What would be helpful, is to have some independent method for determining
> TMRCA, and that is what is shown at the bottom of the link above.
> With full sequence genomes available for a large number of people in the
> 1000 Genomes Project, one can use that data to estimate the TMRCA of those
> particular Y-chromosomes that have been sequenced.
> Just count the number of nucleotide differences between any two of those
> Y-chromosomes, and use a nucleotide mutation rate, and that will give you
> an estimate for the TMRCA of those two Y-chromosomes. This method should in
> principle be quite accurate, although it does depend critically on knowing
> the Y-chromosome nucleotide mutation rate.
> As of late last month, I extracted a sample of 526 men, with the
> Y-chromosome well sequenced, from the 1000 Genomes Project.
> >From that data, I then computed the tree - as shown in the link above - of
> all the Y-haplogroups represented by those men sequenced in the 1000
> Genomes Project. In addition, I used the Y-chromosome nucleotide mutation
> rate adopted by Cruciani et al (in The American Journal of Human Genetics,
> June 2011), to add a timeline from which one can read the TMRCA of any
> branch in that computed tree.
> Uncertainty ranges can be computed for each of the branch times. But the
> main uncertainty is that the Y-chromosome nucleotide mutation rate is not
> particularly well known. Despite that uncertainty, the preliminary results
> do look very plausible.
> Terry
> -------------------------------
> To unsubscribe from the list, please send an email to
> with the word 'unsubscribe'
> without the quotes in the subject and the body of the message

This thread: