GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2006-12 > 1165813699


From: "Alister John Marsh" <>
Subject: Re: [DNA] age estimations based upon ASD-calculations- "Baby SNPs"
Date: Mon, 11 Dec 2006 18:08:19 +1300
In-Reply-To: <019201c71cc2$0aa38430$6401a8c0@Precision360>


Lawrence,

You mentioned the "baby SNPs". I am rather fond of the very slow STRs, the
"baby SNPs", and I have looked a little at some of the slower STRs.

ERRORS IN TRANSCRIBING TO Y-SEARCH: Often the informative markers are rare
off modal slow markers. It should be noted lab errors might be low, but
transcription errors to Y-Search by customers might be higher, particularly
in the case of very rare off modal values on slow markers. Say for a
marker, 999/1000= has a score of 12, and 1/1000= has a score of 11. If
there was a 1% transcription error rate by customers copying results to
Y-Search, there would likely be about 5 of those which are 12 reported as 11
and 5 reported as 13. But of the one who was really 11, there is about 99%
chance it would be reported correctly. So in this case, there might be 6
reported as 11 hypothetically, but, only 1 would actually be 11, and 5 of
those should have been 12. But of those reported as 12, all would actually
be 12s. I have found when dealing with particularly rare marker scores, I
find quite a few verifiable errors, and often suspect errors which I can't
prove. This does not invalidate the usefulness of rare marker scores, but
you have to be aware of the possibility of errors.

GUT FEELING: ....Don't tell anybody... but I use "gut feeling" at bit when
looking at very rare off modal scores. In small haplogroups, I often feel
that very rare off modal values on slow markers are informative, and I am
inclined to think two haplotypes with the rare score could be connected.
But then if you move to undifferentiated R1b, with a huge pool of
haplotypes, the chances of independent mutations to very rare values is more
likely to occur. But as you carve up R1b, and say just look at R1b1c6, a
smaller unit, sharing rare markers within that smaller subgroup becomes more
interesting again.

PAIRS OF OFF MODAL SLOW MARKERS: A very rare marker score by itself can be
interesting. But things get more interesting if two people share two or
more rare off modal values. I start thinking of "marker scores" as rare if
roughly less than 1% "of a particular haplogroup" have that score on a slow
mutating marker. I think you have to consider rarity at a haplogroup level,
rather than when considering a mixed pool of haplogroups.

RARE MARKER SCORES ARE OFTEN RECENT MUTATIONS: It surprised me at first,
but when dealing with very rare marker scores, I found that more often than
not they are recent. (recent= say less than 1,000 years) Clearly, if they
were old, they would not be rare. Also, it is sometimes said that more
males are currently living, than ever lived in all human history. If that
is the case, more than 50% of opportunities for a very rare mutation to
occur are in persons living today. That will it itself create a trend to
finding that most rare mutations are recent, rather than old.

DO DETECTIVE WORK ON INDIVIDUAL HAPLOTYPES SHARING RARE MARKERS: If I found
on a database of 10,000, there were 10 with a very rare maker score on a
very slow marker, quite often by looking individually at the haplotypes
concerned, you can eliminate a number as recent. For example, one of my
rare mutations, also occurs in one of the type 3 Irish group. Because it
only occurs in one or two of the type 3 Irish group, it is a good bet, that
it is a recent mutation in that cluster. Another occurs in a person, but
not in his close relations, so this must be recent in that family. If you
can eliminate say half the 10 this way, then you are down to a short list of
5. Then I look to see if there are many mismatches on other very slow
markers. If between two haplotypes sharing a very rare maker score, there
were a lot of mismatches on very slow markers, I would tend to think the
chances of relationship drop a bit. If a marker is rare, it is likely not
very old, so it would be unusual to find lots of mismatches on very slow
markers. Generally, you look harder at a possible connection, if the
mutation difference is not too huge.

GENETIC DISTANCE OF 18 ON 37 MARKERS: You mention this in your example. As
there is at least one recorded case with 11 mutations on 37 markers between
father and son, where recLOH has occurred, you have to look at where the
mutations are occurring. I would not dismiss 18 mutations as proving two
sharing a rare marker are unrelated, without looking at where the mutations
have occurred.

OLD MUTATIONS ON VERY SLOW MARKERS: There are a few very old mutations on
very slow markers, such as DYS492=13 in R1b, which is a good pointer to
R1b1c9, or DYS455=8 which indicates I1a in I haplogroup. These are very
useful. The fact that these slow markers are such good indicators, is proof
that very slow markers are to a degree "baby SNPs".

MY DREAM PANEL OF 100 SUPER SLOW MARKERS: Initially when test companies
were selecting markers to offer, there seemed a trend to favour faster
mutating markers. I think that the very slow mutating markers were regarded
as too slow to be useful, as the focus was the very recent "genealogical
time frame". But I think there is a lot of interest now from people prepared
to think a little "deeper". Evidence of that is the number of people who
have taken SNP tests. I suspect that there are a lot of un-tapped super
slow markers STRs "out there". As testing prices reduce with improved
technology, I would like to see a test company offer a panel of 100 super
slow STR markers. If an average mutation rate was say 0.0002, that is about
one mutation per 2,000 years in that panel, with not too many back mutations
since Y-DNA Adam to confuse things. If that enabled us to roughly group
people into 2000 year old clades on average, then the faster markers would
be pointers to more recent tree structure. To use SNPs to get people into
2,000 year old clades, might take millions of SNPs, which would be a
nightmare to keep track of, and test for. But 100 very slow STRs could be
relatively cheaply tested for, and only 100 sets of primers would be needed,
not millions of sets of SNP primers.

STR CROSS OVER POINT TO "BABY SNP": I think that the 4 markers you
mentioned which were 0.0002 or less is a good order of magnitude to think
of. I am sure there must be 100 markers which could be tested with mutation
rates in this range.

John.



This thread: