Archiver > GENEALOGY-DNA > 2010-11 > 1291142172

From: David Faux <>
Subject: [DNA] Minority African and Native American Segments: Eurogenes andRHH mapping
Date: Tue, 30 Nov 2010 10:36:12 -0800


One of my greatest frustrations in autosomal genetic genealogy has been the,
to date, inability of available tools to “find” smaller but valid segments
from “minority” (for me) ancestors born in the 1700s. While one might brush
this off as expecting too much, it seems that with the technology (and new
reference samples) made available in the past couple of years it should be
do-able. I fully understand the concept of ancestors falling off the
genetic tree with time due to recombination, but have always believed that
some or all were still present, but in an unrecognized form.

My genealogy is crystal clear, all British and German with the exception of
a well-documented Mohawk ancestor (I can tell you her clan, League family,
and Native name) born in the mid 1700s; and an African ancestor in two lines
strongly suggested by circumstantial evidence (called a “Mullatto” by none
other than Sir William Johnson). Both were from Canajoharie NY, and the
former came to Canada (Six Nations Reserve) at the end of the Revolution.

Many (perhaps 25%) of descendants of George Fry and wife, who arrived in MA
in the 1630s, carry the AFAP mutation embedded in a conserved segment of
chromosome 5 which is a minimum of 7.17 Mb (5.43 cM). Hence through 15 or
so generations, segments of at least this length from ancestors who
contributed 1/32,768 or 0.00003 to the genome. See:

So with an Ancestry Painting of 100% European, it would be easy to accept
the belief that African or Native American ancestors, even those half as far
back as the Fry family example, have been entirely removed from the genome
by recombination. However, a problem here is that recombination can result
in small segments simply under the wire and not recognized as such, or ones
who have irregularities such as the insertion of a sliver of European and /
or a SNP poor region resulting in some fragmentation. So, it is entirely
possible that evidence of an ancestor sits “contentedly” in the genome, but
remains essentially invisible – unless more sophisticated testing is done.

It seems clear that African segments are more easily recognized than Native
American in an otherwise European genome since there are many more markers
that distinguish the former. There is more sharing of alleles between
Europeans and Asians as well as Native Americans, than there is between
Europeans and Africans. See the McGinnis et al., 2010 article:

Ancestry Painting as well as decodeme Ancestry Origins still use only 3
reference HapMapgroups, but talented members of this community have found
ways to introduce all 52 from the CEPH-HGDP panel, and recently many others
(e.g., Athabaskan / Na-Dene) from the academic pool – or created their own
populations from data submission of for example Poles or Finns. Hence Doug
McDonald, David W. (Eurogenes Project) and Dienekes (Dodecad Project) are
forging ahead and also using a variety of algorithms or methods to look at
admixture or affinity comparing someone’s genome to this entire group.

We now have available whole genome analyses using diverse methods, admixture
analyses for groups from K=4 to K=36 (with some consolidation of the larger
number of populations into a broader category such as North Eurasian), and
segment analyses which are expanded versions of Ancestry Painting.

Returning to the search for the distant African and Native American
ancestors, for example Doug was able to provide a 1.2% NA “out of the noise”
categorization for a cousin closer than myself to the Mohawk ancestor –
although my results were embedded with the noise. The three NA segments for
my cousin identified on the segment analysis were not strong enough to
withstand the rigors of various analyses, so they remain tentative. He did
find an African block for myself at the p end telomere of chromosome 16 –
something in the 7 to 10 Mb range. An East Asian segment disappeared when
he added in the Middle East and South Asian populations (someone from NW
Europe would have 4.5 and 8.5% respectively of these groups as part of the
European structure) – so this helps to separate the chaff from the grain. He
also identified a prominent African block for my wife (Ozark ancestry) on
chromosome 8 and a large Native American segment on chromosome 9.

Now, along comes DavidW (Davidski, Polako) who takes the above study by
McGinnis entitled, “Visualizing Chromosome Mosaicism and Detecting Ethnic
Outliers by the Method of ‘Rare’ Heterozygotes and Homozygotes (RHH)”, and
applies it to participants. Here alleles, which are very rare in a western
European population (often zero), are displayed in a chromosome array as
hash marks beside the specific location. At present the method does not
automatically identify the allele as say most likely African. The only way
to determine this is to check the output of the rs number from the amazing
database SPSmart ( showing pie diagrams for the
percentages in all world regions (which can be broken down further). It may
show for example that zero Europeans have the T allele, but that 30% of
Native Americans and 5% of East Asians show this variant. After sending out
location charts and mosaic segment charts for each participant, David
decided, after locating previously hidden African ancestry in two of the
CEPH-HGDP Sardinians, to amend the criteria for rare alleles. He decided to
try an experiment and sent new mosaic displays and associated rs numbers,
with chromosomal locations for each red heterozygous allele, to some who had
been in contact about the matter. He modeled the new approach after his
Sardinian study ( The instructions were
simple - look for block–like structures. This could be done visually via a
clustering of hash marks with large blank areas on either side; or scanning
the position numbers for the array of rs numbers for each chromosome, which
was sent in a table.

Despite having a few hundred rare alleles, it quickly became apparent that
most were “strays”. With what was admittedly a lot of work, I was able to
identify an apparent African segment on the p telomere of chromosome 16
(noted from Doug’s work and the original RHH chart). Also found was a
likely Native American segment on chromosome 2. Interestingly these were
the two which David had flagged using his own methods. What is also
interesting is that each contained an embedded AIM (ancestral informative
marker) cluster of 5 SNPs. For example the chromosome 2 segment included 5
SNPs with zero Europeans who had my rare allele, and zero Africans, but
varying percentages of Native Americans for all of these very closely spaced
(e.g., all within say 300K) SNPs. The same effect was seen for the
“African” segment on chromosome 16. These seemed to flag these regions.

David bracketed the two regions with upper and lower SNPs and ran each using
an MDS (multidimensional scaling) plot. See the Sardinian example above for
what to expect – basically that my CA4 icon would be positioned between
(because one of my segments is European) the say cluster of Africans and the
cluster of Europeans. The chromosome 16 African segment was validated, as
was that for the chromosome 2 Native American. However due to the fact that
some groups such as Athabaskans and Mayan are known to be admixed, the
expectation here is that my segment would cluster best within the admixed
individuals most closely related. The icon was indeed surrounded by only
Native Americans, with the closest being Athabaskans, Mayans, an East
Greenlander, and one Chukchi (Siberian from the peninsula closest to Alaska
and whose genomic structure most closely resembles Native Americans). These
were the three groups seen most commonly in Doug’s analysis of 6, 23andMe
customers, Native Americans from the Great Lakes eastward. So a “perfect
fit” in this instance. An analysis of the length of the segment in relation
to the known genealogical contribution showed an exact match for each (which
is surely coincidence). What is being done here is to treat the segment as
if it was a whole individual and see who is the set of nearest neighbors to
that segment. Other segments (e.g., Native American) from the same genome
could have a different history so the matching pattern may be different.

As an aside I asked DavidW to kindly check the segment on my wife’s
chromosome 8. It was confirmed with a crystal clear positioning between the
African groups and the European populations. My conclusion is that this is
an amazing new technique for biogeographical analysis.

The point is that those with lower levels of minority ancestry who have not
yet “found” this ancestry represented in their genome may still be privy to
finding out that it is indeed present, just masked to some of the “standard”
methods designed to detect them. It was once dismissed as unlikely that
even if present, these minority segments could be unmasked and forced to
show themselves. Clearly not correct. As DavidW. has said, a year from now
we may be laughing at the use of the RHH to find segments because the field
may have advanced to a significant degree. True, but for now this is a
revolutionary new approach for those of us who have small amounts of
minority ancestry allowing us to peer into the genome and confirm that the
ancestor so proudly standing prominently in the genealogy, is installed in
the genome as well.

It is my understanding that DavidW will be taking a break for Christmas, so
stay tuned to his blog for any possible further efforts in the RHH or other
cutting edge work.

David K. Faux.

This thread: