TMG-L Archives
Archiver > TMG > 2005-08 > 1123272811
From: "John Davis" <>
Subject: Re: [TMG] Statistical Report - Part 2
Date: Fri, 5 Aug 2005 13:15:31 -0700
References: <20050805.110734.2428.3.dean.scribner@juno.com>
Interestingly, Huff's book was written tongue-in-cheek, especially the
title. It is actually a very good compendium of caveats to help the user
of statistics to avoid making errors with statistical data.
Unfortunately for me, the one copy I had was purchased with company
money, so I had to leave it in my library for the next guy when I
retired. Unfortunately for all of us, politicians and marketing
professionals apparently enjoy doing just what Huff cautions against, to
the detriment of us all.
In the case of the standard deviation, MIN, MAX, the basic stuff in TMG,
the program, to our advantage, is not attempting to convince the user of
anything, as politicians and marketing professionals often do. The
statistics are therefore rather neutral in that regard. But, there is an
old adage of caution for programmers and statisticians, GIGO = Garbage
In, Garbage Out.
Kevin wisely questioned the results of the statistical computations he
was getting. He found that if someone does not take care in assuring
that no data outside the population under consideration has somehow
inadvertently crept in, the results will contain a bit of garbage. In
other words, the stuff that we didn't expect to be included as part of
the named population is hidden away in there someplace, altering the
output of the statistical computations. Removing such data will purify
the population, and, if we can't remove it, we can at least take it with
that "grain of salt." It all must begin with understanding the
statistic, where it comes from, what it consists of, what it is
revealing to us.
Thankfully, the Central Limit Theorem, and its related statistics,
including standard deviation, are rather benign, of and by themselves.
If the population is large enough, and contains data exactly of the same
kind, and only the same kind, it will be accurate, provided the
distribution of the data points is normal (see previous post.)
Now, "seeing" a cause & effect relationships every time we see a
statistical correlation is a really common error, often perpetuated
purposely by the above-named folks to get the public to jump to the
wrong conclusions. For example, they like to reverse cause/effect to
their advantage, or refuse to consider (or reveal) that the correlation
of two trends may itself be caused by a third unnamed, perhaps unknown,
statistic, that is acting on both.
One thing to keep in mind with the type of statistics that TMG generates
is the grouping of the data. I'm not sure how much grouping can be done
before generating these reports (maybe someone else can help with this),
but if, for instance, there are people of several known ethnic groups in
your family, and IF (big if) your database is large enough, one could,
say, separate out the ethnic groups.
You could expect to see the statistics change from group to group, to
roughly mirror what is already known about, for instance, life
expectancy among those groups. A rough separation of generations should
show an increase in life expectancy as we approach the present time. We
might be surprised to find that certain families are much hardier than
others, or more prolific. The standard deviation could reveal that
certain families almost always live to a ripe old age, while others'
deaths are widely scattered from childhood to old age.
I haven't been using these reports to any extent myself, since I haven't
really rolled up my sleeves and gone to work like I should on our
genealogy, but I can see that these very basic statistics could provide
a little harmless entertainment, and maybe even a little enlightenment,
along the way.
John Davis
Researching Davises, Shields, Corns, Fosters, Reagans, Kendalls, and who
knows what-all?
----- Original Message -----
From: "Dean Scribner" <>
To: <>
Sent: Friday, August 05, 2005 11:07 AM
Subject: Re: [TMG] Statistical Report - Part 2
> Kevin writes:
> >Opinions / Comments ?<
>
> When Mark Twain quoted someone else, saying "There are three kinds of
> lies: lies, damned lies, and statistics," he was telling us that
> statistics mean exactly what you want to believe they mean, or what
you
> want others to believe they mean. Those of us on a salt-free diet
need
> to be very careful about swallowing statistics. They must be
accompanied
> by a whole lot of sodium chloride, and just a wee bit of common sense.
> Read Darrell Huff's book, "How to Lie with Statistics", 1954, a guide
for
> marketing professionals and politicians.
>
> Dean
>
>
> ==== TMG Mailing List ====
> Send all messages and replies to <>.
>
>
This thread:
| Re: [TMG] Statistical Report - Part 2 by "John Davis" <> |