TMG-L Archives
Archiver > TMG > 2005-08 > 1122992968
From: Lee Hoffman <>
Subject: RE: [TMG] Statistical Report
Date: Tue, 02 Aug 2005 10:30:10 -0400
References: <008501c596de$0b43b560$0132a8c0@RichardA31><!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAAaklsBZ+7zhG8IACqAFw2TMKAAAAQAAAA1FtgbVBqCECMJ9fRGZ+cDwEAAAAA@sbcglobal.net>
In-Reply-To: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAAaklsBZ+7zhG8IACqAFw2TMKAAAAQAAAA1FtgbVBqCECMJ9fRGZ+cDwEAAAAA@sbcglobal.net>
At 10:50 PM 8/1/2005, Kevin Sholder wrote:
>Thanks for your help with this and yes there are some irregular dates such
>as a few that had ? In the year or month. So this will be helpful.
While the Statistical report can certainly be helpful in cleaning up your
data sets and project, the Audit report is much more helpful. The
Statistical report will tell you about a few persons to look at for
possible clean up, but then you would need to re-run the report to see the
next persons, correct them, re-run the report, ad infinitum.
For clean-up purposes, the Audit report would be the best -- starting with
extreme comparison values and slowly reducing/increasing those values until
either you have viewed and corrected everything and/or get lists of persons
that all show correctly. By this last statement I mean that you might get
a list of everyone married before age 18 and find that all on the report
actually were married before age 18 (most probably age 17 and 16).
Once you have finished cleanup using the Audit report, the Statistical
report can give you a cross-sectional view of your data. In addition, if
you run it every so often, the averages should not change much. Similarly
the MINs and MAXs should not change much. When they do, you may want to
look at those changes areas to see that the data is correctly entered. If
you see a large increase in the number of Tags of a Tag Group (the MAX or
the AVG) then you may want to check that out.
POP 42403
AVG 0.7
STD DEV 0.6
MIN 0
MAX 7
The above is from my main data set for the Marriage Tags line. It
indicates there are 43,403 persons in the data set. It also indicates
there is an average of 0.7 Marriage Tags per person. This kinda makes
sense if you consider that not all persons have a Marriage Tag.. Note that
the Standard Deviation is .6. This means that the range of Burial Tags is
from 0.1 to 1.3. This is, of course, not quite true when you consider that
some will not have a Marriage Tag (the MIN value). Since the AVG is thus
skewed considerably in the lower direction. Looking at the MAX value of 7,
and comparing the higher range value when the Standard Deviation is
considered, you can deduce that the MAX value is probably much higher than
the usual number of Marriage Tags. From this you can guess that a large
number of the data set do not have a Marriage Tag and of those that do,
most only have one or maybe two with the smaller remainder having 3, 4 or 5
and a few having 6 & 7.
Now, you ask, does this tell me anything new? Probably not, as logic will
tell you most of this anyway -- more or less. But it can be used as a
comparison from time to time to see if your data set is changing in any
significant way. So if you suddenly find that the MAX is still 7 and the
AVG has risen to .9, then you can figure that you have added number of
Marriage Tags to your data set without the comparative increase in the
number of persons. If this fits with what you recall as having entered
then everything is fine. On the other hand, if you don't recall adding
very many Marriage Tags but have been concentrating on say Other Tags then
you may want to check further.
Overall, the Statistical report is more of a "that's interesting"
report. But serious study can make it very useful at times.
Lee Hoffman/KY
TMG Tips: <http://www.tmgtips.com>
My website: <http://www.tmgtips.com/lhoffman>
A user of the best genealogy program, The Master Genealogist (TMG)
This thread:
| RE: [TMG] Statistical Report by Lee Hoffman <> |