sanity_check_1Sanity check is a new kiwitrees feature, available in version 3.1 and above. It is a response to an issue that has been described many times in PhpGedView, webtrees, and recently in kiwitrees. The title sums up the issue well, although I can’t claim credit for it. I ‘borrowed’ it from an excellent piece of online software called “Bonkers” that operates as a stand-alone sanity checker.  If you want a more in depth review of your data than kiwitrees offers, I highly recommend it. They describe the issue with an excellent quote:

Sooner or later, we all get to the point where we realize that there is a lot of data in our database that is just completely bonkers.

The intention with the kiwitrees  sanity checker is to allow you to select one or more data issues you think might exist in your tree, quickly search for all examples, then click on a link to each of the records concerned to check and adjust the data as necessary.

Sanity checker is quite different to, and does not replace, the existing tool “Check for GEDCOM errors” which is designed to check your data for strict adherence to the GEDCOM specification. Sanity checker goes to the next level, checking for data entry errors that may not technically be GEDCOM errors, but do seem to be “bonkers”.

It is important to be aware that the term “bonkers” is extremely un-scientific 🙂 Bonkers does not always mean WRONG! But at least with this tool you have an opportunity to check. Some specific examples of acceptable “bonkers data” are noted in the descriptions below.

There is an important note on the sanity checker page, in red. It says “This process can be slow. If you have a large family tree or suspect large numbers of errors you should only select a few checks each time“. Please consider this before you tick all the boxes. How many checks you can do in one go will depend on the size of your tree, the number of errors, and the amount of memory available on your server. If an error occurs a “fatal error” message will appear on the page. You can clear that by simply clicking your browser’s ‘back’ button. Then try again with fewer tools ticked. If even one tool is too much for your system either ask your webhost for more memory, or use an external tool not dependent on your server such as “Bonkers“.

sanity_check_2[dropcap1]F [/dropcap1]or it’s initial release sanity checker has just a few tools covering some date issues, missing data, and duplicated data. More tools will be added over time, but feel free to comment here with any suggestions you have, and ideally with some sense of a priority level for your request. You will also see that all these early tools are only related to individual data, not family or any other record type. I do hope to add those later, especially date issues around family events such as marriage. But they are more complex and resource intensive, so I’m starting with the easy stuff!
[clear][small_full]Date discrepancies[/small_full]
[h5 class=”hue”]1 – Birth after baptism or christening[/h5]
[h5 class=”hue”]2 – Birth after death or burial[/h5]
[h5 class=”hue”]3 – Burial before death[/h5]
These are self-explanatory. The first  looks for baptism (BAPM) or christening (CHR) dates (whichever you use) and compares them with the birth (BIRT) date. It then lists any where it appears the person was baptised before they were born!

The second is similar, but looks for people who were not born until some date after their death (DEAT) or burial (BURI).

The third is comparing an event (burial) against the death date.

[small_full]Missing data[/small_full]
[h5 class=”hue”]1 – No gender recorded[/h5]
No gender simply looks for individuals where there is no gender (SEX) recorded. It will find individuals with no SEX tag at all. It does not need to check for values other than the acceptable “M, F, or U”, nor for entries of just “1 SEX” as these are all converted to valid entries automatically on either import or edit.

[small_full]Duplicated data[/small_full]
These tools looks for two (or more) similar records within a single individuals data record, such as two births (BIRT), two deaths (DEAT) or two genders (SEX).  It does not consider the content of those records, just their existence. While such duplication might appear to be “bonkers”, you should not assume that is always the case. The GEDCOM specification allows for recording multiple occurrences of the same event in situations where for example a researcher discovers conflicting evidence about a person’s birth event. The specification indicates that in such cases you should record each event in order of preference. Kiwitrees acknowledges this and always uses the first event as the “preferred” one.

Even multiple genders could be a deliberate record of evidence found or perhaps gender change during life. In this case the same “preference” rule applies. The first one found in the raw GEDCOM data is the one used to determine the display elements such as silhouette image, background colours etc.

The check for a duplicated name is a very specific case. It only finds duplicates where the individuals FULL name is IDENTICAL and entered twice.

[h5 class=”hue”]1 – Birth [/h5]
[h5 class=”hue”]2 – Death [/h5]
[h5 class=”hue”]3 – Gender [/h5]
[h5 class=”hue”]4 – Name [/h5]