Hierarchies Help Source Organisation, Analysis and Citation

Jill Ball’s recent hangout on air entitled Let’s get organised caused me pause and think. The question of how to organise the physical and digital ‘stuff’ we accumulate during genealogical research is a common one that elicits a wide variety of responses. This discussion revolved mainly around digital files.

The panellists broadly follow two patterns of file organisation, person oriented and source oriented. Person oriented systems typically arrange files in folders for surnames and individuals. Source oriented systems typically arrange files in folders for each source type. Some people also have place and project folders.

Retrieval strategies include using file naming conventions, tagging, assignment of unique file ids, indexes and spreadsheets. Panellists use Family Historian, Custodian, Evernote, Excel and coggle.it to help them keep track of the ‘stuff’.

In Provenance of a Personal Collection – Archival Accession, Arrangement and Description, I advocated recording source information in a hierarchical archival style catalogue. Archival catalogues typically arrange source items by the provenance and context of their creation and use, which is reflected in multiple levels of logical organisation. Storage is not necessarily the same as logical organisation.

Are Hierarchies Hard?

Genealogists create family trees. A family tree is a hierarchical branching structure where layers represent generations of ancestors. So, genealogists would readily adopt a source hierarchy, wouldn’t they? The discussion made it pretty plain that is not the case.

Jill advocates a flat digital file structure. Among the panellists who use a hierarchy of physical or digital folders, the number of levels is restricted to no more than 2 or 3.

Why do people find hierarchies difficult?

Navigation through hierarchy levels is hard to get your head around. I have been looking for a tool that helps means me draw hierarchical trees and visualise my catalogue structure. Thanks to Alex Daws’ suggestion I tried the mind mapping tool, coggle.it. My genealogical ‘stuff’ falls into the categories depicted:

Sue's Genealogy Catalogue

Sue’s Genealogy Catalogue

For the interactive version follow this link:https://coggle.it/diagram/550aa664a7d032c23734a105/7e12c7b8433ffe43f86b5994f61abf9977f826ac312616349825bbda313db27a

I have included a selection of top level categories and expanded a few of them. On the right are the things I acquired from family and collaborators, and my personal documents. On the left are the things acquired from physical and digital archives.

Personal Collections

The personal collections are organised by their provenance, the person from whom the items came. This visual representation makes it easy for me to see I have omitted a helpful relative, Pat (how ungrateful am I!). I have expanded the part Raymond’s collection which I discussed in Provenance of a Personal Collection. The sub-categories reflect the use (e.g. probate) and history (e.g. belonged to Winnie) of the items.

Complete collections can be organised without taking account of possible future additions. The branches colour coded in yellow are complete. Raymond and Mabel are deceased, and personal study projects relate to courses completed.

The personal collection labelled Sue is my own. It includes collections that were created by my own life, the types of things discussed in Fresh Starts, my genealogy business records named Family Folk, and the results of my research such as blog posts. The category named Genealogical Research Collection is my personal sin bin. It’s arrangement reflects my early attempts to organise things, and indirectly documents my development as a researcher. Rather than rearrange things I have documented the existing arrangements.

Digital and Physical Archives

The left side of the source tree depicts my understanding of the arrangement of things I accessed through archives. I have expanded the top levels for just one record, the marriage of Joseph Wilson and Elizabeth Wilson at Claverley in 1808, that I discussed in Three Wilson-Wilson marriages and the Family History Library Experience.

The original marriage register is held by Shropshire Archives and the archive catalogue entry includes the hierarchy that shows how the marriage register fits into the archive’s collections:

Shropshire Archive catalogue

Shropshire Archive catalogue

Notice that the top level is missing from the archival catalogue. Parishes are collected together in a group denoted by P or XP, but there is no catalogue entry for this group. Many archive catalogues could be made more user friendly by the inclusion of top level groups and a visual interface. This catalogue entry also refers to the microfiche copy of the registers.

Catalogue entries for a marriage record

Catalogue entries for a marriage record

I have followed the archive catalogue in my source tree, but added in the missing parish level and separated out the microfiche version. The Family History Library transcript and microfilm are arranged by call number and film number, a peculiarity of that institution.

Digital archives typically consist of an index or database that may reference a collection of digital images. Database entries are accessed via search functions. The arrangement of digital image collection is similar, but not identical, to the arrangement of physical archive in this case. Some digital image collections differ substantially from their physical counterparts.

In addition to the original, there are 6 different versions of the marriage record. They were derived from the original either directly or indirectly via several different copying processes, but that is hard to show on my source tree.

Citations and Source Identification

Traditionally many academic disciplines cited published and unpublished works in the form of a bibliographic citation, but only included the data they collected in summary form. In many disciplines it is now recognised that the academic paper alone is no longer sufficient and the underlying data also needs to be shared. How to Cite Datasets and Link to Publications explores the issues and makes proposals for scientific data sets. Citing genealogical sources is more similar to citing scientific data than to citing finished works.

Genealogists typically want to pin point a single record or piece of data within a data set. For the marriage example the following locate the record within source items:

Original page & record number
Microfiche counted row and column numbers, record number
Transcript page number, record number
Microfilm Item number, counted image number, record number
Digital image browsing breadcrumb, image number
Database entries search terms

Genealogists need to know exactly which source item was used, because they differ in accuracy and reliability. My source tree distinguishes between the 7 source items, but does not make the relationships between them clear. Here is how I think each was derived:

Claverley marriage register & derivative sources

Claverley marriage register & derivative sources

Copying and processing potentially produces errors, so genealogists need to check against originals if possible. In the marriage example, I used the FamilySearch database to find the transcript and then checked the transcript against the microfiche copy of the original, because they were available at the time. Now I would use the high quality digital image that has since been published. The archive quite rightly restricts access to the original so that it is preserved.

The complicated-looking citations in Evidence Explained identify the source, the equivalent of my source tree. Multi-level citations, indicated by “citing”, give the relationships between derived versions and the original.

I have tackled some quite complex ideas in this post. I hope find some worth considering as your genealogy organisation systems develop. As Julie Goucher said, there is no one size fits all.

I thank Jill and all the panellists for challenging my assumptions, sharing their frustrations and confusion, and openly debating the issues. Conversations like this are valuable contributions that genealogy vendors and software developers need to hear. As a member of FHISO, I am listening.

© Sue Adams 2015

14 Comments on “Hierarchies Help Source Organisation, Analysis and Citation”

  1. Jackie Saulmon Ramirez says:

    Great – if complicated – post!


  2. luvviealex says:

    Hi Sue I’m so pleased that you found coggle useful. I really like your diagrams and actually reckon that they would work really well when people are just starting family history and trying to get a handle on Archives/sources and how records are created/managed etc. Putting a creation timeline next to your map of derivative sources is a really neat idea. Go you!


    • Sue Adams says:

      Hi Alex

      I think you make an important point that beginners could benefit from coggle type diagrams.

      I was sceptical about mind mapping/tree drawing applications, as when I did not find them helpful for essay planning in my student days. Seeing your coggle diagram that explored sources was an inspiration. Would you care to publish it with explanation?


  3. Sue McCormick says:

    I am away from home right now. Maybe at home I can spread things our and work our what you have done; my immediate impression: WOW! I can never do THAT!

    Thank you for some interesting things to think about.


  4. lkessler says:


    I love your source diagram with the integrated timeline and especially like the diagram for “Sue’s Genealogy Catalogue”. This is great presentation and I’ve not seen anything like this before.



  5. […] 6. Hierarchies Help Source Organisation, Analysis and Citation by Sue Adams on Family Folklore Blog […]


  6. Sue Adams says:

    Elizabeth Shown Mills has clarified layered citations in When Citing means Regurgitating in part as a response to this post.


  7. Jana Last says:


    I want to let you know that your blog is listed in today’s Fab Finds post at http://janasgenealogyandfamilyhistory.blogspot.com/2015/06/follow-friday-fab-finds-for-june-5-2015.html

    Have a wonderful weekend!


  8. In response to my question to Laura, posted privately on Google+, “How do archivists catalogue copies and derivative versions?”


    As usual, you’ve done a stellar job explaining this point, and I’m happy to comment on it. I’ll try to be concise; keep in mind that this is a complex issue with multiple ‘answers’ to your question.

    Let me say this first: it is rare that an (American) Archives maintains copies of materials. As you are aware, the primary purpose of an Archives is to house, maintain, and make accessible, unique materials that do not exist anywhere else. By virtue of that, maintaining copies of anything is contradictory to that goal.

    Example: I have seen many posts on SAA’s (Society of American Archivists) various listservs from Archivists debating the prudence of maintaining collections of newspapers or newspaper clippings. Generally, it’s a space issue, and when push comes to shove I’ve seen a fair number of Archivists who recommend ‘deaccessioning’ these materials (removing them from the collection and putting them in the bin.) The thought process behind that is that they are not ORIGINAL; they are one of many printed copies. However, more often than not, there is a contingency of us who point out that the Archivist should first check to determine if the materials are available elsewhere. Again, preservation of original, primary and UNIQUE material is the goal.

    The second answer is that there isn’t a cataloging standard in American Archives. While we have a common nomenclature, the actual cataloging is done at the individual Archivist’s discretion. And, typically, how the materials are described would be dependent on whether it was one document, a small group of documents or an entire collection of documents. In my own collections (the ones I’ve processed…7 total, not including Family and/or personal Archives) in the rare instance that I kept copies of something, I would describe that both in the Scope and Content note and in the container listing (i.e., photocopy of correspondence between John Smith and Tom Jones, 12 Apr 1865; Xerox copy of catalog XXX dated XXX; digital instance of photograph of Mr and Mrs Wilbur Greene 1857).

    It is the researcher’s responsibility to properly cite the sources they’ve used, regardless of repository. Unfortunately, these more complex issues often ‘scare’ beginners away from even attempting to create citations. I say, in a lecture I give about the GPS, that it is better to create something than nothing and while the standard (here in the US) is to use “Evidence Explained” by Elizabeth Shown Mills, if they can simply capture WHERE they got the material it’s better than nothing. I usually get one grizzled veteran who gives me the Eye, but the reality is most people aren’t going to publish, so better to get them in the habit of writing down where they found material and hope they come to the realization that doing so properly is good thing, than scare them off with the complex issues of how to cite a derivative of a secondary source (which most beginners think sounds like Greek). LOL

    I hope this answered your question, but if it didn’t, please let me know. You know I enjoy discussing this crossroads between my two passions!



    • Sue Adams says:

      Laura, Thank you for your informative and insightful reply.

      I am surprised that you say it is rare for an American archive to maintain copies. Microfilm and microfiche are well established way of providing access while limiting damage to the originals through use. NARA’s microfilm catalogue lists 2486 collections. Is NARA an exception? Here in the UK the heavily used collections such as census and parish registers are routinely available on microfilm/fiche and you are expected to use the copies unless you have a very good argument for using the originals.

      The extent to which microfilm and increasingly, digital image copies are made available varies hugely between institutions. Some are filmed by external agencies such as the LDS church. Are such externally filmed records typically available at the institution that holds the originals?

      There is the question of just what is ‘original’. Even if it was easy to determine which of the extant copies of the Magna Carta was written first, all are important documents that still keep scholars busy!

      I am confused by your statement that there isn’t a cataloguing standard in American archives. Do you mean that no particular standard is widely used by American archives, and each chooses from the many standards that exist? From the SAA website it seems to me that the SAA maintains DACS, EAC-CPF and EAD.

      From your answer it looks like you would record the copies in the record of the original and not in a separate record. Is this dictated by archival software or database structure? Is there any one archival standard that accommodates this sort of thing better than others?

      I absolutely agree that beginners and many experienced researchers struggle with the copies of copies of derivatives scenario. I am trying to get a handle on how well archival standards deal with this in order to incorporate some sort of best practice into a genealogical standard for sources and citations being developed by FHISO (http://www.fhiso.org)


  9. […] Hierarchies Help Source Organisation, Analysis and Citation by Sue Adams on Family Folklore Blog […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s