What is an Item?Posted: 26 Aug 2015 Filed under: Genealogy issues, Genealogy software and data | Tags: Abinger baptism, bishop's transcript, database, digital image, Item, microfiche, microfilm, parish register, Source example 4 Comments
In recent posts, The Original in Context and How Many Copies?, I examined how an original item fits into the collection and how originals are copied into many different forms. Those transformations raise a critical question: What is an Item?
The archival standard, ISAD(G), defines an item as
“the smallest intellectually indivisible archival unit” where a unit of description is “A document or set of documents in any physical form, treated as an entity, as such, forming the basis of a single description”.
So, what does that mean and how can it be applied to genealogical sources? In the example of Alfred Munday’s baptism, the Abinger baptism register is described as an item. It is a single physical entity, a book. It has a particular and unique history in that it was created between July 1841 and May 1898 by the parish clergy filling in baptism details.
What happens to the item when copies are made?
Annual copies of parish registers were made for the bishop (Bishop’s Transcripts). In the Abinger example loose sheets of paper (or folios) of baptism, marriage and burial forms were filled in, and the diocese (or later custodian) filed these together by year. So, what is an item for these Bishop’s Transcripts? Although they share the same purpose of providing a copy for the bishop, the baptism, marriage and burial folios are copies of 3 separate registers. A logical argument can be made to consider the 2 baptism folios an item. I prefer to treat each folio as an item, as each is a separate and unique physical object.
Filmed records present different issues. Microfilms often contain more than one register and a register may be spread over more than one film. The Abinger 1841-1898 Baptism Register is split over 2 microfilms, numbers 991768 (1841-1876) and 994421 (1874-1898). Both of these microfilms also contain other registers from Abinger and other parishes. The Family History Library catalogue sometimes marks parishes as separate ‘items’, but does not appear to consistently identify original registers or even sequential runs of data. When the microfilm is of something which is itself a copy the relationship to the original may be obscured. The Abinger Bishop’s Transcripts are spread over 2 microfilms, with baptisms 1844-1850 on film 307739 and baptisms 1851-1857 on film 307740.
The distribution of images over microfiche present similar problems to microfilm, except that the number of images on a fiche is typically fewer than film. In the Abinger case, microfiche were produced from microfilms of the original registers. The fiche that contains Alfred Munday’s baptism includes baptisms from 1835 to 1865.
It is clear that a microfilm or a microfiche is a single physical object, but the contents may vary and may not be adequately documented. A pragmatic approach is to treat each microfilm or fiche as an item. That leaves the problem of describing the contents of each item.
So far I have considered physical copies with a pragmatic approach to defining what an item is. In short if you can pick it up and shake it and nothing falls off, it is an item. For artefacts that are in good condition, it holds true. Fragile artefacts (please don’t annoy archivists by shaking) may become detached, but if the parts have been kept together, it still makes sense to treat the whole as an item. When parts of a artefact are split up and distributed, as in the example discussed in Book Breaking and Digitisation, the parts have separate custodial histories, so should be treated as separate items with a lot of documentation.
What about digital objects? Two types of digital objects have been identified for the Abinger example: digital images and databases.
A digital image, typically a jpeg file, is easily identified as an item that can be viewed and downloaded. In the Abinger case, the digital images mostly depict microfilm frames, but some parts of the collection have been directly digitally photographed. The number of pages of a manuscript (e.g. original register or bishop’s transcript) depicted in a traditional photograph or digital image varies, with 1 or 2 pages being common. When 2 pages are depicted, it is usually an opening of a book, such as the Abinger baptism register. Both sides of a folio do not appear in the same image, because you can’t photograph both sides at once. The Abinger bishop’s transcripts were photographed with 1 side each of 2 folios shown in each image, which results in the last page of baptisms and the first page of marriages being on the same image. As Ancestry has re-arranged the images into separate sets for baptisms, marriages and burials, this image appears twice.
Identifying items in databases is the hardest of all. The main online genealogy data vendors categorise data into collections, such as Ancestry’s ‘Surrey, England, Baptisms, 1813-1912’ and ‘London, England, Births and Baptisms, 1813-1906’ collections that contain the Abinger parish registers and bishop’s transcripts respectively. The distinction between collections is blurred by search facilities that permit broad searches on multiple collections. Each collection is a compilation of data from diverse sources, so the collection is not a good candidate for treatment as an item. The structure of online databases is not transparent. A database record, a single row in a database table, could be considered the smallest intellectual unit within a database. A parish register or bishop’s transcript entry may not actually equate to single row in one table, but entries returned by queries could be stored that way.
Each time a copy is made, a transformation occurs. Transformations change the characteristics of an item. The data contained within items is split, re-arranged and compiled in different combinations each time.
© Sue Adams 2015
Re: “A database record, a single row in a database table, …”, the reality is more complicated, Sue, and makes it even harder to identify an “item”. All but the most trivial of datasets requires more than one database table. However, the actual number and structure of the underlying tables is deliberately kept opaque; what the end-user sees is an abstraction formed from a database “JOIN” of queries from multiple tables that have been correlated with one another. To be accurate, we cannot talk about physical rows in a specific database table because all we’re allowed to see are abstractions: logical views that fall nicely in line with the types of query being performed.
Yes, real databases consist of many tables as a consequence of the data having been normalised, a fundamental principle behind relational databases. The abstracted view, a result of a query on the underlying database tables, is often simplified and presented in a flattened form. So, a view that returns a particular record, identified by a query (probably on the primary key), might more accurately be called an ‘item’. As you said, Tony, not so simple ;-).
For the user, the returned single record, might be the best definition of an item. The storage and processing of the data is part of the processes that produce that result.
Maybe even worse than we’ve both suggested Sue; that “row” in a joined query, or in a database “VIEW”, is transient, and the same tables (each of which would likely their own primary-key) might be used to deliver different result-sets in a different search context. In other words, there probably is no physical “item”.
[…] means of derivation, discussed in What is an Item, has been colour coded: green means an extract/abstract, red is a compilation that includes […]