Computational Analysis of Catalogue Data: Setup

Getting ready

You need to install AntConc 3.5.9 and download a data file to follow this lesson.

Installing and running AntConc

You can download AntConc from https://www.laurenceanthony.net/software/antconc/. This lesson has been developed and tested with AntConc version 3.5.9.

Please make sure you use version 3.5.9, as versions 4.0.0 and above have a substantially different visual interface and options

There are versions for Windows, macOS and Linux.

Please follow the installation instructions on the main AntConc page. Mac users: if you have trouble running AntConc, do Control+Click on the app and then select Open.

Downloading the data

Download IAMS_Photographs_1850-1950_selection3.txt, which is a .txt file that contains a corpus described at IAMS_Photographs_1850-1950_selection3_readme.md. Clicking on IAMS_Photographs_1850-1950_selection3.txt will open the file in a new browser tab. Be sure to right click or control click in order to save the file (NOTE: In Safari, right click and select download linked file; in Chrome and Firefox, right click and select save link as…).

British Museum alternative

An alternative dataset (for use in episodes 10-12) is BM-MDG.zip. This dataset is derived from a dataset published by the British Museum. This data and any derived data are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

If you want to know more about the alternative dataset, see Baker, James, & Salway, Andrew. (2019, June 13). Creation of the BMSatire Descriptions corpus (Version v1.0). Zenodo. http://doi.org/10.5281/zenodo.3245037.

Lewis Walpole Library alternative

An alternative dataset (for use in episodes 13-15) is LWL_prints-data.txt. This repository contains bibliographic metadata (543 records in total) from the Yale University Library made available under a CC0 license. Details of Yale University Library’s Open Metadata Service is available on the Yale University Library website. In recognition of their support in the preparation and publication of this data we thank Ellen Cordes, Daniel Lovins, and Yukari Sugiyama.

If you want to know more about the alternative dataset, see Andrew Salway and James Baker, ‘CatalogueLegacies/transmission: Analysis of transmission from BMSat to LWL (v1.0)’, Zenodo (2021), doi: 10.5281/zenodo.5148228 and James Baker, Andrew Salway, and Cynthia Roman. ‘Detecting and Characterising Transmission from Legacy Collection Catalogues’, Digital Humanities Quarterly 16:2 (2022) where the authors describe the process of creating the dataset. First, they write..

We acquired an export of 16,669 MARC 21 records from Orbis, Yale University’s online library catalogue, selected by choosing all Lewis Walpole Library records with “k” (two-dimensional nonprojectable graphic) in the “Leader - Type of Record” field.

..then..

In order to investigate the influence of the Catalogue of Political and Personal Satires, and in particular Mary Dorothy George’s voice, on later cataloguing at the Lewis Walpole Library, for our first selection of records we created a corpus of descriptions from the MARC field 520 that - based on evidence from other MARC fields in the record - could not have been based directly on the Catalogue of Political and Personal Satires. Specifically we aimed to select those Lewis Walpole Library records that describe satirical prints that are not in BMSat. Hence we selected all records that met all the following criteria: (i) the string ‘Satires (Visual Works)’ appears in the MARC 655 field, and/or ‘Caricatures and Cartoons’ appears in the MARC 600 field; (ii) in the MARC 500 and 510 fields there are no string matches for patterns that characterise the most common ways in which BMSat and British Museum registration numbers are written; (iii) the string ‘not in the catalogue of prints and drawings’ appears in the MARC 500 field (case insensitive matching); (iv) there are no matches for the string ‘ — british museum online catalogue’ in the MARC 520 field (case insensitive matching); and, (v) there is free text in the MARC 520 field but that free text is not enclosed in quotation marks. This gave 543 records

Getting help

User support documentation can be found on the main AntConc page. There are also general and specialist tutorials about using AntConc available on the web, including: