Purchasing, Processing and Loading
Sets of Bibliographic Records in The HOLLIS Catalog:
Considerations


When planning to purchase sets of bibliographic records from vendors, there are many things to consider, including decisions about how best to process and load these records into the HOLLIS Catalog. This document provides a brief summary of these considerations. Libraries planning such data loads are required to review the Harvard University Library Bibliographic Standards (http://hul.harvard.edu/cmtes/haac/bibstan0.html) and to consult with the Data Loads Advisory Subcommittee.

Special considerations: Reproductions, data files, rare and special materials, and non-roman language materials represent types of material which may require special attention during data loads. For example, it is becoming more and more likely that the HOLLIS Catalog already includes titles found in reproduction sets; vendor records for rare book cataloging may be inferior to cataloging already in the HOLLIS Catalog; JACKPHY records will have transliteration and/or vernacular field considerations.

Several sets of bibliographic records have been loaded into the HOLLIS Catalog recently from various vendors, including records from OCLC's Major Microforms (91,760 in 1994), ATLA monographs (26,000 in 1990-95), and Primary Source Media's Western Books (5,209 in 1999). Examples may be viewed in the HOLLIS Catalog under the following series titles:

ATLA monograph preservation program
Black biographical dictionaries, 1790-1950
Columbia University oral history collection
Early American history research reports
Early American imprints. Second series
English books, 1641-1700
German baroque literature, Harold Jantz collection
Hein's legal theses and dissertations
National Resources Planning Board reports and records, 1934-1943
Shakespeare and the stage. Series one
Western Americana
Western Books on Asia: Japan
Western Books: The Middle East from the Rise of Islam
19th-century legal treatises

A. Determining Costs

There are three types of costs associated with loading sets of bibliographic records, which together constitute the total "purchase" price of a data load:

  1. Initial cost of the vendor records
  2. Charges by third-party vendor for preparing records for loading into the HOLLIS Catalog
  3. Anticipated costs of manual clean-up after the data load

B. Evaluating Vendor Records

Harvard University Library has an established set of standards for records in the HOLLIS Catalog. Vendor records should be in MARC21 format, using the MARC21 character set. Vendor cataloging should follow AACR2, with subject headings established according to LCSH and/or MeSH. Obsolete fields should not be used. Non-roman-alphabet records must conform to transliteration guidelines in use in Harvard University libraries.

Request a representative sample set of records from the vendor; check the cataloging for completeness and accuracy. Fields to examine carefully include:

  1. Fixed fields

    a. Check the values for
    • Place of publication (MARC21 008,15-17)
    • Date of publication (MARC21 008, 07-14)
    • Reproduction (MARC21 008, 23)

    b. Note the values assigned to
    • Cataloging source (MARC21 008, 39)
    • Encoding level (MARC21 Leader, 17)

    These values will affect the outcome of duplicate detection and resolution as records are loaded into the HOLLIS Catalog (for example, non-OCLC vendor records will likely have an encoding level of "blank"). The Data Loads Advisory Subcommittee will recommend appropriate values for these fixed fields for each data load.

  2. Identification numbers (MARC21 020, 035, etc.)

    Watch for set ISBN (MARC21 020) used on multiple records, which may cause overlay problems in the HOLLIS Catalog.

  3. Name headings and added entries

    a. Are they complete and do they follow AACR2?
    b. Will additional authority processing be necessary?

  4. Subject headings

    a. Are subject headings present?
    b. Are headings taken from LCSH or MeSH?
    c. Will additional authority processing be necessary?

  5. Series

    Is there a series name associated with the set of bibliographic records? If so, does the HOLLIS Catalog already include entries for this series?

In addition the library should check the HOLLIS Catalog to identify print or reproduction copies of the sample titles, and determine whether the HOLLIS Catalog record or vendor record is the preferred record, keeping statistics on the number of duplicates found.

C. Pre-Processing Options

Before records load into the HOLLIS Catalog, several pre-processing options may be considered. Pre-processing supplied by an outside vendor may:

  1. Correct the fixed field place of publication code (MARC21 008, 15-17), if it reflects the reproduction instead of the original
  2. Correct the Fixed Field reproduction code (MARC21 008, 23)
  3. Correct the Fixed Field date code (MARC21 008, 07-14) to reflect the date in the 260 imprint field
  4. Assign Fixed Field cataloging source (MARC21 008, 39 ) and encoding level (Leader, 17) codes as advised by the Data Loads Advisory Subcommittee, to prevent overlay problems in the HOLLIS Catalog
  5. Remove the 245 $h designation (unless it represents the original format of the piece); keep ISBD punctuation during this process
  6. Create a 533 field for reproductions (reproductions of special and rare materials may already have print cataloging counterparts in the HOLLIS Catalog)
  7. Create an 830 field for the series; if a reproduction set has its own series, add $i to the end of the series statement to retain the series during duplicate resolution
  8. Create appropriate holdings information for loading into the HOLLIS Catalog
  9. Create a call number in the holdings 852 field
  10. Perform authority work (headings which do not conform to LCSH, MeSH, or AACR2 are candidates for authority control services)

D. Loading/Processing the Records

The Office for Information Systems is responsible for loading the data and for producing reports on the results of the data load, including lists of possible duplicates requiring resolution by staff.

Regular processing: Records are loaded and sent through duplicate detection and resolution processes.

Non-regular processing: It may be possible to set flags that will affect the processing of an entire data load. The Data Loads Advisory Subcommittee will evaluate each project to determine whether non-regular processing is advisable.

E. Clean-Up in The HOLLIS Catalog

The library that requests the data load is responsible for any data clean-up activities that the load requires; the library must plan accordingly for the staff workload. Figures from the recent load of 5,209 Primary Source Media records may be used as benchmarks for future loads:

  1. 352 (6.75 %) unresolved possible duplicates
  2. 43 (0.83 %) inappropriately merged HOLLIS/HULPR LOCs which had to be untangled

These problems were detected through regular OIS processing.

Other clean-up may include split files of name and subject headings, and duplicate 830 fields. These problems, however, will not appear in regular OIS reports; instead, they will be discovered during the everyday use of the HOLLIS Catalog by staff and patrons.

Data loads of reproductions are likely to affect records in the HOLLIS Catalog from various libraries and departments across the university, including film preservation units. Libraries responsible for data load clean-up may find it necessary to initiate additional staff training in order to deal with this efficiently.


Report prepared by the Ad Hoc Data Loads Subcommittee: Lynda Kresge, Helen Schmierer, Janet Rutan, Mollie Della Terza, Anne Kern; approved by The Standing Subcommittee on Bibliographic Standards at its August 1999 meeting; reviewed and accepted by HOLLIS Steering Committee liaisons, July 2000; "USMARC" changed to "MARC21" throughout, Feburary 2001; adjusted for current terminology May 2004



Top | BSP Home Page |