Page Delivery Service (PDS)

What is the Page Delivery Service?
Who can use PDS?
What are PDS file requirements?
Captions for page images
Setting maximum delivery size
Sample Harvard collections using PDS

What is the Page Delivery Service

The Page Delivery Service (PDS) delivers to a web browser scanned page images of books, diaries, reports, journals and other multi-page documents from the collections of the Harvard libraries.

Documents delivered by PDS can be used in ways similar to their print counterparts, for example, browsed through a table of contents or viewed page-by-page. PDS also offers tools to manipulate pages (zoom, rotate, pan), a full-text keyword search of document contents, and an option to create a PDF of the document for printing.

To learn more about the PDS web interface, consult the PDS User Guide.

Who can use PDS?

Any Harvard organizational entity that is registered to use DRS is eligible to use PDS. All of the files associated with digital objects to be delivered in PDS must be stored in DRS. To verify your DRS status, check the DRS Owner List. To register for DRS, consult the Planning section of the DRS website. For more information on how to take advantage of the Page Delivery Service, send an inquiry to the Digital Projects Team in LTS.

What are PDS file requirements?

The Page Delivery Service can deliver page content as page image files, as plain text files, or both page images and plain text. The use of PDS requires that these image and text files, along with a structural metadata file, be stored in the DRS.

The PDS requirements include needing to construct a separate digital file for each page of a page-turned object. For instance, if a page-turned object consists of images and text, for each page there needs to be a separate image file and a separate plain text UTF-8 encoded OCR text file.

PDS allows for PDS documents to consist only of text or only of images as well as of text and images together. There can be one or more image files per page as long as the files represent the same page image and are in different formats. For instance, one page can have a TIFF file as a master copy and a JPEG 2000 file as a deliverable copy as well as a plain text UTF-8 encoded OCR file.

Page images of a document can be bitonal (black and white), grayscale, or color. PDS delivers page images in JPEG or GIF format, and can create delivery images from JPEG2000 JP2 or TIFF masters. The following table describes these options:

Page image deposited as: PDS will deliver as:

If more than one of these formats is available for a document, PDS will use this order of preference: (1) JPEG/GIF (whichever is listed first for that page in the METS file), (2) JPEG2000 JP2, (3) TIFF.

**If the PDS TIFF-to-GIF service is used, TIFF images must be bitonal, in CCITT Group 4 Fax compression. Required TIFF values would be:

  • Compression = 4 (CCITT Group 4)
  • PhotometricInterpretation = 0 (WhiteIsZero)
  • BitsPerSample = 1

Plain UTF-8 encoded text for a document is required for DRS in order for the text to be searchable in PDS. Plain text files are created by using OCR (optical character recognition) during the page scanning process or by manually rekeying the text. These plain text files can be used only to support keyword searching in PDS or the plain text can be offered as a display option. It is also possible to have PDS offer plain text as the only display option for a document.

Structural metadata identifies all the components of a document, describes its structure, and allows for page-turning navigation. Each PDS document (or group of documents) must be accompanied by a structural metadata file formatted in XML according to the standardized METS schema, an encoding format for descriptive, administrative, and structural metadata for textual and image-based works. (See the Harvard METS profile for PDS for more information about the Harvard METS implementation for page-turned materials.)

This XML file also contains descriptive metadata about the original source material that is used for the display labels in PDS and to maintain a permanent connection between the digital object and the original source material. The XML metadata file can be produced in-house or by a reformatting vendor. Details about PDS structural metadata are provided in the PDS Workflow document.

Captions for page images

DRS collection curators can request that image captions be applied to the page images stored in the DRS and delivered by the Page Delivery Service. See Image Captions [pdf] for more information.

Setting maximum delivery size

The default maximum size for digital images delivered by the Page Delivery Service (PDS) is 2400 pixels in the largest dimension. But curators can set a lower maximum delivery size for their page images on a per-billing-code basis. See Image Delivery Size Restriction [pdf] for more information.

Sample Harvard collections using PDS

The Harvard/Radcliffe Online Historical Reference Shelf provides links to annual reports and narrative histories that are delivered by PDS.

The Women Working collection compiled by the HUL Open Collections Program offers a variety of digitized books, diaries, and manuscripts delivered through PDS.