Veritas Huloar
Red Spacer

Systems & Planning: Office for Information

Highlights from FY 2008




Harvard’s HOLLIS (Harvard Online Library Information System) Catalog is the primary discovery tool for library collections located throughout the University including books, journals, electronic resources, manuscripts, government documents, maps, microforms, music scores, sound recordings, visual materials and data sets. In addition, HOLLIS serves as the Library’s processing system for managing, acquiring, cataloging, processing, and circulating library materials.

In FY 2008, as in the past several years, improvements to HOLLIS and its underlying Aleph software have been focused on increasing staff productivity. Throughout the year, OIS worked to improve key productivity tools and to add several services that allow library staff to process new materials more efficiently.

  • The Library Reporting System was enhanced and expanded to provide library staff with a facility for running regular, templated reports and to design their own ad hoc queries against the library collections data. The reporting tool allows library managers to design more efficient workflows, to measure productivity, and to facilitate the process of tracking and managing large collections of library materials.
  • With support from OIS, a university-wide task group expanded the use of a macro building tool that allows library staff to increase efficiency by automating many data entry procedures and reduce errors in highly repetitive data entry.
  • In addition, OIS initiated electronic data interchange (EDI) for orders and invoices. This is yet another example of how routine tasks can be automated to improve accuracy, efficiency, and rapid delivery of new materials to library shelves.
  • Data from Biblioteca Berenson at the Villa I Tatti was migrated into Aleph in FY 2008, marking the end of a multi-year process to incorporate all of Harvard’s major library collections into one shared, integrated library system.

DASH—Digital Access to Scholarship at Harvard

On February 12, 2008, the Harvard Faculty of Arts and Sciences (FAS) approved an open-access resolution granting the University permission to make FAS scholarly articles openly available. The measure requested that the Harvard University Library create an open-access institutional repository to disseminate these articles. On May 1, 2008, the Harvard Law School (HLS) faculty approved a similar measure. With those mandates, HUL established an Office for Scholarly Communication to implement the open-access policies, and to partner with HUL’s Office for Information Systems in building and running the repository, which was given the name DASH—Digital Access to Scholarship at Harvard.

Working closely with Professor Stuart M. Shieber, director of the Office for Scholarly Communication, OIS chose the open-source DSpace software platform as a base from which to customize and build DASH. Each school at Harvard will be represented as a “community” in DASH. The DSpace software will also be substantially customized—including extending it with a Harvard look and feel, Harvard-specific communities, collections, and metadata, Harvard ID login for submitters, Harvard-specific licensing options; and other changes. A software developer was hired on a contract basis for the DASH project in mid-2008 and, worked to implement these requirements in anticipation of a fall 2008 launch of a Harvard-internal beta version of the repository. The DASH beta’s purpose is to allow faculty and other Harvard researchers to deposit electronic versions of their papers in DASH, to increase the quantity of content in advance of a public unveiling in 2009.

Editor's note: DASH, in beta form, was made available to the general public on September 1, 2009.

Harvard–Google Project

In FY 2008, book-scanning projects associated with the Harvard–Google Project were completed in 10 Harvard libraries: the Botany Libraries (FAS), the Godfrey Lowell Cabot Science Library (Harvard College Library), the Fine Arts Library (HCL), Harvard Law School Library, Harvard University Archives (HUL), Lamont Library (HCL), the Eda Kuhn Loeb Music Library (HCL), the John G. Wolbach Library (FAS), Schlesinger Library (Radcliffe Institute), and Tozzer Library (HCL).

In April 2008, OIS began to download Harvard copies of all Google-digitized books. These files are stored and preserved in Harvard’s Digital Repository Service for possible future projects and to ensure long-term access to these valuable works.

Google Book Search for Harvard

Following an extended collaboration between the Harvard University Library and Google, a Harvard-customized version of Google Book Search was launched in March 2008. This significant new version offers users the option to search the full text of all books available in Google Book Search—whether contributed by Harvard, another library, or the publisher.

Users of Google Book Search for Harvard see “Find at Harvard University” links displayed with every item in a search result set. By clicking these links, library users reach individual catalog records when exact matches are found in HOLLIS—together with information on location and availability within the Harvard library system. If an exact match in HOLLIS is not found, a pre-populated HOLLIS search screen opens, making it easy for the patron to launch a new HOLLIS search session.

Any Internet user can access Google Book Search for Harvard from the “Harvard Libraries” portal or by linking directly to

Google Book Search for Harvard is also incorporated in “E-Research @ Harvard Libraries” as a new entry on the “Quick Jump” e-resource list. Users with current Harvard IDs and PINs can access the full text of e-books licensed by Harvard. Users will be taken directly to the full text of the e-books selected.

Exposing Deep Content

In FY 2007, the University Library Council established as its number-two priority for OIS a project to provide access to Harvard digital collections through Internet search engines. Internet search engines “crawl” (i.e. find and index) content from what is known as the “surface web”—web pages that can be located by browsing a web site. They can not normally index the “deep web”—that is, data stored in databases that can only be located by searching through a search interface (like HOLLIS). Library systems must be modified to make their deep web content available to search engines. For HUL, the applications that would best benefit from the greater exposure of internet search engines were those that focus primarily on providing access to digital content such as Virtual Collections, VIA, TED, OASIS, HGL, and the book objects deliverable by the Page Delivery Service (PDS).

To fulfill the goals of this project, OIS designed a set of modifications to these discovery applications that included creating a special index that would be search-engine-crawlable for all records in each system. This index points crawlers to special display pages for each data record. The display pages are optimized for search-engine indexing, augmented with Dublin Core metadata, and simplified in their presentation.

By July of 2008, OASIS, TED, and Virtual Collections had been updated in production. The result has been a dramatic increase in discovery and access to Harvard collections by Internet users. For example, the OASIS application now receives six times as many hits from users who locate an OASIS finding aid in Google Web Search as it does from users who do a search using the Harvard-developed, OASIS user interface. TED collections nearly doubled the number of hits in comparison to those from the TED user interface itself. Virtual Collections crawlability was just released as FY 2008 drew to a close. OIS expects that by continuing to provide access to search engines, these numbers will only grow in the future and go a long way towards making Harvard’s digital collections broadly available for scholars far beyond the walls of Harvard Yard.

Task Group on Discovery and Metadata

Following months of review, study, and consultation with several national experts, the Task Group on Discovery and Metadata presented its final report to the ULC in September 2007. As stated in the final report of the group:

“The Task Force was asked to provide the University Library Council (ULC) with both a way to think about these developments and with specific recommendations for short-term action. Unlike other, more formal, studies done elsewhere, the aim was not to present a systematic analysis of the options for libraries and a fully worked out action plan. Rather the work of the group was based on the assumption that there will be continuing and rapid change in these domains, and that the Council will need to revisit the topics repeatedly in years to come. Given this assumption, spending a considerable period of time studying the landscape and deciding on the “right” strategy is inappropriate. In this environment of dramatic change and continuous surprises, education and awareness are critical, and short-term plans are more effective than long-term ones.”

A full copy of the Task Group’s final report is available. An open presentation and discussion of the report for the Harvard library community was held in February 2008.

The primary recommendation of the task group was that Harvard should immediately move ahead with a project to evaluate, select and implement a new platform for the HOLLIS online public catalog. A new discovery environment for Harvard would incorporate the new and innovative features such as faceting browsing and relevancy ranking. The task group was reconstituted and charged to make a recommendation by the end of the academic year. With this challenging timeframe driving the process, the task force identified 5 most likely system options and designed a process that would allow them to evaluate all options quickly and effectively. The criteria used to evaluate options was refined based on the work and recommendations of the previous group. Five sub-teams were formed so the evaluations could all be completed promptly. The Task Force met with vendors and communicated with peer institutions. An open meeting to update the library community and gather feedback was held on May 5, 2008. The final report was presented to ULC in July 2008.

Editor's note: The Task Force presented its final report to the ULC on July 1, 2008. Following intense technical evaluation, the ULC selected AquaBrowser as the new platform in Feburary 2009. OIS launched the new platform in April 2009 at

Assessment and Measurement

In FY 2008, OIS took on a ULC priority project to better measure the use of systems and services developed through the University’s Library Digital Initiative (LDI). An analysis of the systems was undertaken and methods to measure the important functions of each were implemented. The project resulted in new online monthly reports of system statistics.

HGL Usability Study and Redesign

In 2006, the Harvard Geospatial Library (HGL) Standing Committee identified “usability” as a key factor in limiting the discoverability of HGL resources by the ever-widening customer base for geospatial data at Harvard. The use of digitized maps and map data was exploding among undergraduates, graduate students, and researchers at Harvard, and yet HGL retained a user interface from an era when only GIS specialists were expected to seek access to geospatial data sets.

In FY 2007, as a preliminary step to a new user interface design, the OIS usability and interface librarian conducted a usability study of HGL. The study tested a variety of real-world geospatial data discovery and mapping tasks, and was conducted with a set of users with a wide range of exposure to geospatial information systems and to HGL in particular. The results clearly identified a set of 17 priorities for redesigning and rethinking how users interact with modern discovery and mapping systems. The study also identified accessibility enhancements that would make the system usable by a wider range of potential users.

In FY 2008, the Harvard Geospatial Library underwent a comprehensive redesign of the user interface to incorporate design changes suggested by the HGL usability study. A GIS design consultant worked closely with Harvard College Library staff and OIS usability and development staff to create the new design. At the same time, the HGL software developer investigated new open-source user interface and mapping technologies that would give HGL modern, high-performance underpinnings. By the end of the year, development work was well under way, with a public launch of the new user interface planned for the middle of FY 2009.

Editor's note: The new HGL interface was launched on March 13, 2009.

Digital Preservation Program

Over the course of ten years, the Harvard libraries and museums, working through the Library Digital Initiative, have made substantial gains in developing the infrastructure and the expertise necessary to support the work of acquiring, licensing, scanning, cataloging, storing, and integrating digital library content into the academic enterprise. A key aspect of these developments has been the evolution of new forms of metadata. These are needed not only for discovery and access, but also for the long-term preservation of digital content, which stands as one of the key challenges for librarians in the 21st century.

As FY 2008 came to a close, HUL established a new Digital Preservation Program within OIS. The program is charged with providing leadership in digital preservation efforts across the University, as well as overseeing the rapidly expanding and strategically vital Digital Repository Service (DRS). Harvard’s DRS has been in production for more than eight years, and its future development will focus on the expanding demands for persistent digital asset management.

Initially, the Digital Preservation Program will focus on two broad areas:

  • defining additional infrastructure requirements, including enhancements to the DRS, that more fully support digital preservation, and
  • providing detailed analyses of several new formats for the DRS, including but not limited to PDF files and e-mail.

Communication and Outreach

OIS keeps the Harvard library community informed about access to resources, infrastructure development, digital library projects, and related activities through articles and announcements in Harvard University Library Notes, presentations throughout the University, and the Office for Information Systems web site.

In FY 2008, OIS staff redesigned its web site to accommodate the growth of information. The Library Digital Initiative web site was integrated into the new design and two new sections were added: one on digital library projects, and a news feature for announcements related to systems and services. The news feature includes a searchable archive of announcements.

A Selection of Web-Accessible Collections continues to provide easy access to many of Harvard’s subject-specific digital collections.

New Collections Based on the OIS Virtual Collections Service

Since 2006, curators and librarians have used the OIS Virtual Collections Service to harvest descriptions and links from Harvard union catalogs and to provide a customized, web-based catalog of these materials for library users. In FY 2008, OIS facilitated the launch of five new “virtual collections” using the Virtual Collections (VC) tool.