Veritas Huloar
Red Spacer

Systems and Planning

Highlights from FY 2009

Highlights from FY 2009

Print

New Discovery Platform/HOLLIS FY 2009

On April 7, 2009, students, faculty, and staff began exploring a trial version of a new web interface for HOLLIS—Harvard’s Online Library Information System. Located at http://discovery.lib.harvard.edu, HOLLIS reflects a new generation of creative thinking about searching on the web as it differs from searches in traditional library catalogs. As users experiment with the new system, the older version—dubbed “HOLLIS Classic”—remains in place, providing traditional search methods at http://hollisclassic.harvard.edu. Both can be accessed from the Harvard Libraries portal. The new system is designed to allow for incremental change and more rapid deployment of new features and sources of data.

DASH—Digital Access to Scholarship at Harvard

DASH, Harvard's open-access repository, provides web-based open access to the scholarly output of the University. On August 1, 2008, the Office for Scholarly Communication and OIS launched the beta version of DASH to the Harvard community. Developed by OIS using the open-source DSpace software platform, the beta release allowed Harvard faculty and other Harvard researchers to deposit electronic versions of scholarly journal articles in DASH in advance of a public unveiling later in 2009. It is designed to track the Harvard contributing authors for each article and to link to the Catalyst faculty profile application. By the end of FY 2009, over 1,000 articles had been submitted to the repository.

Editor's note: DASH was opened to the general public on August 31, 2009. Visit http://dash.harvard.edu.

HGL—Harvard Geospatial Library

The Harvard Geospatial Library (HGL) is a powerful tool for discovering, displaying, and downloading a growing digital selection of over 5,000 historical maps and geospatial data sets. As the culmination of a multiyear effort driven by the Harvard Geospatial Library Standing Committee in collaboration with the Harvard's Center for Geographic Analysis, OIS launched a completely new user interface for HGL. Reflecting usability objectives identified in an earlier HGL usability study, the new user interface allows non-GIS specialists to search, browse, and interact with web-based digital maps and GIS data sets in an intuitive and straightforward manner. New functionality allows users to pan and zoom maps in ways that have been popularized by current web-based map applications and to easily view available data sets within any given geographic extent or by librarian-defined sub-collections.

The new HGL utilizes a combination of commercial and open-source components to support high performance and to interoperate with emerging web standards for map rendering. HGL can be searched both through its new web interface and through a plug-in module that integrates seamlessly with desktop GIS applications.

The new HGL interface was launched on March 13, 2009. Visit http://dixon.hul.harvard.edu:8080/HGL/hgl.jsp.

Exposing Deep Content

As a part of Harvard’s strategic objective to further the exposure of Harvard’s collections on the Internet and to make them available for teaching and learning, the University Library Council authorized a project to provide access to Harvard digital collections (Virtual Collections, VIA, TED, OASIS, and HGL) through Internet search engines. To fulfill the goals of this project, OIS implemented a set of modifications to these discovery applications, which allow search engines to access descriptive metadata embedded in Harvard's digital library catalogs.

OIS completed this project in FY 2009. The result has been a dramatic increase in discovery and access to Harvard collections by Internet users. Finding aids in OASIS receive many more hits from users who locate text from the finding aid in Google web searches than it does from users who do a search using the native, Harvard-based OASIS user interface.

Virtual Collections user referrals from search engines has increased from essentially zero in January of 2008 to 3,000 to 4,000 per month by the end of FY 2009.

WAX—Web Archive Collection Service

On February 9, 2009, the pilot public interface of Harvard’s new Web Archive Collection Service (WAX) was launched and made available to the University community. WAX began as a pilot project in July 2006, funded by the University’s Library Digital Initiative (LDI) to address the management of web sites by collection managers for long-term archiving. It was the first LDI project specifically oriented toward preserving “born-digital” material. WAX has now transitioned to a production system supported by the University Library’s central infrastructure.

For the pilot, which was designed to address the capture, management, storage, and display of web sites for long-term archiving, OIS collaborated with three University partners, each fielding a single project: the Harvard University Archives (Harvard University Library), the Arthur and Elizabeth Schlesinger Library on the History of Women in America (Radcliffe Institute for Advanced Study); and the Edwin O. Reischauer Institute of Japanese Studies (Faculty of Arts and Sciences, with sponsorship from Harvard College Library).

During the pilot, OIS explored the legal terrain and implemented several methods of mitigating risks, investigated various technologies, developed workflow efficiencies for collection managers and technologists, and analyzed and implemented the metadata and deposit requirements for long-term preservation in Harvard's Digital Repository Service (DRS).

By the end of July 2009, OIS had stored 5,159 ARC files for 1405 WAX harvests representing 141 “seeds” (starting URLs) in the DRS. These included 335 MIME types and 12,133,528 resources (individual HTML pages, images, graphics, audio or video clips, style sheets, scripts, etc.), for a total of 392 gigabytes.

WAX was built using several open-source tools developed by the Internet Archive and other International Internet Preservation Consortium (IIPC) members. These IIPC tools include the Heritrix web crawler; the Wayback index and rendering tool; and the NutchWAX index and search tool. WAX also uses Quartz open-source job-scheduling software from OpenSymphony.

To view the collections, visit http://wax.lib.harvard.edu.

For more information, visit http://hul.harvard.edu/ois/systems/wax.

E-mail Archiving Pilot

In January 2009, OIS began a two-year pilot project to explore the archiving and preservation of e-mail by analyzing the legal and policy issues involved in e-mail archiving; defining the basic workflows necessary for the archiving of e-mail; and defining and implementing a minimum technical infrastructure to support archiving by Harvard units.

The completed pilot, which is expected to provide hands-on experience to three partners, will result in actual archiving of e-mails from each of their collections. The three partners are: Harvard University Archives (Harvard University Library), Schlesinger Library (Radcliffe Institute), and Countway Library of Medicine (Harvard Medical School).

DRS 2

OIS is in the midst of major enhancements to its digital preservation infrastructure. These enhancements, planned for rollout over the next several years, are collectively labeled “DRS 2.” A major component of this work is aimed at robustly supporting a wider variety of digital formats in the Digital Repository Service (DRS) using standard library metadata schemas for their description. At the end of FY 2009, OIS released enhanced DRS software to support two new formats as a first step in the DRS 2 series. DRS support for these formats had been requested by libraries across the University.

New Formats

  • Portable Document Format (PDF)
    DRS now supports the deposit, management, and delivery of PDF files. This support is enabled by enhancements to Batch Builder, DRS Loader, and DRS Web Admin, as well as the development of the new File Delivery Service (FDS).
  • Opaque Containers
    The DRS has a mandate to support an increasing variety of file formats. However, full preservation support for a new format involves substantial research into preservation and delivery requirements, as well as best practices for creation. In the meantime, librarians are in possession of digital files that must be stored safely until full preservation support  is available. The solution, as enabled by enhanced DRS software released at the end of FY 2009, is to store these materials in zipped files, which are in turn deposited as “opaque containers” in the DRS. These files are “opaque” in the sense that the DRS does not fully characterize the individual files within the container. But at the same time, DRS storage and monitoring services are applied, which protects the content of the files from damage. The digital files can be safely stored and retrieved via the FDS or the DRS Web Admin. In the future, the container may be expanded and migrated to directly supported objects.
  • New File Delivery Service (FDS)
    The FDS is a new DRS delivery service for the newly supported PDF files and opaque containers (ZIP files), as well as XML files, SGML files, and ICC color profiles that are deposited to the DRS and enabled for public delivery.

Verde

OIS implemented the Verde electronic resource management system on July 1, 2009, for the management of centrally purchased and licensed Harvard University Library electronic resources. By replacing the locally developed ERM system with Verde, OIS increased the efficiency of acquisitions and financial workflows for electronic resources, and now offers increased access for librarians to licensing and cost-sharing information. A second phase of the project will extend Verde functionality to individual units for management of local acquired electronic resources.

Scan and Deliver

A new electronic document delivery service, Scan and Deliver, enables patrons to obtain scans of book chapters or journal articles from participating Harvard libraries. Patrons can now place requests for eligible items through HOLLIS and HOLLIS Classic. Scan and Deliver links appear next to each eligible item on full-record displays. Requests are received and fulfilled by library staff using the ILLiad interlibrary loan system. 

New Collections Based on the OIS Virtual Collections Service

New Collections Based on the OIS TEmplated Database Service (TED)

Communication and Outreach

OIS keeps the Harvard library community informed about access to resources, infrastructure development, digital library projects, and related activities through articles and announcements in Harvard University Library Notes, presentations throughout the University, and the Office for Information Systems web site.

In FY 2009, OIS expanded its web site with a new Digital Preservation section. The new section contains information about HUL's digital preservation program and projects, as well as practical guidance on preserving digital content.

In August 2008, OIS initiated a monthly HULINFO digest to highlight the system and service enhancements released by OIS in the previous month.

A Selection of Web-Accessible Collections continues to provide easy access to many of Harvard’s subject-specific digital collections.