Harvard University Library


Harvard-Google Project

FAQ: Creating Access to Harvard Library Books in the Public Domain

Google is digitizing a significant number of Harvard's library books that are not under copyright restriction and making them available to Internet users through Google Book Search.

How many works are included in the Harvard-Google project?

The project could eventually cover more than a million works that are out of copyright.

What do Google and Harvard each bring to the project?

Harvard is providing access to out-of-copyright library books from the University's holdings of over 15.8 million volumes. Because Harvard has been collecting books for nearly 400 years, its library holdings cover a broad range of subject areas and include unusual editions, neglected and forgotten works, and unique copies.

Google is bringing expertise in digitizing and its powerful and widely used search technology. Google is bearing the direct costs of the digitization.

How does the scanning of out-of-copyright works benefit Internet users?

The project will dramatically increase Internet access to the public-domain holdings of the Harvard University Library, which, as the largest academic library in the world, includes more than 15.8 million public-domain and copyrighted volumes in approximately 80 physical locations.

The project will enable readers to use keyword searching to locate works that are of interest to them and to read out-of-copyright books online.

Through Google Book Search, readers will be able to find local libraries and booksellers where these books may be available.

Eventually, readers will have the option to download and print PDF copies of these public-domain works.

How will it benefit Harvard students and faculty?

Currently, Harvard's library books are distributed across some 80 library locations. Once a Harvard library book is scanned and processed, students and faculty will be able to use keyword searching as a powerful new tool for finding the book, and then will be able to browse the book, its table of contents, or its index from any location. Given that Harvard now stores 5 million volumes off site, increasing the number of books that can be browsed without physical retrieval will increase access to the collection by the Harvard community.

Will digitized out-of-copyright books be accessible through the University's HOLLIS Catalog?

As the initiative progresses, digitized works from the Harvard collections will become integrated with HOLLIS, the Harvard Online Library Information System.

A user who begins a search in the HOLLIS catalog will eventually see an indication that a book has been scanned by Google and a link to the Google copy for searching and reading online.

A Harvard user who began a search in Google from a Harvard IP address would see a button that would return the user to HOLLIS—and to the physical book in the Harvard collection.

Are any of Harvard's public-domain books beyond the scope of this initiative?

Oversized books, along with other large-format library materials; books in poor condition; University records; and certain other items stored at the Harvard Depository are beyond the scope of this project.

What determines if a book is physically suited to scanning?

Harvard provides clear guidelines about the physical suitability to Google's scanning process. Any book that is likely to be damaged by scanning will not be digitized.

What if a Harvard user needs a book during the scanning process?

If such a book is requested, it will be available for recall in the same timeframe as any recall from a patron.

What privacy policies cover the Harvard-Google Project?

While Harvard and Google both treat their records of book use as confidential, the two organizations collect and retain different kinds of information and have different privacy policies.

The Harvard University Library is committed to protecting the privacy of Harvard library patrons. Our policies conform to the Code of Ethics of the American Library Association, and can be found at http://lib.harvard.edu/comments/privacy.html.

Google collects and stores various kinds of information about use of its services, including Google Book Search. Google's privacy policies can be found at http://www.google.com/intl/en/privacy.html. When anyone uses Google Book Search or other Google services, whether or not accessed via the Harvard system, Google will be collecting information about the use, which may be more extensive than the information collected by Harvard about library patrons and their use of the Harvard libraries, and Google's privacy policies will apply.

Is Google working on similar projects with other major research libraries?

Google is working on related projects with Oxford, Stanford, Princeton, the University of California, the University of Wisconsin–Madison, the University of Michigan, the University of Virginia, the University of Texas at Austin, the New York Public Library, the University Library of Lausanne, the Bavarian State Library, the University Complutense of Madrid, and the National Library of Catalonia along with four affiliate Catalonian libraries, and over time may work with other research libraries as well.

How does this initiative differ from current digitization projects at Harvard?

Unlike traditional digitizing projects—such as those carried out under the University's Library Digital Initiative or the Open Collections Program—which are based on careful book-by-book selections of the very best or most appropriate materials on a specific topic, Google's approach is simply to digitize as many books as possible.

At some point in the future, does Harvard envision allowing Google to scan any of its in-copyright books?

Because of the potential benefits to the educational community and the public at large, the Harvard Library hopes eventually to make its in-copyright books available for digitization and discovery online. However, that is a decision the University will make in the future based on circumstances at the time. For the time being, Harvard is fully occupied dealing with out-of-copyright books, which have the additional important advantage that digital access to the entire work can be provided.

Does Harvard see benefits in the inclusion of in-copyright works in Google Book Search?

Yes. Scholars, students, and readers of all kinds benefit significantly from being able to use keyword searching to find books they would like to study or read. Google Book Search makes it possible to discover works relevant to a given subject or line of inquiry that traditional finding aids, such as card catalogs, may not identify, thus expanding access to past ideas on which others can build. This is especially significant for books that are out of print or hard to find.

In the case of in-copyright works, Google Book Search only displays a few small snippets of text showing use of the search term, unless the copyright holder has authorized a broader display, and then points the user to booksellers and local libraries where a copy of the book may be obtained.

Discovery is likely to lead to greater use, and to the resurrection of intellectual contributions and creative works that have been forgotten or overlooked.

How long will the project take to complete?

The public-domain project is expected to take several years to complete.

Who will scan Harvard's library books?

Working closely with Harvard librarians, Google personnel will scan the Harvard books included in the new initiative.

How does this project relate to Google Book Search and Google Scholar?

The project is part of Google Book Search, which also assists publishers in making books and other offline information searchable online. Google is now working with libraries to digitally scan their collections, and over time will integrate this content into the Google index. Visit their web site for specific product details.

Google Scholar is a related program for journal articles. Again, visit that web site for specific information.

Who are the contacts for more information?

For further information, contact John Longbrake, senior communications director for Harvard University, or Peter Kosewski, director of publications and communications in the Harvard University Library.


Return to the top.