How Does Project Gutenberg Work?

Converting Text From Print to Digital

© Jeff Miller

May 5, 2009
Project Gutenberg Website, Ed Willard
Ever wondered how Project Gutenberg converts books from print to digital? This article will take you step by step through the process.

Project Gutenberg is an online catalog of free downloadable books. The collection is comprised of books that are in the Public Domain. Public Domain books are books whose copyrights have expired and the text is available for free public use.

History

Project Gutenberg is the oldest digital library on the Internet founded in 1971 by Michael Hart. Michael was a student at the University of Illinois where he was given access to the Xerox Sigma V mainframe and encouraged to do what he wanted with it. He chose to enter the text of the Declaration of Independence, making this the very first document in the large collection of documents that make up Project Gutenberg as we know it today.

Distributed Proofreaders

Distributed Proofreaders is an organization that grew out of Project Gutenberg. Their sole purpose is to provide downloadable ebooks for project Gutenberg. It is their job to convert books from print to digital to add to the catalog of Project Gutenberg.

The Process

This process summarizes how Project Gutenberg acquires the majority of their ebooks.

  1. An individual finds a book to submit to Project Gutenberg, verifies that the book isn't already on Project Gutenberg or on the in progress list with Distributed Proofreaders. This list is called David's In Progress list.
  2. The individual verifies that the book is in the public domain using the eight rules on the Project Gutenberg website and any other tools at his disposal.
  3. The individual obtains a Clearance Line from Project Gutenberg to upload the book.
  4. After clearing the book with Project Gutenberg the individual scans/OCR's the book and uploads the resulting files to Distributed Proofreaders.
  5. After Distributed Proofreaders receives the uploaded files, the documents are proofread two times by volunteers.
  6. After all of the book pages have been proofread, the files are sent to a Post Processor who combines all of the page files into a single file, does a final edit to deal with splits across pages, and checks for consistency in proofreading.
  7. A peer review is done on the completed file.
  8. The ebook becomes available on project Gutenberg.

Who Can Contribute Books?

Anyone who loves books and wants to help in the consolidation and digitization of classics can upload books to Project Gutenberg. Of course, it takes a certain level of technical expertise and an investment of time to scan and OCR the book. (OCR stands for Optical Character Recognition and is a type of software that can ‘read’ a scanned text document and save it as ASCII text that can be edited on a computer.)

Other Ways Project Gutenberg Acquires Books

Aside from the aforementioned acquisition process, Project Gutenberg also acquires books by buying them and performing the previous steps themselves. They get these books from libraries, yard sales, book stores,... wherever they can find them at an affordable price. Books are also donated by individuals who want to contribute to the Project Gutenberg cause, but don't have the resources to scan and upload their books.

Getting Involved with Project Gutenberg

If you would like to learn more about Project Gutenberg, or to get involved with what they are doing, you can get more information by going to their website. You can also go to the Distributed Proofreaders website if you want more details about what is involved in uploading books to Project Gutenberg


The copyright of the article How Does Project Gutenberg Work? in Audiobooks/Ebooks is owned by Jeff Miller. Permission to republish How Does Project Gutenberg Work? in print or online must be granted by the author in writing.


Project Gutenberg Website, Ed Willard
       


Post this Article to facebook Add this Article to del.icio.us! Digg this Article furl this Article Add this Article to Reddit Add this Article to Technorati Add this Article to Newsvine Add this Article to Windows Live Add this Article to Yahoo Add this Article to StumbleUpon Add this Article to BlinkLists Add this Article to Spurl Add this Article to Google Add this Article to Ask Add this Article to Squidoo