Million Book Project Explodes a Milestone

The Million Book Project, a university-led initiative to digitize books and make them available online, will have to change its name. The project has digitized over 1.5 million books, which the project says represents 1 percent of all the world’s books (that seems kind of high to me) and 20 of the world’s languages.

The books are now available at a single Web site — — which must be flooded because unfortunately it’s not responding quickly at all. Be patient and it’ll load. The very green front page invites you to do a title search (advanced searching and browsing are also available) and warns you that you’ll need the DjVu or the Tiff plugin to view books.

I did a search for irrigation. I got a 176 books returned with over 20000 pages. The books ranged from Budget Estimate Of Revenue And Expenditure Under Major Heads (from India) to The Economics Of Irrigation (from China). The book list is on the left, while the right side of the results page provides information on each book like date of publication, number of pages, table of contents, etc. Not all the information is available for all books — and not all books are completely available either. I saw a couple of “Book temporarily unavailable” messages, and at least one “15% limited access”.

(For which I am not going to overly ding them. Good grief, the project is to put a million books online, and the structure’s been put together for at least 1.5 million books. It’s an amazing amount of international cooperation and coordination. It would be like slamming the Wright brothers for not having honey-roasted peanuts at Kitty Hawk.)

Click on the title on the right side of the screen and you’ll get the book’s page in a new window. What you get depends on the site where you land. I looked at a couple books that were apparently hosted in China and had a hard time paging through them. For one of them I got a 404 error. I had the best luck with the books hosted in Egypt. (Here’s an example.) In the case of this book I could look at it via DjVu or download a PDF file. I occasionally did a site timeout, which I suspect is this site still being very busy.

You can do some advanced searching in addition to basic keyword searching at the main site. You can search by subject and language as well as hosting country and span of years. I wish that you could host by copyright status and percentage available, but I suppose you could get a certain amount of that done by searching for books copyright before 1923.

I am not going to run to this site the next time I need to do some research. I think this is more about possibilities — rapidly developing possibilities — than perfect implementation. And this is apparently not all of it — the project site has information on other projects including the Newspapers Digital Library and the Spoken Language Digital Library.

Categories: News