The Library of Congress announced within the last couple of weeks a big upgrades to its Chronicling America Web site, a historic newspaper archive. The upgraded archive has an additional 380,000 pages, including newspapers from three new states (Louisiana, Montana, and South Carolina.) The upgrade also extends the collections coverage further into the Civil War era.
The stats, for those of you playing along at home, now equal almost 2.7 million page views from 348 titles published between 1860 and 1922. The downside is that only 22 states and DC are available. (You can get a list of available newspapers here.)
The Library of Congress’ “Chronicling America” Web site, freely available at http://chroniclingamerica.loc.gov/, has added 287,000 newspaper images from 15 states and DC, which brings the
total on the site to 1.7 million pages from 212 newspaper titles published between 1880 and 1922. I reviwed the site back in May so I’m not going to get deeply into it again, but I did want to mention a few things.
If you’re less interested in the text in the newspapers and more into the illustrations, don’t miss this set on Flickr called “Illustrated Newspaper Supplements”. 364 images ranging from people to places to events to an odd picture of a very small lady and a very large dog.
Chronicling America also has an API; you can get information on that at http://chroniclingamerica.loc.gov/about/api/. The syntax for the API seems pretty simple (and you don’t need an API key!) You can search for keywords, link to particular pages, etc.
If you want to get an idea of what’s available in the archives, but you can’t think of any good search terms, visit the topics list at http://www.loc.gov/rr/news/topics/topics.html. It’s pretty abbreviated — less than two dozen topics — but each one provides useful dates, suggested search terms, and a list of sample articles.
Heritage Microfilm has announced a partnership with the military news source Stars and Stripes which has led to an online digital archive for the Stars and Stripes newspaper. The new archive is available at http://starsandstripes.newspaperarchive.com.
At the moment, the archive has European and Pacific editions from 1948 to 1999. (Apparently at some points the Stars and Stripes has had almost three dozen different editions.) This is over one million pages of content. There are also plans to add more content, including the World War II era, Middle East edition, and additional date options for the European and Pacific editions.
Alas, the site is a paid archive. While you can initiate a keyword search with a really basic date range option (you can narrow your results by year) you can’t even see the list of results without a membership. Memberships range from yearly ($47.40) to a day pass for $4.95.
This one’s been sitting in my queue for a while; I’m glad I’ve finally got the time to review it. I’m not even sure how long it’s been around. But it’s really good. The Library of Congress has launched Chronicling America: Historic American Newspapers, in beta. The site’s free and available at http://chroniclingamerica.loc.gov/ .
The site currently has keyword-searchable, scanned newspaper pages from 1880 to 1910 in nine states and the District of Columbia. Why so limited? Because it’s still being added to, and more content will be put in over time. There’s also a huge directory of information on newspapers published in the US from 1690 until today. Let’s look at at that first, then we’ll check out the scanned pages.
The front page has a link to the newspaper directory where you can browse by title, but skip that. Go right to the title search page. There you can do the GOOOOOD stuff: narrow down your search by state, county, or city; narrow it by span of time printed, find papers by ethnicity, frequency, or language, and of course search by keyword. And you’ll need to narrow down your search; just running a plain search for newspapers in New York found over 11000 results. *11000*. Yikes.
Results are listed alphabetically and there’s some data available; not enough, but some. Click on a paper name and you’ll get data like geographic coverage, dates of publication, language, frequency, publisher, etc. How much is available varies a lot; once I saw two papers with the same titles whose details varied slightly, but I didn’t get enough information on either one of them to tell them apart. This is really a jumpoff directory; find information on a paper here and use it to move on to searching richer sources.
The search of the newspaper pages, that’s completely different. It’s terrific! The way the search results are displayed is fantastic. But you’ll have to use the search page first: search by state or paper, by year or date range, and then use keywords which can include phrase or proximity search.
Your search results include thumbnails of full newspaper pages! That sounds incredibly unwieldy but the places where your keyword appears are highlighted. When you choose a result you’ll get your page enlarged, again with your keyword highlighted. (I love those highlights — love love love — but I wish they were something besides rose color. Maybe highlighter yellow or nuclear green. If you’re searching for a keyword and it appears only once on a page, you’ll occasionally find yourself in a game of “hunt the highlight.”) You can get the text of the page (though it appears to be machine OCR’d and it looks pretty bad), a PDF, or you can download an image. Best of all, you can use a feature called “Draw Zoom Box,” outline a part of the page you want to enlarge, and immediately you’ll go to that area of the page — with the keyword highlighting intact.
I was amazed at how smooth the zooming transition was. This is the most painless scanned-image newspaper searching I’ve done in a long time. In addition there are so many little extras — getting the pages in a variety of formats, several different levels of paper navigation even at the page-level viewing, and best of all, obvious permalinks to individual result pages. This project is going to be terrific. Between this and Wyoming’s project to digitize its newspapers, I may never read news from this century again. MORE PAPERS!
Congratulations to the Irish Times, which is celebrating its 150th anniversary! (I think that’s about 1 million in Internet years.) The newspaper announced yesterday that to celebrate it would be making access to its digital archive free until April 6th.
The digital archive goes all the way back to 1859 and is available at http://www.irishtimes.com/search/archive.html. Make sure you click on the 2nd tab, the one that reads “Digital Archive: Search the Irish Times Paper from 1859 to Present”. (There’s also a text archive but it only goes back to 1996.)
I did a search for “George Boole” and was surprised to get over 4700 results. The archive notes that quote marks work to keep keywords in a group, but I found in my search results lots of references to Saint George and none to the mathematician. It was only when I used the “Refine By” option in the search results to search for “George Boole” again that I got a much more reasonable 77 results.
Search results are listed oldest first and in this case the results started with the obituary of George Boole. Interestingly there are no other archive mentions of him again until 1956, and according to the trendline to the right of the search results, archive mentions seem to peak around 1994/1995.
Archive results show a snippet of the actual page image, not a text snippet. Click on the snippet and you’ll get an image of the newspaper page itself, with the section of where your keywords appear highlighted. It might take a moment for the page image to load, but I found all the images — even back to 1859 — pretty easy to read. You can click and drag on the page to move it around, or you can click on the thumbnail of the page image on the right side of the screen to move to a different part. Page sections seem to load article by article; this works fine for more recent versions of the newspaper (the last sixty or seventy years?) but for the 19th century editions, where pages are just rows and rows of columns, it can get a little confusing.
I spent a great deal of time browsing and reading the archives and was never asked for money or even to login or register. You’ve only got a week or so to enjoy this archive for free!