Three cheers to Anne F, who let me know about the new Chicago Foreign Language Press Survey from the Newberry Library. It’s available at
The Chicago Foreign Language Press Survey was actually published over 70 years ago; the Newberry Library has brought it into the 21st century. Here’s how the site describes it: “The Chicago Foreign Language Press Survey was published in 1942 by the Chicago Public Library Omnibus Project of the Works Projects Administration of Illinois. The purpose of the project was to translate and classify selected news articles that appeared in the foreign language press from 1855 to 1938. The project consists of 120,000 typewritten pages translated from newspapers of 22 different foreign language communities of Chicago.”
There are over 48,000 articles in the collection. They can be searched by keyword, browsed by groups (groups include Albanian, Filipino, Lithuanian, Croatian, and Slovak), browsed by year (1855-1940), browsed by “Codes” (This is a tree of subject headings — a huge tree), or browsed by source (there are over 400, from the 1933 World’s Fair Weekly to Zwei Jahrhunderte Chicago.
The subject matter spans a great deal, but there’s a lot to be found on the topics of immigration laws, assimilation, education, economics, and social mores. I found many interesting articles just searching for the names of figures of the time. A Russian newspaper wrote a very kind eulogy to Will Rogers in 1935, while in a Lithuanian newspaper I found a reference to a letter from Upton Sinclair (though, sadly, not the letter itself.)
I did a search for computer and got 45 results, mostly because the search engine was matching on things like compute. Attempts to alleviate this by searching for “computer” and +computer didn’t work, in fact they made the results a lot worse. So be sure to use very precise, or, ideally, multiple keywords when you search this resource.
That aside, I love the elegance of the results page. A permanent link to the search results is available at the top of the page. After that there are summaries of matching articles along with information about the original language, source, and date. Click on a summary for the full article, and, beneath the full article, images of the cards from which the article came. Clicking on the headline of the article took me to a direct link to the article with a little additional information, including the article and its information in raw XML.
Though the articles were translations, I did not find them awkward or difficult to read. I did find myself at times interested in a particular source, but didn’t find any additional information at Newberry. Going to the LOC’s historical US Newspaper Directory got me more data about titles. One time it didn’t have the title I was looking for (Cesky Odd Fellow), but it did have a similar title (Cesky republikan) which was also in Chicago.
With the wide matching that the keyword search does, you might have to do some experimental searching before you get the best results, but even a casual browse here turned up fascinating historical material.
Do you ever use Newswise? It’s kind of a press wire for universities, research institutions, non-profit organizations, and other groups that release what Newswise calls “knowledge-based news.” You can browse the available releases at http://www.newswise.com. I get a daily roundup of the news on the site and always find plenty of interesting releases.
Last month Newswise announced that it is making available RSS feeds for each Newswise channel. There are dozens of channels divided up into five channels; you can get the full list at http://www.newswise.com/channels/.
Take for example the diet and nutrition channel, which at this writing has 343 stories going back to December 28, 2008. You can filter the stories you see by type, date range, or institution. And the RSS feed is available at http://www.newswise.com/legacy/feed/channels.php?channel=101.
I will probably keep getting the daily digests of new releases because I am interested in everything, but if you have a particular research focus I think you’ll find these targeted RSS feeds very useful.
So what exactly does it do? Poligraft comes as a standalone Web site or as a bookmarklet. I’m going to do this writeup using the standalone Web site as it’s easier to show. When you visit a Web page or a news story that contains political content, you can run it through Poligraft. Poligraft will give you the story along with context in a sidebar — which lawmakers have been receiving political donations from whom, where aggregated donations from companies go, etc.
For example, take this article from The New York Times: “Education Department Deals Out Big Awards”. I can take that URL and copy and paste it at Poligraft. (I can also paste the contents of an article if I don’t have access to the URL.)
Poligraft reprints the article, but with an information bar on the left. In this case the information bar is showing where political donations from one individual went, and where aggregated donations from several institutions went — to Democrats or Republicans. The information presented in the bar is just a pie chart, which is a little misleading — you’ll note that all of Cornelia Grumman’s donations were all to Democrats — well, her one $250 donation. Meanwhile Johns Hopkins University has well over a million dollars in aggregate donations listed for the last 21 years, but has the same kind of little pie chart.
Each chunk of data on the information bar has a page with more details. The Ohio State University page shows top politicians donated to, as well as money spent on lobbying and issued lobbied about. Many of the individual names in the report pages are clickable, leading you if you wish down a political wonk rabbit hole.
I myself am enough of a wonk to appreciate this as a tool, but not enough of a wonk to really know how to use it (I had to go through several political stories before I found one that provided a lot of information.) I think as we get closer to the midterm elections it’ll be more useful as there will be more topical stories and more quotes from all sorts of organizations. Sunlight Labs is promising to add more data sets over time, too — look forward to seeing that.
YouTube posted something intriguing on its blog yesterday: the announcement of the YouTube News Feed. On a Web site noted for cute kittens, laughing babies, and people explaining historical events while inebriated, the idea of honest-to-goodness news being distributed sounds more radical than it probably should.
The News Feed will be distributed via CitizenTube; if you go there you’ll see a blog style front page with embedded YouTube videos. At this writing the top news story is an explosion of a gas storage tank that took place in North Carolina over the weekend. Each post includes a bit of context and view counts (though the view counts seem really, really low.)
Other stories covered on the page today include “Explosion Injures 15 German Police Officers,” “Knoxville Police Altercation Caught on Video”, and “Congressman Scuffles with Student.” How is YouTube finding this news? Its blog post notes that it is working with the University of California at Berkeley’s Graduate School of Journalism, but it also encourages video uploaders to tweet pointers to their own news videos @citizentube.
There weren’t a huge number of videos on the site — just 12 for the month of June — and it’ll be interesting to see if YouTube goes for depth or breadth. I think if the site goes far afield and pulls in a wide number of videos covering a number of topics, it’ll be interesting to surf. But if it’s just a few videos a day highlighting stories that are already covered elsewhere in depth — why would I watch this instead of video highlights on a news network site?
Thanks to Schelly at Tracing the Tribe for the heads-up about Footnote.com and another of its free offers: this one making its historical newspaper collection free for the month of May.
Footnote’s historical papers are at http://go.footnote.com/newspapers/. The site claims four million pages. Before you start in with the keyword searching, though, explore the galleries on the front page, including vintage comics, news of the weird, and “outrageous ads.” As with the other content, you will need to be logged in (accounts are free) to explore the galleries. After you have amused yourself with Nancy building robots and the Post Toasties ad, you can browse (newspapers from 46 states are available) or do a keyword search. (You can also use the browse page to search with newspapers from a particular state if you like.)
My keyword search for circus found 88,156 results, with further refinements available, including newspaper, last name, place, and year. Confining myself to the Chicago Tribune still gave me about 15,000 results. Sometimes the search results gave me a snippet of context, sometimes I just got that the OCR software had found the word circus. You have the option to exclude OCR-only results, but that’ll leave you with a much reduced number.
The papers are browsable page by page, which is horribly distracting because they have everything — ads, photographs, comics, etc. Occasionally dark and smudgy, but the papers were always readable.
One thing I like about this collection (and which you’ll find different from a lot of other collections) is how recent some of the newspapers are. I did a search for computer and found a newspaper with a computer ad from 1989. There aren’t as many recent newspapers available, of course, but it’s a nice addition after so many collections that don’t go past 1930 or so.
You’ve got ’til the end of May to enjoy this collection from Footnote. Just be strong and don’t find yourself going through all the pages of a 1923 newspaper, gawping at the ads and totally forgetting what you were searching…
New York Times Open announced yesterday Version 3 of the Times Newswire API. If you’re using version 2, don’t worry; that will be supported until August 2010. The Newswire API site with documentation and changes is at http://developer.nytimes.com/docs/times_newswire_api.
There’s not a huge number of changes here, but the new version does allow you to filter by sections and sources, and apparently integrates better with the Times Article Search API, though I haven’t tried that yet. The new section parameter is called section; you can either specify all or you can give specific section names; the documentation provides a URL for getting a full list of available sections. The source parameter only has three options: you can specify that you want items coming only from the New York Times, only from the International Herald Tribune, or from both papers.
While there are examples with the new parameters, I did not see any applications designed to take advantage of the newly-available parameters. However, there’s always the Times Developer Network Gallery, which shows various applications built on Times APIs. New Apps include We Read, We Tweet and Nooblast.
Thanks to the Internet Search Engine Database for the pointer to Nachofoto, a Web site that aims to provide relevant results for trending and hot search terms with recent photos and images. It’s in beta at http://nachofoto.com/.
This is an image search engine, but the idea is not to do standard image searches; instead you want to do searches of things that were recently in the news or which are relevant to the news. For example, I did a search for volcano:
The page results bring you back images from everywhere. In one case when I tried it I got a source, but none of the photos loaded; I only saw that glitch a couple of times. I got images from CNN, Reuters, Yahoo News, and some more somewhat local sources. In a couple of cases it’s not clear why I got an image and I had to visit the page for more details, but most of the pictures were just what you thought they would be: the volcano erupting, delayed flights, airport chaos, etc.
There was one big exception, though — when I went to retake the screen shot, I saw that a story titled “10 reasons ‘Iron Man 2′ is hotter than an Icelandic Volcano” had filled the search results with pictures of Robert Downey Jr. While I personally have absolutely no objection to looking at pictures of Robert Downey Jr., it wasn’t really relevant to the search.
In addition to the pictures themselves, Nachofoto also provided suggestions for other search terms I might want to try (usually more specific) along with a slider to determine how recent I wanted my returned photos to be (they could be anywhere from a day old to a year old — in the cases of breaking news it would be great if you could even specify how many hours old something should be.)
In some cases Nachofoto is not going to be useful, as when recently I was looking for good image examples of a 30-degree angle. (Don’t ask.) But for current and breaking news this looks terrific.