New York Times Publishing More Subject Headings

The New York Times has published a bunch more subject headings to the Linked Data Cloud. I wrote about this last November when the NYT released 5000 person/place/organization names as subject headings (or, as I noted then, you can think of them as tags.) These subject headings are those that the NYT Open blog describes as “subject descriptors” — keywords related to article content instead of proper nouns. This release includes 498 of the most commonly-used subject headers, which, like the names, are mapped to DBPedia and/or Freebase. The NYT hopes to eventually release all 3,500 of its subject descriptors.

You can browse through all the available subject headings (the descriptors and the proper nouns) at http://data.nytimes.com/. Look for the alphabetical browsing links in the middle of the page. I looked through the Ds and the first one I found was DNA (Deoxyribonucleic Acid) at http://data.nytimes.com/26507891352660881440. This page of information shows the first and most recent use of the subject descriptor, Freebase and DBPedia links, and an associated article count. There’s also a “scope note,” that explains exactly what the subject heading covers. In this case the scope is “Used for any coverage that focuses on D.N.A. — whether in research, forensic science, genetics, etc.”

For details about using these headings in an article search, visit the Search API documentation. There’s also an API request tool for experimenting with searches without having to use an API key or build queries in a URL; here’s an example search for the DNA subject heading.

New York Times Offers Most Popular API

Woo! It’s API Wednesday. The New York Times announced a couple days ago its new API, the Most Popular API, the documentation for which is available at http://developer.nytimes.com/docs/most_popular_api. The Most Popular API is for getting links and data associated with the most-frequently e-mailed, shared, and viewed NYT blog posts and content.

The Most Popular API request uses a REST format with responses in JSON, XML, and the new-to-me serialized PHP (.sphp). Essentially when building a query for this API you’re requesting three things: what you want (most e-mailed, most shared, or most viewed), which section (all sections or one or more sections) and the time period you want to include (either 1, 7, or 30 days.) I really like the share-types option, though: you can limit the “most shared” results to the method used to share the items (one or more types, with your options being Digg, e-mail, Facebook, Mixx, MySpace, Permalink, TimesPeople, Twitter, or Yahoo Buzz.) It’s be neat to slice up data and see what people are sharing on Facebook vs. Twitter vs. MySpace.

You need a key to use the Most Popular API, but NYT Open has a prototyping tool where you can experiment with the different request parameters without having to build a request or get a key. Here’s my request for the most shared-to-Facebook items for the last seven days.

New York Times Releases Version 3 Of Times Newswire API

New York Times Open announced yesterday Version 3 of the Times Newswire API. If you’re using version 2, don’t worry; that will be supported until August 2010. The Newswire API site with documentation and changes is at http://developer.nytimes.com/docs/times_newswire_api.

There’s not a huge number of changes here, but the new version does allow you to filter by sections and sources, and apparently integrates better with the Times Article Search API, though I haven’t tried that yet. The new section parameter is called section; you can either specify all or you can give specific section names; the documentation provides a URL for getting a full list of available sections. The source parameter only has three options: you can specify that you want items coming only from the New York Times, only from the International Herald Tribune, or from both papers.

While there are examples with the new parameters, I did not see any applications designed to take advantage of the newly-available parameters. However, there’s always the Times Developer Network Gallery, which shows various applications built on Times APIs. New Apps include We Read, We Tweet and Nooblast.

New York Times Congress API: Version 3

Oh, New York Times, how you irritate me with your constant talk of a paywall (I don’t care if you institute it, I’m just tired of hearing the endless coy reveal and discussion. Paywall or get off the pot.) But no matter how much the NYT gets on my nerves, I always forgive it after a visit to the Open Blog. Latest from NYT Open: version 3 of the Congress API, which was announced late February. Version 2 of the API will be supported until June 2010.

There are several new additions and changes to the new version. New responses for the API include a list of members leaving office, chamber schedule, votes by date, and member sponsorship comparison. Changes to responses include vote responses (which now include bill information), member bio responses now include current party and state attributes (and some social media information if available), and bill details responses now include version information. You can get an overview of all the changes here. Full documentation is here.

So how can you put the new API to work? The Open folks have put together a sample app at http://nytcongress.appspot.com/ that compares voting records between a pair of senators (they’re random; refresh the page to see different pairs.) You also get a list of bills that the pair has cosponsored with links to additional legislation information. (You can get the code for this application at http://github.com/dwillis/NYT-Congress-API-Demo.) There’s also a forum available for discussion of new applications but it’s not active at the moment.

More great stuff from the NYT Open blog. Maybe as the US government starts releasing more data sets the NYT will start integrating some of that information into its APIs?…

More Subject Headings for NYT Data

The New York Times’ Open Blog announced this morning the addition of about 5,000 new subject headings to the Linked Open Data repository at http://data.nytimes.com/. I covered the initial release of 5,000 subject headings last November. These new subjects include geographic identifiers, organizations, and publicly-traded companies.

These subject headings, like the first crop, have been mapped to DBpedia, and Freebase. In the case where subject headings are geographic, they’ve also been mapped to GeoNames, which you can learn more about at http://www.geonames.org/. Aaaaannnd you developer types will be interested to know that the resources are being published in JSON along with RDF/XML and HTML.

To get a sense of what the New York Times has available, you can download data records for people, organizations, and locations at http://data.nytimes.com. You will have to agree to a Creative Commons license and you will have to wait a while — you’ll be getting an XML file but it’s still a pretty big download!