The New York Times has published a bunch more subject headings to the Linked Data Cloud. I wrote about this last November when the NYT released 5000 person/place/organization names as subject headings (or, as I noted then, you can think of them as tags.) These subject headings are those that the NYT Open blog describes as “subject descriptors” — keywords related to article content instead of proper nouns. This release includes 498 of the most commonly-used subject headers, which, like the names, are mapped to DBPedia and/or Freebase. The NYT hopes to eventually release all 3,500 of its subject descriptors.
You can browse through all the available subject headings (the descriptors and the proper nouns) at http://data.nytimes.com/. Look for the alphabetical browsing links in the middle of the page. I looked through the Ds and the first one I found was DNA (Deoxyribonucleic Acid) at http://data.nytimes.com/26507891352660881440. This page of information shows the first and most recent use of the subject descriptor, Freebase and DBPedia links, and an associated article count. There’s also a “scope note,” that explains exactly what the subject heading covers. In this case the scope is “Used for any coverage that focuses on D.N.A. — whether in research, forensic science, genetics, etc.”
For details about using these headings in an article search, visit the Search API documentation. There’s also an API request tool for experimenting with searches without having to use an API key or build queries in a URL; here’s an example search for the DNA subject heading.
Woo! It’s API Wednesday. The New York Times announced a couple days ago its new API, the Most Popular API, the documentation for which is available at http://developer.nytimes.com/docs/most_popular_api. The Most Popular API is for getting links and data associated with the most-frequently e-mailed, shared, and viewed NYT blog posts and content.
The Most Popular API request uses a REST format with responses in JSON, XML, and the new-to-me serialized PHP (.sphp). Essentially when building a query for this API you’re requesting three things: what you want (most e-mailed, most shared, or most viewed), which section (all sections or one or more sections) and the time period you want to include (either 1, 7, or 30 days.) I really like the share-types option, though: you can limit the “most shared” results to the method used to share the items (one or more types, with your options being Digg, e-mail, Facebook, Mixx, MySpace, Permalink, TimesPeople, Twitter, or Yahoo Buzz.) It’s be neat to slice up data and see what people are sharing on Facebook vs. Twitter vs. MySpace.
You need a key to use the Most Popular API, but NYT Open has a prototyping tool where you can experiment with the different request parameters without having to build a request or get a key. Here’s my request for the most shared-to-Facebook items for the last seven days.
Google announced a ton of new APIs yesterday at the I/O conference, and to help me keep up with them I’m summarizing five of the announcements here, along with pointers to potentially useful articles and other resources.
The Google Buzz API
The official announcement: “This initial iteration of the API includes support for fetching public per-user activity feeds, fetching authorized and authenticated per-user activity feeds (both what the user creates, and what they see), searching over public updates (by keyword, by author, and by location), posting new updates (including text, html, images, and more), posting comments, liking updates, retrieving and updating profiles and social graphs, and more.”
BigQuery and Prediction API
The official announcement: “BigQuery enables fast, interactive analysis over datasets containing trillions of records. Using SQL commands via a RESTful API, you can quickly explore and understand your massive historical data….Prediction API exposes Google’s advanced machine learning algorithms as a RESTful web service to make your apps more intelligent. The service helps you use historical data to make real-time decisions such as recommending products, assessing user sentiment from blogs and tweets, routing messages or assessing suspicious activities.”
(These are in limited preview release)
Google Latitude API
The official announcement: “You could, for example, build apps or features for: Thermostats that turn on and off automatically when you’re driving towards or away from home. Traffic that send alerts if there’s heavy traffic ahead of you or on a route you usually take based on your location history. Your credit card accounts to alert you of potential fraud when a purchase is made far from where you actually are. Photo albums so your vacation photos appear on a map at all the places you visited based on your location history.”
Google Font API
The official announcement: “The Google Font API provides a simple, cross-browser method for using any font in the Google Font Directory on your web page. The fonts have all the advantages of normal text: in addition to being richer visually, text styled in web fonts is still searchable, scales crisply when zoomed, and is accessible to users using screen readers.”
Google Storage for Developers
The official announcement: “Using this RESTful API, developers can easily connect their applications to fast, reliable storage replicated across several US data centers. The service offers multiple authentication methods, SSL support and convenient access controls for sharing with individuals and groups.”
(This API is in limited preview)
New York Times Open announced yesterday Version 3 of the Times Newswire API. If you’re using version 2, don’t worry; that will be supported until August 2010. The Newswire API site with documentation and changes is at http://developer.nytimes.com/docs/times_newswire_api.
There’s not a huge number of changes here, but the new version does allow you to filter by sections and sources, and apparently integrates better with the Times Article Search API, though I haven’t tried that yet. The new section parameter is called section; you can either specify all or you can give specific section names; the documentation provides a URL for getting a full list of available sections. The source parameter only has three options: you can specify that you want items coming only from the New York Times, only from the International Herald Tribune, or from both papers.
While there are examples with the new parameters, I did not see any applications designed to take advantage of the newly-available parameters. However, there’s always the Times Developer Network Gallery, which shows various applications built on Times APIs. New Apps include We Read, We Tweet and Nooblast.
Wow! Yahoo announced yesterday its new Yahoo! Updates Firehose service in initial release. This thing sounds massive: “Yahoo! Updates aggregates social updates from Yahoo! and across the Web: It includes a real-time feed of every public action taken on our network … and elsewhere around the Web that users have authorized Yahoo! to make available.”
That includes over 750,000 ratings a day, 8,000 reviews a day, Flickr Uploads, Delicious bookmarks, favorited items on YouTube, etc. No kidding it’s a firehose!
The actual page for the new service is at http://developer.yahoo.com/social/updates/firehose.html, but if you just want to play with the queries you can by visiting http://developer.yahoo.com/yql/console/?q=select%20*%20from%20social.updates.search;. I had to be logged in to a Yahoo account before I could test things.
The service page provides plenty of examples for tracking all these real-time updates, including querying for words, searching for specific links, and even searching for activity by a user or a group of users (that particular example weirded me out a bit; be sure to check your privacy settings.) You can also pull out specific result fields; test results are shown in XML or JSON, and you get what a resulting REST query would look like.
In addition to the YQL Console Yahoo also has some excellent documentation for YQL, making it easy to imagine the possibilities of this huge flow of information. Not surprising at all that this is by the same company that came out with Yahoo Pipes — it might be just as addictive! I’m looking forward to exploring it more.
Oh, New York Times, how you irritate me with your constant talk of a paywall (I don’t care if you institute it, I’m just tired of hearing the endless coy reveal and discussion. Paywall or get off the pot.) But no matter how much the NYT gets on my nerves, I always forgive it after a visit to the Open Blog. Latest from NYT Open: version 3 of the Congress API, which was announced late February. Version 2 of the API will be supported until June 2010.
There are several new additions and changes to the new version. New responses for the API include a list of members leaving office, chamber schedule, votes by date, and member sponsorship comparison. Changes to responses include vote responses (which now include bill information), member bio responses now include current party and state attributes (and some social media information if available), and bill details responses now include version information. You can get an overview of all the changes here. Full documentation is here.
So how can you put the new API to work? The Open folks have put together a sample app at http://nytcongress.appspot.com/ that compares voting records between a pair of senators (they’re random; refresh the page to see different pairs.) You also get a list of bills that the pair has cosponsored with links to additional legislation information. (You can get the code for this application at http://github.com/dwillis/NYT-Congress-API-Demo.) There’s also a forum available for discussion of new applications but it’s not active at the moment.
More great stuff from the NYT Open blog. Maybe as the US government starts releasing more data sets the NYT will start integrating some of that information into its APIs?…
WatchMouse last week launched a new site called API Status, available at http://api-status.com/. This Web site is very simple; it just monitors the status of APIs.
The site monitors the performance of 26 APIs including Facebook, Twitter, YouTube, Amazon, and PayPal. When you visit the site you’ll get a listing of the APIs with their current status, as well as a table showing the history of the API performance of the last seven days. Three icons denote whether the API is running okay, having issues, or down completely.
If you want to dig a little deeper, you can click on an API name. There you’ll get a bunch of details about the API’s performance, including a graph of its availablity over the last seven days and how it’s performing globally.
This site may not be much use to you in your daily research, but if you’ve ever been in a situation when you’re trying to import XML
into a Google spreadsheet and you can’t get it to work properly and you’re about to START THROWING THINGS — ahem, not that this has ever happened to me of course — it’s good to have a place you can quickly check an API status and better yet see how it’s been doing for the last 24 hours (because maybe you should be working on something else!)