The US Patent and Trademark Office announced today that it was entering into a two year, no cost agreement with Google to make bulk electronic patent and trademark public data available. The USPTO provides the data, Google hosts it for the public.
By the USPTO’s estimate this is going to be about ten terabytes of data, and includes patent grants and applications, trademark applications, and patent and trademark assignments, with more data (like trademark file histories) available in the future. Google’s hosting this data right now; you can see it at http://www.google.com/googlebooks/uspto.html.
Google divides this hosted files into two pages: patents and trademarks. Google notes that it is only hosting the data provided by the USPTO; it isn’t altering it or changing it in any way. And this is BULK hosting. The patent data download page lets you choose between several different types of information (Grant images, Grant full text, Grant bibliographic data, Published applications, Assignments, Maintenance fee events, USPTO Red Book and Classification information) but when you pick one of them you’ll get a list of years and a list of files. I downloaded a patent assignment file and looked at it. It was a small XML file.
I looked at the trademark data page, which consists of Grants & applications, 1870-2008, Recent applications, Recent assignments, and Trademark Trial and Appeal Board decisions. Recent Assignments was another huge set of zip files for the last three years. I downloaded one at random and opened it. It’s an XML file with information about the schema at the top and a bewildering array of text below, using that schema, that was meant for text parsers or bots, not people.
I’m kind of surprised that Google is making this available as ZIP files; maybe it wants you to download it to your own machines before you start slicing and dicing it. I don’t do a lot of trademark and patent research; all I know is that there’s a LOT of data here, and according to the press release there’s going to be even more.
Thanks and a gingerbread man to Slashfood for the pointer to a new database on food legislation from the Yale University Rudd Center for Food Policy and Obesity. Actually there are several resources at the Rudd Center, but let’s start with the database.
The database of US legislation related to food policy and obesity is available at http://www.yaleruddcenter.org/legislation/. Here you can search for legislation, get bill updates, get a list of bills that have been enacted into law, and get details on Congressional lobbying. The search is not a keyword search; instead, it’s a series of drop down menus that allow you to specify a particular state (or Federal legislation) and what issue. Once you’ve specified an issue, you’ll get a list of bills sorted by state. I searched for all legislation related to “Access to Healthy Food” and got around two dozen bills from Alaska to Washington.
Bills have their own pages which include status, summary, sponsor information, and links on the appropriate state’s legislation page. All the status updates I found were from February, March, and April.
If you’re interested in various food legislation issues, poke around the rest of the Rudd Center site. You’ll find a map of soft drink tax legislation (PDF format), podcasts, and policy briefs and reports.
Google was a busy little bee last week. Another of the things it announced was a new Government Requests tool, which shows information about requests for user data or content removal from government agencies worldwide. (This is for Google and YouTube.) This iteration shows data from July through December of 2009, and there are plans to add data in six-month increments. The tool is available at http://www.google.com/governmentrequests/.
The site is basically a map with two menus. You can see government requests for data, and you can see requests for content removal. Brazil tops the list both times, not what I would have expected. There were many, many more requests for data than there were for content removal.
Countries with some data associated with them are tagged with numbers. Click on a country’s number and you’ll get a window showing how many requests were made for data and content removal. You’ll also see what percentage of data removal requests were complied with, and what kind of removal requests they were. A lot of Brazil’s were court orders. I wonder why South Korea had so many AdWords removal requests?
I want more context. I know Brazil has a lot of requests, but how many Brazil pages are in Google’s index? (using site:br as the measuring stick.) 193 million, approximately, at this writing. So that’s .66 content removal requests per million pages of indexed, country-code-specific content in Google’s index. Meanwhile, #2 Germany has 626 million pages (approximately at this writing) which means, what, .30 content removal requests per million pages of indexed, country-code-specific content in Google’s index? Pardon me while I math out for a while…
Thanks to Data Surfer for the pointer to the Traffic Safety Legislation Tracking Database, which tracks information on bills and chaptered laws in all fifty states and DC. It covers legislation from 2007 through 2010 and it’s available at http://www.ncsl.org/?TABID=13599. (The last update was April 6 so I guess updates to the site are ongoing.)
You can search by state, keyword, year, primary sponsor, topic, status, or bill number. I did a search for texting and got results for 34 bills in 14 states. Results are arranged alphabetically by state and each item of legislation (some of them enacted, many failed) includes status, date of last action, and summary. There’s also a history that you can expand for more details.
There were surprisingly few topics to browse through; you’ll have more luck doing keyword searching, especially if you’re interested in the intersection (um, no pun intended) between electronic communications and driving. I did get a couple of weird results on test searches; searching for Internet got me information on “Internet Violence Prevention” legislation, and searching for domestic found enacted legislation about texting and driving that was simply called “Criminal Law.”
The Harvard School of Public Health has released a new site called The Firearms Research Digest. The site has six years’ (2003-2008) worth of summarized research from social science, medical, criminology, and public health journals. It’s available at http://www.firearmsresearch.org/. The site will eventually be expanded to include research from 1988 to the present.
You can do a simple keyword search for you can search by topics (a few dozen), year of research, or publication (there’s a huge list of publications available.) I did a search for storage and got 47 results, which were divided into sublistings including topics, keyword, and title.
The results include a list of related topics but the article list itself includes a title and expands into a variety of bibliographic information for the article and a short summary of the article itself. Articles get their own summary pages which contain links for printing or e-mailing.
A full list of available research topics is at http://www.firearmsresearch.org/content.cfm/topics and there’s a short list of links at http://www.firearmsresearch.org/content.cfm/links. If you’d like to review some spotlighted research, you can do so at http://www.firearmsresearch.org/content.cfm/spotlight.
The Washington Watch Web site has launched a mapping site that takes and maps congressional earmarks. Wikipedia defines an earmark as “a congressional provision that directs approved funds to be spent on specific projects or that directs specific exemptions from taxes or mandated fees.” I thought the definition was more pejorative than that, but this is fairly neutral, an objective determination whatever you may think of the actual practice.
So anyway, Washington Watch is at http://www.washingtonwatch.com/bills/earmarks/ and at this writing has over 7,000 earmarks mapped. The front page has a map you can browse, or you can choose a state and/or representative whose marks you want to browse.
Choosing a state gets you a list of earmarks that includes a one-line summary of the earmark, the name of the recipient (company or institution), and the amount of money requested. The amount of money varied a lot — I saw requests for over five million bucks and I saw one for $4. (That’s right, $4. Got to be a typo, huh?) Click on a summary line and you’ll get a Wiki page of information about the earmark, including the contact information for the recipient company/institution and a more extensive summary. And because this is a Wiki, there’s a place to edit points in favor and points against the earmark. You may also leave a comment about the earmark. (Unfortunately most of the ones I looked at did not have comments.)
To the write of the information about the earmark are lists of things you can do about it. You can vote on it (yes/no), you can send out social media alerts on it, you can contact your representative about it, or you can send out an e-mail alert. An RSS feed lets you keep up with comments on a particular earmark.
The process of gathering up earmarks and information is still ongoing. Interested visitors are themselves invited to add earmarks to the database, using the form at http://www.washingtonwatch.com/earmarks/. There’s actually a contest going on; the top three enterers of earmarks will win, respectively, a Kindle, an iPod Shuffle, and, for some reason, a fruitcake.
There’s a lot of data here — and if enough people want to win a fruitcake there’s going to be even more — but you know what I’d really like to see? Badges. It would be nice if you could take a particular earmark and get a “This sucks!” or “This rocks!” badge, depending on your point of view. Actually some of these earmarks make me think I’d like to go a little further up the indignation scale…