Google Teaming Up With USPTO To Make Patent and Trademark Data Available

The US Patent and Trademark Office announced today that it was entering into a two year, no cost agreement with Google to make bulk electronic patent and trademark public data available. The USPTO provides the data, Google hosts it for the public.

By the USPTO’s estimate this is going to be about ten terabytes of data, and includes patent grants and applications, trademark applications, and patent and trademark assignments, with more data (like trademark file histories) available in the future. Google’s hosting this data right now; you can see it at http://www.google.com/googlebooks/uspto.html.

Google divides this hosted files into two pages: patents and trademarks. Google notes that it is only hosting the data provided by the USPTO; it isn’t altering it or changing it in any way. And this is BULK hosting. The patent data download page lets you choose between several different types of information (Grant images, Grant full text, Grant bibliographic data, Published applications, Assignments, Maintenance fee events, USPTO Red Book and Classification information) but when you pick one of them you’ll get a list of years and a list of files. I downloaded a patent assignment file and looked at it. It was a small XML file.

I looked at the trademark data page, which consists of Grants & applications, 1870-2008, Recent applications, Recent assignments, and Trademark Trial and Appeal Board decisions. Recent Assignments was another huge set of zip files for the last three years. I downloaded one at random and opened it. It’s an XML file with information about the schema at the top and a bewildering array of text below, using that schema, that was meant for text parsers or bots, not people.

I’m kind of surprised that Google is making this available as ZIP files; maybe it wants you to download it to your own machines before you start slicing and dicing it. I don’t do a lot of trademark and patent research; all I know is that there’s a LOT of data here, and according to the press release there’s going to be even more.

Whaddaya think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s