LOC + Twitter Archive. Where Do We Go Now?

I was actually out of town last Wednesday, when the news came out that the Library of Congress would be archiving the tweetosphere. I got a couple of text messages about it, and when I got home and checked my e-mail I saw that it was all over the news and that there was a lot of commentary on it.

My initial question (of course) was, “So is this a search engine?” It isn’t apparently a search engine yet; the Twitter announcement reads “Only after a six-month delay can the Tweets be used for internal library use, for non-commercial research, public display by the library itself, and preservation” but doesn’t mention how that research will be carried out; meanwhile the Library of Congress announcement doesn’t say much, nor does the LOC press release. So LOC-Twitter-Archive-As-Research-Utility seems to be somewhere off in the future, leaving me free to consider the Internet’s reactions and my own.

The responses I’ve been reading seem to be divided into two broad categories: “THIS IS SO COOL!” and “THIS IS A TOTAL WASTE OF MONEY!” I would tend more to the former than the latter, but it took me a while to decide that. And it brought me to a question.

(DISCLAIMER: I am not a librarian. I have a tremendous amount of respect for those who are, and I would never make a false claim to an MLS; I know they are earned. But I think a lot about libraries, and I’m thinking about ’em now.)

When libraries were established information was available only in lumps. There were book lumps and magazine lumps and, later, there were audiotape lumps and microfilm lumps and other multimedia lumps. A librarian had a huge responsibility as a curator, to figure out which lumps should go in the library. After all, a building has room for only so many lumps.

Digital things are not lumps. Granted, you have to have space on a hard drive to store them, but more and more space is fitting into smaller and smaller packages. (And if you don’t believe me check out this 128 GB Flash Drive.) So as things get more and more digital, a librarian loses space considerations as a motivator when making decisions of curation — what to keep and what to discard.

What replaces it? The question of relevance? (Who is to say what is relevant? And though it is not currently relevant is it historically relevant?) The question of need? (Another question I doubt one human could answer.) After thinking about it a while I have my answer, which may be different from other answers, and which very well may change over time as the evolution of digital archiving forces us to ask this question over and over again.

My question is: what context does this provide?

What context does a digital archive provide? The answer may be none, if it’s an isolated forum, or it may be none useful, if it’s a spam-ridden Web archive. Or it may be not enough if it’s a small, general collection that might be better combined with something else. In the case of Twitter, the answer to the question is “Lots. The billions of tweets in the archive are the spontaneous reaction to events from earthquakes to celebrity deaths to political elections to television shows.” No, most of the reactions are not profound and some of them are silly and obscene. So what? We’re human. We’re thoughtful and angry and sad and funny and observant and excited and profane.

The next question, and for some purposes the more important question, is How do we explore this context? If that question is answered insufficiently or with useless tools, the context will not help us with our research or understanding. Then, and only then, will the Twitter archive in my opinion be a waste.

In a way we’re kind of lucky, as we’re witnessing the evolution of large-scale digital archiving. 100 years from now our descendants can look at historical events with human reactions to the current tune of 55+ million tweets a day. But with this evolution comes questions, and experiments, and efforts that don’t go where you expect. That doesn’t mean you don’t make the effort, and it absolutely doesn’t mean you stop asking the questions.

