News

Information Trapping and Twitter

Ohai, I’m back.

Yeah, gone for a while. That darned meatspace. All kinds of stuff can happen and the next thing you know you haven’t put anything on your Web site in six months and people are e-mailing you asking if you’ve lost the keyboard.

But one of my 2009 resolutions was to get back here, since I like doing ResearchBuzz, I’m still crazy about search engines, and I missed ya’ll. There’s probably nobody left — who hangs around an empty RSS feed for months and months? — but if you’re still out there, I did miss you and I’m glad to be back.

I have been doing my Tech Talk thing over at WRAL, so I have been keeping up with my information trapping to a certain extent. But I had not yet delved into Twitter as a way to trap news and information about online search resources. I’ve been playing with it some this evening and wanted to share some conclusions. The Twitter search interface is available at http://search.twitter.com/; I’m Twittering at http://twitter.com/researchbuzz.

The basic Twitter search is simple keyword with the ability to use phrases, exclude words, etc. I tried a couple of sample searches and looked at the RSS feeds, and was struck first by the complete lack of overlap between Twitter and my more traditional new sources. There’s some, but far less than I expected. The second thing I noticed is that I think I’ll be excluding more words than I include; the ability to quickly post and the apparent 15-item limit for Twitter RSS feeds means you really have to work to clamp down the flow and narrow down the kinds of search results you get.

The first thing I figured out is always add -RT to your search, so you don’t get piles of retweets. You’ll still get a few but it gets a lot of noise out of your feed.

The second thing I noticed is that I can take advantage of the patterns of Twittering ‘bots. Searching Twitter for “online library” gets a lot of results from one particular ‘bot, but they’re mostly formatted in the same way so they’re easy to remove.

Third is that you can get a good idea of vocabulary even from one page of search results, and Twitter is tolerant of long queries. So if I want to get news about search engines but not necessarily SEO or rankings and placement, I’m going to have very little luck with “search engine”. I will however do much better with this:

“search engine” -rt -marketing -rankings -myths -optimisation -optimization -visibility -placement

… and even one page of those results only goes back about six hours. But do you see what I mean about excluding more words than I include?

After I’d gotten a feel for what the keyword searches could do I went and took a look at the advanced search options, available at
http://search.twitter.com/advanced. The geographic options are cool, though unfortunately not so useful in the kind of stuff I want to search for. On the other hand, the ability to limit Tweets to only those which have links is very nice (see the checkbox down at the bottom.) There is probably a way to take advantage of the emoticon search but I haven’t figured it out yet. You can also limit the Twitters to those which ask questions.

Searching Twitter is completely backwards from searching a full-text search engine, especially with an eye toward getting a usable and constant flow of information. On a regular search engine, you want to use as many search terms as possible to narrow down your results from a vast ocean of data. In Twitter, there’s still a vast ocean of data, but it’s divided into trillions of drops of water. It’s possible (and from what I’m seeing probably even a better idea) to narrow down with what you DON’T want, instead of trying to guess the right set of keywords from no more than 140 characters at a time.

I’ll be doing more experiments. In the meantime you can check out http://search.twitter.com/operators for a list of Twitter operators and special syntax.