Don’t Be a Snob About Searching the Web – A Cautionary Tale

Sometimes when I try to teach someone about search, I can’t quite get them to see the point. Why should they learn to use search engines well? Why not go straight to Wikipedia, or IMDB, or some other reference compilation? The large reference sites would surely have the answers they seek. And if they don’t – then the information’s probably not online, right?

Wrong, wrong, wrong. SO wrong. In fact, I recently had an experience that wonderfully illustrates how wrong this idea is.

If you’re a longtime ResearchBuzz reader you know I’m a fan of Mystery Science Theater 3000. There are, in fact, a lot of folks out there doing the “riffing” thing, for example Incognito Cinema Warriors, Josh Way, and my personal favorite, Toast and Rice.

I was watching one of Toast and Rice’s shorts, a 1951 number called The Outsider, when I realized I was seeing the actress who played Susan Jane in a lot of shorts. There she was in The Snob. There she was in The Gossip. And she’s actually a decent actress, unlike some of the kids in the shorts (there’s a kid in The Outsider named Junior, and every time he says his line “Is everybody ready for the big feed?” I just cringe.) So, I wondered, who is this actress, anyway?

I did start with IMDB – it lists shorts as well as TV shows and feature length movies. When I looked up The Outsider, I found a little information, including the name of the actress – Vera Stough.

Unfortunately Vera Stough’s page on IMDB didn’t have a lot. A list of credits spanning 1951 (The Outsider) to 1978 (an episode of Eight is Enough).

But it had enough that I called bullspit. 27 years doing movies and television, and no biographical information? If there were just the shorts credits I would assume she left acting after high school/college. But even according to the IMDB she was working steadily between 1951 and 1978. I was missing something, and so was IMDB.

At this point my random curiosity about an actress in 1950s shorts was now a search problem. And to paraphrase Vanilla Ice, when it’s a search problem, yo I must solve it. So I started digging.

For my first search I used the actress’ name, and the name of two of her shorts, hoping the Principle of Mass Similar would find me useful stuff.

“vera stough” “the snob” “the outsider”

Paydirt on the very first page! And what a source – the Franklin Hills Residents Association newsletter, The Overview. The Summer 2004 issue (that link is to a PDF) of this newsletter has a substantial article on the actress, who now goes by Brady Rubin.

Brady Rubin does have a more substantial IMDB page, including a picture, and one credit – The Snob – that’s shared with the Vera Stough page.

Once you have both her names, then getting an even fuller picture of the actress’ life (by searching for both names) is easy. A newspaper article from 1986 reflects on her theater work. Searching just for the name Brady Rubin finds a review of an Ibsen play she apparently directed this past March.

Now of course you want to cross-check, and of course you want to get as many sources for your information as possible, and if I was pursuing this diligently I’d follow up to make sure that there aren’t, for example, two Brady Rubins, one of whom used to be Vera Stough and one who directs Ibsen plays. But the information I got to crack this search open was not on IMDB (both Vera Stough and Brady Rubin are denoted as being in The Snob, but I can find no indication that they’re denoted as being the same person). It was on a general Web search that I was lucky enough to get right the first time.

Wikipedia does not have it all. IMDB does not have it all. Do not assume that these big sites are pulling information from every corner of the Web, especially as you can get substantial information from very unlikely sources (like a newsletter for the residents of Franklin Hills!) Take the time and do a general Web search. You will often find surprising information that the larger sites either haven’t found or haven’t integrated into their own sites.

If you’d like to see Ms. Rubin’s acting chops without the riffing, The Eclectic Screening Room has a thoughtful overview of both The Outsider and The Snob with both shorts embedded in the blog post. She really is quite a good actress!

Keeping Up With Reddit: Use IFTTT, Not Google Alerts

If you consider just mainstream press coverage, you might come away with the impression that Reddit is a chaotic place full of jerks and that the information it aggregates is of no use to the serious researcher.

And you’d be a tiny bit right. Reddit does have jerks, for the simple reason that a community of that size can’t exist without having jerks. In 2015, Reddit had over 8.5 million users provide over 725 million comments on over 73 million submissions. I imagine it would be impossible, statistically speaking, to aggregate that much human interaction and not have some untoward behavior.

Similarly I imagine it would be difficult to aggregate 73 million submissions on a site as wide-ranging as Reddit and not have something turn up that would be of interest or of use to you. Your field of interest would have to be very limited.

Knowing the size of Reddit and knowing its level of activity, I assumed that there was information in there that would be useful to ResearchBuzz. I just had to figure out a way to winkle it out.

After six months of various experiments, I’ve finally settled on a useful way to monitor Reddit without too much hassle. My biggest takeaway? Google Alerts is not the right tool for the job.

Information Trapping Strategies for Reddit

Reddit is divided into categories called subreddits. Any registered user can make a subreddit and there are literally thousands and thousands of them. Metareddit lets you search for various subreddits. The nice thing about Metareddit is that you can search for a keyword it’ll show you subreddits that are somehow relevant to that keyword, not just the ones that contain the keyword in the title. Thus a subreddit search for cows finds not only cows and happycowgifs, but also ranching and vegan (and SlothMemes and Bioshock. Sometimes understanding the connection takes a little digging.)

It may be that as you explore the subreddits you may find that all you need to do is join Reddit and monitor a couple of particular subreddits. If you want to monitor for information on genomics, for example, then genomics and ClinicalGenetics might be all you need. In that case, get a Reddit account, join those subreddits, and enjoy! (The topic “Best Practices for Hands-On Reddit” is a whole ‘nother article.)

Alas, that won’t work for me. I’m interested in online databases and digital archives and so forth, which cover thousands of subreddits. If I tried to monitor just one or two, I’d miss so much.

I started my Reddit-monitoring with Google Alerts.

My Experiments With Google Alerts

I currently have 137 Google Alerts. Three of them are Reddit-oriented, and were generated after I did a lot of experimenting. With many of them I found that I got too much, non-useful results, or just weirdness. Finally I settled on three:

intitle:”new database”

intitle:”new tool”


Are you might notice from the syntax, these are very crude searches. The keywords I want in the title, on the site Adding “new” to the keywords database and tool where the only useful way I could find to narrow down the search results. You’d think “archive” as a lone keyword would be overwhelming, but it wasn’t. Publishing information on archives is not a big part of Reddit, as far as I’ve noticed.

The results I got in my e-mail for the Google Alerts didn’t tell me too much; I generally had to dig.

That’s not to say these Google Alerts weren’t useful at all. I found several things – mostly related to popular culture – that I’m sure I wouldn’t have found otherwise. But I got usable results so infrequently that I was sure I was missing other content.


If you’ve been reading ResearchBuzz for a while, you know about IFTTT. The site allows you to automate certain tasks usings its different Channels. A channel can be a site (like Reddit) or a technology (like e-mail) or a platform (like Blogger) or even an appliance (Samsung has channels for a refrigerator, robot vacuum, and washer!)

IFTTT stands for “IF This, Then That”. You put the channels together in “recipes,” so that if something happens on one channel, something happens on another channel. As you might imagine, this lets you set up a lot of time-saving tasks. And best of all, IFTTT is free!

I didn’t even have Reddit on my mind when I went to set up some recipes in IFTTT — but browsing through the channels, I noticed there was one for Reddit. And then I noticed how powerful it was.

The Initial Setup

I’m going to assume you know the basics of IFTTT. If you don’t, MakeUseOf has a very thorough guide, which should teach you all you need to know about making recipes.

Start with the Reddit channel, at You’ll need to connect that channel, and to do that you’ll need a Reddit account. Make sure you’ve got an IFTTT account and a Reddit account, and that your Reddit channel is connected. When it’s all set up it’ll look something like this:

Screenshot 2016-05-08 at 13.29.34 - Edited

Underneath this screen will be a list of popular IFTTT recipes using Reddit. That’s fun to browse, but the really good stuff is at the bottom of the screen: the triggers and actions.

Every IFTTT has a list of Triggers and Actions. A trigger is something that happens on the channel (the “If”.) The actions are things that happen because of something happening on another channel (the “then”.) The IFTTT channel has a lot of triggers, but only two actions.

The trigger I use is New Post From Search. As the channel explains, “This Trigger fires every time a new post submitted on reddit matches a search query you specify.” There is a limit to 20 items per search so you need to get specific.

Let’s make a recipe. I want a recipe to send me an e-mail whenever a Reddit search mentions online museum. We’ll start with the trigger New Post From Search. I’ve chosen the channel and the trigger, so I need to enter my search term:

Screenshot 2016-05-08 at 14.10.40 - Edited

Note that entering your search term is absolutely all you have to do, though you can refine your search with Reddit’s search operators (more about that in a moment.)

(Wondering why I didn’t put the phrase online museum in quotes? Oddly, Reddit does not appear to support phrase search.)

Now you’ll need to create a trigger. Because I’m using this recipe to monitor Reddit for new resources, I’m having the results emailed to me. But if you look at Reddit’s channels you’ll see you have other options, as well. You could add information to a Google Sheet. You could add the links found to Pocket (though I recommend you be sure your search is getting good results before you do that.) You could even automatically tweet something the monitor finds, though again you better be sure your results are fantastic (and that you’ve got the steely nerve required to auto-Tweet links found by automated Reddit searches.)

I find email works best for me, and IFTTT thoughtfully has a quick template of the results that I’ll get back from my Reddit searches. I already have the Email channel set up in IFTTT, so adding this action to my recipe only takes a moment.

PROTIP: If you don’t want to get an email every time the search finds a resource, IFTTT also offers an “EMail Digest” channel that will send you resource roundups either daily or weekly.

The last step is Reddit showing you a summary of your recipe. Click “Create Recipe” and you’re good to go!

Screenshot 2016-05-08 at 14.11.44 - Edited

Just looking at this recipe, you’ve probably noticed the obvious thing that sets it apart from Google Alerts: you’re searching Reddit directly, not relying on Google to index everything perfectly. That alone is going to increase the quality of the resources you get.

But there’s a second step here that will make your search results even better: using Reddit’s own search syntax.

Tweaking Your Reddit Searches

This was a very basic Reddit search: online museum . We can do better than that.

Reddit’s got a list of advanced search operators at Quickly skimming this list will give you at least a few ideas for making your search better. Let’s hit the highlights.

Reddit Syntax

title: – Just what it says on the tin. Limits your search terms to the title field of a search. If you have general topic interests this is the most important search syntax you can use to keep from being overwhelmed.

site: – If you’ve got a search that’s giving you quality problems and you know your keyword is good, site can be your best friend. It’ll restrict the link you find in your search. You don’t have to just restrict your search to a particular domain; instead, you can search for a top-level domain like edu, gov, or a country code like uk. Try the following searches in Reddit:

title:genetics site:edu
title:genetics site:gov
title:genetics site:uk

See how different those are?

nsfw: Whether a post is considered “safe for work” or not. nsfw:no means you do not want sites which are not considered safe for work. I always set this one to no.

STUPID PET TRICK: You can do a standalone search for nsfw:yes without any other keywords or syntax if you’re just interested in the more — um — adult side of Reddit.

self: Lets you specify whether a post is a text post, or whether it’s a submission with an URL. I want posts with resources, so I set this to self:no. On the other hand, if you were looking for information or anecdotes about something, you might set this to yes. Searching for title:spina title:bifida self:yes might give you insight to what people living with spina bifida or otherwise connected to it are struggling with or want to know more about, while title:spina title:bifida self:no gives you resources and news stories. I recommend you experiment with this before you use it.

When I use Reddit’s search syntax, my search recipe looks a little different. Instead of just online museum, it’s title:online title:museum nsfw:no self:no. Testing that search on Reddit gets me a reasonable number of useful results. I finish creating the recipe and bam, I’ve got a new information trap on Reddit.

I’ve published this recipe on IFTTT at You can make your own copy and edit it to reflect the topics and ideas that you’re interested in as you explore using Reddit as a resource for links and information.

A Couple of Hazards

In the month or so I’ve been using IFTTT and Reddit, I have found a lot of useful sites. I’m getting more e-mail, of course, but since the titles are so clear, I’m sure most of the time when I follow a link it’ll be useful. Two particular things, however, are an issue:

Cascades: I call them cascades, anyway. Sometimes you’ll get a useful link multiple times. The Panama Papers database release is a great example. I saw a few mentions of it, then as the release date of May 9th got closer I saw more, and then an absolute flood a couple days before before the database release. I just delete them as the subjects of the e-mail alerts make it clear they’re duplicates.

Old Resources: Sometimes someone will post a resource that’s very old thinking it’s new and cool. If you’re just gathering resources this won’t matter, but I try to keep ResearchBuzz focused on fresher links.

(What about spam? – Personally, I haven’t seen much spam on the Reddit resources I get. I think my keywords are keeping most of it at bay. If I did see more I would revamp my keywords or consider using the site: syntax to restrict my searches to at least the .edu and .gov top-level domains.)

Why Don’t You Just Use an RSS Feed?

Reddit, unlike many sites, offers an RSS feed of its search results (You can see the one for the online museum search here). So why don’t I just put an RSS feed together instead of messing around with IFTTT?

There are a few reasons:

  1. Getting the results the IFTTT way means you’ll have more flexibility on how you can use them – sent to Pocket, auto-tweeted if you’re nervy enough, etc.
  2. The duplicates are easier to deal when you get them individually in e-mail versus when you get them scattered all over an RSS feed – at least to me.
  3. My priorities. My e-mail alerts are a higher priority to me than my RSS feeds, because they’re targeted. When I get alerts sent to my e-mail, I know I’ll read them first. If I put them in an RSS feed it’s not clear when I’d get around to reading them.

This Just Works Better Than Google Alerts

When I first started using IFTTT as an information trap, I had no idea it would work this well. I wish I’d cottoned to it a long time ago! If you’ve been interested in Reddit but could not work out a good way to monitor it, try IFTTT. You’ll save a tremendous amount of time and I’d be shocked if you didn’t find useful resources.

Do you like ResearchBuzz? Does it help you out? Please consider supporting it on Patreon. Not interested in commitment? Perhaps you’d buy me an iced tea. I love your comments, I love your site suggestions, and I love you. Feel free to comment on the blog, or @ResearchBuzz on Twitter. Thanks!

Wolfram|Alpha Celebrates First Anniversary with Some New Features

Happy birthday, dear Wolfram|Alphaaaaaaa…. happy birthday to youuuu….. Search engine Wolfram|Alpha put up an interesting blog post Tuesday about its first anniversary and the way it has changed over the last year. The search engine also announced a few changes.

The home page is a bit different, pretty but still simple. If you’ve never quite “gotten” W|A, check out the examples by topic, so you can get an idea of what Wolfram|Alpha can do. If you really want to get under the hood, check out the still-incomplete entity index, which shows you very specific examples of what W|A covers in different categories. (This is still under development but it’s fascinating and I can’t wait to see how it fills out.) The home page also has settings now, too, though it’s just for background settings (the blue one is nice) and whether W|A shows hints or not. Looks like it relies on cookies to keep these settings.

There’s also some new content; the site now offers street maps; searching for something like Sydney Opera House shows, in addition to information about the structure itself, a street map to where the structure is located. There’s also several ways to search for diseases — pulling up that URL will let you calculate disease risk, look at the incidence of disease in populations, get information on specific diseases, and more. I did find that I had to play with my searches a bit to get some of these results. And of course I knew a long time ago that the phrase random disease works.

W|A also announced that when the search engine doesn’t know the answer to a question, it’ll will try to find the “nearest” query to interpret. It doesn’t work all the time, but W|A is working on making this better. I’ll need it, because I’m still not great at figuring out Wolfram|Alpha’s syntax sometimes, though I find myself using it more and more.

In fact, I’m using it so much that I find myself actually looking for a couple of features, though neither one of them is probably what W|A is made for. First of all is an expansion of random words. You can search W|A for random word and get a word with definitions, synonyms, etc. But though the definitions include the parts of speech, you can’t search for, say, random noun. I wish you could; it would be a handy tool for Mad Libs or generating random queries for Flickr. You also can’t stack random queries, either, which is a shame. Wouldn’t it be a great creativity tool for writers if you could run the query random first name, random surname, random occupation, random city and get all the answers on one page?

Happy birthday, Wolfram|Alpha. You’re not getting as much attention as you probably deserve, but it hasn’t stopped you from evolving in new and useful ways. Keep it up!

Searching Dashboard Style

A couple of weeks ago Netvibes announced a new “dashboard engine” to get real-time updates, single-screen style. If you liked Mashpedia you’ll like this. To try it go to and click “Get Started”. You’ll get a page asking you to enter a keyword and then specify whether you’re searching for News, an Artist, or a Brand. I choose a “News” search for Goldman Sachs. The first thing Netvibes did was give me a series of photographs to choose a theme from. Then after I chose one I got my dashboard.

I was a little taken aback by this screen when I saw it; the initial user interface is so slick and the actual dashboard is kind of — boxy. But who cares! Plenty of consolidated data is here and so what if the presenting modules don’t have rounded corners? The default tab is for news and shows results from Flickr, Google News, Yahoo, and Google Blog Search. Each of the modules are customizable; you can change the number of items that show from a source, change the color, remove it entirely, etc. You can also edit the layout of the entire tab. (If you don’t like a widget view there’s also a “reader” view that makes the dashboard look more like a traditional RSS feed reader.)

Yes, this is only one tab of data! There are others; one for general information — more like a personalized portal than anything else — and three devoted to your actual search. There’s one for videos, one for general chatter (on social networks and elsewhere) and one for your search across several different Google properties. If that isn’t enough for you there’s a tab across the top of the page that’ll let you add more content modules, from news to travel. If you can’t make up your mind there’s also a list of essential widgets where you can start.

You don’t need to be logged in to play with Netvibes dashboard, but if you get a (free) account you get more functionality, like the ability to share your dashboards.

I was really impressed with this. There tabs and the widget options mean you can pack a lot of data flows into one screen. The only thing I saw that you’ll have to watch is that the modules don’t update automatically that I could see, so you’ll have to periodically refresh the page. This is an excellent companion to Mashpedia or Cpedia if you’re using that.

Cuil Launches Cpedia, Web Aggropedia

Poor Cuil, a victim of what I like to call “Teoma Syndrome.” Teoma, for those of you who weren’t geeking out on search engines ten years ago, was a search engine that launched in 2000. It got TONS of publicity. Lots of people made noise about it. It was the alleged Google killer. bought it in September 2001, spent some time working on it, and then it just kind of… faded out of the public consciousness. It’s still at if you want to try it. It’s not a bad engine, if you can ignore the smaller data pool and the fact that for every page of ten search engine results you get ten sponsored results (five above and five below.)

When Cuil was launched there was a similar level of fuss, a lot of press, and then poor reviews and some concern about results. Cuil had interesting ideas, but couldn’t compare to Google at launch.

But here’s the thing. If Cuil hadn’t been so constantly and overtly compared to Google, it could have stunk at launch and that would have been unfortunate but okay… it would have had the breathing room to get better. As it was, there were bad reviews at launch, it wasn’t Google, and it didn’t get the traction it could have. And that’s one of my complaints about this “Google killer” business.

The thing that kills Google — one week from now or ten centuries from now — will not look anything like Google. Yahoo and Ask are not threatening Google right now, Facebook and Twitter are. And if something like Facebook gains ascendancy, it won’t be a Facebook clone that overcomes it. It’ll be some site or service that we can’t imagine now, like BrainColorSearch, a USB-powered portable MRI that you hook to your computer while exploring the Internet. The portable MRI device scans your brain constantly as you search and each item of Internet content has an aggregated brain scan associated with it. Searchers using BrainColorSearch match for relevance and for a map that most consistently matches their brain (or how they have declared they want their brain to look — a specified mood.) Searchers end up exploring a Web that makes them the most neurologically comfortable. (Soon tweeners begin “Brain Bombing,” hacking the MRI devices to search for content items that generate bizarre, almost impossible MRI activity maps…)

Okay, that example was a bit goofy, but you get the idea. The next company to displace the leader will not be leader+10%, it’ll be something completely different and (at least for a while) strange.

Anyway, what was I talking about? Oh yes, Cuil. Cuil has moved away from being a Web-type search engine and has announced a change to something different, an attempt to aggregate “instant reference” pages. Sort of like Mashpedia, which I reviewed last week. Cuil’s new effort, Cpedia, doesn’t seem to go to the breadth of Web properties that Mashpedia does, but I found the results more relevant in some cases with nice clustering.

Start your search at and enter a topic. I used Benjamin Franklin as my test search with Mashpedia so I repeated it here. As you can see all the content in the result is from Web search — you won’t find Flickr, YouTube, etc here except as the byproduct of a Web search. The results were a little odd; the first one was “huh?”-inducing — I couldn’t figure out why it ranked higher than the Wikipedia article — but the rest of it was good content. There were other tabs across the top of the page to provide more targeted content — Benjamin Franklin Parkway, Benjamin Franklin House, etc.

There were also related topics on the right. The query Benjamin Franklin spawned such related topics as “Gentlemen Scientists,” “Continental Congressmen From Pennsylvania,” and “United States Presidential Candidates, 1808.” These headings show a list of topics which, when clicked, take you to a new page of search results (though in the same browser window, so open some tabs if you want to go off exploring.)

At the top of the topic list is a window that opens real-time results. I didn’t get any for Benjamin Franklin, so I tried cows and I only got one for that, so I tried Padres and then I got a bunch, from Twitter and other news services. A slider allows you to specify how recent you one the information to be, from last hour to last day. I am not normally one to comment on design, but it’s annoying to have the main search page designed vertically, and the streaming results designed horizontally.

I can imagine using Cpedia in a complementary fashion to Mashpedia. I’d come to Cpedia first, to explore content topically and zero in on the names/descriptions/topic headers that really find me what I want. Then I’d take those to Mashpedia and use them to explore certain parts of the Web (Flickr, YouTube, Twitter) more deeply.