Learning Search

The Importance of Excluding Words When Setting Up Google Alerts

I have written a considerable amount about the importance of picking good words when creating Google Alerts and when doing other search exercises. I have also written a lot about Google Alerts itself. I have written less about the importance of excluding words, and I want to fix that today.

So Many Google Alerts, So Little Time

As I said in a recent article, Google Alerts is one of the three main ways I monitor the media and the Web for ResearchBuzz-worthy articles. As you might imagine, getting a constant stream of e-mails and skimming through them is a time-consuming process. I want to minimize that time as much as possible.

That’s why when I initially set up the alerts, I run the searches and go through that first initial set of results. I’m looking for results that are irrelevant to my interests, but more than that I’m looking for irrelevant results that come up over and over again. I’m going to get results from Google Alerts that I can’t use; there’s no getting around that. What I’m trying to discover is if there is any way I can avoid as many of those as possible.

I usually find that there is, and that’s by excluding keywords in my Google Alerts searches. Excluding keywords works in three ways that I want to show you and give you examples for: pattern avoidance, topic avoidance, and current event avoidance. I hope that when you see how I’m excluding keywords that it’ll give you ideas for crafting your own Google Alerts better.

(NOTE: The term “avoidance” is pretty strong, and combined with “current events” it might sound like I’m burying my head in the sand. But I’m not; instead, I like to keep my news consumption separate from my Google Alerts consumption. When you read “current events avoidance,” understand that I’m just trying to keep my Google Alerts on ResearchBuzz-worthy stories and sources. Okay?)

Pattern Avoidance

If you’ve ever used WordPress or Movable Type or something like that, you know that each installation is laid out in a certain way. Archives might be in this part of the Web site, with each URL having the word archive in it. It might be that each post about an event might have the word “Calendar” in the page title.

Those are patterns – regular ongoing ways that a Web site (or a particular kind of CMS like WordPress) sets up its pages – whether it’s a similarity in page title, URL, or something else. When I go through my Google Alerts I can spot these patterns and make sure that I eliminate a few more useless results from my e-mail. Here are a couple of examples:

“online database” -inurl:blogspot

That Google Alert makes me look like I’m anti-Blogspot but I’m not; I just found that Blogspot blogs tended to be more links to other stories than actual writeups in and of themselves. Just removing Blogspot from the URL cut all of those out of my Google Alerts.

intitle:google -intitle:”google apps” -intitle:calendar (site:nc.us | site:tx.us | site:wa.us | site:ny.us | site:ca.us | site:museum | site:aero | site:edu | site:gov | site:mil)

This is kind of an extensive Google Alert so let’s break it down. I’m looking for pages/articles with “Google” in the title. I know that’s going to get me a crazy number of results so I’m limiting my search to several different level top-level domains, like the ones for North Carolina and Texas as well as the .gov, .mil, and .edu domains.

Restricting my search for “Google” in page titles to these top-level domains will still get me a lot of search results, so I’m also excluding the phrase Google Apps (which will avoid announcements about Google Apps changes) and the word calendar from the page title (So I don’t get Google Calendar and event announcements.)

intitle:database -inurl:library -inurl:jobs -intitle:”job vacancies” -inurl:academics -intitle:calendar (site:nc.us | site:tx.us | site:wa.us | site:ny.us | site:ca.us | site:museum | site:aero | site:edu | site:gov | site:mil)

This is a similarly crazy-long Google Alert. In this case I’m looking for pages and articles that mention databases in the title, but I’m restricting the domains I’m watching to a group similar to the one in the previous example. In this case, I’m excluding the word library from the URL (because I don’t want news about databases that are restricted to library patrons only) along with the word academics (for much the same reason.) I’m also restricting the word jobs (because I found a lot of results about job databases) from the URL and the phrase job vacancies from the title because I was getting ads for database experts.

(Looking at that search you might wonder, “Can you make these searches too long?” I’m sure you can; I just checked and Google still has a query limit of 32 words. Excluding words counts against that query limit (I just checked that too.) That being said, I can’t imagine that there are many Google Alert type searches that would bump up against that limit. Even if they did, you could break the search out into sub-searches and set up Google Alerts for those instead.)

When I was reviewing these searches, I noticed another type of issue. There were results that didn’t have patterns precisely, but which regularly included keywords that meant the story would not be one I was interested in. I have struggled a lot to get these right, but once I do it really cuts down on the amount of non-useful results I get.

Topic Avoidance

I say topic instead of keywords because I’m trying to avoid a kind of news story or Web page, not just a particular keyword or set of keywords. You’ll see what I mean when I show you these examples.

new intitle:”social media” -“twitter reacts” -“win over” -“loss to”

Some news stories are writers going to social media, looking up tweets about a topic, grabbing a bunch of hot takes, and then embedding them into an article. Which is fine, but there’s not much chance those kinds of stories will be useful to me. Removing “twitter reacts” has cut down on those kinds of stories a lot. Removing “win over” and “loss to” removed those stories about social media reactions to sporting events.

“Social media,” even when restricted to just a page’s title, is a really terrible search term. It’s far too general. But you would not believe the number of good articles I’ve found because of it! Every time I think about getting rid of it, I remember the useful material I’ve found and keep it.

best review “drone flight simulator” -intitle:”google play”

My husband is into drones, and I keep this alert around to hear about new drone software. Removing “google play” from the page’s title keeps his alert very low-frequency and I do not get pages about apps and games in the Google Play store (I’m watching for desktop software.)

intitle:”search engine” -seo

Much respect to the SEO folks, but it isn’t my bag. This very simple exclusion makes sure that my search results are more about search engine resources and less about ranking your site as well as possible.

Current Event Avoidance

While there is some change in the way words are used, new slang, etc, excluding keywords to avoid patterns of Google Alerts and topics is generally a one-time thing: set up the Google Alert, examine several pages of search results, and tweak your alert to maximize its usefulness. But occasionally a current event will come up that will completely overwhelm your search results. In that case you might have to take time out to adjust your Google Alerts. I have two examples of this for you.

Tool

Old version: intitle:tool “new tool” -“james keenan”

New version: intitle:tool “new tool” -“james keenan” -band -album

The first time I created a Google alert for “new tool,” I got results about the band Tool and its possible new album (apparently it’s been a long time and the fans are ready for another one.) I looked up some information about the band and excluded “James Keenan,” who is the singer. That worked fine and cleared up my alerts. (Many sites doing stories about the new album would either quote Mr. Keenan or mention they tried to contact him.)

Now apparently the Tool album is either well underway or almost finished, and stories about it are not relying on quotes or information from Mr. Keenan at all. Excluding the words band and album from the Google search are cutting the Tool-based stories down considerably.

Threat

Old version: new intitle:”social media” -“twitter reacts” -“win over” -“loss to”

New version: new intitle:”social media” -“twitter reacts” -“win over” -“loss to” -school -threat

This is a tough one, and when I realized I had to adjust this alert I wanted to cry.

After the mass murder at  Stoneman Douglas High School, my “social media” Google Alerts started going into overdrive, with story after story after story about threats to various schools causing lockdowns and cancelled classes and arrests. I didn’t want to just exclude the word threat, because I would miss stories like “Nerdocrumbezia Leaders Call Social Media A Threat to the Government.” So I ended up excluding both threat and school. I’m pretty sure I’m going to miss useful materials because of this, but I felt I had to do it. I also felt horrible about it.

Revisiting Current Event Exclusions

Whether you need to go back and revisit current event exclusions later depends on what you’re excluding. In these cases I’ll never be particularly interested in a new Tool album, and reading about social media threats to schools will not find me resources for ResearchBuzz. But if you’re excluding the word Olympics during the Olympics, of course you will have to go back and remove that later or risk losing relevant results.

Picking the perfect search term is important, but equally important is excluding terms. It’ll help you keep your search alerts focused and manageable.

This article was brought to you by my Patreon patrons. Their support helps keep ResearchBuzz going. Think a kind thought for them, or maybe consider becoming a patron yourself. Thanks!

5 replies »

  1. Tara, what a timely article. I am a rank novice when it comes to Google Alerts but I have started setting up a few and really appreciate knowing more about fine-tuning the results. Thank you!

  2. Hello, thanks for this article. Do you know what happens if you exceed the 32 word limit? I have some alerts set up that do, but are returning results, and am wondering if anything after the 32 words is just being ignored.

    • That’s exactly what happens! If you try those searches in Google you’ll get a little statement that all terms after the 32nd are being ignored. Hey, thank you for your bee-advocacy!

Whaddaya think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.