Canada, Google, IFTTT, More: Wednesday Buzz, November 11th, 2015


Libraries and Archives Canada has launched a new database: “Immigrants to Canada, Porters and Domestics, 1899–1949”. “This online database allows you to access more than 8,600 references to individuals who came to Canada as porters or domestics between 1899 and 1949. ”


Google has launched a new search results interface for tablets. “The new interface is very different from the old tablet design, which was a combination of mobile and desktop design in one. This tablet view uses a form of card-like results, with a skinnier top bar navigation and a lot of white space on the left and right of the search results.”

More Google: it is rolling out a new feature called “About Me”. The link goes to a good rundown from Venture Beat. “So far, About me looks like a decent start. Some information (like my last name) I definitely want other Google users to know, while other details (like my phone number) I would rather keep private. I can finally control these easily.”

More MORE Google: you may have known you could access a timer from Google’s search page. Now you can access a stopwatch too!


Possibly interesting: If you add a New York Times recipe to IFTTT, you can get a free 8-week digital subscription to the NYT. The small print: “Promotion is for November and for new NYTimes subscribers only. After adding any NYTimes Recipe you will be directed to The New York Times website where you may redeem a special code. All you have to do is enter your email address to claim a free trial subscription. Smartphone and tablet apps may not be supported on all devices. A few other restrictions apply.”


Wow: Facebook no longer counts 3rd party apps and yet still has over 1.5 billion users. “Some say Facebook inflates its monthly active user count by including people who shared from or used a Facebook-connected third-party app. But today Facebook quieted those critics with a 10-Q update to its SEC filing that says it now only counts people who used Facebook or Messenger directly. That means the 1.55 billion user count it gave yesterday on its earning report is real.”


The latest company to offer two-factor login? Why, it’s Twitch! Still can’t use it on Amazon, though. “Two-factor authentication (2FA) requires two different methods of verification to log in to your Twitch account: your password and your mobile phone. Each time you log in, you’ll enter your password and a unique code that we’ll send to your mobile phone. If your password is somehow compromised, your account will be inaccessible without the code we send your phone.”

Adobe Flash Player: still full of security issues, still a target for hackers. “Software maker Adobe issued an update on Nov. 10 to fix 17 critical vulnerabilities in its ubiquitous Flash player, the day after an analysis found that the program was the most popular target of exploit-kit developers.”

The state of North Carolina has upheld a ban prohibiting sex offenders from using Facebook. “The North Carolina Supreme Court has upheld a state law prohibiting registered sex offenders from using social networking sites such as Facebook that allow minors to join.”


Research indicates that positive emotions are more contagious than negative ones on Twitter. “[Emilio] Ferrara and [Zeyao] Yang used an algorithm that measures the emotional value of tweets, rating them as positive, negative or neutral. They compared the sentiment of a user’s tweet to the ratio of the sentiments of all of the tweets that appeared in that user’s feed during the hour before. Higher-than-average numbers of positive tweets in the feed were associated with the production of positive tweets, and higher-than-average numbers of negative tweets were associated with the production of negative tweets.”

Carnegie Mellon is developing a new tool to easily do visual analysis of large data sets. “Called Explorable Visual Analytics, or EVA, the tool uses a novel computer architecture that enables the analyst to explore raw data through dynamic visualizations with minimal time delay. It’s designed to help users make sense of ‘high-dimensional’ data — that is, data with lots of parameters.” And I mean big data sets: “To refine the design of EVA, the researchers have explored a multidimensional, 100-gigabyte database on the workforce from the U.S. Census Bureau’s Longitudinal Employer-Household Dynamics (LEHD) program.”

This is terrific. The Social Media Collective Web site is aggregating a reading list on the topic of algorithms as a social concern. “This list is an attempt to collect and categorize a growing critical literature on algorithms as social concerns. The work included spans sociology, anthropology, science and technology studies, geography, communication, media studies, and legal studies, among others. Our interest in assembling this list was to catalog the emergence of ‘algorithms’ as objects of interest for disciplines beyond mathematics, computer science, and software engineering.” Good morning, Internet…

I love your comments, I love your site suggestions, and I love you. Feel free to comment on the blog, or @ResearchBuzz on Twitter. Thanks!

NANFWRIMO #6 & #7: Focusing Your Searches With Specialized Vocabulary

Everybody has a unique vocabulary. You may start with the same language as everyone around you, but over time you will develop words and phrases that you use only in specific situations – at work doing a specialized task, at home talking about a favorite meal, with friends mentioning something funny that happened five years ago and became part of your history with them.

EggoYou can even develop a customized vocabulary with your pets. When my cat Eggo hears me say “Apocalypse” she knows this means she better stop what she’s doing and scram, or the world as she knows it is going to end.

(This usually involves me chasing her down the hall with a squirt gun.)

Specialized vocabulary comes from all aspects of your life. As I noted, you make use certain words and definitions only with family. You end up using words that are associated with certain Web sites or software programs. (Hashtag! Vaguebook! Swipe left!) You even learn words just because you live in a certain time and place. (On Fleek! Throw shade! Netflix and chill. Hulu and commit! Amazon Instant Video and pensively brood!)

(I made that last one up.)

Often advice on Internet searching tells you to consider your keywords carefully; make sure you’re describing clearly what you’re looking for, get specific, etc. And there, often, the advice stops. But you can take your searches to the next level by taking advantage of specialized language. You can slant your searches all kinds of ways just by adding an extra word or two — even if doesn’t seem to have much to do with your topic.

I have five tactics for you to try with specialized vocabulary. But first I need to teach you The Glossary Trick.

The Glossary Trick

Often you might find yourself researching topics that you don’t know a lot about. If you don’t know much about the topic, you almost certainly don’t know the vocabulary associated with the topic. Most of the time you can solve that problem by doing a quick search for the topic of your choice and the word glossary.

Try it:

chocolate glossary
sailing glossary
vaudeville glossary
telecommunications glossary

Your goal when looking at a glossary is not to instantly become an expert. It’s to gather words that will provide a greater depth to your search results. If you’re looking for rich, in-depth Chocolate sites, for example, you can really change your search results by adding cocoa nibs or couverture or terroir.

Sometimes this doesn’t work; if I’m looking for vocabulary for a place or a profession, the glossary trick can lead to spotty results. In those cases I will try searching for “only x knows” or “only x understands,” for example:

“only Californians (know | understand)”

(The parens denote two possible words separated by a |, which means “or”. So this query means either the phrase “only Californians know” or the phrase “only Californians understand.”)

The first search result when I try that is from a blog called Breezy Days and it gives me a few words I can try out. Nor-cal, So-cal? Fresno? D-land? Cali, as a possible excluded word? Looking at more search results might allow me to refine and lengthen my list of California-vocabulary.

You can try this with occupations too, though it’s less reliable:

“only nurses (know | understand)”
“only teachers (know | understand)”
“only engineers (know | understand)”

Now let’s play with some different kinds of specialized vocabulary.

1. Time-based

Time-based vocabulary is simply slang that was popular for a defined period of time. It’s not a word like wow, which has remained in use consistently for a long time. It’s a word like tubular, which is very 80s, or groovy, which is more 60s/70s, or fleek, which is recent.

(Where you are the slang may have been different. YSMV: Your Slang May Vary.)

Slang like this is powerful because once it falls out of popular use, it’s gone. Nobody runs around saying “Gag me with a spoon” anymore. It is so associated with an era that people will use it to invoke a time period. When people are writing about the 80s, they’ll use grody to the max or gross me out the door or whatever. Adding time-specific slang terms to your queries will focus them in a way that no attempts to just describe the era could.

To see what I mean, run these four searches on Google:

fashion “like totally”
fashion groovy
fashion fleek
fashion “daddy-o”

Do you see how one little word or phrase can completely bend your search results?

Protip I: Sometimes you’ll find that a slang term has been co-opted by a later era. Searching for fashion tubular will get you a bunch of pages about Adidas sneakers. Just try another one.

Protip II: Using slang to focus your results on a certain area only goes back so far — at this writing, roughly the 1940s. If you try to use 23 skidoo or oh you kid, both phrases from the early 20th century, you’re going to have spotty results. As we humans continue to fill the Internet and explore our history, this may change. I honestly don’t know.

2. Place-based

Lemon PieI am southern and I say y’all a lot. I also say “bless your heart,” mostly un-ironically. You can slant all kinds of searches toward places or regions by adding a place-based term. Try:

“lemon pie”

“lemon pie” “bless your heart”

The second search is not only a lot more southern, it’s moved you past simple lists of a recipes and into — well, basically lemon pie stories.

What slang is exclusive to your area? what happens when you add it to a general search?

Don’t think you have to use slang exclusive to your area, either. Try adding other regional slang to your searches. If you are less familiar with it, it might take some experimenting to change your search results in a meaningful way.

3. Activity-based

I’m going to work tomorrow, price my stacks, draw down my board and get a start on my seven-cards. That’s work vocabulary. This is specialized to me and what I do, so it wouldn’t work for an Internet search, but there’s plenty of industry-specific vocabulary.

Searching for retail information? You’ll make your results more industry-oriented by adding the words “loss prevention” or facing. Trying to get display ideas? Slatwall will turn your shelving search into a bevy of retail possibilities.

Think about the terms you use at work. Are there terms you have to explain to new employees? Those, if commonly used in your industry, are prime candidates for using in your searches to narrow your focus on a particular industry. And activity-based vocabulary isn’t just for work. It’s also for sports, hobbies, and even fandom! (What’s a parrothead? What are browncoats?)

4. Expertise-based

Expertise-based vocabulary is similar to activity-based vocabulary, but it’s more exclusive and indicative of a higher level of training. For example, you know in broad terms what a heart attack is. If you were a doctor you might call it a myocardial infarction.

If you do a search for

“heart disease” “heart attack”

you will get consumer-level information on heart health. On the other hand if you do a search for

“heart disease” “myocardial infarction”

You will get pages which are more dense with medical information and your search results will be prefaced with links to scholarly articles. Using language doctors (or other highly-trained experts) are more likely to use generally gives you more information-rich pages.

Protip I: Don’t know how to make your medical topic more “doctor-y”? Try searching “another term for x”: another term for heart attack, or another term for stroke, or another term for high blood pressure. These will give you different for each of these conditions.

Protip II: This is more like a stupid pet trick, but try adding one of those ridiculously-specific ICD-10 codes to your search. Who knows, perhaps when you add “W61.33XA” you’ll slant your search toward initial chicken attacks. This doesn’t work most of time, but when it does it’s hysterical.

5. Culture-based

There’s slang you learn because of where you grew up. There’s slang you learned because of when you grew up. And then there’s slang you learned because of HOW you grew up, and what you grew into. Your religion, your preferences for love, your ethnic background – the way you describe yourself and your context. These are all opportunities to change your searches to be more reflective of your life and your situation.

Or not. I’m not saying that if you’re southern you have to add y’all to every single search ever.

(DISCLAIMER: I am now tempted to try that for a week just to see how weird it would get.)

I am saying that words are symbols of how we describe our experiences to ourselves and each other, and because we all have different circumstances and contexts, we all use different symbol sets. If it helps you in one of your searches to add symbols that are reflective of your context, than that is a good thing. If you want to try to use symbols that are reflective of someone else’s context in order to get a different perspective, then that can be a good thing – as long as you remember that understanding how someone uses a word does not mean you understand how they experience their lives.

There’s one more specialized vocabulary that everyone can use, no matter what their symbol sets.

6. Outcome-based

Simply: what do you want to happen?

Do you want to cook something? Do you want to buy something? Do you want an opinion? Add words relevant to your outcome. Try these searches:

“lemon pie” nutrition servings

“lemon pie” shipping delivery

“lemon pie” reviews stars

When you’re adding outcome-based vocabulary to your search, think for a moment about your ideal search result – the recipe page, or the order page, or the review. Think about standard words that might appear on the page, and then add them to your search.

Protip: This is also a fantastic way to use exclusionary terms. Try searching for countertop -shipping -sale -order. You’ll find by excluding three words, you’ve removed a lot of sales stuff from your results and focused more on reviews, maintenance, etc.

It’s very important to get your initial search terms right. Once you’ve got that, however, do a little experimenting with specialized language and vocabulary and see if that nudges your results closer to where you want to go.

Schizophrenia, Indigenous Peoples, USGS, More: Tuesday Buzz, November 10th, 2015


Now available: a database of clinical research data on schizophrenia. The data have been translated into one “language” and aggregated so it can be viewed in toto. “Despite hundreds of studies, schizophrenia remains poorly understood. In part, that’s because the findings of traditionally small individual schizophrenia studies are variable and difficult to replicate. The larger database of SchizConnect allows scientists to see broader results across 1,000 subjects instead of 100.”

A new online map contains information on lands around the world held by indigenous peoples. “LandMark is the first online, interactive global platform to provide maps and other critical information on lands that are collectively held and used by Indigenous Peoples and local communities. … LandMark currently provides information at two scales–community level and national level—allowing users to compare the land tenure situation across countries and within countries.”

The US Geological Survey (USGS) has released a new photo catalog. “The U.S. Geological Survey announced today that it has made part of a huge national repository of geographically referenced USGS field photographs publicly available. USGS geographers developed a simple, easy-to-use mapping portal called the Land Cover Trends Field Photo Map. The entire collection contains over 33,000 geo-referenced field photos with associated keywords describing the land-use and land-cover change processes taking place. Initially, nearly 13,000 photos from across the continental US will be available to the public, yet the online collection will grow as more processed photos become available.”

The National Science Foundation (NSF) is awarding funds to establish Big Data Hubs. “As a part of the Administration’s Big Data Research and Development Initiative and to accelerate the emerging field of data science, NSF announced four awards this week, totaling more than $5 million, to establish four Big Data Regional Innovation Hubs (BD Hubs) across the nation. Covering all 50 states and including commitments from more than 250 organizations—from universities and cities to foundations and Fortune 500 corporations—the BD Hubs constitute a ‘big data brain trust’ that will conceive, plan, and support big data partnerships and activities to address regional and national challenges.”


Google Maps is getting easier to use offline. “Now you can download an area of the world to your phone, and the next time you find there’s no connectivity—whether it’s a country road or an underground parking garage—Google Maps will continue to work seamlessly. Whereas before you could simply view an area of the map offline, now you can get turn-by-turn driving directions, search for specific destinations, and find useful information about places, like hours of operation, contact information or ratings.”

Do you want to check out the New York Times’ foray into VR? Here ya go.

Twitter has launched a public policy transparency page, whatever that is. “Because Twitter stands for open communication, we’re pleased to unveil, our new site covering the most critical policy issues facing our users, as well as providing an unprecedented level of transparency into how and with whom we engage politically in the U.S. ” I’m going to assume this is meaningless until Politiwhoops comes back

Facebook has launched a new feature called “Music Stories”. “On the Facebook iPhone app, songs and albums shared from the leading music services will become ‘Music Stories,’ a new post format which allows people to listen to a 30-second preview of the shared song (or album) while on Facebook. The preview is streamed from either Apple Music or Spotify (depending on the source of the link shared), and can be purchased from or saved to the respective music streaming service.”


It’s two years old, but this paper from Ian Milligan addresses issues that are coming more and more into prominence. Check out Mining the ‘Internet Graveyard’: Rethinking the Historians’ Toolkit. “‘Mining the Internet Graveyard’ argues that the advent of massive quantity of born-digital historical sources necessitates a rethinking of the historians’ toolkit. The contours of a third wave of computational history are outlined, a trend marked by ever-increasing amounts of digitized information (especially web based), falling digital storage costs, a move to the cloud, and a corresponding increase in computational power to process these sources. Following this, the article uses a case study of an early born-digital archive at Library and Archives Canada – Canada’s Digital Collections project (CDC) – to bring some of these problems into view.”


Wow! Apparently WordPress now powers 25% of the Web (including this here Web site.) “The latest data comes from W3Techs, which measures both usage and market share: ‘WordPress is used by 58.7% of all the websites whose content management system we know. This is 25.0% of all websites.’ While these numbers naturally fluctuate over the course of the month, the general trend for WordPress has been slow but steady growth.”

Collins has declared “binge-watch” the word of the year. “Meaning ‘to watch a large number of television programmes (especially all the shows from one series) in succession’, it reflects a marked change in viewing habits, due to subscription services like Netflix. Lexicographers noticed that its usage was up 200% on 2014.”


Comcast is having 200,000 customers reset their passwords but says it wasn’t hacked. “[A] package of personal data, including the e-mail addresses and passwords of Comcast customers, was listed for sale for $1,000 on a Dark Web site that was also marketing a number of other questionable goods. The Dark Web is a collection of sites that are publicly accessible but cannot found by search engines.” Good morning, Internet…

I love your comments, I love your site suggestions, and I love you. Feel free to comment on the blog, or @ResearchBuzz on Twitter. Thanks!

Flooding, The Guggenheim, Cylinder Recordings, More: Monday Buzz, November 9th, 2015


The Cumberland River Basin flood of 1939 how has a digital archive of images. “Tammy Kirk, U.S. Army Corps of Engineers Nashville District librarian, scanned more than 200 photos from the 1939 flood so that academics, members of the press, genealogists and local historians can research and utilize the images.”

The Guggenheim has launched its first online exhibit. But it’s a bit more than you might expect. “Troy Therrien, curator of architecture and digital initiatives at the Guggenheim, has a different approach for thinking about the role digital technology plays. He believes rather than incorporating digital works into the analog status quo, museums should be rethinking the architecture of exhibitions altogether. This is why last week, Therrien launched Åzone Futures Market, the Guggenheim’s first digital exhibition that allows visitors to invest in technologies of the future.”


The University of California Santa Barbara has relaunched the Web site for its digital archive of cylinder recordings. There are over 10,000 recordings in this collection, spanning the late 19th century to the early 20th. Note that these recordings capture the attitudes of our culture in that time, and therefore some of them would be considered both racist and offensive today.

Google Calendar now has a “trash” feature. “If you have edit rights for a given calendar in the web app, you will be able to click on its dropdown menu and access its Trash folder. From there, you can restore deleted calendar events or delete them forever.”

YouTube has added VR features to its Android App. “The company added support for virtual reality videos to its Android app Thursday and announced that any YouTube video can now be viewed with the company’s Cardboard virtual reality headset. The changes are limited to YouTube’s Android app for now, but iOS is coming soon, Google says.”


Ancestry is apparently making some of its military records free through November 11. From the NGS: “ is making some of its military collection FREELY accessible November 6-11, 2015. This is in honor of Remembrance weekend (Canada and England) and Veterans Day (US). I only checked these three countries. There may also be similar free access for other countries. Please do check and let us know where else you were able to gain free access.”

Researchers at Harvard and MIT have done a study to discover what makes infographics memorable. “In a new study that analyzes people’s eye movements as they look at charts, graphs and infographics, researchers have been able to determine which aspects of visualizations make them memorable, understandable and informative. The findings reveal how to make sure your own graphics really pop.” The article has some hints, but apparently the data are also being released in a database.

Do you have a page on Facebook? Ever wonder how the “Insights” statistics work? Yoast has you covered. “This post is about all the information that can be found in Facebook Page Insights, and what we feel does and doesn’t make sense to look at for most of our visitors. It all depends on the kind of website you have, of course.”

James Tanner of Genealogy’s Star test-searched for an ancestor on several different search engines. Including Excite, Dogpile, and Lycos. I would love to see this search run with additional, more current engines like Gigablast, Startpage/Ixquick, and maybe Yandex (which does have an English interface.


Google has awarded grants to groups fighting racism. “The technology giant’s philanthropic arm chose organizations in the San Francisco Bay Area taking on systemic racism in America’s criminal justice, prison and educational systems, says Justin Steele, who leads’s Bay Area giving efforts.”

You may have heard of a popular blog called Grantland, which was run by ESPN. ESPN announced its shutdown late last month and the Internet Archive leapt into action to get the site archived. “You can see a visual representation of this effort if you look at the past few months of Grantland archives in the Wayback Machine, which crawls and preserves pieces of the web for the Internet Archive.”

From MIT Technology Review: Google Aims to Make VR Hardware Irrelevant Before It Even Gets Going. “Smartphones have sidelined digital cameras and other special-purpose devices. Now Google thinks mobile phones will shove virtual-reality headsets like the Oculus Rift into the shadows, too.”


Oh eww. There’s apparently a new kind of ransomware that holds entire sites for ransom. “This latest criminal innovation, innocuously dubbed ‘Linux.Encoder.1’ by Russian antivirus and security firm Dr.Web, targets sites powered by the Linux operating system….Typically, the malware is injected into Web sites via known vulnerabilities in site plugins or third-party software — such shopping cart programs. Once on a host machine, the malware will encrypt all of the files in the ‘home’ directories on the system, as well backup directories and most of the system folders typically associated with Web site files, images, pages, code libraries and scripts.”

A citizen of Scotland has been indicted for a Twitter-based stock manipulation scheme. “According to the indictment, [James Alan] Craig, 62, of Dunragit, Scotland, alleged set up Twitter accounts using names similar to real market research firms for the purpose of manipulating stock prices. Craig issued tweets with false and fraudulent information about publicly-traded securities, causing the price of the securities to rapidly decline. Craig then bought securities of the targeted companies through his girlfriend’s brokerage account and later sold them at a higher price per security. Craig’s actions are alleged to have caused of more than $1.6 million in losses to shareholders.”

The FCC has announced that it can’t force Google and Facebook to stop tracking its users. “The Federal Communications Commission said Friday that it will not seek to impose a requirement on Google, Facebook and other Internet companies that would make it harder for them to track consumers’ online activities.”

Wow, sounds like there’s some really horrible Android malware out there. “Lookout has noticed a trend toward Android malware that masquerades as a popular app, but quietly gets root-level access to your phone and buries itself deep in the operating system. If that happens, you’re in serious trouble. Unless you can walk through loading a fresh ROM or carefully modify system files over ADB, it may be easier to just replace the device, or have your phone company reflash it — a simple factory reset won’t get the job done.”

Google says a recent Samsung Galaxy phone has a whole host of bugs. “Google has revealed that Samsung’s flagship Galaxy S6 Edge Android smartphone suffered 11 ‘high impact’ security issues that were introduced by the company’s customisation of Android. Of the 11 bugs that were found in a week-long focus on Samsung’s device by Google’s Project Zero security bug hunting team, some could allow hackers to take over the device and steal personal data.” Looks like most of them have already been fixed.


Interesting reading from Exploration in science and ranking journals by novelty. “What is the problem with rankings based on citations? For one, they do not depend at all on what kind of science is being pursued – they make no distinction between novel and conventional science. Though a highly cited paper might play with novel ideas, there is no intrinsic reason for it to do so….To address this problem, we recently developed a new journal ranking approach that rewards playing with new ideas, rather than influence (Packalen and Bhattacharya 2015). Journals are ranked based on their propensity to publish articles that build on ideas that are relatively new. Journals that publish articles that build only on well-established knowledge – no matter how influential – are not rewarded in our ranking.” Good morning, Internet…

I love your comments, I love your site suggestions, and I love you. Feel free to comment on the blog, or @ResearchBuzz on Twitter. Thanks!

NANFWRIMO #4 & #5: Strategies for Low-Stress Web Monitoring

Note: this article is so long that I’m counting it as two days’ worth of NANFWRIMO. Next article on Monday.

The Web is enormous. Billions and billions of pages, if I can get all Carl Sagan on you, with more being added every second. When you’re trying to monitor these new Web pages for information, the potential flood of data can be daunting. So many pages, irrelevant to your topic but glomming on to your search terms! So much spam! How can you possibly find the good stuff amidst all the dreck?

The truth is this: the Web is still great for finding new and useful information. With the increase in its size, however, and the amount of gaming and spamming going on, you have to think strategically about how you’re going to monitor its new pages. You can’t just put a couple of keywords in a Google Alert; you’ve got to be a bit more savvy. That’s what this article’s about: seven tactics to give you a rich flow of data from new Web pages without wasting a lot of time and tears on off-topic pages, junk, and spam.

For the purposes of this article I’m using Google Alerts. It has its shortcomings, but in terms of completeness I think it’s the best option out there for Web monitoring. (If you have alternative suggestions please, put them in the comments!) If you’ve got a Google Account, you’ve got access to Google Alerts – you can access it at and it’s free.

(If you’ve never used Google Alerts before and need an overview, WikiHow can help you out.)

1. Choose Your Search Terms Wisely

If your search terms are off, Google Alerts isn’t going to do much for you, so this first tip is the most important: Get your search terms right. Your goal here is a balance between the very specific terms that would get you exactly what you want but would very rarely come up in a Web page, and the broader terms that would get you interesting results but also a lot of uselessness to eliminate.

Think about your topic terms. Write them out. Which ones do you see a lot in the resources you review? Which ones do you rarely see? Google Alerts will give you a preview of results when you set up an alert. Test your terms. Which ones are giving you resource-rich, “meaty” results? Which ones are junk?

Sometimes it’s difficult to narrow down your search terms. For example, I want to learn about new digital archives. “Digital archive” is a general term, but at the same time it would be difficult for me to narrow my focus and still get a good spectrum of results. I can use more obscure terms, like “digital library,” but that’s more changing vocabulary than getting specific. If you really can’t think of a way to create pinpoint search terms for your topic, don’t worry; we’ll look at other ways to narrow your search in this article.

Even if you can’t come up with focused search terms for your topic of interest, see what you can do by adding time-related words to your query. For example, if I set up a Google Alert for “search engine”, I would spend hours going through junk results. But what about “new search engine”? Much better results. Fewer of them and they’ll be more timely. Maybe recent, updated, or latest can be integrated into your search terms and phrases.

2. Limit by Domain

You just can’t limit your search terms. Your topic is either very broad or defies specific description. That’s okay. Shift your attention to limiting where Google Alerts will find results.

Because my interests are in digital archives, online databases, etc. I find that focusing on certain top-level domains gives me quality results in workable quantities. Like this:

“digital archive” (site:edu | site:gov | site:mil | site:museum |

(Google does not need the parens to correctly parse this search. But it helps me organize my thinking and, in the cases where I’m doing very complex searches, helps me understand them when I go back and review months after creation.)

In this case I’m doing the fairly broad search for digital archive but limiting the results to only a few top-level domains, including .edu, .gov, and municipal/government sites in the state of North Carolina. This gives me a manageable flow of results.

Protip: When I limit my Web monitoring to only a few top-level domains, I do it with the knowledge that I am going to miss resources made available on .com and .org sites as well as other domain. I feel comfortable doing that because I also monitor news, social media, and RSS feeds. You must monitor multiple aspects of the Internet because some tools and terms will work better than others.

3. Limit by Area of Page Monitored

The messy HTML of a Web page is not as delineated as, say, an XML file. But it still has an identifiable title. When you’re frustrated with too many useless results from a Google Alert, give it a laser-like focus by searching only the titles of pages:

intitle:”new search engine”

You have instantly cut down the data pool that Google Alerts is searching by at least 90% when you’re monitoring only Web page titles.

Now, can you combine the ideas of narrowing your search by domain AND by page title? Yes. In fact, I’ve found this a great way to monitor Reddit. Visiting the site and browsing, even using the search engine, takes a lot of time and doesn’t find much. But this search in my Google Alerts generates a trickle of great resources I rarely find anywhere else:

intitle:”new tool”

Or maybe I’ve got a keyword that’s kind of general so I want to limit my search to just blogs:

intitle:”new web” ( | | site: |

Do you see what you’re doing here? You’re using special syntax to trim the billions-of-pages Web into manageable chunks. But you don’t even need special syntax to do that; you can also limit your results by excluding keywords.

4. Your Exclusions are As Important As Your Inclusions

When it comes to Web searching this isn’t said often enough: What you exclude is just as important as what you include. If you’re trying to monitor the Web and you keep getting the same old junk, consider that maybe your included search terms are fine and you need to exclude some words from your alerts.

I want to monitor news releases at certain top level domains – basically I’m looking for press releases. At the same time I don’t want to get re-indexed archive pages, or job announcements. So I use this:

intitle:news release press contact -intitle:”blog archive” -inurl:jobs -intitle:”job vacancies” (site:edu | site:gov | site:mil)

This gets me what are generally resource announcements and skips job appointments and personnel-type stuff. When you find your Google Alerts are getting you useless results that are of a specific type, go through them and see if there are any keywords that you can use to exclude that class of results entirely.

Speaking of special syntax and language, did you notice I used the special syntax inurl in that last example?

5. Inurl: Is Tricky But Useful

The inurl syntax searches for a character string in a page URL. Using it judiciously can mean short searches which yield information-rich results:

intitle:”digital archive” inurl:library site:edu

But this syntax requires caution. I can use inurl:library in the search above because many university libraries delineate their sites this way. It’s not quite standard, but it’s common enough that it works. You may find that your inurl searches are not standard or common enough, and end up eliminating a lot of resources you’d otherwise find. You’ll have to experiment.

Of course, sometimes your keywords are just too general for the Web, and you have to take strong measures to winnow down the data pool you’re monitoring.

6. Shift Your Focus to a Smaller Data Pool

Google Alerts monitors not only the Web but also news, blogs, what it calls “discussions,” and a few other Internet subsets. If you find, even after experimenting and revising your search, that you’re still getting too many non-useful results, consider shifting your Google Web alert to a Google News alert. You won’t get as many results, but they will be more focused and generally more timely.

Protip: Generally I let my Google Alerts monitor everything – News AND blogs AND Web and so on. It’s best to let these alerts run together. However, if you find that you’ve got a search term that attracts a lot of spam or off-topic results, setting your Google Alerts to bring your results only from News sites will generally get that topic back on track.

7. Regularly Revise and Update

Did you read Harry Potter and the Goblet of Fire? Let me channel Alastor Moody for a moment:


The searches you ran in 2001 probably did not look like the searches you ran in 2010, which in return probably did not look like the searches you run now. Even if your topics of interest don’t change, the resources you would use online to access and discuss them will change. Make sure that you’re regularly going through your alerts to see if you’re using keywords that are outdated and will limit your results.

At the same time, check to see if you might need to add some keywords. Since I cover a lot of Internet resources in ResearchBuzz, I want to be informed about new articles, tweaks, and updates about a number of tools that have been around for less than a couple of years.

When livestreaming became a big deal, I made sure I was covered in my Google Alerts:

(intitle:periscope | intitle:snapchat | intitle:meerkat | intitle:vine ) (site:edu | site:gov | site:mil)

When I discovered I was finding good content about new Facebook extensions buried in pages about other things, I made sure that topic got more prominence:

“new facebook tool”

When I started getting more and results from, I decided to break those results off into their own alert:

(“digital archive” | “online database”)

At the same time I might stop monitoring other topics. Bookmarking, for example, is not the topic it used to be. I’ve noticed that there’s less talk about online museums and more talk about digital archives. Digital library is another search term that seems to have gained in prominence lately, if my Google Alert results are any indication.

When you monitor topics regularly over a long period of time you will get a familiarity that’s not quite conscious. You’ll notice patterns. Something will bug you and you’ll realize you haven’t seen a particular keyword mentioned in quite a while. Another time you’ll see someone’s name and realize you’re seeing that name affiliated with a certain topic an awful lot — even though that name is not particularly famous or talked about. Use this developed knowledge — I’ve called it “spidey sense” — to refine your Google Alerts.

The size of the Web can be intimidating. Use the tools Google gives you — special syntax, the ability to exclude words, and restricting your search to only certain areas of a page — to get your results down to a usable level that lets you spend more time using what you find, and less time wading through page after page of useless results.