Blog Archives
Google Instant: Breaking It, Gaming It, and The Future
If you were within fourteen miles of the Internet yesterday, you probably heard about Google’s new search engine feature, Google Instant. Google announced it about 1pm EST, and for the rest of the day my Twitter and Hootsuite feeds were filled with opinion, derision, experiments, and arguments over whether Google Instant means the death of SEO or not. (I am strictly on the user end of search and don’t involve myself with SEO much — but no, Google Instant does not mean the death of SEO. It’ll change, certainly, but not die.)
What It Is
Google Instant is available at http://www.google.com/instant/, or just from the regular Google page. I thought you had to be logged in to a Google account to use Instant, but a quick test shows that this does not appear to be the case. Start typing your search. Google will try to guess what you’re looking for and provide search results as you type. If you start typing Roll Over Beethoven, Google will provide you with relevant results even before you finish typing the phrase.
Video and audio content are integrated with the results, and related searches are at the bottom of the page. In fact aside from the instant refreshing, Google Instant’s results look very much like non-instant search results. The complaint about Google Instant that I’ve read the most is that Google’s anticipated searches and refreshed results are irritating and unwelcome. It’s easy to turn that off. In fact, it’s pretty easy to make Google Instant not work.
Making It Break
You might not want Google to decide what you’re searching for as you’re typing. You can stop Google doing that by putting a + in front of a query as you begin to type. If you do that, Google will refresh its results to ONLY what you’re looking for, and not what Google THINKS you’re looking for. (Depending on what your query is, this can also be a fun exploration down the rabbit hole; see yesterday’s article Turning Google Instant into a Quick Head Trip for details.)
Google Instant does not play nicely with certain search syntax, either. The query intitle:boogie works fine (though starting a query with syntax also seems to break Google’s suggested search results) but if you add the site: syntax, Google Instant gives up and stops. Try the query physics site:edu and Google stops showing any results at all, with the morose message “Press Enter to search.” (I think that’s Googlese for “You have completely confused me and I’m not going to play anymore.”)
Specifically excluding words seems to stop Google’s anticipatory behavior as well, though starting a query with an exclusion makes Google just sit there. (You can’t search for -mice, for example, and get a result.)
Google Instant retains a behavior that its old counterpart had which irritated me a lot. It tries to correct your search even when you know what you’re looking for. For example, if I start looking for Carolynn Jones, Google will decide that of course I must have meant Carolyn Jones, and will give me the results for the actress. You can stop Google doing that the same way you can stop the autosuggestions, by putting a + before your first query word.
Of course, you don’t have to go through all this trouble if you simply don’t want to use Google Instant; you can turn it off by going to Google Preferences, or by looking for the “Instant is on” pulldown menu and changing it to “Instant is off.”
Gaming It
Google has always had particular search quirks which are still available under Google Instant, only now they’re more fun. Want chicken for dinner? Looking for recipes? Interested in broiling but willing to consider other options? Make broiling the heaviest word in your search — that is, use it multiple times. If you use a word multiple times in a query, Google will, I think, look for the word multiple times in the search results, changing how they look. Typing chicken broil broil broil broil broil broil broil broil broil changes the results steadily and drops the result count lower and lower. Type in one more broil and the result count jumps back up again.
The word order will also change how your results appear. beer chicken shows a different set of results than chicken beer; the thing is it’s hard to switch a query word order without retyping it. (Is this something a Greasemonkey script could do?)
Google also has a sort-of wildcard/proximity functionality using the * symbol. It used to be that one * equaled one word; Therefore searching for “three * * * mice” would find the word three within three words of mice. Now it’s more ambiguous and I generally only use one * at a time. Google Instant seems to search the * okay; if I look for “I am * man” Google doesn’t suggest “I am Iron Man,” but rather what appears to be a song called “I am Man.” Furthermore, Google finds both “I am Man” phrases and phrases with an additional word, like “I am That man.”
Google doesn’t appear to have much in the line of stop words, either. +the will show you a 14 billion result count, and +a will show you an 18 billion result count.
For the Future
Google Instant is being touted as a big time saver. Eh, I don’t see it, not from the desktop. It’s not that I don’t do that many searches (actually I tried to figure out how many searches I do on a daily basis and kind of startled myself.) It’s just that it doesn’t take me nine seconds to type a search query to start with. And if many people are typing in only one or two words I can’t see that it’s taking them nine seconds either.
But I can see where instant search would be really useful on a mobile phone. Searching on a mobile is very slow. If I had enough of a data connection on my phone to allow for suggestion and search result preloads, that would be fantastic.
I would also like to be able to slant Google Instant results. Say I’m researching lung transplants. I would like to be able to go to a relevant page, like a Wikipedia article, and tell Google to use the context of that page to slant its suggestions and results as I search. Not sure if that’s what Google Instant is really for, but if you want to talk about something that would be a time-saver….
Around the Web
As you might imagine Google Instant has precipitated a lot of comments from all quarters of the Web. For a view of it from inside Google, check out Matt Cutts’ blog post. Alexandra Petri says it makes her “nervous and jittery.” (Well, Google did call a recent update “Caffeine.”)
Microsoft Watch considers what Google Instant means to Bing. (Bing seems to me to not being going for search innovation, but instead to be focus on intelligently integrating content into regular Web search. Which is smart, and which it’s doing well.) CNET wonders if Google Instant will be useful to mobile users. (Um, YES.) The Atlantic looks at the pros and cons of Google Instant and busts out the Gloom and Doom Burgers.
Not me; I’ll stay here peacefully chewing the Cheese Sandwich of Context. Yes, Google Instant is different, and it’s going to change the way we behave when we search. Whole philosophies will spring up around making the use of this new tool. But we must absolutely not forget one thing: Google Instant is one step. One step in the evolution of our relationship with this huge data collection we’re gathering. In two years things will be different. In ten years things will be radically different. Yes, we will change as search changes. But it is my firm belief that search will always evolve more to accommodate us, and not the other way around.
Turning Google Instant Into a Quick Head Trip
As you might imagine I’ve been playing with Google Instant for about an hour now and I have lots and lots to write about. Expect a big ol’ article tomorrow. But in the meantime, I wanted to give you a quick tip that can turn Google Instant into a strange ramble through the Web.
Normally when you start typing in a query Google will try to anticipate what you’re looking for an will offer you search results based on its best guess. You can turn that behavior off if you put a + in front of the first query word. When you do that Google will refresh constantly as you type until you a) reach the end of your query or b) type something it has no results or suggestions for.
I took a poem from Pablo Neruda: The Tree Is Here, Still, In Pure Stone, and the line fire in the forest, blaze of the dust-cloud. Typing +fire in the forest, blaze of the dust-cloud slowly into Google (well, slower than I usually type) found the results refreshing with every word typed. News about wildfires flickered by a definition of “Fire in the hole,” which was followed by a kid’s book, Blaze and the Forest Fire — and I never actually found any results that had to do with Pablo Neruda’s poem.
Starting +The Count of Monte Cristo took me lots of images of Count Von Count of Sesame Street. Typing in the lyrics to a Laurie Anderson song took me to Flickr pages, lots of sun-related music videos, and finally Sharkey’s Day. Typing in primary colors took me to Simply Red, then people, then huge chromatic explanations.
It’s like the biggest, most open free association game ever. If I could somehow hook this up to Wolfram|Alpha’s random word features, I think I might be able to make my cerebral cortex explode. Wait, that’s not a good thing, is it…
Google Revamps Its Translate Tool
Google announced last week some updates to its translate project. I normally don’t use Google Translate outside the regular Web interface, so I’m sure I’ve missed a lot as it’s evolved.
Google Translate lives at http://translate.google.com/. It looks a bit different from what I remember, with more access to tools “up front” and an overt language autodetect for anything you might type into the search box. And of course you can translate a document as well. Over fifty languages are available for translation. But as I got further into looking at the new Google Translate, I discovered that all the good stuff, once again, lies outside of Google Translate’s home page.
For example. Google Translate Search. When you run a search you can choose to have results from Google Translate Search, which you can find at the very bottom of the search options toolbar. Google will decide which languages are appropriate for your search, run the search, report which languages it used, and translate the search results for you.
I searched for pierogi. Google decided I should have results from Polish and Lithuanian searches, and gave me translated pages of results. I didn’t get too much into them, but the snippets indicated perfectly acceptable translations for machine-level (of course they were mostly recipes.) I was amused to note that one of the results was the Polish Wikipedia translated into English. (There were far, far more Polish results than Lithuanian.)
If you don’t want to get that deeply into non-native-language search, keep an eye out for the Translate this Page links by the search results. Whenever you find a page that’s not in your native language, Google will give you a translation link.
Finally, Google has made Google translation a part of its shortcuts. You can do short translations from the Google search box. I found things like hello in German worked fine, but sometimes I had to specify using the word translate, like the query translate how are you in Swahili.
Note that you only get the translation. If you want to know what it sounds like, you’ll have to click on the link and go to the Google Translate page, where you’ll get a link to hear the translation.
I’m going to find the translate shortcut useful, but Google Translate Search’ll be pretty fun too, if I can remember to use it.
Google Now Providing Many More Domain Results
I’ve been hearing rumblings about this from various points on the Internet since last week, but now we have official information from Google. Google announced last week that search results would now show multiple results from the same domain. It used to be that the most results you would get from one domain would be two. Now you’ll get as many as Google thinks are appropriate.
This won’t be for all queries, of course. Google will only roll out the multiple results for queries that seem to indicate an interest in a single domain. For example,
say I was interested in digital cameras on Amazon. I can do the query digital cameras Amazon (note that I don’t have to specify on Amazon or at Amazon or anything like that) and get the following page of results.
There is a sponsored link at the top of the results, and a pointer to shopping options, but aside from a few results at the very bottom of the page everything comes
from Amazon.com.
Now this is going to be very handy if you are in fact looking for results from one place, but I was worried about doing actual company research. Would Google make it difficult to find company-related information that wasn’t on that particular company’s domain?
I did an experiment, searching for Target pharmacy. The first eight of the results were on the Target.com domain, Google’s new feature working as you’d expect.
Then I tweaked the search, adding one word, so the query was now Target pharmacy sucks. (NOTE: This was for an experiment only. I’ve used a Target pharmacy once to buy Sudafed and it was a perfectly acceptable experience.) This time none of the results were from Target, most were from discussions and surprisingly one result was pro-Target (the post was complaining a different pharmacy sucks.) I tried using softer words instead of sucks — bad, problem, and trouble — and only trouble brought me several results from the Target.com domain.
Does Google’s change mean that we’ll never see “two results from a domain” again in search results? Absolutely not. I wondered about that and found results in the old fashion in just a few minutes. I did a search for facebook researchbuzz and found two sets of two results from, well, Facebook and ResearchBuzz.
As a searcher, the lesson I’m going to take away from this change is to avoid using company and domain names in searches unless I really want to slant my results that way. And if I do have to use these names in a search, I’ll counterbalance them with descriptive terms that hopefully keep the results from being limited to one domain. But I don’t think this adjustment is going to radically change the way I do searching.
Google Offers OCR for Incoming Docs
How the heck did I miss this? Google announced last week that now when you import files into Google Docs (JPEG, GIF, PNG, or PDF) you have the option of running optical character recognition on them. This is really huge; this means that instead of just static files, you’ll be able to upload sets of words with which you can do further work. There appear to be some limits on what you can upload/convert (more about that shortly) but I find this really exciting.
When you open up Google Docs and choose Upload, you’ll get a screen to select the file you want to upload with an option for OCR like the one you see here. I didn’t have anything to OCR handy, so I went to Google Books and grabbed Analytical Psychology by Carl Gustav Jung. I set it to upload — it’s about 9MB — and Google Docs chugged away on it for several moments. After waiting a while Google Docs told me “Unable to Convert Document.” Well, phooey. So I went back to Google Books and tried again, this time with Damon Runyon’s Rhymes of the Firing Line. That one was a lot smaller — a little under 2MB.
That one uploaded fine, but it only got the disclaimer from Google Books and the title page, because apparently there’s a limit to how much of a PDF document Google will OCR. >facepalm<. (Another limit that I haven't mentioned — the OCR currently supports only English, French, Italian, German and Spanish.) Okay, I decided to try one more time, this time with a screen shot of a page from a Julian Huxley article from a 1965 Rotarian. (If you do a search for Google Books for the word psychedelic in magazine content available in full-text, this is the earliest result.) Google Docs processed it very quickly, but apparently didn’t like the two column format as the OCR was very poor. Here’s a sample:
Thc organization of power in competitive national units has reached iu; logical conclusion in the confron-lation of two grcat uppnscd blocs immobilized in thc grip of the cold war. Advance in thc tischnical of weaponry has given us weapons so powc rful than they cannon-we hope-be used: meanwhile na-lions are spcnding so much on amwmcnls that there is not enough lo mccl more than a fraction of other und more important psychosocial needs, Increasing emphasis on material products has lcd to wasteful ovcrcxploilatiun of Nature and a tllrcnlcncd shortage of natural rcsollrccs.
I guess what I’m getting at is that the Google Docs OCR is as it stands a bit on the fickle side; there are some size limits and apparently some layouts work better than others. But I am still excited about this. If it evolves to be a little less finicky and have fewer limits it’ll be an incredibly powerful tool for organizing PDF content. Personally I can’t wait.
Google, Now With Caffeine
Google announced yesterday its new Web indexing system, which it says in the announcement “provides 50 percent fresher results for web searches than our last index, and it’s the largest collection of web content we’ve offered.” (Though not an exact document count, as Google stopped providing those a long time ago.)
Google has an illustration to show how the Web is being indexed. Before, it said, the old index had several layers. Now, the Web is analyzed in small portions and the index is being continually updated.
I’m afraid I misinterpreted that as “Before, Googleman stood beside a neatly-stacked index of information and had a fairly good idea of what was going on. Now, Googleman stands helplessly inside a maelstrom of content, constantly getting bombarded by multimedia.”
But I kid Google, though I wasn’t kidding about the maelstrom. Caffeine is huge. From the announcement: “If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles.”
I remember when Google’s index went to a billion pages and that blew my mind. I don’t know what Google’s index stands at now, but I can do some guessing. I can do a search for a, for example (apparently it’s not a stop word anymore) and get 18,020,000,000 results at this writing. Do the same search for the last 24 hours and get 17,070,000,000 results. A search for the? About 11,490,000,000 results, with, very strangely, about 16,610,000,000 for the last 24 hours (sometimes Google’s numbers are odd.) site:com? 13,440,000,000. site:google.com? About 108,000,000 results.
As you know I’m very much into information trapping. I spend a lot of time figuring out how to find out about new places and resources without having to wait for other people to find them for me. Google Alerts on Google’s own index helps me a lot; I have many alerts set up to let me know when Google indexes particular kinds of content. (Are you interested in learning more about how to do this? Leave me a comment or drop and e-mail and I’ll write an article.) I have not seen in the last 24 hours any big change in the kind or amount of content that I’m getting, but I’ll keep an eye on it and let you know how it changes, if at all.
Google Offers (Partial) Encrypted Search
Google announced last week that it was starting to do full encryption for some search services. This shouldn’t be surprising to those folks who use GMail, as Google started doing full encryption for that quite a while ago.
You can now go to https://www.google.com/ and have full SSL encryption when you’re doing regular Web search. Emphasis on regular Web search; Google notes “when you search using SSL, you won’t see links to offerings like Image Search and Maps that, for the most part, don’t support SSL at this time.” The search will also be a little bit slower, as the encryption has to be processed (I doubt you’ll even notice.)
If you use a lot of open WiFi connections or public hotspots, you’ll appreciate the encryption of the Web search. But note that this is more about data security while you’re out among strangers and less about your privacy in relation to Google. Google still does retain search data and does use cookies (you can get Google’s general privacy policy here.) If you are more worried about your privacy as in regards to Google, you might want to use Startpage, which is extremely privacy-focused and doesn’t even record user IP addresses anymore (see privacy details here.)
Matt Cutts has a few more thoughts that you can read on his blog.










