Quit Trying to Be the Next Google Dammit, Pt. 2: The Goal Should Be An Internet That Makes Us Better Humans
We have a houseguest, my husband and I. She is staying with us while she receives medical treatment, and will be here for a while.
I am on all my manners. I have almost stopped singing out loud to myself, and talking twee to the cat, and blurting out observations which make sense to me but no one else. I am cooking dinner and keeping the kitchen clean and checking twice a day to make sure there are plenty of clean towels in the linen closet. I do not feel much faith in my powers as a hostess — I am too big and rumpled and introverted and strange and I’m always convinced something will go wrong. I cooked pierogies and the house smelled like fried onions even hours later, and I went in the bathroom and cried because everything the house would smell like fried onions forever and I was the worst person in the world.
Through all this I go back to the Internet over and over again to try to be better. To find good recipes to cook. To do medical research. To figure out how to make our ancient bathroom sparkle. To get rid of the fried onion smell, dammit. To be a more productive person and a more effective hostess for this family member with her blue cane who is so, so patient with me and makes me feel ridiculous for crying over food.
I don’t say to myself that I am using Google because it indexes so many Web pages so quickly and thus and such. I don’t say to myself that I’m searching PubMed because it has so much information organized in such a way. I say to myself that I want to use THIS resource or THAT resource because it’s helping me in doing a job at which I feel completely rubbish. It’s making me better.
Wouldn’t it be wonderful if instead of headline touting “the next Google” (a phrase which has 2,660,000 matches on Google itself, by the way), stories and Web pages encouraged aspiring companymakers to build the things that make us more capable and stronger? To encourage people to, instead of merely reflecting an existing quo, build tools that will expand horizons and give us new ways of being and lead us to becoming better humans?
… I suppose that now that I have admitted in front of God and everybody to crying over fried onion stink that I should also tell you my secret dream. My secret dream is to have a place to send every bit of information I look at. I read literally hundreds of RSS feeds. I am subscribed to dozens of Google Alerts. And my perfect day would be able to match every bit of information to someone who would be delighted to have it.
That’s my particular itch. To direct information to people who could use it. That’s why I spend so much time reading those feeds and alert services — because there are so many great resources out there, and more coming every day, and y’all don’t know them, and that drives me nuts.
If I were building an Internet company, that would be what I would build. A delivery system to tell you about all the beautiful stuff I find. A system that’s so simple and easy to use that I could spend 99% of my time finding and reporting the beautiful stuff and only 1% of the time doing bullshit, which is anything that’s not finding and reporting beautiful stuff.
Well meaning people would ask me, “Is it going to be like Google? Or Facebook?” And I would say “No no, if either of those worked for me I would be using them now.” And I would make something that worked perfectly for me, no matter how it ended up looking like. And then I would invite other people to play. And if they liked it, away we go! And if they didn’t — well, at least I had solved one of my own problems, yes?
Technology is for the purpose of us. We are not for the purpose of technology. When we aspire to merely imitate an existing structure we are doing ourselves a disservice. Even a better Google is still a Google. But to focus on solving a problem and letting people do better those things that make us so uniquely us — when that is your goal, you have moved outside history and technology becomes merely an element of construction and not a force that bends you.
Earlier this month I read an interesting article in ScienceNOW. It was about how people can recognize how they have changed in the past, but are less good at recognizing how they will change in the future. “Gilbert and colleagues call this effect ‘the end of history illusion,’ because it suggests that people believe, consciously or not, that the present marks the point at which they’ve finally stopped changing.”
I thought this was interesting because it’s a huge blind spot in one’s development as a person and may explain why it’s so hard for people to enact radical change on themselves (and it may also give some hints on how it could become easier to do so.) I also think it may explain how people see current companies and technology.
I thought of this study yesterday when I read an article on Mashable called Free Database of the Entire Web May Spawn the Next Google. It was an overview of a new non-profit that’s making a huge bucket of Web data that people can splash around in. This is great, but not new (ODP data was being used for the same purpose by sites like Oingo, and that was over a dozen years ago) and I found the idea that this might bring about “the next Google” to be as galling as it ever is. Only this time I’m going to write about it because I can’t stand it any longer.
Seventeen years ago this spring I wrote my first book on Internet and search engines. I have been reading and writing about search engines and finding things online ever since. And I would like to bring all this experience to bear and disclose something to you:
One day Google is going to suck.
This is not disrespect. It’s history. The more successful Google gets, the bigger it gets. The bigger it gets, the slower it moves. The slower it moves, the more difficulty it has in responding to rapid changes of technology. The more difficulty.. you get the idea. The very fact of a company’s existence and the requirements heaped on it from all sides — from the government, shareholders, customers, employees — eventually coats it in layers of bullshit that have nothing to do with mission and innovation and everything to do with placating someone or other. The more success, the more of that there is. Bureaucratic barnacles.
Because we are always in the present, we can’t imagine the Internet without our right-now-essential tools. But eventually they will not be essential. Eventually the Internet will change enough that they will take a more minor role, specialize to the point that they appeal to a much smaller audience, or deprecate entirely.
HotBot? AltaVista? The Open Directory Project? All once hailed as great innovations, hugely useful, where-would-we-be-without-them, tools of the Internet. And now they all pretty much suck. (Though some people involved, like Rich Skrenta (ODP) and his search engine blekko, have moved on to greater things.)
I’m not saying that tomorrow Google is going to start sucking, and I’m not saying it sucks now. It doesn’t. I’m saying that it can’t be what it is indefinitely no matter how unstoppable and monolithic it looks now. And I’m saying that if you start off trying to “be the next Google,” you are setting yourself up for failure.
There are so many problems of discovery and usage on the Internet that have nothing to do with what Google does right, right now. Searching for podcasts is a pointless nightmare. It’s still hard to find and use “deep Web” resources like those which are found within library catalogs and online exhibits. Natural language searching has gone from being difficult and odd (but somewhat useful) to, in my experience, misunderstanding what I actually want. Special character searching is still a niche for engines like SymbolHound. Translation tools, while better, are still pretty bad. The only Twitter viewing/monitoring tool I can find that doesn’t make me want to punch a wall in frustration is Undrip.
Here’s my point: now matter how pervasive Google is, no matter how unshakable it looks, there are still issues with the way the Internet and the Web work. There are still structures to be invented and innovations to be made. And that will be true forever.
For your success, scratch what makes you itch. Look at the Web/Internet/whatever, see what pisses you off, and address that. Take Common Crawl’s excellent offerings and makes your job easier. (Now I’m wondering what Wikia is doing with Grub.) What you do may overlap Google’s endeavors or it may not. But it seems to me you will be much more successful with that approach than by trying to replicate the success of what came before.
Theatre, New Search Engines, Maps, Real-Time Subtitle Translation, More: Morning Buzz, July 24, 2012
Birmingham Rep is getting a digital archive: “The REP 100 website – http://www.rep100.org – will contain more than 3,000 records of The REP’s historic productions – including photographs, letters, documents and other fascinating ephemera from its history and will be made available to the public, many for the first time, next year.”
From TechCrunch: “Ohloh Wants to Fill the Gap Left by Google Code Search”: “Besides code search, Ohloh features an exhaustive directory of open source projects, complete with statistics on how often the projects are updated.”
VentureBeat has an article about a new social search engine: Bottlenose. Going to try to give a text drive next week.
The Census Bureau has launched a new database on HIV/AIDS statistics. “The database was developed in 1987 and now holds 149,000 statistics, an increase of approximately 10,800 new estimates in the last year, making it the most complete of its kind in the world.”
An e-mail based diary that prompts you with questions and then uses AI to generate more and more specific questions over time? MyFutureSelf sounds like a really interesting tool.
Google has announced lots more detailed maps: “And today, we’re launching updated maps of Croatia, Czech Republic, Greece, Ireland, Italy, Lesotho, Macau, Portugal, San Marino, Singapore and Vatican City…”
Nifty article from UberGizmo — real time subtitle translation. Apparently inspired by Google Glass, but using Microsoft’s translation APIs. Just saying.
Speaking of Google, have you heard about the new face blurring tool on YouTube?
The Internet Archive gives an update on its music collections. I think I’m going to be spending a lot of time in the DNA Lounge archives… good morning, Internet…
I first covered Blekko about a year ago (November 2010). After that aside from mentioning it a few times I haven’t talked about it, but I still like it consider it to have a lot of good functionality. Recently Blekko announced improved search relevancy and a lots of automatic slashtagging.
Automatic application of slashtags (a way of categorizing information) has been applied to over 500 categories. Using slashtags, you can narrow down your search results contextually, which will get you a wider variety of more relevant results (ideally) than trying to narrow down using specific keywords.
Some of Blekko’s slashtags act as special syntax; using the slashtag /monte at the end of any search query will show you search results from three different search engines. The twist that you don’t get to see which search engine produced which results until you choose the column of results most relevant to you. This is an interesting tool to use because you’ll see a) which results are tagged with which Blekko categories, and b) which sites on other search engines are banned on Blekko. (First impression: Blekko does not like eHow nohow.)
(I also like Monte because it emphasizes, to me, that search has not been solved. When playing with /monte I did not consistently pick one search engine for the best set of results. I didn’t even pick one most of the time. There’s still work to be done to make Web search the best it could possibly be, and nobody’s figured it out yet. Not even Google.)
Running a few slashtag searches did show that it was useful at narrowing down results, but the downside is that I’m apparently not very good at guessing slashtags. There is a partial directory available at http://blekko.com/tag/show#tab3. If you want more suggestions you can also enter your search and a forward slash, and Blekko will suggest a variety of slashtags for you.
Blekko continues to impress. The only complaint I have is that I can’t find a complete slashtag directory, either for categories or for special syntax. Perhaps I’m looking in the wrong place?
I’ll have a review for you momentarily, but I just wanted to announce off the bat that I’ll be suing Blekko for copyright infringement.
I’m pretty sure I have ample evidence for prior ownership of Blekko as intellectual property; the word was used repeatedly in reference to my personal “brand” the last time I tried to serve vegetarian hamburgers at a family dinner.
Just kidding, of course. And I guess there’s some kind of requirement that search engines have odd names. At least Blekko is easy to spell. Blekko has actually been rumbling around for several months now, but launched its public beta yesterday at http://blekko.com. It’s gotten tons of coverage; maybe the search engine wars are revving up into their third cycle. After spending a little time with it, I gotta say: I like. I like very much. I’m a little worried about some of the search algorithms, and there’s a LOT going on in that front page.
But Blekko is letting me get under the hood of search results in a way that I’ve never seen before. I’m getting crazy amounts of data just from a search results page. I’m going to have to think about it and digest it, but the transparency of the SEO information available feels immediately like a reproach to Google. And the slashtag method feels like a reproach to DMOZ and any other searchable subject index that’s still sticking it out nowadays. The whole engine is like a hybrid of crowdsourcing, searchable subject indexes, and the ocean of data more appropriate to a full-text search engine (Blekko’s first search index is the product of a 3-billion page crawl.) There’s some hint of a clustering search engine in there too…
Blekko has the obligatory Minimalist Search Engine look; a search box and some discussion about slashtags.
Slashtags are Blekko’s big thing. They’re both a way to focus search results and a search syntax. If you have a Blekko account (they’re free) you can create your own slashtags and slash the Web your own way.
Blekko has launched with several hundred slashtags and the best way to get a sense of how they work is to try them. Start your search with a regular keyword — say, diabetes.
Your search results will reflect pretty standard Web sites for diabetes. But you’ll also see a number of recommended slashtags relevant to your search. If you want more information about diabetes as it relates to your diet, you can try a search for diabetes /diet (slashtags are always prefaced with a /, natch.) This will rerun your search for the diabetes keyword against a list of Web sites that have been slashtagged for “diet”. And you’ll end up getting pretty relevant results.
It looks like you can invoke multiple slashtags in a search, but it’s going to have the effect of searching those sites which appear listed under both tags. For example, the search for diabetes /diet /nutrition finds you results — but only 27 of ‘em.
If you search for a slashtag that doesn’t exist, Blekko will a) recommend some alternate ones and b) give you the option to create your own. If you search for a tag that someone else has created, you’ll have option to search their tags. (Try diabetes /australia to see an example of that.
There are special slashtags that let you narrow and sort your results in a certain way, but let’s put those aside for the moment, becasue I want to address the search results. Whereas many search engines have gone more and more toward giving you a minimum amount of data for each search result, Blekko busts the results wide open, providing huge amounts of information for each result. Let’s take a close-up look at a search result.
There are several information links for each search result. Taking them one by one, here’s what they look like:
Tag — Add a slashtag (or multiple slashtags) to this search result.
SEO — Get CRAZY, CRAZY amounts of SEO information for this search result page. You will be able to see charts for inbound link distribution, crawl stats, the number of inbound links, and the number of site pages. You can compare two domains and see what other sites are duplicating this site’s content. SEO is not really my forte; if you want a more thorough overview of the SEO tools, you can check out this article at Search Engine Land.
Links — Inbound links for a given result page.
Cache — Shows a cache of the page with your search terms highlighted.
IP — Allows you to search for all pages/sites hosted on the IP of the result page.
Chatter — Make a comment about that result page.
Spam — This isn’t an information link but an action link. If you click the spam link, you’ll remove the page from your search result and you will never, never see it again.
If you click on the prefs link on the upper right part of the page, you’ll have option to see other information in your search results, including Source (shows the page source in browser), RSS (shows a link to the RSS feed, if any), and Similar (shows sites similar to the one you’re looking at.) There are also occasional unusual information links, like Adsense, which lets you search for an Adsense code across multiple sites.
Okay, let’s move back to those slashtags. Some slashtags (like /diet and /nutrition) allow you to narrow the focus of your content, but others allow you to change how the results display. Appending /date to your search will allow you to sort your results by the most recent additions to Blekko. (diabetes /nutrition /date).
/images and /videos limit results to images and videos, respectively. I didn’t have much success combining these slashes — searching for diabetes /images /date didn’t seem to show results in order any differently from diabetes /images. Appending /rss to your search gives you the results as an RSS feed.
Rich Skrenta has taken the idea of search engines and given it a twist — several twists, actually. I am trying to figure out how to add Blekko to my daily routine, to slashtag resources as I review them and add them to morning buzz. I realized while playing with Blekko that it’ll be an excellent bookmark manager for me. (Don’t tell Blekko; I don’t know if they intended that.)
But I’m a bit worried about the essentials. Sometimes the search results I got we just weird. Searching for strawberry gave me “Strawberry Perl for Windows” as the first result. Searching for “Cow” gave me the Center on Wisconsin Strategy as the third result and a vegetarian restaurant guide as the fourth result (okay, that one was pretty funny.) So the algo needs a little bit of tweaking. (Looking at Google’s results I see that those two search terms also have their own strangeness. I need to do some more experimenting.) But if enough people get on board with adding slashtags and eliminating spam, this could be a very clean, well-organized index.
I would also (I can’t believe I’m typing this) like to see more social aspects to the search tools. You can follow individual users’ slashtag sets, but not all of a user’s content (as far as I can tell), you can’t upvote or “like” user-created slashtag sets (as far as I can tell.)
A great, great start. I can’t wait to get more time to play with Blekko.
Michael Fagan’s Fagan Finder (http://www.faganfinder.com), a search tool which has been around for ages and ages, has gotten several updates recently. You can read all about it at the Fagan Finder blog. Some highlights:
The video and movie search engine now has specialized categories. You can search large sites like YouTube, but also an array of sites with educational video, how-to, and news. Hey, how about the content from the Internet Archive?
The news search engine also includes some options for blog search, and unfortunately just brings home how limited the options for blog search are to start with. There are a couple of video and semantic search engines too.
The academic search is really nice. Categories of resources to search here include scholarly papers, online courses and video, flashcards and quizzes, and books. I was a little surprised to not see the Haithi Trust as one of the book search options — did I miss it?
Finally, the search engine page includes several choices for real-time as well as alternative search engines like Wolfram|Alpha and DuckDuckGo.
I was surprised to not see a social search category or a code search category, but what’s here is extensive. Fagan Finder has a basic design that’s not AJAXy and slick, but I’ll take useful and informative over AJAXy any time.