I spend a lot of time thinking about how search engines can narrow down information pools and provide richer search results without relying overmuch on the searcher themselves.
Look at it this way: you’re searching for something because you want to know more about it. You want to know more about it because you have a gap in your knowledge (or you want to confirm that you don’t.) Relying on the searcher’s query past a certain point runs the risk of introducing errors in knowledge that damage the quality of the search results.
Google and other big search engines know that, of course; they’ve developed extensive search technology to ideally get you where you need to go on the Web with only the most amorphous requests. The problem is that it’s non-transparent. You don’t know how Google is using your search to get you from point A to point B. You just know that your search worked. Will it work the next time Google changes that non-transparency algorithm? Who knows.
That’s why I focus on ways to inform your search without taking it over or rendering it non-transparent. I might use authoritative information about Web space, as with Super Edu Search. I might try to focus on a particular area, as with Backyard Scholarship. I might use time-bounded searches, as with Contemporary Biography Builder.
Or I might use indicators of interest and past attention. Page view counts are a wonderful record of how popular Wikipedia topics were and are. Why not take advantage of that?
I love exploring Wikipedia, but it can be daunting. Say I want to learn more about bass guitarists who play jazz, so I head over to Wikipedia. When I open the American jazz bass guitarists page, this is what I see:
Wikipedia presents category pages in alphabetical order, which is to its credit – they’re striving to present data in a neutral way. Great! But that doesn’t tell me which of these players are popular. It doesn’t tell me which ones are active or part of online discussions or controversies. It gives me no place to start.
You might not need a place to search when the category is small or when you’ve got a lot of time. But when you want to get to the heart of the matter and find the popular pages in the category, try Category Cheat Sheet, at https://searchgizmos.com/ccs/ .
Category Cheat Sheet takes the first 500 pages of a Wikipedia category and evaluates the most recent months’ page views for each. It then re-sorts the category by page view count and provides brief summaries for the top 20 most popular pages. It also provides links back to Wikipedia pages if you need more than an overview.
Here’s what that jazz bass player category looks like when it’s run through the CCS:
With every entry you get the name of the article, a recent monthly page count, and a summary. The links to full Wikipedia articles open in a new tab so you can skim through the list, click on anything that looks interesting, and then review the tabs separately. I found in making this that it’s a much friendlier way (for me at least) to explore a category, and definitely exposes the popularity bias errors I would have made (I thought Larry Graham would rank a lot higher.)
I’m releasing it as a standalone tool because it’s useful and fun to play with, but Category Cheat Sheet is actually one half of something else I’m building. I’m still wrestling with the problem of doing a successful general topic Web search with as little knowledge as possible. By gathering up the most popular names in a category and applying some (very basic) language analysis, I’m hoping I can turn a Wikipedia category name into a specific, useful Google search for that topic. Stay tuned.
Categories: RB Search Gizmos
Love this so much! I write a lot of copy for franchises I know little about. Being able to sort the Foods in Halo by relevance instead of alpha is primo. Now I don’t have to click into each to see if it’s in the actual games or a novel tie-in or whatever. 👍