RB Search Gizmos

Mining Wikipedia’s Page View Counts With Gossip Machine

Researching famous people has always been a favorite search puzzle of mine. Google works to a point, and there are little search tricks you can use to narrow down your  results, but digging down into substantive news and information about celebrities and the well-known is difficult via a general search engine. If they’re really, really famous it gets even more tough.

Thinking about this, I mused  about famous people, and about references to them, and where those references appear, like Wikipedia. Wouldn’t there be a way to gauge public interest in a famous person via Wikipedia?

I came up with a hypothesis: why would a Wikipedia page get an unusually high number of views? Because more people are interested and looking at it, of course. And why are they looking? Because they were reminded of the page’s topic, probably through a news story or similar happening.

Therefore, Wikipedia page view counts aren’t just page view counts, they’re fossilized attention. They’re markers in time for when a topic has an unusual level of interest.

So why not find those markers and translate them to news searches?

That’s what  Gossip Machine does. You can use it at https://researchbuzz.github.io/Gossip-Machine/ .

1

Enter the topic you’re interested in, the year you want to search, and how newsworthy  you want the date to be ( When the setting is “VERY Newsworthy,” days must have at least 190% of an average day’s pageviews, while the “Gossip Fiend” setting requires only 150%.)

Gossip Machine goes through every day of a year’s worth of page counts and returns the days that match your settings, along with links to Google News and Google Web searches for that date.

Let’s do a couple of examples. The default search for Gossip Machine is for Snoop Dogg in 2016, with VERY Newsworthy dates to be found. I click the “Fire Up the Gossip Machine” button and get a list of 2016 dates when Snoop’s page had an unusual level of interest and visits. Each date has a link to do a Web search or a News search. Gossip Machine also tells you the average page view count so you can be prepared for the odd results you might get for topics with a low view count.

Screenshot from 2022-08-29 09-07-25

I clicked on the July 17 news search to see what was going on that day, and yeah, that looks pretty newsworthy!

Screenshot from 2022-08-29 09-09-44

Clicking on the Web link might bring you links to other news stories or multimedia.

Screenshot from 2022-08-29 09-12-28

Do you see how that Wikipedia page view count – a marker of increased interest – can super-focus your search results?

It doesn’t work just for people, either. You can search for things like medical conditions, locations (remember, you’re looking for things that people might look up on Wikipedia, so the name of a California city might work better than a really general search like California itself) or even events.

Gossip Machine also does an initial search to find your topic page so don’t worry about getting the name exactly right. If, for example, you look up magic mushrooms in the year 2020, Gossip Machine will get you topical page results, having resolved your query to the topic Psilocybin mushroom:

Screenshot from 2022-08-29 09-28-36

You’ll note that in this example this topic’s page has a much lower view count than Gossip Machine recommends for good results. It’s still worth checking at least one or two links in the set of results, especially if they’re grouped together around a single date like this one is. In this case there was certainly relevant news:

Screenshot from 2022-08-29 09-31-46

I don’t think I’m quite done with Gossip Machine. If there’s an interest I can add more sources besides Google News and Web search, and I’ve been thinking about adding a triangulation feature – find the most popular dates that two different Wikipedia pages have in common and get the news for those.

What do you think?

1 reply »

Leave a Reply