RB Search Gizmos

Find Popular Wikipedia Pages by Date and Sort Them By Type: WikiPopPulse

I call GPT-4 “Curly” because it says “certainly!” all the time and every time it does I think of Curly Howard from The Three Stooges. “Soitenly!” Happily it does not mind when I call it Curly.

Screenshot from 2023-04-04 08-37-37

Curly’s great to talk to about programming because it doesn’t get impatient with my million questions and will happily expand on the most minute aspects of an API (for which I thank it from the bottom of my autistic, detail-loving heart.)

In fact, we were shooting the breeze about a Wikidata property, P31, when Curly mentioned that that property has MANY possible values. P31 is the “Instance of” property for Wikidata. It specifies the kind of thing a Wikipedia article is about. So, the Cleopatra page is an “Instance of” a human. The YouTube page is an “Instance of” a social media platform. The John Wick movie page is an “Instance of” a movie.

Naturally I thought “Hmm, hidden classification system! Wonder what I can do with it.” I had already made tools for exploring Wikidata properties by category (Wikidata Property Peeker and Wikidata Quick Dip) so I looked around at categories to find one that would give me an interesting mix of instance types and didn’t find anything I liked until I checked Wikipedia’s most-viewed pages list.

Wikipedia has been keeping page counts since late 2015, something which still overjoys me on a regular basis. Wikipedia’s page views have inspired me to make all kinds of tools, like Gossip Machine and Category Cheat Sheet and Clumpy Bounce Topic Search, so why not use the page counts in exploring the various P31 possibilities? Curly and I got to talking about it and made WikiPopPulse ( https://searchgizmos.com/wikipop/ ). I think I ended up accidentally making a framework for what will end up being a much larger thing, but let me show you what it does so far.

Using WikiPopPulse

To start using WikiPopPulse, specify any date between January 1, 2016 and yesterday. WikiPopPulse will go to Wikipedia, get the top 100 Wikipedia pages for that date, group them by “Instance of” property, and present them to you in a drop-down menu. Here’s yesterday:

Screenshot from 2023-04-04 09-25-55

Pick a category and you’ll get a listing of the pages in that category. The number in parens beside the name of the page refers to the page’s original position on the top 100 list. Let’s look at television series.

Screenshot from 2023-04-04 09-29-06


One thing I personally dislike about popularity lists like this is that the further away you are from the content timewise the less it makes sense. Look at a popular Wikipedia pages list from a year ago, for example, and see how many things you recognize. Major items, certainly, but a lot of it just fades into the background hum of history being made while you’re busy being you (which is the best use of your time, so do carry on.)

Anyway, I hate that so MY popularity list has date-based searches for both Google News and Twitter so you can see why the item was on the list in the first place. In this case the list is only from yesterday but I don’t know anything about current TV so I’m already baffled. I’ll click on the Google News link for Secret Invasion and get this in a new tab:

Screenshot from 2023-04-04 09-38-58

Sometimes you’ll find articles that have no topic mentions in Google News but plenty on Twitter, and vice-versa. Those tend to be pretty interesting!

Sorting by the “Instance Of” property wasn’t particularly revelatory in the case of yesterday, but it can really set a mood on certain days. For example, Russia invaded Ukraine on February 24, 2022. Here’s what the instance representation was for the most popular Wikipedia pages on February 25, 2022:

Screenshot from 2023-04-04 09-42-40

Kind of gives you a top-level view even before you look at specific pages, doesn’t it?

But the really nice thing about organizing these pages by category is I have the opportunity to connect them to specific external resources beyond date-bounded Google and Twitter searches. The human category links to official websites as well as Facebook and Twitter accounts when available (going back to yesterday’s date for this example):

Screenshot from 2023-04-04 09-52-33

I’ll probably start by doing obvious things like connecting film items to Rotten Tomatoes results and IMdB listings, connecting country types to reference resources, etc. but I think as I said earlier I’ve accidentally created a framework. I imagine I’ll be endlessly tinkering with WikiPopPulse as I come across little niche resources for various instances and wire them in.

Leave a Reply