Images, Art, Facial Recognition, More: Sunday Morning Buzz, May 17th, 2015


Wolfram|Alpha has launched an image identification tool. “Now I’m excited to be able to say that we’ve reached a milestone: there’s finally a function called ImageIdentify built into the Wolfram Language that lets you ask, “What is this a picture of?”—and get an answer. And today we’re launching the Wolfram Language Image Identification Project on the web to let anyone easily take any picture (drag it from a web page, snap it on your phone, or load it from a file) and see what ImageIdentify thinks it is…” Warning: you can play with this for hours. I uploaded an image of one of my cats and it got it spookily correct (“Calico cat”) but then I uploaded a picture of a person and it misidentified him as a fire extinguisher. It seems to do best with images without lots of details.

Amit Agarwal, who has no need to prove how brilliant he is but keeps doing it anyway, has created a tool to send bulk personalized Tweets and DMs.


May be useful depending on your research needs: a roundup of 60 facial recognition databases.

Interesting! Using a ‘bot to help people discover art. “Artbot, developed by Desi Gonzalez and Liam Andrew in the HyperStudio research group of Comparative Media Studies/Writing (CMS/W), is a mobile website app that mines both user preferences and event tags to provide serendipitous connections to the local art scene…. Artbot enables users to select their interests from a list that ranges from medieval art to surrealism and from ancient history to photography. At the same time, the app scrapes data from museum websites to find artists, movements, and themes that link events to each other in various ways. Artbot then cross-references the data collected to generate event recommendations.”


Chromecast has gotten some updates. “Ever since Google launched the Chromecast in July 2013, the company has been steadily updating the HDMI dongle with new capabilities and features. Today, the company has announced six new apps for its $35 streaming media stick: CBS All Access, HGTV, FOX Now, FXNOW, Pluto TV, and Haystack.”

Libraries and Archives of Canada has put more WWI service files online. “As of today, 155,110 of 640,000 service files are available online…”


YouTube “How To” video searches are way up in 2015. “People trying to figure out how to accomplish a home improvement project, fix their hair or prepare a recipe have helped grow YouTube’s ‘how-to’ searches by 70 percent year-over-year.”

More YouTube: what’s YouTube’s most-watched game? Why, it’s Minecraft. “Think about that for a minute. YouTube’s list of the top 10 biggest games on the site, based on a decade’s worth of viewing hours, features long-running game franchises like Call of Duty and Grand Theft Auto. But it’s six-year-old Minecraft that comes out on top.”

From Search Engine Land: How Google made it a little harder to reach from outside the US. “Last fall, things were quietly changed. Instead of that link always being at the bottom of country-specific versions, it was altered to appear only the very first time someone tried to reach and got redirected to their country-specific version. On subsequent attempts, it would not be shown.”

There are concerns going around about a phantom Google update. “HubPages, a collection of more than 870,000 miniblogs covering everything from the ‘History of advertising’ to “How to identify venomous house spiders,” saw its Google search traffic plunge 22 percent on May 3 from the prior week. Of the company’s 100 top pages, 68 lost visitors over that stretch.”

HathiTrust, in its blog, has a post about quality and OCR issues. “For the digital content we ingest, HathiTrust has established specifications related to image formats, resolution, color space, and other characteristics. Rigorous validation ensures that these specifications are met. The methods of production or processing of digitized items may leave fingerprints of some sort, however. These may be benign, such as the presence of digitization color targets, added coversheets, book cradles, or a characteristic coloration of pages, which do not generally interfere with the display or understanding of the original object and its content. They may also be more serious, including mis-colorations of pages, human fingers in the images, systemic cropping, warping, or bolded or light text—problems that do interfere with legibility or clarity of the image.”

The Digital Public Library of America (DPLA) is now on Pinterest. Good morning, Internet…

I love your comments, I love your site suggestions, and I love you. Feel free to comment on the blog, or @ResearchBuzz on Twitter. Thanks!

Wolfram|Alpha, Olympics, India, More: Saturday Buzz, February 1, 2014

Wolfram|Alpha now has data about languages spoken in the US.

Korea has fined Google $196,000 for unauthorized data collection.

Over at TechCrunch, an interesting discussion about whether Yahoo was building its own search products. If they do, I hope they go all-out. Confidential to Yahoo: there are lots of search gaps out there. Searchable subject indexes come to mind…

Fun! Winter Olympians to follow on Instagram.

Speaking of Olympics, I had no idea there was a database devoted to tracking cases where athletes have been suspended for doping. Unfortunately for casual curiosity, the Anti-Doping Database is subscription-based.

A really nice Facebook image cheat sheet. I need this as a poster.

From Boing Boing: The Library of Congress is adding digitized Carl Sagan items to the LoC Web site.

A column in the New York Times compares social Q&A apps Jelly and Need. For the author’s purposes (which sound like mine) Need absolutely wins.

From Small Business Trends: The 7 Best WorldPress Alternatives. After using FrontPage (MANY years ago), Moveable Type, and WordPress for ResearchBuzz, I can say with much confidence that I’m sticking with WordPress. But I was trying to use WordPress for a work problem, and it wasn’t working very well. I ended up using a site creator called Wix ( ) which is not mentioned in this article, and I highly recommend it. (DISCLAIMER: My link to Wix is not and affiliate link and Wix didn’t pay me to recommend them. Wix doesn’t know me from Adam’s off-ox.)

The National Film Archive of India is going to get a digital library.

Twitter has bought 900 patents from IBM? “Twitter says it has bought 900 patents from IBM and that the companies have entered into a cross-license agreement. Financial terms weren’t disclosed.”

Zooniverse has launched yet another new project: Disk Detective. “Disk Detective is backed by a team of astronomers that need your help to look at data of stars to try and find dusty debris disks – similar to our asteroid field. These disks suggest that these stars are in the early stages of forming planetary systems.” Good morning, Internet…

Wolfram|Alpha Lets You Slice and Dice Jobs and Salaries Data

Wolfram|Alpha announced at the end of September that it had overhauled/updated/spat and polished its data on jobs and salaries in the US. And I’m glad I didn’t cover it then because according to the comments the data was a bit buggy. But hopefully it’s better now.

You can now go to WA and ask questions about employment in various regions of the United States. I can make the query teachers in South Carolina.

Wolfram|Alpha returns data with a count, information on wages, a breakdown of subcategories, and a list of related categories. In this case, I got information on the number of librarians, curators, and archivists in South Carolina, as well as postsecondary teachers.

While this information is interesting the real fun (as is almost always the case with WA) comes when you start mixing the data together.) The query teachers and truck drivers in South Carolina shows data about the two occupations side by side, graphs an employment history in SC over the last several years, and shows each occupation’s presence in the workforce as a percentage. However while it shows wages for the truck drivers, it doesn’t for the teachers — as there are many subcategories of teachers the query may be too general.

You can also compare teachers in metro regions (compare teachers in charlotte to teachers in columbia) or specific job information between states and regions. (compare truck driver salary in Montana to truck driver salary in Texas).

And remember, when we get right down to it, it’s all just numbers. Wolfram|Alpha is about lots and lots of numbers. And if you can figure out the syntax you can compare those numbers. This query works: Box office of Iron Man 2 versus aggregate salary of all California plumbers. As does this one: cost of ten thousand gallons of gas versus salary of South Carolina truck driver. (WA will even graph that for you.)

I love searching with Wolfram|Alpha, but more than that I love playing with Wolfram|Alpha. To get a sense of the scope of what WA covers at this point, visit the examples page.

Wolfram|Alpha Adds Movie Data

Wolfram|Alpha announced on September 17th that it has added movie box office data to its data engine. You can check it out at

Whenever I hear about a new data set on WA I always check to see if you can access it randomly. And in this case you can: the query random movie got me the result New Wave Hookers (I am not kidding) with basic information as well as a brief cast listing. A few random searches later I found Cabaret, and this is where the box office data started coming in.

Cabaret, according to WA, was released in 1972. There was a brief mention of total box office receipts but clicking on “More History” brought me to … nothing. Total receipts was all the information this movie had. I went to look at something more recent, and picked Dead Again, which was released in 1991. For that movie, WA had statistics about its highest rank at the box office, highest receipts for the weekend, highest number of screens it played on at one time, and highest average receipts per screen for a single weekend. There were also graphs that showed the performance of the movie over time. It looks like most of the recent movies have a good amount of data, though one that I looked at (the Aqua Teen Hunger Force movie that came out a few years ago) had no box office data at all.

If you want to do some comparisons, you can search for several movies at once and get the box office data presented in a table. See the screenshot for a comparison of super hero movie sequels, pitting Iron Man 2 against X2 and the Fantastic Four sequel.

I discovered to my delight that you can also use movie release dates as a time unit, making the query minutes since iron man release date — well, let’s not get nuts and say useful, but how about viable. Just so you know, at this writing it’s been 21024 hours since the release of Iron Man. You can also calculate time between movies, as in minutes between iron man release date and iron man 2 release date.

Wolfram|Alpha Launches Widgets

Search engine Wolfram|Alpha has announced Wolfram|Alpha widgets.

Oh boy! I love widgets. What are they? Widgets are little bits of code that you can usually embed somewhere — like on your Web site or Facebook page. Widgets perform calculations, provide information, or other small feats of data crunching. W|A has tons of widgets available at

You can look at featured, top rated, and the newest widgets, but there’s also a list of popular categories which includes Money & Finance, Physics, Weather, and Astronomy. Just poking around for a few minutes I found a lot of interesting widgets, including a genealogy relationship calculator, calculator for heart disease risk, SAT score analyzer, and to my surprise a bunch of tools for Scrabble and crossword puzzles.

When you find an interesting widget you can try it right from within the gallery. If you like it you can customize it (you’ll have to have your own Wolfram|Alpha account to do this) or you can embed it. There are embed options for specific Web sites, but the WordPress one requires that you have another plugin already installed.

But there is a regular line of JavaScript code for embedding widgets as well. If it works properly and you have JavaScript turned on, you should see a widget below.

I have been playing with WA since it launched and I still haven’t found all the cool stuff it can do. Widgets might be a more targeted way to explore its capabilities.