Science

How Omar Khadr's name appeared in a Google search for 'Canadian soldiers'

The episode is yet another reminder of how even algorithms with the best of intentions can unwittingly fuel the spread of misinformation online.

There's a Russia connection, but not in the way you might think

Google's Knowledge Graph pulls its data from a variety of sources — one of them being Wikidata, an open repository of information that's hosted by the same organization that hosts Wikipedia. And like Wikipedia, anyone can edit Wikidata, for better and for worse. (Matt Rourke/Associated Press)

Earlier this week, Conservative Leader Andrew Scheer tweeted a screenshot of some curious Google search results. A search for the term "Canadian soldiers" returned a photo of former Guantanamo Bay detainee Omar Khadr who was accused of killing a U.S. soldier in 2002.

Scheer asked that Google take action, and it didn't take long for another user to suggest the whole thing was the work of a Russian troll.

The truth? It's far more mundane.

Still, the episode is yet another reminder of how even algorithms with the best of intentions can unwittingly fuel the spread of misinformation online. And with Canada's federal election just months away, the stakes are even higher when politics are involved.

How did Khadr get in there?

Khadr's name appeared in what Google calls its Knowledge Graph results. These sometimes appear above or beside Google's usual search engine results when the user asks a question, seeks out a piece of general knowledge, or searches for a well-known place or public figure.

Knowledge Graph pulls its data from a variety of sources — one of them being Wikidata, an open repository of information that's hosted by the same organization that hosts Wikipedia. Think of Wikipedia like a finished report, and Wikidata the raw data that's used to write it. Like Wikipedia, anyone can contribute to Wikidata, for better and for worse.

Twitter user Stephen Punwasi pointed out that the data Knowledge Graph used to put Omar Khadr among Canadian soldiers appears to have been pulled from Omar Khadr's Wikidata page — and that a "Russian troll" was the one who did it.

So was this the work of a Russian troll?

That doesn't appear to be the case.

The modifications to Khadr's Wikidata page were made by a user named Ghuron. The user appears to be an active contributor of data to the site, and according to their Github account, does happen to live in St. Petersburg, Russia.

But Ghuron's activity doesn't appear to be targeted at data related to any particular person, ideology, country, or political topic. Rather, it resembles an automated cleanup job intended to improve the quality of Wikidata at a rate far faster than any one person could do by hand.

According to discussions between Ghuron and other Wikidata members, Ghuron runs a script which uses machine learning to automatically add and modify large volumes of Wikidata data (for example, a person's occupation). Basically, it's designed to put data into buckets.

His script makes sure that Street Fighter is properly classified as a video game, that the Faroe Islands get lumped in under the larger "islands" category, or that the right Renaissance artists are properly classified as painters. Just see for yourself.

From time to time, his script also appears to get things wrong — and other Wikidata users haven't been shy letting him know.

Is that what happened to Khadr?

Yup! Using Khadr's Wikidata page edit history as a guide, here's a brief timeline that goes back even farther than Scheer's tweet:

  • On July 26, 2018, Ghuron's script categorizes Omar Khadr's occupation on Wikidata as "soldier" — part of a larger, automated effort to assign occupations to everyone from the Zodiac Killer to Danish priests. 

  • On Sept. 24, 2018, users begin posting to the discussion section of Omar Khadr's Wikipedia page, asking why Google search results for his name describe him as a Canadian soldier. However, the phrase "Canadian soldier" has never appeared on his Wikipedia page, meaning the phrase was likely pulled from his Wikidata page instead. Google has yet to explicitly confirm this.

  • Later that day, the data is removed from Khadr's Wikidata page. A user on the discussion section of Khadr's Wikipedia page wrote: "It was a Google error, associated only with their search engine and was not fed by Wikipedia. Google rectified the error tonight and Omar Khadr is no longer shown as a Canadian soldier."

  • On Sept. 30, 2018, Ghuron's script categorizes Omar Khadr's occupation on Wikidata as "soldier" once again. It's not clear if Google's Knowledge Graph ignored the change this time.

  • Either way, on Dec. 8, 2018, Ghuron's script then categorizes Khadr's military rank on Wikidata as "soldier" — data that's just different enough that it likely found its way back into Knowledge Graph results. Before long, people noticed Khadr among the results for "Canadian Soldier" once again.

What did Google do about it?

Danny Sullivan, the closest thing Google has to a search engine ombudsman, replied to Canadaland's Jesse Brown on Twitter.

"We reviewed, and because it was an issue with the Knowledge Graph, we took action there," Sullivan wrote.

It doesn't appear that Khadr's Wikidata page has been changed — just Knowledge Graph's handling of the data it contains.

CBC News has reached out to Google, and will update this story if we hear more.

Is this normal?

As Sullivan also points on Twitter, Google doesn't modify search results — at least, not unless it's compelled to remove information from its index. Rather, Google is changing its Knowledge Graph results.

Whether the distinction is obvious to most users, given its placement  — especially in situations where political tensions run high — is less clear.

ABOUT THE AUTHOR

Matthew Braga

Senior Technology Reporter

Matthew Braga was the senior technology reporter for CBC News, where he covered stories about how data is collected, used, and shared.