Can AI solve the internet's fake news problem? A fact-checker investigates.
We're in our misinformation predicament partly because of algorithms. Can they also get us out of it?
You may have noticed: It’s a weird time for facts.
On one hand, despite the hand-wringing over our post-truth world, facts
do still exist. On the other, it’s getting really hard to dredge them
from the sewers of misinformation, propaganda, and fake news.1
Whether it’s virus-laden painkillers, 3 million illegal votes cast in
the 2016 presidential election, or a new children’s toy called My First
Vape, phony dispatches are clogging the internet.
Fact-checkers and journalists try their best to surface facts, but there are just too many lies and too few of us. How often the average citizen falls for fake news is unclear. But there are plenty of opportunities for exposure. The Pew Research Center reported last year that more than two-thirds of American adults get news on social media, where misinformation abounds. We also seek it out. In December, political scientists from Princeton University, Dartmouth College, and the University of Exeter reported that 1 in 4 Americans visited a fake news site—mostly by clicking to them through Facebook—around the 2016 election.
As partisans, pundits, and even governments weaponize information to exploit our regional, gender, and ethnic differences, big tech companies like Facebook, Google, and Twitter are under pressure to push back. Startups and large firms have launched attempts to deploy algorithms and artificial intelligence to fact-check digital news. Build smart software, the thinking goes, and truth has a shot. “In the old days, there was a news media that filtered out the inaccurate and crazy stuff,” says Bill Adair, a journalism professor at Duke University who directs one such effort, the Duke Tech & Check Cooperative. “But now there is no filter. Consumers need new tools to be able to figure out what’s accurate and what’s not.”
With $1.2 million in funding, including $200,000 from the Facebook Journalism Project, the co-op is supporting the development of virtual fact-checking tools. So far, these include ClaimBuster, which scans digital news stories or speech transcripts and checks them against a database of known facts; a talking-point tracker, which flags politicians’ and pundits’ claims; and Truth Goggles, which makes credible information more palatable to biased readers. Many other groups are trying to build similar tools.
As a journalist and fact-checker, I wish the algorithms the best. We sure could use the help. But I’m skeptical. Not because I’m afraid the robots are after my job, but because I know what they’re up against. I wrote the book on fact-checking (no, really, it’s called The Chicago Guide to Fact-Checking2 ). I also host the podcast Methods, which explores how journalists, scientists, and other professional truth-finders know what they know. From these experiences, I can tell you that truth is complex and squishy. Human brains can recognize context and nuance, which are both key in verifying information. We can spot sarcasm. We know irony. We understand that syntax can shift even while the basic message remains. And sometimes we still get it wrong.3 Can machines even come close?
Fact-checkers and journalists try their best to surface facts, but there are just too many lies and too few of us. How often the average citizen falls for fake news is unclear. But there are plenty of opportunities for exposure. The Pew Research Center reported last year that more than two-thirds of American adults get news on social media, where misinformation abounds. We also seek it out. In December, political scientists from Princeton University, Dartmouth College, and the University of Exeter reported that 1 in 4 Americans visited a fake news site—mostly by clicking to them through Facebook—around the 2016 election.
As partisans, pundits, and even governments weaponize information to exploit our regional, gender, and ethnic differences, big tech companies like Facebook, Google, and Twitter are under pressure to push back. Startups and large firms have launched attempts to deploy algorithms and artificial intelligence to fact-check digital news. Build smart software, the thinking goes, and truth has a shot. “In the old days, there was a news media that filtered out the inaccurate and crazy stuff,” says Bill Adair, a journalism professor at Duke University who directs one such effort, the Duke Tech & Check Cooperative. “But now there is no filter. Consumers need new tools to be able to figure out what’s accurate and what’s not.”
With $1.2 million in funding, including $200,000 from the Facebook Journalism Project, the co-op is supporting the development of virtual fact-checking tools. So far, these include ClaimBuster, which scans digital news stories or speech transcripts and checks them against a database of known facts; a talking-point tracker, which flags politicians’ and pundits’ claims; and Truth Goggles, which makes credible information more palatable to biased readers. Many other groups are trying to build similar tools.
As a journalist and fact-checker, I wish the algorithms the best. We sure could use the help. But I’m skeptical. Not because I’m afraid the robots are after my job, but because I know what they’re up against. I wrote the book on fact-checking (no, really, it’s called The Chicago Guide to Fact-Checking2 ). I also host the podcast Methods, which explores how journalists, scientists, and other professional truth-finders know what they know. From these experiences, I can tell you that truth is complex and squishy. Human brains can recognize context and nuance, which are both key in verifying information. We can spot sarcasm. We know irony. We understand that syntax can shift even while the basic message remains. And sometimes we still get it wrong.3 Can machines even come close?
The media has churned out hopeful coverage about how AI efforts
may save us from bogus headlines. But what’s inside those digital
brains? How will algorithms do their work? Artificial intelligence,
after all, performs best when following strict rules. So yeah, we can
teach computers to play chess or Go. But because facts are slippery,
Cathy O’Neil, a data scientist and author of Weapons of Math
Destruction: How Big Data Increases Inequality and Threatens Democracy,
is not an AI optimist. “The concept of a fact-checking algorithm, at
least at first blush, is to compare a statement to what is known truth,”
she says. “Since there’s no artificial algorithmic model for truth,
it’s just not going to work.”
That means computer scientists have to build one. So just how are they constructing their army of virtual fact-checkers? What are their models of truth? And how close are we to entrusting their algorithms to cull fake news? To find out, the editors at Popular Science asked me to try out an automated fact-checker, using a piece of fake news, and compare its process to my own. The results were mixed, but maybe not for the reasons you (or at least I) would have thought.
That means computer scientists have to build one. So just how are they constructing their army of virtual fact-checkers? What are their models of truth? And how close are we to entrusting their algorithms to cull fake news? To find out, the editors at Popular Science asked me to try out an automated fact-checker, using a piece of fake news, and compare its process to my own. The results were mixed, but maybe not for the reasons you (or at least I) would have thought.
Chengkai Li is a computer scientist at the University of Texas at Arlington. He is the lead researcher for ClaimBuster, which, as of this writing, was the only publicly available AI fact-checking tool (though it was still a work in progress). Starting in late 2014, Li and his team built ClaimBuster more or less along the lines of other automated fact-checkers in development. First, they created an algorithm, a computer code that can solve a problem by following a set of rules. They then taught their code to identify a claim—a statement or phrase asserted as truth in a news story or a political speech—by feeding it lots of sentences, and telling it which make claims and which don’t. Because Li’s team originally designed their tool to capture political statements, the words they fed it came from 30 or so of the past U.S. presidential debates, totaling roughly 20,000 claims. “We were aiming at the 2016 election,” Li says. “We were thinking we should use ClaimBuster when the presidential candidates debated.”
Next, the team taught code to a computer to compare claims to a set of known facts. Algorithms don’t have an intrinsic feature to identify facts; humans must provide them. We do this by building what I’ll call truth databases. To work, these databases must contain information that is both high-quality and wide-ranging. Li’s team used several thousand fact-checks—articles and blog posts written by professional fact-checkers and journalists, meant to correct the record on dubious claims—pulled from reputable news sites like PolitiFact, Snopes, factcheck.org, and The Washington Post.
I wanted to see if ClaimBuster could detect fake science news from a known peddler of fact-challenged posts: infowars.com. 4 I asked Li what he thought. He said while the system would be most successful on political stories, it might work. “I think a page from Infowars sounds interesting,” he said. “Why not give it a shot and let us know what you find out?”
To create a fair fight, my editor and I agreed on two rules: I couldn’t pick the fake news on my own, and I couldn’t test the AI until after I had completed my own fact-check. A longtime fact-checker at Popular Science pulled seven spurious science stories from Infowars, from which my editor and I agreed on one with a politicized topic: climate change.
Because Li hadn’t had the budget to update ClaimBuster’s truth database since late 2016, we chose a piece published before then: “Climate Blockbuster: New NASA Data Shows Polar Ice Has Not Receded Since 1979,” from May 2015.
Climate-change
deniers and fake-news writers often misrepresent real research to
bolster their claims. In checking the report, I relied on facts
available only in that period.
To keep it short, we used the first 300 words of the Infowars account. 5 For the human portion of the experiment, I checked the selection as I would any article: line by line. I identified fact-based statements—essentially every sentence—and searched for supporting or contradictory evidence from primary sources, such as climate scientists and academic journals. I also followed links in the Infowars story to assess their quality and to see whether they supported the arguments. (A sample of my fact-check is here.)
Take, for example, the story’s first sentence: “NASA has updated its data from satellite readings, revealing that the planet’s polar ice caps have not retreated significantly since 1979, when measurements began.” Online, the words “data from satellite readings” had a hyperlink. To take a look at the data the story referenced, I clicked the link, which led to a defunct University of Illinois website, Cryosphere Today. Dead end. I emailed the school. The head of the university’s Department of Atmospheric Sciences gave me the email address for a researcher who had worked on the site: John Walsh, now chief scientist for the International Arctic Research Center in Alaska, whom I later interviewed by phone.
To keep it short, we used the first 300 words of the Infowars account. 5 For the human portion of the experiment, I checked the selection as I would any article: line by line. I identified fact-based statements—essentially every sentence—and searched for supporting or contradictory evidence from primary sources, such as climate scientists and academic journals. I also followed links in the Infowars story to assess their quality and to see whether they supported the arguments. (A sample of my fact-check is here.)
Take, for example, the story’s first sentence: “NASA has updated its data from satellite readings, revealing that the planet’s polar ice caps have not retreated significantly since 1979, when measurements began.” Online, the words “data from satellite readings” had a hyperlink. To take a look at the data the story referenced, I clicked the link, which led to a defunct University of Illinois website, Cryosphere Today. Dead end. I emailed the school. The head of the university’s Department of Atmospheric Sciences gave me the email address for a researcher who had worked on the site: John Walsh, now chief scientist for the International Arctic Research Center in Alaska, whom I later interviewed by phone.
Walsh
told me that the “data from satellite readings” wasn’t directly from
NASA. Rather, the National Snow and Ice Data Center in Boulder,
Colorado, had cleaned up raw NASA satellite data for Arctic sea ice.
From there, the University of Illinois analyzed and published it. When I
asked Walsh whether that data had revealed that the polar ice caps
hadn’t retreated much since 1979, as Infowars claimed, he said: “I can’t
reconcile that statement with what the website used to show.”
In addition to talking to Walsh, I used Google Scholar to find relevant scientific literature and landed on a comprehensive paper on global sea-ice trends in the peer-reviewed Journal of Climate, published by the American Meteorological Society and authored by Claire Parkinson, a senior climate scientist at the NASA Goddard Space Flight Center. I interviewed her too. She walked me through how her research compared with the claims in the Infowars story, showing where the latter distorted the data. While it’s true that global sea-ice data collection started in 1979, around when the relevant satellites launched, over time the measurements show a general global trend toward retreat, Parkinson said. The Infowars story also conflated data for Arctic and Antarctic sea ice; although the size of polar sea ice varies from year to year, Arctic sea ice has shown a consistent trend toward shrinking that outpaces the Antarctic’s trend toward growth, bringing the global totals down significantly. The Infowars author, Steve Watson, conflates Arctic, Antarctic, global, yearly, and average data throughout the article, and may have cherry-picked data from an Antarctic boom year to swell his claim.
In addition to talking to Walsh, I used Google Scholar to find relevant scientific literature and landed on a comprehensive paper on global sea-ice trends in the peer-reviewed Journal of Climate, published by the American Meteorological Society and authored by Claire Parkinson, a senior climate scientist at the NASA Goddard Space Flight Center. I interviewed her too. She walked me through how her research compared with the claims in the Infowars story, showing where the latter distorted the data. While it’s true that global sea-ice data collection started in 1979, around when the relevant satellites launched, over time the measurements show a general global trend toward retreat, Parkinson said. The Infowars story also conflated data for Arctic and Antarctic sea ice; although the size of polar sea ice varies from year to year, Arctic sea ice has shown a consistent trend toward shrinking that outpaces the Antarctic’s trend toward growth, bringing the global totals down significantly. The Infowars author, Steve Watson, conflates Arctic, Antarctic, global, yearly, and average data throughout the article, and may have cherry-picked data from an Antarctic boom year to swell his claim.
In
other cases, the Infowars piece linked to poor sources—and misquoted
them. Take, for example, a sentence that claims Al Gore warned that the
Arctic ice cap might disappear by 2014. The sentence linked to a Daily Mail
article—not a primary source—that included a quote allegedly from
Gore’s 2007 Nobel Prize lecture. But when I read the speech transcript
and watched the video on the Nobel Prize website, I found that the
newspaper had heavily edited the quote, cutting out caveats and context.
As for the rest of the Infowars story, I followed the same
process. All but two sentences were wrong or misleading. (An Infowars
spokesman said the author declined to comment.)
With my own work done, I was curious to see how ClaimBuster would perform. The site requires two steps to do a fact-check. In the first, I copied and pasted the 300-word excerpt into a box labeled “Enter Your Own Text,” to identify factual claims made in the copy. Within one second, the AI scored each line on a scale of zero to one; the higher the number, the more likely it contains a claim. The scores ranged from 0.16 to 0.78. Li suggested 0.4 as threshold for a claim worth further inspection. The AI scored 12 out of 16 sentences at or above that mark.
In total, there were 11 check-worthy claims among 12 sentences, all of which I had also identified. But ClaimBuster missed four. For instance, it gave a low score of 0.16 to a sentence that said climate change “is thought to be due to a combination of natural and, to a much lesser extent, human influence.” This sentence is indeed a claim—a false one. Scientific consensus holds that humans are primarily to blame for recent climate change. False negatives like this, which rate a sentence as not worth checking even when it is, could lead a reader to be duped by a lie.
How could ClaimBuster miss this statement when so much has been written about it in the media and academic journals? Li said his AI likely didn’t catch it because the language is vague. “It doesn’t mention any specific people or groups,” he says. Because the sentence had no hard numbers and cited no identifiable people or institutions, there was “nothing to quantify.” Only a human brain can spot the claim without obvious footholds.
Next up, I fed each of the 11 identified claims into a second window, which checks against the system’s truth database. In an ideal case, the machine would match the claim to an existing fact-check and flag it as true or false. In reality, it spit out information that was, for the most part, irrelevant.
With my own work done, I was curious to see how ClaimBuster would perform. The site requires two steps to do a fact-check. In the first, I copied and pasted the 300-word excerpt into a box labeled “Enter Your Own Text,” to identify factual claims made in the copy. Within one second, the AI scored each line on a scale of zero to one; the higher the number, the more likely it contains a claim. The scores ranged from 0.16 to 0.78. Li suggested 0.4 as threshold for a claim worth further inspection. The AI scored 12 out of 16 sentences at or above that mark.
In total, there were 11 check-worthy claims among 12 sentences, all of which I had also identified. But ClaimBuster missed four. For instance, it gave a low score of 0.16 to a sentence that said climate change “is thought to be due to a combination of natural and, to a much lesser extent, human influence.” This sentence is indeed a claim—a false one. Scientific consensus holds that humans are primarily to blame for recent climate change. False negatives like this, which rate a sentence as not worth checking even when it is, could lead a reader to be duped by a lie.
How could ClaimBuster miss this statement when so much has been written about it in the media and academic journals? Li said his AI likely didn’t catch it because the language is vague. “It doesn’t mention any specific people or groups,” he says. Because the sentence had no hard numbers and cited no identifiable people or institutions, there was “nothing to quantify.” Only a human brain can spot the claim without obvious footholds.
Next up, I fed each of the 11 identified claims into a second window, which checks against the system’s truth database. In an ideal case, the machine would match the claim to an existing fact-check and flag it as true or false. In reality, it spit out information that was, for the most part, irrelevant.
Take the
article’s first sentence, about the retreat of the polar ice caps.
ClaimBuster compared the string of words to all sentences in its
database. It searched for matches and synonyms or semantic similarities.
Then it ranked hits. The best match came from a PolitiFact story—but
the topic concerned nuclear negotiations between the U.S. and Iran, not
sea ice or climate change. Li said the system was probably latching onto
similar words that don’t have much to do with the topic. Both
sentences, for example, contain the words “since,” “has,” “not,” as well
as similar words such as “updated” and “advanced.” This gets at a basic
problem: The program doesn’t yet weigh more-important words over
nonspecific words. For example, it couldn’t tell that the Iran story was
irrelevant.
When
I tried the sentence about Al Gore, the top hit was more promising:
Another link from PolitiFact matched to a sentence in a story that read:
“Scientists project that the Arctic will be ice-free in the summer of
2013.” Here, the match was more obvious; the sentences shared words,
including “Arctic,” and synonyms such as “disappear” and “ice-free.” But
when I dug further, it turned out the PolitiFact story was about a 2009
Huffington Post op-ed by then-senator John Kerry, rather than Al Gore
in a 2007 Nobel lecture. When I tested the remaining claims in the
story, I faced similar problems.
When I reported these results to Li, he wasn’t surprised. The problem was that ClaimBuster’s truth database didn’t contain a report on this specific piece of fake news, or anything similar. Remember, it’s made up of work from human fact-checkers at places including PolitiFact and The Washington Post. Because the system relies so heavily on information supplied by people, he said, the results were “just another point of evidence that human fact-checkers aren’t enough.”
When I reported these results to Li, he wasn’t surprised. The problem was that ClaimBuster’s truth database didn’t contain a report on this specific piece of fake news, or anything similar. Remember, it’s made up of work from human fact-checkers at places including PolitiFact and The Washington Post. Because the system relies so heavily on information supplied by people, he said, the results were “just another point of evidence that human fact-checkers aren’t enough.”
That doesn’t mean AI fact-checking is all bad. On the plus side, ClaimBuster is way faster than I can ever be. I spent six hours on my fact-check. By comparison, the AI took about 11 minutes. Also consider that I knock off at the end of the day. An AI doesn’t sleep. “It’s like a tireless intern who will sit watching TV for 24 hours and have a good eye for what a factual claim is,” Adair says. As Li’s team tests new AI to improve claim scoring and fact-checking, ClaimBuster is bound to improve, as should others. Adair’s cooperative is also using ClaimBuster to scan the claims of pundits and politicians on cable TV, highlighting the most check-worthy utterings and emailing them to human fact-checkers to confirm.
The trick will be getting the accuracy to match that efficiency. After all, we’re in our current predicament, at least in part, because of algorithms. As of late 2017, Google and Facebook had 1.17 billion and 2.07 billion users, respectively.
That enormous audience gives fake-news makers and propagandists incentive to game the algorithms to spread their material—it might be possible to similarly manipulate an automated fact-checker. And Big Tech’s recent attempts to fix their AI haven’t gone very well. For example, in October 2017, after a mass shooting in Las Vegas left 851 injured and 58 dead, users from the message board 4chan were able to promote a fake story misidentifying the shooter on Facebook. And last fall, Google AdWords placed fake-news headlines on both PolitiFact and Snopes.
Even if there were an AI fact-checker that’s immune to errors and gaming, there would be a larger issue with ClaimBuster and projects like it—and with fake news in general. Political operatives and partisan readers often don’t care if an article is intentionally wrong. As long as it supports their agenda—or just makes them snicker—they’ll share it. According to the 2017 Princeton, Dartmouth, and Exeter study, people who consumed fake news also consumed so-called hard news—and politically knowledgeable consumers were actually more likely to look at the fake stuff. In other words, it’s not like readers don’t know the difference. The media should not underestimate their desire to click on such catnip.
One last wrinkle. As companies roll out an army of AI fact-checkers, partisan readers on both sides might view them as just another mode of spin. President Donald Trump has called trusted legacy news outfits such as The New York Times and CNN “fake news.” Infowars, a site he admires, maintains its own list of fake-news sources, which includes The Washington Post. Infowars has also likened the work of fact-checking sites like Snopes and PolitiFact to censorship.
I asked Li whether my one fact-checked story might have an impact, if it would even make its way into the ClaimBuster truth database. “A perfect automatic tool would capture your data and make it part of the repository,” he said.
He added, “Of course, right now, there is no such tool.”
Footnotes:
1. Fake news is an embattled term. It is used to describe news that is intentionally meant to mislead—for political or economic gain— based on false, misinterpreted, or manipulated facts. But partisans also use it to smear reputable legacy media outlets. Here, we’re using the former definition.2. The book is part of a family of writing guides from the University of Chicago Press. And yes, the facts in it are valid beyond Chicago.
3. A Popular Science fact-checker spent 15 hours verifying the pages of the Intelligence Issue and caught 34 errors before we went to press.
4. Infowars is a media empire and clearinghouse for conspiracies—from the federal government controlling weather to the idea that Glenn Beck is a CIA operative.
5. We made sure that the rest of the story did not provide evidence or context that would affect our fact-check.
This article was originally published in the Spring 2018 Intelligence issue of Popular Science
No comments:
Post a Comment