Publisher of Humanities, Social Science & STEM Books
four people with confused expressions, using tin can telephones

Numbers Lit! Data Literacy and Fake News

Posted on: March 17, 2021

Dr. Lesley S. J. Farmer, California State University Long Beach, USA and author of the Routledge book, Fake News in Context

“Wisconsin counted more votes than registered voters.”
“Trump’s inaugural crowd was larger than Obama’s.”
“The US has 6 million confirmed COVID-19 cases.”
“The number of children diagnosed with Autism Spectrum Disorder (ASD) has risen of the past ten years; it’s an epidemic!”

Numbers are supposedly objective. After all, they are calculated by universal rules. The book Lies, Damned Lies and Statistics by Mike Nemeth says it all: numerical data can be manipulated to mislead people just as words and images can be manipulated to fool the public.

More than ever, people need to be data literate and check the accuracy – and context – of data that they read and see, especially in the news. Yet people seldom evaluate and select authoritative data sources.

Why is checking data so important? People often share such data online, especially if it confirms their current beliefs (such as a Trump supporter reading the inaugural figures) or is startling (as in the above Wisconsin claim), and people tend to believe their friends, so may act on this false or misleading information. The results may be consequential. For instance, if the data about ASD is coupled with data about vaccinations, parents may decide not to have their children vaccinated at all, which may lead to more diseases.

An easy way to discern the quality of data is to follow these four steps.

  1. Look through the entire article or news items. Sometimes a news heading or picture might not match the rest of the article. For instance, the rest of the article on ASD does go on to say that the reason for the ASD diagnosis rise is a greater awareness of the condition and more testing of children at an earlier age in order to help children and families address ASD differences.
  2. Look up to find the source. The information about Wisconsin’s count was posted in Twitter and went viral. Both the Wisconsin Election Committee and the New York Times checked the figures, and wondered where the Tweeter found the other data. What happened was that the Tweeter used 2018 registered voter numbers, and didn’t realize that in 2020, there were more than half a million additional registered voters than in 2018.
  3. Look across to see the data from other news sources about the same topic. For instance, pictures of the Trump and Obama inaugural crowds were photographed from different angles or were cropped in different ways. The most accurate images (from the air and un-edited) comparing the two events clearly showed that Obama drew a larger crowd.
  4. Look inward to your own beliefs. For instance, seeing that there were six million COVID-19 cases, and reading that fewer than five percent need hospitalization, might be interpreted as a reason not to follow masking and social distancing practices. After all 5 percent of 6 million is only 300,000, and the US population is over 300 million: less than 1 in a 1000. Adding to the fact that up to seventy percent of adults aged 85 or older were hospitalized, especially if you are a young adult and think the pandemic is a hype, and you want to do what you want, these figures can seem reassuring. Those same people who interpret the data as shown above might not read additional articles that give more context such as more recent data, the fact that many people do not disclose their illness or choose not to be hospitalized, and that the elderly are likely to be in crowded nursing homes. 

There are lots of ways that data can be skewed. Here are some tips to help you figure out the validity and accuracy of the data.

  • The population may be less than 50. It is usually misleading to generalize from so few people. Check the size of the group being measured.
  • The group that the data is based on may be skewed or “cherry-picked.” For instance, surveying only Bernie Sanders campaigners in Vermont does not represent all Democratic voters, let alone all American voters. Check the demographics of the population being measured.
  • The unit of measurement can skew results. For instance, the presidential election result if reported by who received the majority vote for on the state level gives a much different picture if that majority count is based on county.
  • Average isn’t always average. For instance, the average household income is higher than the median (the midpoint of all the households) income because those 1 percent billionaires positively skew the average.
  • Data totals may mislead the reader if percentages are not accounted for. For instance, California COVID-19 cases constitute a tenth of US COVID-19 related deaths. It must be deadly to live in that state, right? First of all, California’s population makes up twelve percent of the US population, so percentage-wise, California doesn’t look so bad. In terms of COVID-related deaths, data can also be misleading. The number of such deaths in California is, again, about one-tenth of the US total.  If on the other hand, you look at the number of deaths per 100,000 people (which is more appropriate because it measures proportionate populations), California’s death rate is 130, which puts the state in the bottom (better) half of states in terms of proportional death rates (New Jersey’s death rate is twice as high as California).
  • Graphs of data can also mislead the viewer. A pie chart might not equal 100 percent. Cutting off the bottom of chart can inflate the appearance of difference. Using 2D figures instead of a single line can exaggerate the difference. Always look at the type of graph to make sure it is appropriate, the scale/unit of measurement, the starting point of the graph (it should be zero), and the graph labels.

Doing all this math may be daunting and time-consuming, valuable as it is – and certainly useful for students. At the least, you can use a data fact-checker to give you some confidence in the data you encounter – and can help you over time to identify reliable data news sources. Some good data fact-checking tools are https://www.washingtonpost.com/news/fact-checker/, https://www.wolframalpha.com/, https://factcheck.afp.com/, and Factcheck.org.

In any case, the more data literate you are, the less likely you will be fooled by falsified or misleading data. You can make better informed decisions with more confidence, which helps not only you but also your family and community at large.