The phrase “data journalism” gives roughly equal weight to the terms “data” and “journalism,” implying that they’re both quite important to the practice.
But why do journalists need data? And why do data need journalists?
Let’s start with the first question: Why do journalists need data?
To answer that question, it is helpful to recognize that the typical journalistic story is anecdotal. It relies on a single account or a few accounts to describe an incident and, sometimes, an issue. That can often make a problem seem way bigger or smaller than it actually is.
For example, one could hear a compelling story about the tragic deaths of members of a youth soccer team in a plane crash and get the sense that air travel is dangerous, despite it actually being the safest mode of travel. Similarly, one might see a story about a child getting shot and believe it to be an isolated incident, even though hundreds of children are injured or killed by guns each year.
In such cases, data can be used to supplement the anecdote with robust evidence by allowing the journalist to point to multiple data points, all collected in some systematic fashion, to help contextualize the issue.
Additionally, typical journalism is also episodic. For example, a story will often focus on specific incidents, such as a house burning down and claiming those living within it. Data can help the journalist move from describing the episode to tackling a broader issue, such as fire codes and preparedness, by focusing on the prevalence of the issue and the groups that are most affected. The incident thus becomes more of an example to illustrate a broader issue.
Now that we have established that journalists can benefit from working with data, we can move on to the second question: Why do data need journalists?
The main reason is quite simple: data can be an absolute mess. For example, consider the following spreadsheet:
That’s a lot of information — so much, in fact, that it becomes very difficult for a person to process that information and make sense of it. Chances are, you’re not really seeing too many patterns in that dataset just by looking at it. The information overload problem becomes especially true when you’re talking about thousands of data points (or more!).
Now, consider this representation of those same data:
It is much easier for a person to make sense of this representation of the data. This isn’t just because the data have been transformed into a visual. It’s due to the fact that a journalist has analyzed those data, found the most interesting insights, and presented those insights in a manner that is easy for a person digest.
So, to directly answer our question, data need journalists in order to be more easily understood by a general audience. Journalists have a sense of what is important to their audiences and they can thus prioritize information, separating the interesting from the mundane. They also have a sense of their audiences’ knowledge levels and can thus help them make sense of the information by distilling it into a simpler form, such as by breaking down otherwise confusing advanced statistical analyses. Finally, they can help people care about the data by integrating the anecdotes in a way that connects data to core human emotions.
In short, data can add value to journalism and journalism can add value to data. When done properly, they make a formidable partnership for empirically and rigorously addressing questions that matter to a community in a compelling way.