‘The joke is on us if we cannot read data’

Award-winning investigative reporter and database editor Brant Houston talks about how data journalism builds credibility

April 13, 2017 01:09 am | Updated 07:26 am IST - Mumbai

Necessary skills: Brant Houston says that to be a good journalist today, one needs training in how to research on the web and social media, and how to find and use documents.

Necessary skills: Brant Houston says that to be a good journalist today, one needs training in how to research on the web and social media, and how to find and use documents.

Brant Houston, 63, is the Knight Chair in Investigative Reporting at the University of Illinois. The award-winning investigative reporter and database editor at various U.S. newspapers is a co-founder of the Global Investigative Journalism Network and the Investigative News Network, and is the author of Computer-Assisted Reporting: A Practical Guide . In India to conduct a hands-on data journalism workshop organised by the Energy Policy Institute at University of Chicago (EPIC)-India and data journalism initiative IndiaSpend, he shares why data journalism is not restricted to the nerds, and builds the credibility of journalists in the era of fake news.

How do you define data journalism?

Basic data analysis is going through documents with columns and rows. Now, it has begun to include technical skills like web scraping, setting up interactivity with databases. Data journalism is a basis for reporting and its extension, which allows you to get a better context and depth of story.

How has data journalism impacted the way we do journalism, especially investigations?

Data journalism has increased our credibility. When we only rely on anecdotes, we are only looking at the trees. Data allows us to look at the larger picture, the forest. And this has allowed journalists to go beyond the social sciences and be a bit more organised and scientific in their approach. It has helped many journalists overcome their Math phobia, because numbers and spreadsheets can be overwhelming. But the joke is on us if we are unable to read them; how else could we then look at budgets and inflation and report on them?

Good journalism is needed now more than ever before, especially in the face of “fake news”. So we need the skills to ensure more credibility. The digital platform allows us to save a lot of words by visualising data, but the strength lies in the data that has been mined and revealed.

But working with data seems like a thing only nerds do.

Way back in 1999, the Columbia Journalism Review had published a cover story titled “We are all nerds now”. On one hand, our newsrooms may not be even close to that. But on the other hand, the whole world today has gone nerdy because data is everywhere.

When I was once struggling with finding and sorting some data, a senior colleague told me that if I did not sweat it out a little, it would make no sense to receive an explanation from him on how do it, because I could forget it. So it takes time to figure things out, but it is worth the effort. Even now, I do try to learn a new software at least once a year.

How do you find data that may not be easily available?

Everyone thinks that there is a lot of transparency with data in the U.S., but the challenge is finding data with smaller cities and rural levels, similar to some countries in the world. But if there is no data on a given subject, then probably policies are being made without substantial information, deliberation and transparency, and that’s a story. It also paves the way towards pushing for creating data on the same subject.

Does data journalism take away from the focus of narrative journalism, where a journalist attempts at making a personal connect?

I always felt I was developing the skills to do a much better job with documents. And I never stopped interviewing people. I would look for documents, get the information and then get out on the field. Getting a lot of documents and data allows us to decide whom we should be interviewing, as opposed to talking to anyone we run into, or a spokesperson. Just because documents are online now and the word data is used, people treat it differently.

What is required of a newsroom to make data journalism a part of their work?

I have been doing data journalism for three decades and it is sad that it is still not an integrated part of newsrooms, and this is the case around the world. But to be a good journalist today, one needs training in how to research on the web and social media, and how to find and use documents. Apart from the basic spreadsheet, a journalist should be able to make maps, have a better understanding of statistics, and understand social networks.

So newsrooms ought to be filled with those who can do minimum reporting, those who can compare large data and match different files, and those who can scrape data, clean it and code.

But wouldn’t that overlook the classic journalism skills?

We still need all those skills but a journalist who knows to code would be able to get some data off the web much more quickly and efficiently manage it. My vision is that coding will be second nature to a journalist’s bag of skills, like knowing how to drive a car.

Where do you foresee data journalism in India?

India has carved out a great reputation for technology and data, and even though I don’t know if coding is taught in schools, there is a comfort with technology and excitement about it. IndiaSpend has an intense focus on getting and looking at data, and some newsrooms created interesting visualisations for the elections. The other interesting thing in India is the easy acceptance of social media tools for mining data.

Could you illustrate a data journalism project you worked on that is close to your heart?

I was building my own database of unsolved murders of women in Connecticut over a period of five years, when I was working with Hartford Courant in the early ’90s. Since we did not have a good database of our own articles, I went to our library of clips and made a spreadsheet out of the reports of these murders. Tracking where the women were from and where their bodies were found, I saw a pattern. I got a large set of data from the FBI called Supplementary Homicide Report , and upon cross-referencing, the patterns revealed three serial killers operating in three different parts of the state. This eventually [led to] a task force that tracked the killers.

We profiled five of the women who were murdered, and were either involved in drugs and prostitution, and were hence thought of as less. But after my reportage, their families called to say that nobody had cared for those missing women. This showed [to] me that we did not have to work with huge amounts of data to create impact.

What makes good data journalism

- Finding meaning in data which could affect people’s lives, like guiding policies. The best data journalism connects us to “why we care”

- Checking the integrity of the data, and finding its flaws, if any

- Revealing the methods of finding the data to establish transparency, accuracy and integrity, and thereby its purpose

The tools

- Using Boolean Logic, which some say is a life skill. It is about using three words: “and”, “or”, “not” in search engines. This narrows down the search.

- Advanced Google search allows you to choose domain name, and the type of data (pdfs, spreadsheets, etc).

- I use spreadsheets all the time, and Google FusionTable to manage my data

- Esri is a good source for maps and making them

World class data journalism

- Internationally, the Panama Papers showed people the power of data and its structures. I’m not sure if 10-15 years ago such amount of data could have been handled. The Panama Papers had tremendous impact in showing the flow of illegal money.

- The Atlanta Journal-Constitution investigated doctors across the U.S. who had been disciplined for sexually abusing their patients and had returned to the profession. It required scraping across 50 states, and many interviews.

- The continuing body of work by the Organised Crime and Corruption Reporting Project (OCCRP), which uses data tools so that people understand organised crime. They are based in East Europe, but the reporting takes them everywhere.

- Korea Center for Investigative Journalism (KCIJ) reported of the manipulation of Internet public opinion by South Korea’s spy agency before their 2012 Presidential election. Some 600 Twitter accounts used by the spy agency were detected.

Top News Today


Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.