American Culture

The new transparency: Newspapers mine public data, and not everyone’s happy about it

Better get used to it, people. As governments increasingly place public information online, news organizations are going to demand access to it and print it — but not always with appropriate context. That must change.

Among the leaders of the data-mining charge appears to be media conglomerate Gannett Co. Inc., owner of 82 U.S. daily newspapers, including USA Today, and 23 television stations. You’ll recall that Gannett-owned The Journal News published an interactive map of addresses of gun-permit holders in the New York state counties of Westchester, Rockland, and Putnam.

The News has been roundly criticized for that act. But there are reasons for criticism beyond the rabid fear-mongering.

The News has a First Amendment right to print public information (lawyers would argue some limits do apply). But any newspaper printing public information, especially when unpopular, has the responsibility to carefully and intelligently construct context for those data. Simply printing that these households have a gun permit does not do a damn thing to advance a public debate about guns, deaths related to guns, and the Second Amendment.

In Wisconsin, a Gannett Wisconsin Media investigative team has begun a continuing series: “What We Pay: Your Tax Dollars and the Salaries They Support.” The series is published in Gannett’s Wisconsin papers.

In week one, the team examined salary information at Wisconsin’s public universities. The stories identify salary disparities among different disciplines (with business and finance profs earning the top salaries). They also show university football coaches and athletics administrators made million-dollar salaries — with one assistant football coach making more than half a million. Shocking, isn’t it, but hardly surprising.

The stories explain why disciplines differ in what salaries they can demand: market value and social value, providing some context for the raw data. But more context — deeper, more instructive context in stories heavily invested with face-to-face reporting — would serve readers better.

In week two, Gannett published the salaries of public school teachers, employees, and administrators in more than 400 Wisconsin school districts. That’s the salary records of more than 250,000 people.

Begin the obvious criticism: If you earn a paycheck, it’s likely you wish to keep that after-taxes figure to yourself. In America, it’s a cultural faux pax to ask someone, “So how much do you make?”

That criticism cannot offset the fact that tax dollars pay public school salaries. That’s public information. Nowadays, salary data exists in easily mined databases. Don’t want the public to know your salary? Consider a career in which taxpayers don’t pay your wages.

(But even then, your salary is in a database somewhere. As a fellow Scrogue points out, websites like Glassdoor provide ranges of salaries for various occupations and positions. Publicly held companies list salaries of senior positions. Like it or not, your salary is at risk of becoming public knowledge.)

The Gannett series ranks district salaries by report card scores, enrollment, average teacher salary, and district administrator salary. The series notes that “salaries vary widely across Wisconsin’s public school system, even among districts of similar size and location.”

In the weeks to come, Gannett plans to publish other city, county, and state public-employee salaries.

Get used to this. It will happen more often. But expect — in fact, demand — that news organizations provide the essential qualitative reporting to provide context around these quantitative data. This Wisconsin series is insufficient in that regard. Yes, various explanations are offered for discrepancies and patterns found, but they are not rich in nuance and lack well-reported depth of explanation.

Gannett is a national news organization. If it plans to increase its data-mining activities, it should scale them up — how do Wisconsin public teacher salaries compare with those in other states in which Gannett has a presence? What patterns can it discern? What explanations can it uncover?

News organizations should tell the human stories inherent in data. The Journal News did not do that with its gun-permit map. Gannett’s public salary series only scratches at the surface of what readers need to participate in a much wider debate on how society places value on the people it pays to serve it.

Journalism schools should take notice as well. If they’re teaching data-mining techniques, they should be damn sure first-rate data reporting and journalism ethics courses are required as well.

10 replies »

  1. I don’t know. Can I make the case that this is a valid journalistic function serving the community? Sure. I work in marketing. You’d be amazed what I can make a case for. But that isn’t what this feels like. It feels like another corporate attack on public education and hence a despicable violation of the public trust.

    I can’t read minds and I haven’t read the series. But the last couple of decades have programmed me to question things in a certain way.

    Meanwhile, here’s looking forward to the debate that ensues when Anonymous and/or Wikileaks are able to provide us with the salaries of Gannett executives.

  2. As a former state employee in Wisconsin, this is nothing new. Every year that I worked there (the previous 5 years) the Milwaukee Journal Sentinel published a database of all state employee wages, including overtime (except the court system, and university, I think).

    I’m not going to argue about whether or not it’s necessary, or right, or appropriate. It is what it is. However, I will say that the first time after that that someone snottily asked during a phone call “just how much do you make, anyways?” and I told them…well, they shut up fairly shortly afterwards.

  3. Sam, these corporate attacks; are they mandated from above or made by the journalists?

    • You tell me. I’d love to know what the thinking is here. Like I say, I can see how something like this might be of legitimate community value, were there some sense that there was wastefulness or corruption. But publishing public teacher salaries in the absence of some kind of journalistic purpose?

      We live in an era marked by relentless, ideologically driven partisan attacks on government in general, and on teachers in particular. I have a good friend who’s a former Gannett employee (won a Pulitzer with them, in fact) and nothing he ever had to say would lead me to think that the company’s leadership was … how to put this? … ethically unimpeachable. So I see this and I wonder.

    • Although, to your specific implication, when you get a series like this that spans print and broadcast across the country, that is by definition corporate driven. You didn’t have editors and reporters at all outlets simultaneously and spontaneously cocking up the same idea.

  4. I was talking with my wife a little about this issue this morning and she raised some very questions that might have some bearing, at least as this data involves data mining teachers’s salaries.

    Does the data released include just base salary, or salary plus benefits? How about what teachers get paid over base salary for covering a class for another teacher who has to go home sick in the middle of the day? Or summer school pay?

    How does the data handle differences between two people of equal experience (years worked as a teacher) but different educational levels? And we’re not talking just BA/MA/PhD, but also the extra credits that every teacher has to take to maintain his/her teaching license and that are probably hidden from view due to privacy laws? And is there enough metadata released into the public domain to understand why two teachers with identical years of experience and credits might still be paid more (years of experience in one district do not always transfer to years of experience in another – there’s usually a cap)?

    This and more are things that are going to be desperately needed to properly understand the public data without drawing completely erroneous conclusions. Without this information you couldn’t possibly analyze whether female teachers hit a “glass ceiling” that male teachers do not, never mind understanding whether the ceiling is due to gender discrimination or because of losses in experience due to pregnancy and child-rearing.

    In today’s era of journalistic cuts, I’m not exactly hopeful that reporters will be given the time they need to analyze the data they’re reporting in sufficiently detail to get it right. And if looking at the abuses of climate data has taught me anything, data without proper analysis is useless at best and a propaganda tool at worst.

    • How about this: Is this bastion of concerned journalism using data that reflects how much money these teachers have to spend out of their own pockets on supplies?