On the Use of Data

Derek Thompson posts on The Atlantic about a study from The Hamilton Project under the title ‘The Hollowing Out of the Middle Class.” This one bears some digging into which I promise to do. But Thompson has focused in on a few graphs in the Appendix of a 50 page report.

Let’s be honest, not all data are created equal and most authors use it to support a position. Cherry picking the charts and graphs that support your belief system is fairly common. There’s a technical term for zeroing in on the information that supports what you already believe–confirmatory bias. Is that at work here? Could be. Appendices, likes footnotes in corporate annual reports, are often where telling details and outright contradictions live.

In this case the position seems to be the at best questionable assumption held by the Gates/Obama crowd that the answer to all life’s problems is a college education. Oh yes, the graphs clearly show that. But to my thinking there are 2 problems that Thompson doesn’t address but the full report might.

One, we cannot just accept as a given that the types of solid, middle class jobs for HS grads that my dad held disappeared due to some natural force like erosion. The economy, as anyone who took 101 should know, is the aggregated result of individual choices. The structural changes in the economy between 1970 and 2008 were not a given or Japan and Germany would face the same exact issues. Rather those changes are the result of choices made by people–uncoordinated as they may be. So the argument that the economy requires more college-educated people because of the decisions of college-educated people is somewhat tautological.

The second issue is, what’s in the underlying data set. A quick glance at the full report results in a mathematical assault about logarithmic scaling and price deflators. I understand why they do all of this to normalize data drawn over time, account for inflation and put disparate rates on an equal footing. The problem is the distribution and composition of the underlying data.

I promise to do some more digging on this front. But for now what is unclear is if CEO and Senior Executive pay–which increased by a factor of 100 in the time period covered–is included and if mean dollar values are used. Large values distort means and it is unclear if that is happening here.

In the meantime, though, it’s like Rod Stewart said, every picture tells a story.


