In January, I showed that Wolfers' first analysis was spurious and was a good example of torquing data to fit a narrative. Then I wrote:
Sure one can spin a compelling narrative and find some numbers that seem to support that narrative. As the saying goes, numbers which are sufficiently tortured using statistical methods will ultimately confess. As a corollary I might add that pretty much any narrative one cares to spin can be supported by some plausible or plausible-sounding data. But I'm also pretty sure that is not how "data journalism" is supposed to work.
As far as narratives go, the ASU Curtain of Distraction is a great one. Too bad the numbers don't follow along.
I asked Wolfers several times via email and Twitter for his data, and he ignored me. This is pretty common in academia I am afraid, but journals have cracked down in peer reviewed settings. Academics aspiring to be data journalists should continue to meet professional standards, even when publishing in the New York Times, but I digress.
Wolfers is back with a second effort to fit the data to the narrative. In the NYT today he writes of a look across college basketball:
We found that distraction works. On average, college basketball players are about one percentage point less likely to make a free throw when in front of a hostile crowd than when at home. They are not less accurate in neutral arenas, which suggests that what matters is not whether a player is in a familiar arena, or whether they have had to travel, but whether they are trying to shoot in front of an organized student section hellbent on distracting them.The home court advantage in basketball is very well known, so the overall conclusions of a small home court advantage in free throw percentages confirms that which we already know.
But then Wolfers says this:
Our analysis reveals that there appear to be some fan sections that are particularly effective. The best remain the Arizona State fans, at least since their introduction of the Curtain of Distraction. Teams playing in front of the curtain shoot about nine percentage points worse than they do at home. A handful of other teams, including Northwestern, Baylor, Utah, Nebraska and U.C.L.A., are also blessed with effective fans, costing visitors about one point per game on average.You may not have noticed what Wolfers has done with some statistical sleight of hand, so let me explain. He looks at data for all teams over 5 years, but ASU over just 2 years. You have to follow a few asterixes to the fine print to discover this - he writes at the bottom: "That said, this [ASU] data sample is significantly smaller than those of other colleges, which grouped five seasons into a single rating." What!?
It is not even worth looking at this analysis unless it compares apples to apples. Either compare all teams over the past 2 years, or all teams over the past 5 years. It is fairly obvious to any quant that a short-term dataset of this type will have much higher variability than a longer-term dataset. Wolfers does present the data for ASU over 5 years, no such relationship exists. He doesn't present the data for all teams for 2 years. I can guess what it might show.
Given than the ASU numbers already been shown to be spurious, mixing them in improperly with another dataset does not alter this fact. Stuff like this gives data journalism a bad name.