Thursday, May 21, 2015

A Problem with 538's NBA ELO Time Series

UPDATE 22 May: FiveThirtyEight doubles down on this flawed methodology with a follow-up post that makes the erroneous thinking inescapable. Not good.

The ESPN website FiveThirtyEight (disclosure: I wrote a handful of pieces for them in 2014) has just put up a time series of ELO rankings for NBA (and also now defunct) basketball teams. The effort is notable but deeply problematic. Consider that the ELO rankings produce the following:

1949 Chicago Stags: 1577 (pictured above)
2015 New York Knicks: 1256

Raise your hand if you think that the 1949 Stags (a fine team no doubt) would be favored to beat the 2015 New York Knicks!

The problem is that ELO is constrained to an average value of 1500 over time, thus removing the signal of any trends, variability or non-stationarity in the league performance.

ELO is a tool for contemporary comparison, and is not well suited for presentation as a time series because the baseline changes over time. I'd guess over the long term it has changed so that teams get better, on average, as compared with those of the  past. But certainly there is variation in the overall competitive level of the league, which ELO does not capture.

So when 538 says that the 1996 Chicago Bulls were "the best ever there ever was," what this really means is that they were relatively the best team ever, in comparison to the competition that they faced in 1996. That is a bit different than saying they are "the best that ever was." Producing a time series of ELO is thus fundamentally flawed.

In other words, an average ELO team in 2015 at 1500 is not at all comparable to an average ELO team at 1500 in 1949. The 2015 New York Knicks, as awful as there were in 2015, would probably beat the 1949 Chicago Stags by about 100 points.

Thus, rather than reporting absolute ELO scores 538 might have reported anomalies from the long-term average, to present a team's relative ranking, thus removing the need for consideration of any trend. Not as cool as a long-term ranking, but more accurate.

It is a good first effort, but needs some work!


Post a Comment