Saturday, May 30, 2015

The FA Cup Hurricane Prediction Model

As readers here will know, I have a long-time interest in predictions - how they are made, how they are used and how good they are. Evidence indicates that though we try hard, we are just not very good at making good predictions.

Luckily, in research I did a few years back I appear to stumbled on to an exception. In a paper on mine on our ability to anticipate US hurricane damages 1-5 years in advance (here in PDF), the time scale of predictions offered by the so-called "catastrophe modelling" companies, I discovered a unique relationship between the final score of the FA Cup and the total hurricane damage in the US later that same year.

I explain:
Indeed, my own research shows a correlation of 0.33 between the total score in the UK Football Association’s (FA’s) annual Cup Championship game and the subsequent hurricane season’s damage, without even controlling for SSTs, ENSO or the Premier League tables. Years in which the FA Cup championship game has a total of three or more goals have an average of 1.8 landfalling hurricanes and USD11.7 billion in damage, whereas championships with a total of one or two goals have had an average of only 1.3 storms and USD6.7 billion in damage.
Of course, anyone with some data and a spreadsheet can mine for relationships. The true test of a prediction is how it does in real-time prediction. Starting 2006, the companies which provide forecasts (or "medium-term outlooks") of US hurricane damage for 5 years into the future consistently predicted that annual hurricane damage would be above average. These predictions are important because they influence everything from global reinsurance to homeowners insurance.

So how have these sophisticated (and costly) predictions done compared to the FA Cup Prediction model from 2008-2014? The table below shows the results.
You can see that the FA Cup model has been twice as accurate as the catastrophe modeling companies in anticipating US hurricane damage.

OK, this is all fun and games, but there is a serious point here to make as well, and it goes far beyond hurricanes to how we produce and think about scientific research that produces predictions about the future.

From my 2009 paper:
The "Guaranteed Winner Scam" Meets the "Hot Hand Fallacy"

I am sure that no one would believe that there is a causal relationship between FA Cup championship game scores and US hurricane landfalls, yet the existence of a spurious relationship should provide a reason for caution when interpreting far more plausible relationships. Two simple dynamics associated with interpreting predictions help to explain why fundamental uncertainties in hurricane landfalls will inevitably persist.

The first of these dynamics is what might be called the ‘guaranteed winner scam’. It works like this: select 65,536 people and tell them that you have developed a methodology that allows for 100 per cent accurate prediction of the winner of next weekend’s big football game. You split the group of 65,536 into equal halves and send one half a guaranteed prediction of victory for one team, and the other half a guaranteed win on the other team. You have ensured that your prediction will be viewed as correct by 32,768 people. Each week you can proceed in this fashion. By the time eight weeks have gone by there will be 256 people anxiously waiting for your next week’s selection because you have demonstrated remarkable predictive capabilities, having provided them with eight perfect picks. Presumably they will now be ready to pay a handsome price for the predictions you offer in week nine.

Now instead of predictions of football match winners, think of real-time predictions of hurricane landfall and activity. The diversity of available predictions exceeds the range of observed landfall behaviour. Consider, for example Jewson et al. (2009) which presents a suite of 20 different models that lead to predictions of 2007–2012 landfall activity to be from more than 8 per cent below the 1900–2006 mean to 43 per cent above that mean, with 18 values falling in between. Over the next five years it is virtually certain that one or more of these models will have provided a prediction that will be more accurate than the long-term historical baseline (i.e. will be skilful). A broader review of the literature beyond this one paper would show an even wider range of predictions. The user of these predictions has no way of knowing whether the skill was the result of true predictive skill or just chance, given a very wide range of available predictions. And because the scientific community is constantly introducing new methods of prediction the ‘guaranteed winner scam’ can go on forever with little hope for certainty.

Complicating the issue is the ‘hot hand fallacy’ which was coined to describe how people misinterpret random sequences, based on how they view the tendency of basketball players to be ‘streak shooters’ or have the ‘hot hand’ (Gilovich et al., 1985). The ‘hot hand fallacy’ holds that the probability in a random process of a ‘hit’ (i.e. a made basket or a successful hurricane landfall forecast) is higher after a ‘hit’ than the baseline probability.9 In other words, people often see patterns in random signals that they then use, incorrectly, to ascribe information about the future. The ‘hot hand fallacy’ can manifest itself in several ways with respect to hurricane landfall forecasts. First, the wide range of available predictions essentially spanning the range of possibilities means that some predictions for the next years will be shown to have been skillful. Even if the skill is the result of the comprehensive randomness of the ‘guaranteed winner scam’ there will be a tendency for people to gravitate to that particular predictive methodology for future forecasts.
Enjoy today's game. Let's hope for less than 3 goals, and of course, an Arsenal win!


Post a Comment