Thursday, October 6, 2011

Sometimes a Rare Event is Just Rare

At PTG, I just saw a talk by Katarina Pijetlovic of Tallinn University of Technology, on alleged irregularities in ITF draws for Wimbledon and the US and Australian Opens.  Apparently, in the past 12 of these events Roger Federer and Novak Djoikovic were drawn into the same half of the tournament and Andy Roddick and Rafael Nadal were drawn into the same half of the draw.

In any one event the odds of such an outcome are 1 in 2, and across 12 such events, 1 in 4,096 (0.5 to the 12).  The suspicion, Pijetlovic explains, is that there was some nefarious goings on in the draw process.  She pointed to a recent analysis by ESPN of tennis draws which showed that the US Open draws were outliers with respect to the low ranking of players faced by the top seeded player in the first round.

What is going on here? Cheating? Subterfuge?

Pijetlovic expressed doubt because the drawn are done in public with a person (usually a former tennis player) pulling names out of a container (often a trophy).  But still, she said . . .

I for one am not convinced by either of these analyses that there is anything here as they lack a coherent theory  - What benefit would conferred to the tournament?  Is there collusion across tournaments to pair Feder/Djoikovic and Roddick/Nadal?  Why connect the US Open draws with the top 4 seeds across three tournaments?  Why neglect the French Open?

I asked Pijetlovic about it in the Q&A and she said that the statistics were not impossible but were improbable. Fair enough. But there are an awful lot of things that happen in sport. Some much so that rare events are sometimes observed and the joint probability of rare but otherwise unconnected events will be smaller still.

To observe a rare event (or several) and to conclude that there is more to it requires more than just rareness. Otherwise it is just data mining. Sometimes, the luck of the draw is just the luck of the draw.


  1. Well, I'd need quite a lot more evidence - indeed I'd have to be convinced that the location of those particular 4 players in the draw was even rare. Did Pijetlovic seem to understand the ITF seeding system? Wimbledon used to be berated for not following it - in its grand English way, and because the surface was different, they felt entitled to seed as they felt (which drove players who'd sweated their way up the rankings to a 'safer' position completely nuts). My understanding was that the other tournaments had/have very little latitude?

  2. Is there a math error here? If draws of seeded players were completely random then...

    The odds of ONE pair of players being drawn into the same half is slightly less than 0.5.

    The odds of two different pairs of players being drawn into the same half is just under 0.25.

    The odds of each of those pairs ALSO being drawn into DIFFERENT halves is close to 0.125.

    Throw out the first of the 12 events, and you wind up with a probability of one in two to the 33 or roughly one in 8 BILLION.

    _IF_ this math is correct (and I know nothing of tennis, so it may well not be), then it would appear to PROVE rigging, motive or no. [Whereas one in 4192 could easily be a coincidence, given the vast number of opportunities for this sort of thing to happen in sports.]

  3. Jason- Thanks ... in the presentation it was explained that the top 2 seeds are placed into opposite halves of the draw, so they cannot play each other before the finals. 3 and 4 are randomly placed in the top or bottom half -- hence 0.50.

  4. I've always heard that the top two seeds were always placed in opposite halves so as to avoid any potential match before the final! When Nadal won his first Roland Garros the match wasn't all that exciting... not after he beat Federer in the semi-final! Ok he was only ranked #3 then, so the lottery placed him in Roger's half, but imagine if they had met up in the early rounds? Matches not everyone watches? I like know there is a greater chance of seeing top seeds play each other later in the tournament. I figured they might separate #3 and #4 into opposite halves as well to help with this, and then randomly distribute the rest of the players. Dunno if they do this though...

  5. She either has a fundamental misunderstanding of statistics or is deliberately trying to mislead others who don't have a clue about statistics: "If you wanted to arrive at the result of draws achieved at the Grand Slams from 2008 to 2011, you would have to conduct 131072 draws to get the same result only once. It would take you conducting 17 draws every single day for 359 years."

    Actually, no, no, NO! It's called the LUCK of the draw. I could draw this tomorrow, let alone 359 years, or I could also concievably draw for 600 years and still not get it. A probability of 1 out of 50, for example, does not mean you need to do it 50 times in order to get it. Nor does it mean that if you do it 50 times, you _will_ get it.

    Again: If winning the lottery has a probability of 1/100,000,000,000 it does not mean you need to play it 100,000,000,000 to win. Nor does it mean that if you play ti that many times, you WILL win.