Tomorrow, barring the pavement opening up and swallowing him, Chris Froome of the Sky Team will win the Tour de France. And for the next “ten years” we’ll all wonder if he did so without the help of drugs.

There are many reasons to suspect Froome, which we’ve discussed before, but a recent interview with Jonathan Vaughters, boss of a rival team, brings up another–it’s not just that he’s winning but HOW he’s winning.

Froome’s winning margin is likely to be over five minutes, 5:03 to be exact, and it would be 5:23 had he not had a time penalty for blatantly cheating by taking on food when he was not allowed to, a blunder he tried to fob off on a teammate.

That’s a huge margin.

In 1989 and 1990, before synthetic EPO became prevalent in cycling, Greg Lemond won the race by 8 seconds and by 2 minutes and 16 seconds. Between 1991 and 2005, the “doping era” which included the wins of admitted dopers Riis, Ulrich, Pantani and Armstrong, the average winnng margin was 5 minutes and 1 second. While few would argue that cycling has been clean since, most would argue it’s certainly been *clean-er, *and once again we’ve seen winning margins come down. The winning margin from 2006 through 2012 was 1 minute 40 seconds. (Indeed, if you exclude the performance of Froome’s Sky teammate, Bradley Wiggins last year, the recent wins have been even narrower, 1’23”.)

Number of stage wins by a Tour winner shows a similar pattern. During the doping era, the Tour winner averaged winning 2.4 stages. Since, it’s been one stage win per year.

So where are we? Froome is going to win by 5’03” and will have won 3 stages when he arrives in Paris. During the dirty years, the typical winner won by 5’01” and won 2.4 stages. During recent years, the winner has won by 1’40” seconds and has won 1 stage. The last cyclist to win by more than five minutes? Lance Armstrong in 2004. The last Tour winner to win more than two stages? Lance Armstrong.

In other words, Froome is winning like winners won in the dirty days. In fact, he’s winning EXACTLY like Armstrong won. Armstrong’s winning marging during his seven wins was 5’23” and his average number of stage wins was 3, the same numbers as Froome to the second.

Does this prove he’s dirty? No, but it certainly makes it OK to ask the question.

Categories: Science/Technology, Sports

I don’t agree that the winning margin suggests anything about the winner’s use or otherwise of performance enhancing drugs. For that to work, you’d have to be assuming that only the winner was doping, and for many tours it’s now confirmed that other podium finishers were doping.

Also, I looked up the winning margins before disqualifications of doping riders for all tours since 1970. For 1970-1990, which you say is the pre-EPO era, the average winning margin was 6m38.8s. From 1991-2005, the doping era, I found an average winning margin of 5m09s (not sure why that’s different to the 5m01s you give but maybe I copied a number wrongly). Plotting all the winning margins against time, there’s a general trend towards smaller winning margins with no apparent change of trend between the doping and non-doping eras.

It’s right that the riders face these sort of questions after so many years of scandal and lies, and it’s outrageous that the teams refuse to release relevant data, but I don’t think the winning margins are telling us anything.

Good analysis. Not sure about your conclusion.

First, the 5’01” vs. 5’09” could be my mistake. I used to be a pretty good analyst, but my skills have atrophied over the years.

Second, let’s say there are three possibilities. (Actually there are more than that, but these are the three that seem most likely to me.)

1. Winning margins are essentially a random walk and there is no pattern to the numbers. The low numbers between 2006 and 2012 are simply the luck of the draw. WM and #wins don’t measure anything.

2. Your hypothesis–There is a long term trend toward tighter wins, in which case Froome’s winning margin could mean something since it’s clearly inconsistent with the trend.

3. My hypothesis–WM and # wins are a valid measure of rider dominance. EPO fueled riders win by bigger margins than clean riders, in which case Froome’s winning margin could mean something since it’s inconsistent with clean riders.

So it seems to me under both your hypothesis and my hypothesis it’s a reasonable conclusion to infer that something could be up with Froome. Only under the random walk hypothesis is Froome’s performance not suspicious.

Even though I don’t think your conclusion follows from your analysis, however, I can still think of two reasons my argument might be bollocks.

1. I can’t think of the underlying causality for my hypothesis. Why should a cheater beat other cheaters by wider margins than a clean rider would beat other clean riders?

2. The riders who won in the “clean era” of 2006-2012 weren’t exactly pristine. Pereiro had a problem with over-dosing on asthma medicine and three of those wins were by Contador, who’s been caught once and accused twice more.

But you’re 1000% right. If instead of releasing two years of selected data to a newspaper like Sky did, they released real data real time, we wouldn’t have to resort to this sort of inferential analysis of aggregated data.

Thanks for coming in. It’s great to have a real conversation about this stuff.

Well, when I plotted the data for stage wins and winning margin against time, for both the data give the visual impression of a trend towards fewer stage wins and smaller winning margins, but there’s no way Chris Froome appears to be an outlier. I fitted a simple linear equation to the winning margins and got

WM = 514.9 – 9.11*(year-1970)

This indicates a reduction of 9.1 seconds per year in the winning margin on average. Then, looking at the residuals from the fit, the standard deviation about the trend is 3m24s. Chris Froome’s final winning margin was 4m20s, while the trend line would predict 2m03s, so he’s 2m17s above the trend, or 0.67 sigma. In a statistical sense that’s a complete non-result.

Looking at all results since 1970, the two points that are the strongest outliers are 1973 and 1981. Luis Ocaña won by more than 15 minutes in 1973, and 1981 Bernard Hinault won by 14m34s. These points lie 2.26 and 2.24 standard deviations above the trend line. In comparison, Lance Armstrong’s largest margin by this measure was in 1.04 standard deviations above the trend, in 2002.

So, my conclusion is, there is a trend towards smaller winning margins but Chris Froome’s margin this year is in no way inconsistent with that trend. From this data, even if the premise that winning margin is a proxy for doping is correct, you can’t infer anything about whether Froome doped or not.

Interesting, RW. .67 can either be not at all significant or really really significant, depending on things like the value of N. Do you have any idea what kind of N you’d need for Froome’s result to be statistically significant? Some insignificant results are more insignificant than others, and you’re working at a lot high level on this than I am.

Also interesting is to compare the winning margin to the average margin from 2nd to 10th places. If the winning margin was 3 minutes and the average gap between the 10 following riders was also 3 minutes, that would not seem suspicious, but if the average gap between the followers was only 5 seconds, then you’d have to wonder about the winner. I just looked up the data for 1999 when Armstrong’s winning margin was largest – 7m37s, compared to an average gap of 1m56s between each of the next 9 riders. And in 1990, LeMond won by 2m16s and the average gap from 2nd to 10th was 1m19s. This year, Froome won by 4m20s, and the average gap between the next 9 riders was 1m40s. Not sure if there’s much to be inferred from that either though!

Also for stage wins – the average number of stage wins for a winning rider from 1970-1990 was 3.1, from 1991-2005 it was 2.4 and since 2006 it has been 1.3. The overall trend is for fewer stage wins for the overall winner, and I see no apparent changes of trend in 1991 or 2005.

So I assumed that the distribution of winning times was a normal distribution, and that looks like a reasonable approximation for the 1970-2013 winning margins: 65% lie within 1 standard deviation of the mean, 93% within 2 standard deviations (you’d expect 68% and 95% for a truely normal distribution). From my understanding of statistics, a value 0.67 standard deviations from the mean couldn’t ever indicate a significant departure from the expected distribution if that distribution is normal. A 5 sigma outlier could be statistically insignificant if N was small but I don’t think there is any value of N for which 0.67 standard deviations could be a statistically significant outlier.

If the N is sufficiently large I believe that .67 can be significant. Been a long time since I had to know these things, though.

Well, next time I attempt one of these analytical sports posts, I’ll have to rope you in RW. Nice stuff. Still not quite on board with your conclusion, but love your analysis and your thoughtful tone.