The standard statistical interpretation is that Ohio State isn't as good a team as their record indicates. In any of their narrow victories, they might have just as easily have lost, and thus the predictive rankings are correct to rate them poorly. Ohio State supporters generally counter with the argument that Tressel's style of play favors low-scoring, close games, and as such close wins will be the norm. They also claim that the narrow wins result from his team's ability to gear it up when needed and do whatever is needed to win. I call this the "Tressel Factor".
Clearly this isn't always the case; an example that immediately comes to mind is their championship, which would not have been won had a controversial penalty been called. Now I don't care to argue the merits of the pass interference call; I am merely pointing that the official's decision of whether or not to throw that flag had nothing to do with any "Tressel Factor"; it was a break that went in Ohio State's direction over which they had no control.
However, even if Tressel isn't able to summon his magic in every single game, it is still worth examining whether or not there is evidence that the Tressel Factor is real. I address this question in two ways.
Year | School | Standard | Med. Likely | Predictive |
---|---|---|---|---|
2003 | Ohio State | 11 | 10 | 21 |
2002 | Ohio State | 1 | 1 | 11 |
2001 | Ohio State | 37 | 37 | 31 |
2000 | Youngstown St | 15 | 15 | 21 |
1999 | Youngstown St | 3 | 2 | 9 |
1998 | Youngstown St | 26 | 25 | 34 |
1997 | Youngstown St | 4 | 4 | 2 |
1996 | Youngstown St | 18 | 17 | 8 |
1995 | Youngstown St | 64 | 63 | 49 |
What we see is that, in only 5 of 9 years has Tressel's team has had a win/loss ranking that is better than its predictive (score-based) ranking. In effect, Tressel's teams show no indication that they have been systematically underrated by a predictive ranking.
I haven't shown the data, but there is clear evidence that Tressel's style favors low-scoring teams, in that my scoring rating for Tressel's teams has been consistently on the low side of the spectrum. This fact related to the team's predictive ranking.
Imagine that the average score in a football game is 20 points. Suppose that team A has an average offense but a superb defense that allows only half as many points as average. In other words, if team A plays an average team, they will win on average by a score of 20-10, which implies that they will win 75% of the time and have a predictive rating of +0.667. Now imagine team B, which has average defense but a superb offense that scores twice as many points. Team B would beat an average team by 40-20, meaning that they win 84% of the time and have a predictive rating of +0.987. In other words, team B is the better team; for A to be equally good they would have to win by an average of 20-6. (Details on the game analysis calculations can be found here.)
This is a fundamental problem with defensive football and smashmouth offense. While it may conjure up nostalgic images of how the game "should be played", a team that wins its games 20-10 is inferior to one that wins its games 40-20 because one bad break is more likely to result in a loss.
What I wish to examine is the possibility that my statistical model for evaluating game results could be slightly wrong. As noted in the ranking descriptions, the significance given to a win is roughly equal to the score difference divided by the square root of the number of points. This means that a 17-14 win is more convincing than a 16-15 win to the same degree than a 16-15 win is more convincing than a 15-16 loss. If there really is a significant difference between winning and losing ("the better team just wanted it more and got the job done"), I should be able to improve the accuracy of my win-loss predictions by adjusting this formula slightly to put more weight on whether or not a team won. In fact, I find this to not be the case. Adjusting the game evaluation formula to give more significance to a narrow win, even slightly, measurably diminishes the accuracy of the predictions.
I find this to be somewhat surprising. I have to believe that there is some degree of clutch performance in football; after all different people react differently when the pressure is on. However, football is a team sport, and it seems that the overall clutch ability of one team is comparable to that of any other team. More to the point, there are a lot of random elements -- bounces of the ball, calls or non-calls, mistakes, etc. -- that dominate the outcome of a close game. Put differently, a team that put together a last-minute drive to win a game was probably just as likely to lose as to win. This shouldn't detract from the excitement of a close game or taint the win; it just means that it doesn't tell you as much about the "character" of the players as common wisdom would indicate.
Note: if you use any of the facts, equations, or mathematical principles introduced here, you must give me credit.