Ranking Tutorial: Basics

OK, so we want to create a ranking system. Where to begin? A good first place is to decide exactly what we want the ratings to mean. In other words, if team A has a rating of 0.40, and team B has a rating of 0.55, which team is better? What is average? How likely is team B to beat team A if they play next week?

There are two common systems. One defines ratings such that a team's rating equals the odds that it would beat an average team. If you see a ranking whose average team score is around 0.5 and whose scale is roughly zero to one, this is probably (more or less) what was done. The drawback is in computation and use of the ratings for predictive purposes.

The other approach, and the one I have adopted, is to define the rating system so that the difference between two rankings is directly related to the odds of one team beating the other. To make this as simple as possible, I have put ratings in units of sigma (one sigma equalling the typical random variation from game to game in the difference between two teams' performances). I should point out that the definition of the rating scale doesn't matter; a correct statistical treatment would produce the same ranking order regardless of the scale used. I have chosen the sigma scale for convenience.

Those of you familiar with elementary statistics will know these equations:

   NP(x) = exp(-x^2/2) / sqrt(w*pi)
   CP(x) = integral(z=-inf,x) NP(z) dz

The first is the normal probability that an event x sigma away from the expectation will occur. This function is symmetric around zero, meaning that an event will occur x above expectation exactly as often as it occurs x below expectation. The second equation is the cumulative probability that the event will occur less than x away from expectation. This is commonly called the error function, but to eliminate confusion because the error function is defined slightly differently I refer to the cumulative probability as CP(x).

Given how the scale is defined in sigma, you can quickly see that the odds of team A (rating=a) beating team B (rating=b) equals CP(a-b), while the odds of team B winning equals CP(b-a). (Naturally, CP(x)+CP(1-x) equals one.) The odds of a tie is related to P(a-b), which of course equals P(b-a).

As an aside, I realize that many computer raters regard their rating schemes as some sort of "black art" and are hesitant to divulge much or any of their formulas. This rating is based on standard and straightforward statistical principles, and thus has no "magic" ingredients. However, since it would be possible to reproduce my predictive rating given the information presented here, I will require that any use of the information on this page give credit to me.

Some definitions of the terms used throughout these information pages:

"X": used to define the variable X in text
X^Y: X to the Yth power
inf: infinity
P(B|A): Probability that B will occur, based on A
sum(i=A,B) C: The sum of C, with variable "i" going from A to B
integral(i=A,B) C: The integral of C, with variable "i" going from A to B
exp, ln, sqrt: exponential, natural log, and square root functions
NP(x) = exp(-0.5*x^2)/sqrt(2*pi)
CP(x) = integral(z=-inf,x) exp(-0.5*z^2)/sqrt(2*pi)
ICP(x): The inverse of CP(x) -- if y=CP(x), then x=ICP(y).

Return to ratings main page

Note: if you use any of the facts, equations, or mathematical principles on this page, you must give me credit.