Humans, however, evaluate things much differently. My ratings provide two "human" ranking systems for college sports -- a pseudopoll and a pairwise ranking. The pseudopoll attempts to mimic poll voting tendencies; the pairwise ranking (for baseball and basketball) attempts to mimic tournament selection and seeding tendencies. These are similar in many ways but different in others. Both systems, however, deviate strongly from computer rating systems. Generally this is cause for suspicion: the voters think the programmers are wrong while the programmers say the voters don't know how to rank teams. There are truths to both sides -- there are random factors (from human indecision) and systematic factors (such as poll inertia) that make polls unfair; on the other hand a measurement of the "best" season is inherently subjective and thus one must examine how people (on average) make such judgments. This page describes how human evaluations systematically differ from computer evaluations.
Note that all information on this page was obtained through exhaustive (and exhausting) research and experimentation; any use of this information for any purpose must give me credit.
Contents of this page:
If an average individual is asked to rank teams in some order, he will go through a two-step process. In some cases this process is intuitive and "seems" like only one step to the individual; in other cases the person will actually do both steps. Step one is to make a provisional ranking based on one's "gut instinct" feelings about each team. Step two is to adjust the ranking as necessary, based on specific data. The information presented on this page is based on attempts to mimic poll and selection results by mimicking this two-step process. This section deals with the first step.
Poll voters seem to come up with their initial feelings based on the team's record, soundness of wins, and the quality of the better teams it has beaten.
The record is straightforward to quantify; I use the percentage of games won to accomplish this. While this seems to give football teams at 11-1 an advantage over those 10-1, my comparisons of simulated and real polls indicate that 11-1 teams do tend to be ranked higher (all other things being equal) and thus the winning percentage is indeed the correct measurement.
For the average margin of victory, I use the game probability factors described in the predictive poll description. In short, the value produced for each game is the probability that the team outplayed its opponent in that game. A tie game gives a probability of 0.5; a blowout win gives a value of nearly 1, and a blowout loss gives a value of nearly 0.
The final element of the voter's initial feeling is the quality of teams it has beaten. Interestingly, a team that goes 9-2 with two losses to top-10 teams seems to be ranked no better than one whose two losses are to lousy teams; only the wins really matter here. I quantify an opponent's quality as its probability of outplaying the 25th-best team in the same road/home/neutral situation as the game in question. The value that best replicates polls is not the simple average of such probabilities, but rather weighted towards a team's stronger opponents. Thus a game against a very weak opponent does not appear to lessen a voter's impression of a team's schedule, as long as there are a couple wins over ranked opponents.
Although the three components come out of the predictive rating system, they are not combined in a similar fashion at all (no surprise, given that voters are not computers). Instead, the voter appears to start with the team's record and make adjustments according to the "convincingness" of its wins and whether or not it has beaten solid opponents.
A comparison of typical voting patterns with statistics shows two very interesting points:
Obviously no voter puts his voting tendencies in a computer and follows these steps to determine his instinct. However, his impression of a team is based on his impression of the team's performances in various games, with some games seen as more important than others. So, in terms of understanding polls, this is a fair process.
In NCAA selection committees, the "gut feeling" ranking is provided for the committee in the form of RPI rankings. While many committee members downplay the importance of RPIs in the selection process, it is clear from my research that the RPI
The initial ordering of teams from the RPI appears to have a couple of significant modifications. A team finishing below 0.500 in conference play is hurt (so that a basketball team going 7-9 would effectively lose 0.05 RPI points and one going 6-10 would lose 0.10 and essentially be eliminated from consideration). A team with a solid record in a tough conference is helped (a 20-10 team in a power conference would effectively gain 0.06 RPI points); sadly the team's non-conference schedule strength doesn't appear to matter so that a team that went 12-0 against lousy non-conference opponents and 9-9 in conference would be helped as much as one that went 6-6 against top-notch non-conference opponents and 14-6 in conference play.
The NCAA championship handbook lists quite a few other factors that are supposedly considered by the committees:
Once an individual ranks teams according to his initial feelings (as described in the previous section), he will make a few "common sense" adjustments. While most individuals seem to have very similar initial feelings, it seems that the adjustments will vary wildly from person to person. I have found four types of adjustments that tend to be made. All four are present in voted polls; only two are in the selection results.
Note that all of these factors signify leanings. Each voter also has an opinion regarding his initial ballot, meaning that the need for reranking according to these principles has to be sufficiently strong to overcome his initial feeling.
The pseudopoll is a simulated poll, with 50 "virtual" voters selected. The average voter profile is as described above, but the 50 voters themselves each have their own personal leanings on each issue, which they use to produce their own ballot. Note that ballots are not saved from poll to poll, meaning that there is no "inertia" in the pseudopoll (aside from the fact that whatever caused a team to be ranked high in the previous poll will cause them to be ranked high in the current one). In most sports, each ballot consists of 25 teams, with a first place vote worth 25 points, second worth 24, and so on. In hockey, the ballot only includes 10 teams, with points 10-9-8-...
As mentioned in the predictive rating description, the pseudopoll includes priors to improve early season stability. This is a factor based on preseason guesses of team strength that is incorporated into the team performance calculation but not into the head-to-head or common opponent modifiers.
The pairwise ranking is based on a single "ballot" using the average selection profile for the sport in question.
Note: if you use any of the facts, equations, or mathematical principles on this page, you must give me credit.