This year, ESPN has devised yet another system for rating college basketball teams. They tout it as an improvement over the RPI system that the Selection Committee uses to help determine seeding and the KenPom efficiency rankings, which remove game pace from the equation. BPI was devised by Dean Oliver, a sharp cookie who, along with Ken Pomeroy, pioneered the new methodology for basketball statistics. Oliver argues that his system is more complete than KenPom because it:
- Factors in diminishing returns for blowouts
- Considers all wins better than losses
- De-weights games with missing key players.
Fair enough. I suppose I could dig deeper to understand all the algorithms of BPI, but I’ll take Dean at his word. What interested me about the explanation of BPI is that Oliver claimed it was 74.4 percent accurate in picking the tourney games from 2007 to 2012. I don’t know why they cut off the analysis at six years. I suppose it has something to do with the fact that 2006 was a gruesome year. Remember George Mason? Anyway, I don’t know that for sure. Oliver and ESPN also said that they couldn’t compare BPI to KenPom because they didn’t have Ken’s pre-tourney data.
Well, I do. So I decided to do a comparison. I stacked BPI up against KenPom and a baseline strategy of picking high seeds, with average margin breaking seed ties. (I didn’t analyze RPI because it’s a notoriously poor predictor of tourney outcomes. I’ll cover this in another post.)
The first thing I realized was that Oliver and company were evaluating tourney outcomes round by round. In other words, that 74.4 percent does not measure the accuracy of filling out your whole bracket in advance of the dance. It takes the outcome of the first round, then re-evaluates the teams. That’s a clever way of claiming a higher accuracy rate. Here’s what I mean: last year, had you filled out your bracket completely, you no doubt would’ve hade Duke beating Lehigh, then downing either Notre Dame or Xavier in round two. Of course, you would’ve been wrong in both games. But the way ESPN is calculating the accuracy of BPI, they look at the second-round Xavier/Lehigh matchup and credit their system for getting that game right. Unfortunately, for the typical tourney pool, where you fill out your bracket before the dance and let it ride, you don’t have this luxury of round-by-round reassessment. That alone inflates the BPI accuracy rate.
All that said, I did look at how BPI compared to KenPom and the higher seed strategy in round-by-round game prediction. The first thing I realized is that ESPN factored the play-in games of 2007-10 and the First Four games of 2011-12. Otherwise, there was no way that the percentages would’ve rounded to 74.4 percent. So, they’re evaluating their system against 390 game decisions—and saying they got 290 correct. I don’t know how their system sorted out the pre-tourney sham games. So let’s assume they were nine for 12 in those matchups. That means for the 378 real tourney games played between 2007 and 2012, BPI must’ve forecasted 281 games correctly.
So…the question is: how did KenPom and a simple higher-seed strategy (heretofore called “YourMom” since anyone can pick games by seed) perform over the same time period? The answer is: not as well…but not significantly worse. Both KenPom and YourMom got 276 games right. That works out to a 73.0 percent accuracy rate—a scant 1.4 percent below BPI. That’s right: the high-falutin’ BPI and KenPom systems yield prediction results that are just 1.4 percent better than a tourney ninny advancing all the higher seeds. Ouch.
I’m sure, in some way, this could be spun as a small victory for BPI. But let’s remember a couple of things. First, this is the inaugural year of the system. We have no idea whether it was “fitted” to yield more accurate tourney results. While the three extra factors that BPI contemplates seem reasonable on the face of it, the relative weight that ESPN gives each factor is the more important consideration. I’m not saying that this is what’s happened, but you can certainly dial the knobs to fit history more accurately. I do the same thing each year unapologetically, trying to learn lessons from each tourney as they come. But the most important test of whether this sort of fitting is valid comes with the results of the next dance. In short, a five-game advantage over six tourneys isn’t anything to write home about—particularly when it’s so close to the strategy that any bracket newbie could adopt.
Okay. So I did one other analysis yesterday—and I think it’s more relevant than determining round-by-round prediction accuracy. This analysis compared the accuracy of KenPom to YourMom in filling out your entire bracket and living with the consequences of lost games in previous rounds. I was able to go back nine years for this analysis. What did I find? Using KenPom efficiency data would’ve predicted 376 of the 567 tourney games played between 2004 and 2012. That’s a 66.3 percent accuracy rate. And how would’ve YourMom done? Amazingly, two games better—for a 66.7 percent accuracy rate. That’s right. Picking by seeds and margin beats out using KenPom.
Now, I’m not here to question KenPom. I think his data is absolutely invaluable to analyzing the quality of basketball teams. I crutch on it, as do forward-thinking analysts like Jay Bilas and Jimmy Dykes. But what this shows is that no system can reliably predict basketball outcomes. We all try, of course—and I would argue that all these systems underrate the value of things like coaching, momentum and individual matchups. But at the end of the day, the NCAA tournament is a giant puzzle of probabilities…and the people controlling the outcomes are 20-year old kids.
So when it comes time to fill out your bracket, cast a cold eye on all these statistical methods. Even mine. Take from them what you believe, use your gut instincts…and create a bracket that’s your own.
Either that, or call your Mom.