Fun With Playoff Odds Modeling

Author’s note: “Five Things I Liked (Or Didn’t Like) This Week” is taking a short break, but will return next Friday for the end of the regular season.
Earlier this week, I did the sabermetric equivalent of eating my vegetables by testing the accuracy of our playoff odds projections. I found that our odds do a pretty good job of beating season-to-date odds (particularly late) and pure randomness (particularly early, everything does pretty well late). It’s good to intermittently check in on the accuracy of our predictions. It’s also helpful to build a baseline as a benchmark to measure future changes or updates against.
Those are a bunch of solid, workmanlike reasons to write a measured, lengthy article. But boring! Who likes veggies? I want to beat the odds, and I want to flex a little mathematical muscle while doing it. So I goofed around with a computer program and tried to find ways to recombine our existing numbers to come up with improved odds built by slicing up existing ones. It didn’t break the game wide open or anything, but I’m going to talk about my attempts anyway, because it’s September 19, there aren’t many playoff races going on, and you can only write so many articles about whether the Mets will collapse or if Cal Raleigh will hit 60 dingers.
What if you just penalized extreme values?
I first tried to correct for the fact that early-season projection-based odds (which I’m calling FanGraphs mode for the rest of the article) seem to be too confident and thus prone to large misses. I did so by applying a mean reversion factor that pulled every team’s values toward the league-wide average playoff chances (i.e. how many teams made the playoffs that year). This method varies based on the current playoff format; we have 16-team, 12-team, and 10-team samples in the data, and I adjusted each appropriately. I set the mean reversion factor so that it was strong early in the year and decayed to zero by the end of the season. Read the rest of this entry »










