How Well Do Our Playoff Odds Work?

It’s the time of year when folks doubt the playoff odds. With the St. Louis Cardinals going from 71-69 long-shots to postseason clinchers, and the rollercoaster that is the American League Wild Card race, you’ve probably heard the skeptics’ refrains. “You had the Jays at 5%, and now you have them at 50%. Why did you hate them so much?” Or, hey, this tongue-in-cheek interview response that mainly makes me happy Adam Wainwright reads our site:

In that generic statement’s defense, it really does feel that way. In your head, 5% rounds to impossible. When the odds say “impossible” and then the season progresses to a point where outcomes are far less certain, what other impression can you take away than “these odds were wrong”?

I feel the same way from time to time. Just this year, the Cardinals and Blue Jays have been written off and then exploded back into contention. St. Louis bottomed out at 1.3% odds to make the playoffs – in August! It’s not quite negative 400 percent, but it sure feels that way. Can it really be that those odds were accurate, and that we just witnessed a one-in-100 event?

To investigate this question, I did what I often do when I don’t know where to turn: I bothered Sean Dolinar. More specifically, I got a copy of our playoff odds on every day since 2014, the first year when we calculated them using our current method. I left out 2021, as we don’t have a full season of data to use yet, but that still left me with a robust (some would say too robust) amount of data.

Let’s get the headline out of the way first: our odds work! Here’s a graph where I placed each team’s projection on each day of the 2014-20 seasons in groups of .05, then checked how frequently those teams made the playoffs. The x axis is the average odds our projection system gave each bucket — teams with between a 20% and 25% chance of making the playoffs, for example, had an average projection of 22.4%. The y axis is how frequently those teams made the playoffs at year’s end — 22% for that same bucket. Without further ado:

That’s pretty neat. The general pattern conforms to what you’d expect if the system were an unbiased estimator. Teams with low odds of making the playoffs rarely do. Teams with high odds almost always do. Cut it down to every one percent bucket for added granularity (but smaller samples), and the pattern remains:

If you hunt for it, you can find some soft spots. We seem to under-forecast teams in the 5-15% odds range, over-forecast teams in the middle of the pack, and under-forecast the very best teams. It’s marginal, though; we think that low bucket should have a 10% chance of making the playoffs, and they check in at 14%. Meanwhile, the middle of the distribution (35%-65%) “should” make the playoffs half the time, but they check in at 46%.

Can you tell the difference between a 10% and 14% chance? I certainly can’t. Meanwhile, the overall shape of the distribution makes sense. That said, I wanted to get slightly deeper than “our odds generally conform to reality.” First things first: I took our season-to-date odds model and calculated all the same values. For our purposes, that’s a “naive” model — it absolutely believes what has happened in the season so far. You can get more naive, as well — I also ran the numbers on our coin flip mode, which makes each game a toss-up.

I compared the three models based on mean absolute error. For those who haven’t used it, it’s a measure of model accuracy. It’s particularly useful in binary models, because it does a good job of punishing models that don’t make specific predictions. Take a hypothetical model that’s trying to make the same predictions that we are — whether or not a team will make the playoffs. From 2014-20, 76 teams made the playoffs — 10 per year for six years, and 16 in the manic 2020 season. That works out to 36.2% of all seasons — if you give every team a 36.2% chance of making the playoffs, you’ll be “right” all the time — and means that 36.2% of the teams you gave a 36.2% chance to reach the playoffs will do just that.

Obviously, your model would be awful. But in a superficial sense, it’s perfect. When it spits out a prediction of 36.2%, it means it exactly; 36.2% of teams in that bucket will make the postseason. Mean absolute error penalizes this model on a relative basis. At the start of this year, for example, we gave the Dodgers a 98.4% chance of making the playoffs and the Orioles a 0.0% chance. That’s an average error of 0.8 percentage points, as compared to the silly model’s average 50 percentage point error.

Why tell you this? Because we can use mean absolute error to check our model’s relative strong and weak points. Across the entire data set, it has an MAE of 0.223. The season-to-date odds, meanwhile, have an MAE of 0.245. In other words, the FanGraphs projection model does a better job of predicting the playoffs than looking at a team’s record and remaining games. Coin flip odds had an MAE of 0.275. In a general sense, that should give you an idea of a minimum bound for projections.

We can slice it up further. Odds don’t work the same in May and September. In May, a team’s future play is of utmost importance. In September, math takes over. Six back with nine games left to play? It doesn’t take a great model to tell you that’s a hard mountain to scale. Let’s see how our three ways of projecting the season do as time changes.

I broke up each of the three models by month, then looked at the mean average errors in each month. For this section, as well as the above overall numbers, I also removed data points where a team had mathematically clinched either making the postseason or missing it. If a team has an undecided fate, our model opining on it is useful data. After the outcome is determined, we shouldn’t give models credit for knowing that outcome. Here’s that data:

Projection Systems by MAE
Month FG Projection Season-to-Date Coin Flip
March/April 0.289 0.334 0.412
May 0.258 0.291 0.338
June 0.222 0.244 0.265
July 0.212 0.222 0.233
August 0.186 0.197 0.203
Sep/Oct 0.121 0.130 0.131

This shape makes a ton of sense. Early in the season, projections have a commanding information advantage. They know who is on each team (season-to-date mode uses last year’s stats as an input early in the year) and how good they’ve been in their career. As the season wears on, season-to-date mode gets more information about the current team, and team records start to matter. Down 15 games in late June? Both models are going to give you slim odds of making the playoffs.

I ran a few other tests, but nothing changed the overall conclusions I’d already reached. The projection-based model does better than the season-to-date model, and both beat guessing. The errors decrease over time. Distributionally, there’s no section with a clear bias.

Naturally, that left me looking for fun oddities. The team with the lowest playoff odds to cash in? That’d be the 2015 Texas Rangers, who we clocked as low as 0.5% early in the year. They were 57-57 on August 14, five games back in the AL West, before peeling off a 31-17 run to clip the Astros in a weak division.

Spare a thought, too, for the 2018 Oakland A’s, who we pegged at 2.9% after a 5-10 start. They, too, went on quite a streak; 36-36 in June 18, they went 37-13 in their next 50 games. By then, we had them at 77% to make the playoffs, but we certainly had a low estimation of their strength starting out. Given how many players they retained from a team that finished last in the AL West the year before, I understand the model’s miss, but it’s good to remember that even something that incorporates projection systems will miss badly from time to time.

The worst miss in September? It’ll be this year’s Cardinals when the season is in the books, but for now it’s the 2019 Brewers, who were five games out of the second Wild Card on September 5. They went 18-5 to end the season, a rough mirror of this season’s St. Louis run. Of course, plenty of teams that start September five games out of the playoffs simply don’t make it, which explains why our system made them only 5.6% to secure a playoff berth — and why teams in that rough bucket have actually made the playoffs only 7% of the time in our sample.

On the other side of things, the 2015 Nationals started out 27-18, and we pegged them at 98.3% likely to make the postseason. They posted a losing record the rest of the way, a desultory 56-61, while the Mets caught and passed them. The only other team with similarly strong odds that went on to miss was Cleveland in 2019 — they looked like the class of that division early in the season, and won 93 games, fairly close to our 97-win projection. The Twins were simply better — they won 101 games — and 93 wins just missed the Wild Card cutoff. It’s hard making projections sometimes!

Does all of this mean that the odds work? That’s up to you to decide. I can tell you that they work better than looking at a team’s season-to-date play, and far better than flipping a coin, but I don’t know what your minimum accuracy to deem a model successful is. For my part, I’m happy with them. They generally point in the right direction, they tell a consistent story, and they’re especially useful early in the season, though all models have higher error bars that early on.

That doesn’t mean they can’t be improved. Our method, while it involves several projection systems, is reasonably straightforward. There aren’t many moving parts — mean projections and playing time mushed up against the remaining schedule and the current standings. I’ve always thought it would be helpful to add some type of depth layer on top of that, and I still hope we’ll implement it at some point — thanks, Morris Greenberg, for an interesting solution to this problem that I still can’t quite wrap my head around.

For the moment, though, I’m satisfied. Do our odds miss the mark from time to time? Of course they do. For a low-input model, the fact that it handily outperforms naive projections early in the season and keeps ahead of them as we get down to crunch time says enough for me. And hey, sometimes the ones that look like errors end up okay. That Blue Jays projection — 4.6% on August 27 — looked pretty bad, until it looked good. It might yet look bad! The outcome of future games is unpredictable — one of the great selling points of sports. A low-probability run doesn’t mean probability is broken — it means something incredible is happening. And really, isn’t that the best takeaway of all (says the writer who thinks the odds work, of course)?

One last thing: want some big tables of all the data I used in constructing this? Here’s a raw table of the five-percentile and one-percentile cutoffs used to construct the above graphs, unedited for teams that have clinched a postseason berth or been eliminated from contention:

Raw Data, by 5 Percent
Odds Count Our Odds Observed
0 15975 0.006 0.008
0.05 2201 0.074 0.115
0.10 1694 0.125 0.168
0.15 1577 0.175 0.183
0.20 1372 0.224 0.221
0.25 1093 0.273 0.269
0.30 956 0.325 0.326
0.35 888 0.375 0.392
0.40 808 0.424 0.376
0.45 780 0.475 0.437
0.50 730 0.525 0.468
0.55 722 0.575 0.517
0.60 741 0.625 0.576
0.65 714 0.675 0.653
0.70 635 0.725 0.753
0.75 715 0.776 0.787
0.80 738 0.826 0.837
0.85 920 0.876 0.855
0.90 1211 0.927 0.895
0.95 3374 0.986 0.993
1.00 2806 1.000 1.000
Raw Data, by 1 Percent
Odds Count Our Odds Observed
0 12880 0.001 0.001
0.01 1088 0.015 0.024
0.02 772 0.025 0.014
0.03 623 0.035 0.043
0.04 612 0.045 0.070
0.05 473 0.055 0.085
0.06 457 0.065 0.101
0.07 457 0.075 0.129
0.08 424 0.085 0.130
0.09 390 0.095 0.138
0.1 339 0.105 0.142
0.11 328 0.115 0.174
0.12 349 0.125 0.192
0.13 335 0.135 0.176
0.14 343 0.145 0.155
0.15 302 0.155 0.189
0.16 317 0.165 0.164
0.17 330 0.175 0.182
0.18 326 0.185 0.199
0.19 302 0.195 0.182
0.20 304 0.205 0.184
0.21 263 0.215 0.183
0.22 309 0.225 0.214
0.23 236 0.235 0.250
0.24 260 0.245 0.285
0.25 246 0.255 0.199
0.26 247 0.265 0.304
0.27 204 0.275 0.240
0.28 204 0.285 0.309
0.29 192 0.295 0.302
0.30 198 0.305 0.333
0.31 216 0.315 0.347
0.32 171 0.325 0.263
0.33 191 0.335 0.288
0.34 180 0.345 0.394
0.35 175 0.355 0.389
0.36 179 0.365 0.397
0.37 173 0.375 0.387
0.38 170 0.385 0.424
0.39 191 0.395 0.366
0.40 166 0.405 0.337
0.41 179 0.415 0.346
0.42 173 0.425 0.416
0.43 135 0.435 0.333
0.44 155 0.445 0.445
0.45 165 0.455 0.406
0.46 156 0.465 0.404
0.47 137 0.475 0.460
0.48 154 0.485 0.487
0.49 168 0.495 0.435
0.50 162 0.505 0.426
0.51 127 0.515 0.496
0.52 151 0.525 0.444
0.53 151 0.535 0.497
0.54 139 0.545 0.489
0.55 136 0.555 0.441
0.56 143 0.565 0.517
0.57 162 0.575 0.562
0.58 152 0.584 0.467
0.59 129 0.595 0.597
0.60 158 0.605 0.532
0.61 144 0.615 0.618
0.62 148 0.625 0.527
0.63 140 0.635 0.600
0.64 151 0.645 0.609
0.65 143 0.655 0.713
0.66 149 0.665 0.597
0.67 124 0.675 0.621
0.68 157 0.685 0.618
0.69 141 0.695 0.716
0.70 125 0.705 0.744
0.71 121 0.715 0.736
0.72 134 0.725 0.687
0.73 123 0.735 0.756
0.74 132 0.745 0.841
0.75 136 0.755 0.735
0.76 134 0.765 0.813
0.77 127 0.775 0.819
0.78 158 0.785 0.810
0.79 160 0.795 0.763
0.80 141 0.805 0.823
0.81 131 0.815 0.855
0.82 155 0.825 0.819
0.83 152 0.835 0.836
0.84 159 0.845 0.855
0.85 177 0.855 0.847
0.86 174 0.865 0.833
0.87 168 0.875 0.845
0.88 193 0.885 0.886
0.89 208 0.895 0.861
0.90 198 0.905 0.848
0.91 239 0.915 0.883
0.92 208 0.925 0.899
0.93 263 0.935 0.916
0.94 303 0.945 0.914
0.95 343 0.955 0.968
0.96 355 0.965 0.994
0.97 365 0.975 0.978
0.98 503 0.985 0.998
0.99 1808 0.998 1.000
1.00 2806 1.000 1.000





Ben is a writer at FanGraphs. He can be found on Twitter @_Ben_Clemens.

newest oldest most voted
HamelinROY
Member
HamelinROY

It would be interesting to see the deviation per category per month, if that makes any sense. Basically if a team is projected to have a 50% chance of making the playoffs, how much are you off at opening day. How much are you off at July, etc.

That said, great work. It would also be interesting to compare your odds versus other sites, but that might be harder to get the data that you need

dodgerbleu
Member
dodgerbleu

I would love this. Showing us deviation from daily projections shouldn’t be off at all. They adjust every single day. Kind of stacks the deck in favor of the projections. Monthly would be fascinating though.