Yes, the Playoffs Are Still a Crapshoot

Freddie Freeman
Orlando Ramirez-USA TODAY Sports

We live in an era where every team, to some degree or another, embraces modern analytics when assessing itself and the rest of the league. Wanting to know about the numbers that drive baseball has filtered into fandom as well, which is why you’re here on this very website! But despite all the progress the stathead crowd has made over the last quarter-century, when it comes to the playoffs and playoff results, many fans seem more inclined to defenestrate the numbers and attribute the losses to all sorts of causative elements beyond a surplus or dearth of players just happening to have particularly good games that week.

In the worst case, failing to win two of three games or three of five is attributed to some kind of character flaw. At best, the loss is because of some fundamental flaw in a team’s construction, typically something that sabermetrics is to blame for, no matter whether the team is sabermetrically inclined or not. Here’s one example of very common thinking along these lines just from the last couple of days. It’s from a fan on Reddit, so I’m not specifically attributing them, mainly because I don’t want to risk social media pile-ons:

The postseason is the real season from now on, so the focus shouldn’t be on sabermetrics as much as it has been. With the playoffs being expanded, 111 wins doesn’t mean squat. Shift some of the focus to bringing in guys who play with fire and aggressiveness and will situationally hit rather than live and die by the long ball. If it costs us some wins during the regular season, so what?? It’s the little things that win the most important games.

With three of MLB’s four 100-win teams already out of the playoffs and the 99-win team pushed to the brink but ultimately surviving, these types of incriminations will be common this season. The Dodgers didn’t lose a 60/40 matchup (the ZiPS projection for the series) because they were simply outplayed over four games, but because something was broken in how the team was built. Depending on who you listen to, you can hear the same type of grumbling about the Mets and Braves.

Since questions should be explored rather than dismissed, let’s look at playoff overperformers and underperformers over the Wild Card era (starting in 1995) and examine if there really are consistent patterns behind which teams are overperforming or underperforming in the postseason. And since this is illustrative more than anything, I’m trying to keep it as simple as possible, within reason.

The best place to start is the 950 playoff games played since the 1995 postseason [originally said 1,900 erroneously -DS]. (This doesn’t include Game 5 of the Yankees-Guardians ALDS since I was crunching the numbers while that game was being played.) Using Pythagorean wins plus home-field advantage and projecting games on that basis, here is how each team has over or underperformed in the playoffs:

Playoff Performance vs. Pythagorean Record (1995-2022)
Team Expected Wins Actual Wins Difference
New York Yankees 105.1 119 13.9
San Francisco Giants 42.3 50 7.7
Kansas City Royals 14.4 22 7.6
Boston Red Sox 61.5 68 6.5
Florida Marlins 17.6 24 6.4
Philadelphia Phillies 26.2 32 5.8
Cleveland Guardians 47.3 51 3.7
New York Mets 26.2 28 1.8
Chicago White Sox 12.8 14 1.2
Washington Nationals 17.9 19 1.1
Houston Astros 62.0 63 1.0
St. Louis Cardinals 74.1 75 0.9
Detroit Tigers 24.7 25 0.3
Pittsburgh Pirates 3.7 3 -0.7
Atlanta Braves 72.8 72 -0.8
San Diego Padres 16.0 15 -1.0
Colorado Rockies 11.3 10 -1.3
Baltimore Orioles 16.4 15 -1.4
Milwaukee Brewers 15.1 13 -2.1
Arizona Diamondbacks 20.2 18 -2.2
Tampa Bay Rays 30.8 28 -2.8
Seattle Mariners 19.9 17 -2.9
Toronto Blue Jays 13.1 10 -3.1
Los Angeles Angels 24.6 21 -3.6
Cincinnati Reds 8.9 5 -3.9
Texas Rangers 25.2 21 -4.2
Chicago Cubs 31.2 25 -6.2
Oakland A’s 24.6 18 -6.6
Los Angeles Dodgers 68.8 62 -6.8
Minnesota Twins 15.4 6 -9.4

The Dodgers finish near the bottom, though part of that is they’ve played in so many games; they’ve won about seven fewer games in the playoffs since 1995 based on what you’d expect from their Pythagorean record, the Pythagorean record of their opponents, and whether they had home field advantage in any particular game. But one thing worth noting is that Los Angeles’ deficit is almost entirely in games played before Andrew Friedman took over after the 2014 season and moved the team in a more sabermetric-oriented direction. From 2015 to ’22, the Dodgers won 47 playoff games against 47.9 expected wins.

Since eyeballing lists is a poor method of dimensionality reduction, I ran a series of logistic regressions (winning or losing a game is binary data) using a few dozen different variables. If these variables have relevance, then including them along with the Pythagorean projections ought to make them better at sorting wins and losses. I started off with an obvious one: bullpen strength. I want to avoid getting too far into the weeds talking about chi-squares and Wald tests, so here’s a method with a visual element, the ROC curve. Simply put, it evaluates how well we identify wins and losses ahead of time. The larger the area under the blue curves below, the better the model.

On the top is the model using only the Pythagorean winning percentage, and on the bottom, the model uses Pythagorean winning percentage and the relative FIP- of the team’s bullpens. The dashed line represents the results for a random number generator in picking wins and losses, with the blue line indicating the output of the model. In other words, we get a very slight improvement by knowing the bullpen strength in addition to team quality.

To put it into an easy number, if you picked the 1,900 wins-losses using just Pythag and home-field advantage, you’d have correctly picked the winner 1,018 times (53.6%). If you used the model that also included the relative quality of the team bullpens, you would have instead picked the winner 1,026 times. That translates to being correct one more time about every three years.

Another common “explanation” of why good teams underperform is that they’re too reliant on home runs. So I calculated a home run propensity for each team, starting with park-neutral seasonal stats and then calculating runs created both with and without every at-bat that resulted in a home run. The least reliant on home runs was the 2007 Angels (22.1% of runs), and the most was the 2020 Dodgers at 53.4%. I then included the relative dependence on home runs for both teams in each game. With this knowledge, you’d improve from picking 1,018 winners to… 1,085.

While the model improves by knowing which team relies more on home runs to generate runs, it didn’t confirm the hypothesis for one crucial direction: the effect was in the opposite direction. Teams with a greater reliance on home runs have overperformed in the postseason, not underperformed. For relative three true outcomes (walks, home runs, strikeouts) performance, you only improve from 1,017 correct to 1,056, but again in the direction of sabermetric-friendly teams, with high TTO teams overperforming in the playoffs slightly.

Contact rate is another post-hoc explanation for why certain teams win, but that was rejected from the model at the 95% level of significance. The same goes for attempts to include momentum (second-half performance, September record). Team speed doesn’t make the threshold. Neither does clutch performance during the regular season or postseason performance history of the players involved. In layman’s terms, knowing which teams have these characteristics has little to no value in telling us why certain teams won playoff games over the last three decades, apart from their effect on their Pythagorean record. Surprisingly, what was (barely) useful enough to make it as a variable was team age, but this was an advantage on the side of young players; young teams, and consequently, teams with less playoff experience very slightly overperformed over the last 28 postseasons.

In the end, I looked at about 60 team-related variables, and they all did almost nothing to explain better which teams won playoff games. ZiPS does a somewhat better job, but only because it “cheats” since it knows who is starting, who is healthy, and who is in the lineup, something which isn’t necessarily a team construction issue.

The Dodgers, Mets, and Braves lost because… well, they lost. That’s it. Enjoy the games, but sometimes a cigar is just a cigar, and that’s perfectly fine.

Dan Szymborski is a senior writer for FanGraphs and the developer of the ZiPS projection system. He was a writer for from 2010-2018, a regular guest on a number of radio shows and podcasts, and a voting BBWAA member. He also maintains a terrible Twitter account at @DSzymborski.

Newest Most Voted
Inline Feedbacks
View all comments
1 year ago

Great article, Dan. I understand that people are annoyed when their team loses and have trouble grasping randomness in these things. But it’s frustrating that we have to Keep. Having. This. Discussion.

1 year ago
Reply to  aglossman

Baseball is a situation where there’s both lots of randomness and a whole ton of data that allows us to say with some confidence that, yes, something unusual did just happen and we shouldn’t completely change our baseline belief. But you still can’t ever rule out the possibility that our analytical tools can be improved and that we missed something that could have improve our prediction. The discussion can’t be avoided. “How much should this unlikely event cause us to update our priors?” is a key question of life.

A few years ago, people were really tired of having the “hot hand” discussion, and yet that still goes on, with occasional interesting contributions to the literature.

Can’t be avoided even more in other areas of life, of course, which lack the type of regulated repeated games of baseball. But I understand that people are even more tired of the discussions when it comes to public policy or politics.

Last edited 1 year ago by JohnThacker
1 year ago
Reply to  aglossman

I don’t think its the randomness per se that has people frustrated. It’s that the counterbalances added to the format to compensate for adding more (ie less successful) teams into the mix seem to have done nothing to alter the randomness. Teams don’t even seem to have played or composed their roster differently for it. The BYE just ends up putting top seeds where they would’ve been under previous formats. It’s just one year but it certainly looks like the new format will undermine regular season success.

1 year ago
Reply to  CHINarwhal

THis all presupposes that maintaining the integrity of the regular season is inconsistent with the goal of the new postseason format. If you think that the postseason is meant to reward the best team based on their regular season performance, that is probably inaccurate and the data doesn’t bear it out. If you’re cynical like me and believe that the postseason is meant to be watched on TV, then there is no inconsistency in injecting randomness into it, because that means upsets, cinderella series, and all sorts of other very watchable things. My horse was eliminated from this race a while back and I have to admit that teams that I hate (hate) have played some of the best, most tense games of baseball I’ve ever seen.

It’s too soon to tell, but if the postseason format retains viewers it will stick around. My point though is that it doesn’t have anything to do with rewarding the teams that play the best or most consistently over the course of the regular season. If that were the case we wouldn’t need a postseason, we would just end the regular season and crown the Dodgers or something.

1 year ago
Reply to  CHINarwhal

It would seem to me that if it didn’t reduce randomness then those additional teams potentially deserved to be there.

I’m not saying that means further expansion should be on the table, because it shouldn’t. But it does mean that in this case 12 teams doesn’t undermine the competitiveness of the playoffs.

1 year ago
Reply to  CHINarwhal

That would be because playoff format changes aren’t made to “address randomness” but to generate more revenue. In fact, the “randomness” is more of a feature, not a bug.

The Playoffs would be meaningless if you could predict the winners regularly just by the numbers.

Last edited 1 year ago by fjtorres
1 year ago
Reply to  aglossman

I think the questions are good. The Reddit thing posited a theory based upon limited observations. He lacked the tools to explore it further. That it was disproven does not make the question stupid. Personally, I read this expecting the conclusion Dan reached, but I read the post because I was open to the possibility that their could be a correlation; that the Reddit thing could be right. Why else would I waste my time reading it? I note that you also read this.

1 year ago
Reply to  bglick4

Eh, I’m at work. What else am I supposed to do?