Yes, the Playoffs Are Still a Crapshoot by Dan Szymborski October 19, 2022 Orlando Ramirez-USA TODAY Sports We live in an era where every team, to some degree or another, embraces modern analytics when assessing itself and the rest of the league. Wanting to know about the numbers that drive baseball has filtered into fandom as well, which is why you’re here on this very website! But despite all the progress the stathead crowd has made over the last quarter-century, when it comes to the playoffs and playoff results, many fans seem more inclined to defenestrate the numbers and attribute the losses to all sorts of causative elements beyond a surplus or dearth of players just happening to have particularly good games that week. In the worst case, failing to win two of three games or three of five is attributed to some kind of character flaw. At best, the loss is because of some fundamental flaw in a team’s construction, typically something that sabermetrics is to blame for, no matter whether the team is sabermetrically inclined or not. Here’s one example of very common thinking along these lines just from the last couple of days. It’s from a fan on Reddit, so I’m not specifically attributing them, mainly because I don’t want to risk social media pile-ons: The postseason is the real season from now on, so the focus shouldn’t be on sabermetrics as much as it has been. With the playoffs being expanded, 111 wins doesn’t mean squat. Shift some of the focus to bringing in guys who play with fire and aggressiveness and will situationally hit rather than live and die by the long ball. If it costs us some wins during the regular season, so what?? It’s the little things that win the most important games. With three of MLB’s four 100-win teams already out of the playoffs and the 99-win team pushed to the brink but ultimately surviving, these types of incriminations will be common this season. The Dodgers didn’t lose a 60/40 matchup (the ZiPS projection for the series) because they were simply outplayed over four games, but because something was broken in how the team was built. Depending on who you listen to, you can hear the same type of grumbling about the Mets and Braves. Since questions should be explored rather than dismissed, let’s look at playoff overperformers and underperformers over the Wild Card era (starting in 1995) and examine if there really are consistent patterns behind which teams are overperforming or underperforming in the postseason. And since this is illustrative more than anything, I’m trying to keep it as simple as possible, within reason. The best place to start is the 950 playoff games played since the 1995 postseason [originally said 1,900 erroneously -DS]. (This doesn’t include Game 5 of the Yankees-Guardians ALDS since I was crunching the numbers while that game was being played.) Using Pythagorean wins plus home-field advantage and projecting games on that basis, here is how each team has over or underperformed in the playoffs: Playoff Performance vs. Pythagorean Record (1995-2022) Team Expected Wins Actual Wins Difference New York Yankees 105.1 119 13.9 San Francisco Giants 42.3 50 7.7 Kansas City Royals 14.4 22 7.6 Boston Red Sox 61.5 68 6.5 Florida Marlins 17.6 24 6.4 Philadelphia Phillies 26.2 32 5.8 Cleveland Guardians 47.3 51 3.7 New York Mets 26.2 28 1.8 Chicago White Sox 12.8 14 1.2 Washington Nationals 17.9 19 1.1 Houston Astros 62.0 63 1.0 St. Louis Cardinals 74.1 75 0.9 Detroit Tigers 24.7 25 0.3 Pittsburgh Pirates 3.7 3 -0.7 Atlanta Braves 72.8 72 -0.8 San Diego Padres 16.0 15 -1.0 Colorado Rockies 11.3 10 -1.3 Baltimore Orioles 16.4 15 -1.4 Milwaukee Brewers 15.1 13 -2.1 Arizona Diamondbacks 20.2 18 -2.2 Tampa Bay Rays 30.8 28 -2.8 Seattle Mariners 19.9 17 -2.9 Toronto Blue Jays 13.1 10 -3.1 Los Angeles Angels 24.6 21 -3.6 Cincinnati Reds 8.9 5 -3.9 Texas Rangers 25.2 21 -4.2 Chicago Cubs 31.2 25 -6.2 Oakland A’s 24.6 18 -6.6 Los Angeles Dodgers 68.8 62 -6.8 Minnesota Twins 15.4 6 -9.4 The Dodgers finish near the bottom, though part of that is they’ve played in so many games; they’ve won about seven fewer games in the playoffs since 1995 based on what you’d expect from their Pythagorean record, the Pythagorean record of their opponents, and whether they had home field advantage in any particular game. But one thing worth noting is that Los Angeles’ deficit is almost entirely in games played before Andrew Friedman took over after the 2014 season and moved the team in a more sabermetric-oriented direction. From 2015 to ’22, the Dodgers won 47 playoff games against 47.9 expected wins. Since eyeballing lists is a poor method of dimensionality reduction, I ran a series of logistic regressions (winning or losing a game is binary data) using a few dozen different variables. If these variables have relevance, then including them along with the Pythagorean projections ought to make them better at sorting wins and losses. I started off with an obvious one: bullpen strength. I want to avoid getting too far into the weeds talking about chi-squares and Wald tests, so here’s a method with a visual element, the ROC curve. Simply put, it evaluates how well we identify wins and losses ahead of time. The larger the area under the blue curves below, the better the model. On the top is the model using only the Pythagorean winning percentage, and on the bottom, the model uses Pythagorean winning percentage and the relative FIP- of the team’s bullpens. The dashed line represents the results for a random number generator in picking wins and losses, with the blue line indicating the output of the model. In other words, we get a very slight improvement by knowing the bullpen strength in addition to team quality. To put it into an easy number, if you picked the 1,900 wins-losses using just Pythag and home-field advantage, you’d have correctly picked the winner 1,018 times (53.6%). If you used the model that also included the relative quality of the team bullpens, you would have instead picked the winner 1,026 times. That translates to being correct one more time about every three years. Another common “explanation” of why good teams underperform is that they’re too reliant on home runs. So I calculated a home run propensity for each team, starting with park-neutral seasonal stats and then calculating runs created both with and without every at-bat that resulted in a home run. The least reliant on home runs was the 2007 Angels (22.1% of runs), and the most was the 2020 Dodgers at 53.4%. I then included the relative dependence on home runs for both teams in each game. With this knowledge, you’d improve from picking 1,018 winners to… 1,085. While the model improves by knowing which team relies more on home runs to generate runs, it didn’t confirm the hypothesis for one crucial direction: the effect was in the opposite direction. Teams with a greater reliance on home runs have overperformed in the postseason, not underperformed. For relative three true outcomes (walks, home runs, strikeouts) performance, you only improve from 1,017 correct to 1,056, but again in the direction of sabermetric-friendly teams, with high TTO teams overperforming in the playoffs slightly. Contact rate is another post-hoc explanation for why certain teams win, but that was rejected from the model at the 95% level of significance. The same goes for attempts to include momentum (second-half performance, September record). Team speed doesn’t make the threshold. Neither does clutch performance during the regular season or postseason performance history of the players involved. In layman’s terms, knowing which teams have these characteristics has little to no value in telling us why certain teams won playoff games over the last three decades, apart from their effect on their Pythagorean record. Surprisingly, what was (barely) useful enough to make it as a variable was team age, but this was an advantage on the side of young players; young teams, and consequently, teams with less playoff experience very slightly overperformed over the last 28 postseasons. In the end, I looked at about 60 team-related variables, and they all did almost nothing to explain better which teams won playoff games. ZiPS does a somewhat better job, but only because it “cheats” since it knows who is starting, who is healthy, and who is in the lineup, something which isn’t necessarily a team construction issue. The Dodgers, Mets, and Braves lost because… well, they lost. That’s it. Enjoy the games, but sometimes a cigar is just a cigar, and that’s perfectly fine.