Yes, March Projections Matter in September
Back in March, all we had to look at for the 2021 season was projection; we’re now on the cusp of September, with just over 80% of the games already played. To the consternation of a percentage of fans, projections on sites like this one look unreasonably grumpy to teams like the Giants, who have played above their projections in 2021. It’s undeniable that they have been off for teams, which is something you should expect. But are these computers actually wrong when predicting middling play from these high-achieving surprises going forward?
The Giants, in particular, have already outperformed their original preseason projections by somewhere in the neighborhood of 10 wins. With a month to go in the season, it wouldn’t be a total surprise if San Francisco ended up with 25 more wins than the predictions. Yet our assembled projections only list the Giants with an expected roster strength of .512 over the rest of the season. Using ZiPS alone is sunnier but also well off the team’s seasonal pace through the end of August.
Team | W | L | GB | Pct | Div% | WC% | Playoff% | WS Win% |
---|---|---|---|---|---|---|---|---|
San Francisco Giants | 100 | 62 | — | .617 | 52.8% | 47.2% | 100.0% | 13.6% |
Los Angeles Dodgers | 100 | 62 | — | .617 | 47.2% | 52.8% | 100.0% | 12.8% |
San Diego Padres | 84 | 78 | 16 | .519 | 0.0% | 12.6% | 12.6% | 0.3% |
Colorado Rockies | 73 | 89 | 27 | .451 | 0.0% | 0.0% | 0.0% | 0.0% |
Arizona Diamondbacks | 73 | 89 | 27 | .451 | 0.0% | 0.0% | 0.0% | 0.0% |
Yes, ZiPS projects the Giants as the slight favorite to win the NL West now, but that doesn’t take a lot of mathematical courage given that they’re already in first place with most of the season done. With the help of a little math, you can see that the computer projects San Francisco to go just 16–16 the rest of the season; to be more precise, ZiPS thinks the Giants have a roster strength that indicates a .521 winning percentage with their average opponent for the rest of the season having a .520 roster strength, resulting in a mean projection around the .500 mark. That’s up quite a bit from the preseason, when ZiPS thought the Giants were a .467 team with a .502 schedule, ending up with a 75–87 preseason estimate. But it almost feels a bit disrespectful to the team leading the league in wins for most of the season.
So why are projections so grumpy? In a nutshell, it’s because they’re assembled based on empirical looks at what baseball’s history tells us, not (hopefully) the aesthetic valuations of the developer. And in the baseball history we have, seasonal projections do not become this all-powerful crystal ball when it comes to projecting how the rest of the season goes.
ZiPS has generated win projections for teams since 2005; not including 2020, we have 15 years of projected standings versus reality, making up 450 teams. To demonstrate the point, I went back and got the preseason projections for every team, the actual records of each team through the end of August in the season in question, and the September records of these teams. One-month projections are going to be volatile, but if preseason projections have limited predictive value over the actual performance, the April-to-August performance should still crush them in terms of predicting the future.
In reality, they don’t. Of the 450 teams, seasonal performance only was the better projection of September performance by a measure of 229 to 221. Using a more robust method than a simple head-to-head matchup, seasonal performance had an RMSE (root mean square error) of 0.107, compared to 0.112 for the original projections. A model of September winning percentage using just these two numbers would consist of a mix of 56% actual and 44% preseason. Using only teams from 2010 to ’19 — 2010 is when I started using a probabilistic playing time estimate, which I feel is superior — ZiPS beats actual in predicting September, 153 to 147, with the RMSE gap shrinking to 0.106 versus 0.108.
Nor did the preseason projections have trouble with the outliers, the teams with the best and worst winning percentages through the end of August:
Team | September WPct | April-August WPct | Preseason WPct |
---|---|---|---|
2017 Dodgers | .433 | .689 | .574 |
2018 Red Sox | .577 | .684 | .556 |
2011 Phillies | .533 | .652 | .586 |
2019 Yankees | .560 | .650 | .605 |
2015 Cardinals | .484 | .649 | .531 |
2016 Cubs | .621 | .644 | .580 |
2019 Astros | .760 | .642 | .599 |
2019 Dodgers | .750 | .638 | .574 |
2018 Yankees | .556 | .630 | .568 |
2010 Yankees | .433 | .621 | .593 |
2015 Royals | .469 | .615 | .500 |
2011 Red Sox | .259 | .615 | .574 |
2013 Braves | .481 | .615 | .562 |
2019 Twins | .667 | .615 | .512 |
2010 Rays | .500 | .614 | .562 |
Team | September WPct | April-August WPct | Preseason WPct |
2018 Orioles | .259 | .296 | .475 |
2019 Tigers | .259 | .299 | .420 |
2012 Astros | .500 | .303 | .364 |
2018 Royals | .536 | .321 | .426 |
2013 Astros | .259 | .326 | .352 |
2019 Orioles | .333 | .333 | .364 |
2010 Pirates | .433 | .333 | .444 |
2011 Astros | .360 | .343 | .426 |
2019 Royals | .440 | .350 | .420 |
2019 Marlins | .333 | .356 | .346 |
2013 Marlins | .464 | .366 | .401 |
2016 Twins | .345 | .368 | .475 |
2010 Orioles | .567 | .371 | .457 |
2016 Braves | .643 | .376 | .426 |
2017 Phillies | .552 | .376 | .414 |
I highlighted whether the ZiPS preseason projections or the season-to-date winning percentage was closer to the September results. As you can see, ZiPS did just fine with the teams that were absolutely crushing it or absolutely flailing and does a bit worse when we’re talking about the teams most defying their expectations, losing 8–12 against overachievers and 7–13 against underachievers.
The reason this happens is that the biggest piece of information that the seasonal stats have “access to” that ZiPS does not isn’t how the players played, but who the players are. If the projected Braves rotation in March suddenly decides after Opening Day to give up the whole baseball thing and travel from the country in a giant bus, solving mysteries and setting right what once went wrong, the seasonal performance “knows” that the Braves no longer had those pitchers, but the projections do not. The teams that miss their expectations by the most are far more likely to have had significant differences in projected playing time versus actual playing time.
Once you tell ZiPS who actually plays with no update based on seasonal performance of individual players, it turns the table on the April-August results. Preseason ZiPS with known playing time through August comes closer to September performance from 2010 to ’19 by a margin of 166–134, and the RMSE drops from 0.108 to 0.097. The most accurate mix for projecting September over that period would then be 62% ZiPS and 38% April-to-August record.
The advantage of preseason projections even continues into the following season. Taking away again the knowledge of who actually played, ZiPS still did a better job of predicting next season’s winning percentage for teams than the most recent season’s winning percentage, 157–143, with an RMSE of 0.071 compared to 0.078 for the previous year. To sum this up without any math talk: Based on projections and results from 2010 to ’20, you would expect a 2021 preseason team projection to do a better job projecting 2022 than the team’s actual 2021 performance.
(Let me take a bit of an intermezzo here to note that I tested ZiPS because I have access to ZiPS; it isn’t a magical unicorn, and I expect that projections based off of Steamer, THEBAT, etc. will also display this behavior. I just couldn’t test that.)
So why is this true? What you see here is the fundamental difference between Bayesian inference and frequentist inference. Bayesian inference means that we treat the probabilities themselves as unknowns; the frequentist approach would use the actual results as the underlying probability. The latter, using April-to-August as our expectation moving forward, assumes that the Giants have an underlying 64.6% chance of winning games because we observed them winning 64.6% of games. That approach fails to work historically because one of the first things you learn when projecting anything in baseball is that you never actually get to know what the right answer was.
To see this, look at the hypergeometric distribution (not binomial, since there are exactly 2,430 wins in a full season) of wins for a league in which every game is a coin flip.
Wins | Probability | 1-in | Cumulative Probability |
---|---|---|---|
58 | 0.01% | 14,173 | 0% |
59 | 0.01% | 7,888 | 0% |
60 | 0.02% | 4,511 | 0% |
61 | 0.04% | 2,651 | 0% |
62 | 0.06% | 1,601 | 0% |
63 | 0.10% | 993 | 0% |
64 | 0.16% | 632 | 0% |
65 | 0.24% | 414 | 1% |
66 | 0.36% | 278 | 1% |
67 | 0.52% | 191 | 2% |
68 | 0.74% | 135 | 2% |
69 | 1.02% | 98 | 3% |
70 | 1.36% | 73 | 5% |
71 | 1.78% | 56 | 6% |
72 | 2.27% | 44 | 9% |
73 | 2.82% | 35 | 12% |
74 | 3.41% | 29 | 15% |
75 | 4.03% | 25 | 19% |
76 | 4.63% | 22 | 24% |
77 | 5.20% | 19 | 29% |
78 | 5.68% | 18 | 35% |
79 | 6.05% | 17 | 41% |
80 | 6.29% | 16 | 47% |
81 | 6.37% | 16 | 53% |
82 | 6.29% | 16 | 60% |
83 | 6.05% | 17 | 66% |
84 | 5.68% | 18 | 71% |
85 | 5.20% | 19 | 76% |
86 | 4.63% | 22 | 81% |
87 | 4.03% | 25 | 85% |
88 | 3.41% | 29 | 89% |
89 | 2.82% | 35 | 91% |
90 | 2.27% | 44 | 94% |
91 | 1.78% | 56 | 95% |
92 | 1.36% | 73 | 97% |
93 | 1.02% | 98 | 98% |
94 | 0.74% | 135 | 99% |
95 | 0.52% | 191 | 99% |
96 | 0.36% | 278 | 99% |
97 | 0.24% | 414 | 100% |
98 | 0.16% | 632 | 100% |
99 | 0.10% | 993 | 100% |
100 | 0.06% | 1,601 | 100% |
101 | 0.04% | 2,651 | 100% |
102 | 0.02% | 4,511 | 100% |
103 | 0.01% | 7,888 | 100% |
104 | 0.01% | 14,173 | 100% |
In other words, with perfect knowledge that every team is precisely a .500 team in an abstract sense, you’d still expect, on average, about seven teams a season to win 89 or more games or 73 or fewer through nothing but sheer luck. That the season itself is a small sample is one of the frustrating things about making projections; you’re guessing at an answer to which you don’t get to find out the answer. If Joe Schmoe hit .300 after getting a mean projection of .270, there’s no divine intervention that tells you if he was “actually” a .300 hitter, a lucky .270 hitter, a really lucky .250 hitter, or even an unlucky .330 hitter. An updated projection will move toward .300 from .270 (ignoring all other factors), simply because it’s more likely that a .300 hitter hits .300 than a .250 hitter does, but not all the way.
Team | Preseason ZiPS Wins | Actual Wins | Difference |
---|---|---|---|
Cubs | 1310 | 1261 | -49 |
Tigers | 1260 | 1219 | -41 |
Orioles | 1156 | 1125 | -31 |
Mariners | 1195 | 1166 | -29 |
Rockies | 1205 | 1177 | -28 |
Giants | 1258 | 1232 | -26 |
Diamondbacks | 1238 | 1213 | -25 |
Red Sox | 1388 | 1363 | -25 |
Reds | 1205 | 1183 | -22 |
Padres | 1193 | 1173 | -20 |
Nationals | 1266 | 1248 | -18 |
Mets | 1263 | 1246 | -17 |
Phillies | 1279 | 1262 | -17 |
Pirates | 1157 | 1142 | -15 |
Royals | 1123 | 1111 | -12 |
Blue Jays | 1241 | 1237 | -4 |
Indians | 1299 | 1303 | 4 |
Rays | 1261 | 1275 | 14 |
Braves | 1274 | 1289 | 15 |
Twins | 1219 | 1234 | 15 |
Marlins | 1117 | 1141 | 24 |
Brewers | 1238 | 1262 | 24 |
White Sox | 1180 | 1207 | 27 |
Athletics | 1253 | 1282 | 29 |
Cardinals | 1336 | 1367 | 31 |
Dodgers | 1350 | 1382 | 32 |
Rangers | 1231 | 1265 | 34 |
Angels | 1287 | 1323 | 36 |
Astros | 1181 | 1222 | 41 |
Yankees | 1389 | 1432 | 43 |
One of the fundamentally wrong beliefs about projections is that certain teams or types of players regularly outperform or underperform their projections. But issues of bias are generally one of the first things to wring out in modeling baseball, and one of the places you get the most improvement. Instead, the issue for ZiPS — and, I believe, all other prominent projection systems — is accuracy rather than skew. For the 30 teams in baseball from 2005 to ’20, the year-to-year r^2 of one season’s winning percentage error to the next is 0.000215, meaning that the errors themselves are distributed just about randomly. There are going to be a lot of projections that miss by a mile because projecting the future is hard, but there’s no reason to believe that projection systems are biased against certain types of teams.
(Yes, from a certain point of view, me developing a projection system that has been too positive about the Rockies is a monumental, glorious self-own.)
ZiPS doesn’t hate the Giants or the Red Sox, and neither does their creator. It just has the advantage of knowing that, in baseball history, projections are better when you don’t weigh too heavily what came recently. If San Francisco’s season ends with a championship, the players are just as entitled to have beer-and-champagne showers as the Dodgers were last season. But that doesn’t mean that any of this should have been expected or that the team’s winning ways will continue at the same clip when we look at future accomplishments rather than current ones.
All models are flawed, but some are useful. Projection systems are handy and remain so, even this late in the season, with no updates included whatsoever. Ignore them at your peril.
Dan Szymborski is a senior writer for FanGraphs and the developer of the ZiPS projection system. He was a writer for ESPN.com from 2010-2018, a regular guest on a number of radio shows and podcasts, and a voting BBWAA member. He also maintains a terrible Twitter account at @DSzymborski.
I often think about the 2006 Cardinals who I’ve heard people say was the “worst team to win a World Series” since they went 83-78. Preseason they were +700 to win the WS (amongst the favorites) and the O/U on their win total was 94 (second most in baseball).
A great example of projections are better than in season results.
It’s complicated. There’s an argument it was the worst season ever for a world series winner, and that “worst team to win a World Series” is just shorthand for that. Obviously they weren’t the worst team when the playoffs started.
That’s true, and a good point. I guess what I mean is they may have went 83-79 but their true talent was much higher than 83 wins.