Yes, March Projections Matter in September

Back in March, all we had to look at for the 2021 season was projection; we’re now on the cusp of September, with just over 80% of the games already played. To the consternation of a percentage of fans, projections on sites like this one look unreasonably grumpy to teams like the Giants, who have played above their projections in 2021. It’s undeniable that they have been off for teams, which is something you should expect. But are these computers actually wrong when predicting middling play from these high-achieving surprises going forward?

The Giants, in particular, have already outperformed their original preseason projections by somewhere in the neighborhood of 10 wins. With a month to go in the season, it wouldn’t be a total surprise if San Francisco ended up with 25 more wins than the predictions. Yet our assembled projections only list the Giants with an expected roster strength of .512 over the rest of the season. Using ZiPS alone is sunnier but also well off the team’s seasonal pace through the end of August.

ZiPS Projected Standings – NL West
Team W L GB Pct Div% WC% Playoff% WS Win%
San Francisco Giants 100 62 .617 52.8% 47.2% 100.0% 13.6%
Los Angeles Dodgers 100 62 .617 47.2% 52.8% 100.0% 12.8%
San Diego Padres 84 78 16 .519 0.0% 12.6% 12.6% 0.3%
Colorado Rockies 73 89 27 .451 0.0% 0.0% 0.0% 0.0%
Arizona Diamondbacks 73 89 27 .451 0.0% 0.0% 0.0% 0.0%

Yes, ZiPS projects the Giants as the slight favorite to win the NL West now, but that doesn’t take a lot of mathematical courage given that they’re already in first place with most of the season done. With the help of a little math, you can see that the computer projects San Francisco to go just 16–16 the rest of the season; to be more precise, ZiPS thinks the Giants have a roster strength that indicates a .521 winning percentage with their average opponent for the rest of the season having a .520 roster strength, resulting in a mean projection around the .500 mark. That’s up quite a bit from the preseason, when ZiPS thought the Giants were a .467 team with a .502 schedule, ending up with a 75–87 preseason estimate. But it almost feels a bit disrespectful to the team leading the league in wins for most of the season.

So why are projections so grumpy? In a nutshell, it’s because they’re assembled based on empirical looks at what baseball’s history tells us, not (hopefully) the aesthetic valuations of the developer. And in the baseball history we have, seasonal projections do not become this all-powerful crystal ball when it comes to projecting how the rest of the season goes.

ZiPS has generated win projections for teams since 2005; not including 2020, we have 15 years of projected standings versus reality, making up 450 teams. To demonstrate the point, I went back and got the preseason projections for every team, the actual records of each team through the end of August in the season in question, and the September records of these teams. One-month projections are going to be volatile, but if preseason projections have limited predictive value over the actual performance, the April-to-August performance should still crush them in terms of predicting the future.

In reality, they don’t. Of the 450 teams, seasonal performance only was the better projection of September performance by a measure of 229 to 221. Using a more robust method than a simple head-to-head matchup, seasonal performance had an RMSE (root mean square error) of 0.107, compared to 0.112 for the original projections. A model of September winning percentage using just these two numbers would consist of a mix of 56% actual and 44% preseason. Using only teams from 2010 to ’19 — 2010 is when I started using a probabilistic playing time estimate, which I feel is superior — ZiPS beats actual in predicting September, 153 to 147, with the RMSE gap shrinking to 0.106 versus 0.108.

Nor did the preseason projections have trouble with the outliers, the teams with the best and worst winning percentages through the end of August:

ZiPS vs. Actual, 2010–19
Team September WPct April-August WPct Preseason WPct
2017 Dodgers .433 .689 .574
2018 Red Sox .577 .684 .556
2011 Phillies .533 .652 .586
2019 Yankees .560 .650 .605
2015 Cardinals .484 .649 .531
2016 Cubs .621 .644 .580
2019 Astros .760 .642 .599
2019 Dodgers .750 .638 .574
2018 Yankees .556 .630 .568
2010 Yankees .433 .621 .593
2015 Royals .469 .615 .500
2011 Red Sox .259 .615 .574
2013 Braves .481 .615 .562
2019 Twins .667 .615 .512
2010 Rays .500 .614 .562
Team September WPct April-August WPct Preseason WPct
2018 Orioles .259 .296 .475
2019 Tigers .259 .299 .420
2012 Astros .500 .303 .364
2018 Royals .536 .321 .426
2013 Astros .259 .326 .352
2019 Orioles .333 .333 .364
2010 Pirates .433 .333 .444
2011 Astros .360 .343 .426
2019 Royals .440 .350 .420
2019 Marlins .333 .356 .346
2013 Marlins .464 .366 .401
2016 Twins .345 .368 .475
2010 Orioles .567 .371 .457
2016 Braves .643 .376 .426
2017 Phillies .552 .376 .414

I highlighted whether the ZiPS preseason projections or the season-to-date winning percentage was closer to the September results. As you can see, ZiPS did just fine with the teams that were absolutely crushing it or absolutely flailing and does a bit worse when we’re talking about the teams most defying their expectations, losing 8–12 against overachievers and 7–13 against underachievers.

The reason this happens is that the biggest piece of information that the seasonal stats have “access to” that ZiPS does not isn’t how the players played, but who the players are. If the projected Braves rotation in March suddenly decides after Opening Day to give up the whole baseball thing and travel from the country in a giant bus, solving mysteries and setting right what once went wrong, the seasonal performance “knows” that the Braves no longer had those pitchers, but the projections do not. The teams that miss their expectations by the most are far more likely to have had significant differences in projected playing time versus actual playing time.

Once you tell ZiPS who actually plays with no update based on seasonal performance of individual players, it turns the table on the April-August results. Preseason ZiPS with known playing time through August comes closer to September performance from 2010 to ’19 by a margin of 166–134, and the RMSE drops from 0.108 to 0.097. The most accurate mix for projecting September over that period would then be 62% ZiPS and 38% April-to-August record.

The advantage of preseason projections even continues into the following season. Taking away again the knowledge of who actually played, ZiPS still did a better job of predicting next season’s winning percentage for teams than the most recent season’s winning percentage, 157–143, with an RMSE of 0.071 compared to 0.078 for the previous year. To sum this up without any math talk: Based on projections and results from 2010 to ’20, you would expect a 2021 preseason team projection to do a better job projecting 2022 than the team’s actual 2021 performance.

(Let me take a bit of an intermezzo here to note that I tested ZiPS because I have access to ZiPS; it isn’t a magical unicorn, and I expect that projections based off of Steamer, THEBAT, etc. will also display this behavior. I just couldn’t test that.)

So why is this true? What you see here is the fundamental difference between Bayesian inference and frequentist inference. Bayesian inference means that we treat the probabilities themselves as unknowns; the frequentist approach would use the actual results as the underlying probability. The latter, using April-to-August as our expectation moving forward, assumes that the Giants have an underlying 64.6% chance of winning games because we observed them winning 64.6% of games. That approach fails to work historically because one of the first things you learn when projecting anything in baseball is that you never actually get to know what the right answer was.

To see this, look at the hypergeometric distribution (not binomial, since there are exactly 2,430 wins in a full season) of wins for a league in which every game is a coin flip.

Hypergeometric Win Distribution, 162 Coin Flips
Wins Probability 1-in Cumulative Probability
58 0.01% 14,173 0%
59 0.01% 7,888 0%
60 0.02% 4,511 0%
61 0.04% 2,651 0%
62 0.06% 1,601 0%
63 0.10% 993 0%
64 0.16% 632 0%
65 0.24% 414 1%
66 0.36% 278 1%
67 0.52% 191 2%
68 0.74% 135 2%
69 1.02% 98 3%
70 1.36% 73 5%
71 1.78% 56 6%
72 2.27% 44 9%
73 2.82% 35 12%
74 3.41% 29 15%
75 4.03% 25 19%
76 4.63% 22 24%
77 5.20% 19 29%
78 5.68% 18 35%
79 6.05% 17 41%
80 6.29% 16 47%
81 6.37% 16 53%
82 6.29% 16 60%
83 6.05% 17 66%
84 5.68% 18 71%
85 5.20% 19 76%
86 4.63% 22 81%
87 4.03% 25 85%
88 3.41% 29 89%
89 2.82% 35 91%
90 2.27% 44 94%
91 1.78% 56 95%
92 1.36% 73 97%
93 1.02% 98 98%
94 0.74% 135 99%
95 0.52% 191 99%
96 0.36% 278 99%
97 0.24% 414 100%
98 0.16% 632 100%
99 0.10% 993 100%
100 0.06% 1,601 100%
101 0.04% 2,651 100%
102 0.02% 4,511 100%
103 0.01% 7,888 100%
104 0.01% 14,173 100%

In other words, with perfect knowledge that every team is precisely a .500 team in an abstract sense, you’d still expect, on average, about seven teams a season to win 89 or more games or 73 or fewer through nothing but sheer luck. That the season itself is a small sample is one of the frustrating things about making projections; you’re guessing at an answer to which you don’t get to find out the answer. If Joe Schmoe hit .300 after getting a mean projection of .270, there’s no divine intervention that tells you if he was “actually” a .300 hitter, a lucky .270 hitter, a really lucky .250 hitter, or even an unlucky .330 hitter. An updated projection will move toward .300 from .270 (ignoring all other factors), simply because it’s more likely that a .300 hitter hits .300 than a .250 hitter does, but not all the way.

ZiPS Projected Wins vs. Actual, 2005–20
Team Preseason ZiPS Wins Actual Wins Difference
Cubs 1310 1261 -49
Tigers 1260 1219 -41
Orioles 1156 1125 -31
Mariners 1195 1166 -29
Rockies 1205 1177 -28
Giants 1258 1232 -26
Diamondbacks 1238 1213 -25
Red Sox 1388 1363 -25
Reds 1205 1183 -22
Padres 1193 1173 -20
Nationals 1266 1248 -18
Mets 1263 1246 -17
Phillies 1279 1262 -17
Pirates 1157 1142 -15
Royals 1123 1111 -12
Blue Jays 1241 1237 -4
Indians 1299 1303 4
Rays 1261 1275 14
Braves 1274 1289 15
Twins 1219 1234 15
Marlins 1117 1141 24
Brewers 1238 1262 24
White Sox 1180 1207 27
Athletics 1253 1282 29
Cardinals 1336 1367 31
Dodgers 1350 1382 32
Rangers 1231 1265 34
Angels 1287 1323 36
Astros 1181 1222 41
Yankees 1389 1432 43

One of the fundamentally wrong beliefs about projections is that certain teams or types of players regularly outperform or underperform their projections. But issues of bias are generally one of the first things to wring out in modeling baseball, and one of the places you get the most improvement. Instead, the issue for ZiPS — and, I believe, all other prominent projection systems — is accuracy rather than skew. For the 30 teams in baseball from 2005 to ’20, the year-to-year r^2 of one season’s winning percentage error to the next is 0.000215, meaning that the errors themselves are distributed just about randomly. There are going to be a lot of projections that miss by a mile because projecting the future is hard, but there’s no reason to believe that projection systems are biased against certain types of teams.

(Yes, from a certain point of view, me developing a projection system that has been too positive about the Rockies is a monumental, glorious self-own.)

ZiPS doesn’t hate the Giants or the Red Sox, and neither does their creator. It just has the advantage of knowing that, in baseball history, projections are better when you don’t weigh too heavily what came recently. If San Francisco’s season ends with a championship, the players are just as entitled to have beer-and-champagne showers as the Dodgers were last season. But that doesn’t mean that any of this should have been expected or that the team’s winning ways will continue at the same clip when we look at future accomplishments rather than current ones.

All models are flawed, but some are useful. Projection systems are handy and remain so, even this late in the season, with no updates included whatsoever. Ignore them at your peril.





Dan Szymborski is a senior writer for FanGraphs and the developer of the ZiPS projection system. He was a writer for ESPN.com from 2010-2018, a regular guest on a number of radio shows and podcasts, and a voting BBWAA member. He also maintains a terrible Twitter account at @DSzymborski.

34 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
kylerkelton
1 year ago

I often think about the 2006 Cardinals who I’ve heard people say was the “worst team to win a World Series” since they went 83-78. Preseason they were +700 to win the WS (amongst the favorites) and the O/U on their win total was 94 (second most in baseball).
A great example of projections are better than in season results.

sadtrombonemember
1 year ago
Reply to  kylerkelton

It’s complicated. There’s an argument it was the worst season ever for a world series winner, and that “worst team to win a World Series” is just shorthand for that. Obviously they weren’t the worst team when the playoffs started.

kylerkelton
1 year ago
Reply to  sadtrombone

That’s true, and a good point. I guess what I mean is they may have went 83-79 but their true talent was much higher than 83 wins.