Yes, March Projections Matter in September

August 30, 2021

Back in March, all we had to look at for the 2021 season was projection; we’re now on the cusp of September, with just over 80% of the games already played. To the consternation of a percentage of fans, projections on sites like this one look unreasonably grumpy to teams like the Giants, who have played above their projections in 2021. It’s undeniable that they have been off for teams, which is something you should expect. But are these computers actually wrong when predicting middling play from these high-achieving surprises going forward?

The Giants, in particular, have already outperformed their original preseason projections by somewhere in the neighborhood of 10 wins. With a month to go in the season, it wouldn’t be a total surprise if San Francisco ended up with 25 more wins than the predictions. Yet our assembled projections only list the Giants with an expected roster strength of .512 over the rest of the season. Using ZiPS alone is sunnier but also well off the team’s seasonal pace through the end of August.

ZiPS Projected Standings – NL West

Team	W	L	GB	Pct	Div%	WC%	Playoff%	WS Win%
San Francisco Giants	100	62	—	.617	52.8%	47.2%	100.0%	13.6%
Los Angeles Dodgers	100	62	—	.617	47.2%	52.8%	100.0%	12.8%
San Diego Padres	84	78	16	.519	0.0%	12.6%	12.6%	0.3%
Colorado Rockies	73	89	27	.451	0.0%	0.0%	0.0%	0.0%
Arizona Diamondbacks	73	89	27	.451	0.0%	0.0%	0.0%	0.0%

Yes, ZiPS projects the Giants as the slight favorite to win the NL West now, but that doesn’t take a lot of mathematical courage given that they’re already in first place with most of the season done. With the help of a little math, you can see that the computer projects San Francisco to go just 16–16 the rest of the season; to be more precise, ZiPS thinks the Giants have a roster strength that indicates a .521 winning percentage with their average opponent for the rest of the season having a .520 roster strength, resulting in a mean projection around the .500 mark. That’s up quite a bit from the preseason, when ZiPS thought the Giants were a .467 team with a .502 schedule, ending up with a 75–87 preseason estimate. But it almost feels a bit disrespectful to the team leading the league in wins for most of the season.

So why are projections so grumpy? In a nutshell, it’s because they’re assembled based on empirical looks at what baseball’s history tells us, not (hopefully) the aesthetic valuations of the developer. And in the baseball history we have, seasonal projections do not become this all-powerful crystal ball when it comes to projecting how the rest of the season goes.

ZiPS has generated win projections for teams since 2005; not including 2020, we have 15 years of projected standings versus reality, making up 450 teams. To demonstrate the point, I went back and got the preseason projections for every team, the actual records of each team through the end of August in the season in question, and the September records of these teams. One-month projections are going to be volatile, but if preseason projections have limited predictive value over the actual performance, the April-to-August performance should still crush them in terms of predicting the future.

In reality, they don’t. Of the 450 teams, seasonal performance only was the better projection of September performance by a measure of 229 to 221. Using a more robust method than a simple head-to-head matchup, seasonal performance had an RMSE (root mean square error) of 0.107, compared to 0.112 for the original projections. A model of September winning percentage using just these two numbers would consist of a mix of 56% actual and 44% preseason. Using only teams from 2010 to ’19 — 2010 is when I started using a probabilistic playing time estimate, which I feel is superior — ZiPS beats actual in predicting September, 153 to 147, with the RMSE gap shrinking to 0.106 versus 0.108.

Nor did the preseason projections have trouble with the outliers, the teams with the best and worst winning percentages through the end of August:

ZiPS vs. Actual, 2010–19

Team	September WPct	April-August WPct	Preseason WPct
2017 Dodgers	.433	.689	.574
2018 Red Sox	.577	.684	.556
2011 Phillies	.533	.652	.586
2019 Yankees	.560	.650	.605
2015 Cardinals	.484	.649	.531
2016 Cubs	.621	.644	.580
2019 Astros	.760	.642	.599
2019 Dodgers	.750	.638	.574
2018 Yankees	.556	.630	.568
2010 Yankees	.433	.621	.593
2015 Royals	.469	.615	.500
2011 Red Sox	.259	.615	.574
2013 Braves	.481	.615	.562
2019 Twins	.667	.615	.512
2010 Rays	.500	.614	.562
Team	September WPct	April-August WPct	Preseason WPct
2018 Orioles	.259	.296	.475
2019 Tigers	.259	.299	.420
2012 Astros	.500	.303	.364
2018 Royals	.536	.321	.426
2013 Astros	.259	.326	.352
2019 Orioles	.333	.333	.364
2010 Pirates	.433	.333	.444
2011 Astros	.360	.343	.426
2019 Royals	.440	.350	.420
2019 Marlins	.333	.356	.346
2013 Marlins	.464	.366	.401
2016 Twins	.345	.368	.475
2010 Orioles	.567	.371	.457
2016 Braves	.643	.376	.426
2017 Phillies	.552	.376	.414

I highlighted whether the ZiPS preseason projections or the season-to-date winning percentage was closer to the September results. As you can see, ZiPS did just fine with the teams that were absolutely crushing it or absolutely flailing and does a bit worse when we’re talking about the teams most defying their expectations, losing 8–12 against overachievers and 7–13 against underachievers.

The reason this happens is that the biggest piece of information that the seasonal stats have “access to” that ZiPS does not isn’t how the players played, but who the players are. If the projected Braves rotation in March suddenly decides after Opening Day to give up the whole baseball thing and travel from the country in a giant bus, solving mysteries and setting right what once went wrong, the seasonal performance “knows” that the Braves no longer had those pitchers, but the projections do not. The teams that miss their expectations by the most are far more likely to have had significant differences in projected playing time versus actual playing time.

Once you tell ZiPS who actually plays with no update based on seasonal performance of individual players, it turns the table on the April-August results. Preseason ZiPS with known playing time through August comes closer to September performance from 2010 to ’19 by a margin of 166–134, and the RMSE drops from 0.108 to 0.097. The most accurate mix for projecting September over that period would then be 62% ZiPS and 38% April-to-August record.

The advantage of preseason projections even continues into the following season. Taking away again the knowledge of who actually played, ZiPS still did a better job of predicting next season’s winning percentage for teams than the most recent season’s winning percentage, 157–143, with an RMSE of 0.071 compared to 0.078 for the previous year. To sum this up without any math talk: Based on projections and results from 2010 to ’20, you would expect a 2021 preseason team projection to do a better job projecting 2022 than the team’s actual 2021 performance.

(Let me take a bit of an intermezzo here to note that I tested ZiPS because I have access to ZiPS; it isn’t a magical unicorn, and I expect that projections based off of Steamer, THEBAT, etc. will also display this behavior. I just couldn’t test that.)

So why is this true? What you see here is the fundamental difference between Bayesian inference and frequentist inference. Bayesian inference means that we treat the probabilities themselves as unknowns; the frequentist approach would use the actual results as the underlying probability. The latter, using April-to-August as our expectation moving forward, assumes that the Giants have an underlying 64.6% chance of winning games because we observed them winning 64.6% of games. That approach fails to work historically because one of the first things you learn when projecting anything in baseball is that you never actually get to know what the right answer was.

To see this, look at the hypergeometric distribution (not binomial, since there are exactly 2,430 wins in a full season) of wins for a league in which every game is a coin flip.

Hypergeometric Win Distribution, 162 Coin Flips

Wins	Probability	1-in	Cumulative Probability
58	0.01%	14,173	0%
59	0.01%	7,888	0%
60	0.02%	4,511	0%
61	0.04%	2,651	0%
62	0.06%	1,601	0%
63	0.10%	993	0%
64	0.16%	632	0%
65	0.24%	414	1%
66	0.36%	278	1%
67	0.52%	191	2%
68	0.74%	135	2%
69	1.02%	98	3%
70	1.36%	73	5%
71	1.78%	56	6%
72	2.27%	44	9%
73	2.82%	35	12%
74	3.41%	29	15%
75	4.03%	25	19%
76	4.63%	22	24%
77	5.20%	19	29%
78	5.68%	18	35%
79	6.05%	17	41%
80	6.29%	16	47%
81	6.37%	16	53%
82	6.29%	16	60%
83	6.05%	17	66%
84	5.68%	18	71%
85	5.20%	19	76%
86	4.63%	22	81%
87	4.03%	25	85%
88	3.41%	29	89%
89	2.82%	35	91%
90	2.27%	44	94%
91	1.78%	56	95%
92	1.36%	73	97%
93	1.02%	98	98%
94	0.74%	135	99%
95	0.52%	191	99%
96	0.36%	278	99%
97	0.24%	414	100%
98	0.16%	632	100%
99	0.10%	993	100%
100	0.06%	1,601	100%
101	0.04%	2,651	100%
102	0.02%	4,511	100%
103	0.01%	7,888	100%
104	0.01%	14,173	100%

In other words, with perfect knowledge that every team is precisely a .500 team in an abstract sense, you’d still expect, on average, about seven teams a season to win 89 or more games or 73 or fewer through nothing but sheer luck. That the season itself is a small sample is one of the frustrating things about making projections; you’re guessing at an answer to which you don’t get to find out the answer. If Joe Schmoe hit .300 after getting a mean projection of .270, there’s no divine intervention that tells you if he was “actually” a .300 hitter, a lucky .270 hitter, a really lucky .250 hitter, or even an unlucky .330 hitter. An updated projection will move toward .300 from .270 (ignoring all other factors), simply because it’s more likely that a .300 hitter hits .300 than a .250 hitter does, but not all the way.

ZiPS Projected Wins vs. Actual, 2005–20

Team	Preseason ZiPS Wins	Actual Wins	Difference
Cubs	1310	1261	-49
Tigers	1260	1219	-41
Orioles	1156	1125	-31
Mariners	1195	1166	-29
Rockies	1205	1177	-28
Giants	1258	1232	-26
Diamondbacks	1238	1213	-25
Red Sox	1388	1363	-25
Reds	1205	1183	-22
Padres	1193	1173	-20
Nationals	1266	1248	-18
Mets	1263	1246	-17
Phillies	1279	1262	-17
Pirates	1157	1142	-15
Royals	1123	1111	-12
Blue Jays	1241	1237	-4
Indians	1299	1303	4
Rays	1261	1275	14
Braves	1274	1289	15
Twins	1219	1234	15
Marlins	1117	1141	24
Brewers	1238	1262	24
White Sox	1180	1207	27
Athletics	1253	1282	29
Cardinals	1336	1367	31
Dodgers	1350	1382	32
Rangers	1231	1265	34
Angels	1287	1323	36
Astros	1181	1222	41
Yankees	1389	1432	43

One of the fundamentally wrong beliefs about projections is that certain teams or types of players regularly outperform or underperform their projections. But issues of bias are generally one of the first things to wring out in modeling baseball, and one of the places you get the most improvement. Instead, the issue for ZiPS — and, I believe, all other prominent projection systems — is accuracy rather than skew. For the 30 teams in baseball from 2005 to ’20, the year-to-year r^2 of one season’s winning percentage error to the next is 0.000215, meaning that the errors themselves are distributed just about randomly. There are going to be a lot of projections that miss by a mile because projecting the future is hard, but there’s no reason to believe that projection systems are biased against certain types of teams.

(Yes, from a certain point of view, me developing a projection system that has been too positive about the Rockies is a monumental, glorious self-own.)

ZiPS doesn’t hate the Giants or the Red Sox, and neither does their creator. It just has the advantage of knowing that, in baseball history, projections are better when you don’t weigh too heavily what came recently. If San Francisco’s season ends with a championship, the players are just as entitled to have beer-and-champagne showers as the Dodgers were last season. But that doesn’t mean that any of this should have been expected or that the team’s winning ways will continue at the same clip when we look at future accomplishments rather than current ones.

All models are flawed, but some are useful. Projection systems are handy and remain so, even this late in the season, with no updates included whatsoever. Ignore them at your peril.

34 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

kylerkelton

3 years ago

I often think about the 2006 Cardinals who I’ve heard people say was the “worst team to win a World Series” since they went 83-78. Preseason they were +700 to win the WS (amongst the favorites) and the O/U on their win total was 94 (second most in baseball).
A great example of projections are better than in season results.

sadtromboneMember since 2020

Reply to kylerkelton

It’s complicated. There’s an argument it was the worst season ever for a world series winner, and that “worst team to win a World Series” is just shorthand for that. Obviously they weren’t the worst team when the playoffs started.

Reply to sadtrombone

That’s true, and a good point. I guess what I mean is they may have went 83-79 but their true talent was much higher than 83 wins.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG