One of the things you have to get used to when you work with projections is being wrong. Like, All. Of. The. Time. While I’d like to believe that the projections are accurate and it’s just real life that mucked things up, that isn’t quite how they work. There are always events you didn’t see coming, assumptions you made erroneously, and just plain old irreducible error, all of which are going to thwart you.
On a basic level, you’re supposed to be wrong. Imagine a world in which you knew, for an exact fact, that every team was a coin flip to win every game. With this perfect knowledge, you’d still expect nearly a quarter of the league to win either 73 games or fewer, or 89 games or more, through nothing but luck. For the math-inclined, this is a hypergeometric distribution, not a binomial one; the coin flips are not independent because the win totals will still add up to 2,430 and one team’s win invariably is another team’s loss. Here’s a quick table for some of the win totals, showing the probability of a team winning exactly X games and how many of the teams you’d expect to have won up to X games:
|Wins||Probability||1-in-X Chance of Occurring||Cumulative|
As an example, you’d expect 3.4% of those coin flip teams to win exactly 74 games, with 15% of all teams winning up to 74 games.
But we don’t have anywhere near perfect knowledge about how good a team will be. We’re not even in the same zip code as “near perfect”; we just hope to be on the right continent. As a result, our error bars are going to be significantly larger than even the rather erroneous results you still get with omniscient projections.
One of the questions I get a lot is whether ZiPS overrates or underrates a particular type of team. For example, does ZiPS miss more on younger teams? Older teams? Teams with more speed? Teams with good bullpens? Unfortunately, it’s nowhere near that simple. Attributes like that are calibration errors and, compared to other sources of error, they’re comparatively easy to identify and iron out. I’d be giddy if there were that much delicious, low-hanging fruit to pluck from the vine.
That’s not to say there aren’t teams that ZiPS has missed on more often than others. Let’s start with a table of yearly team win misses across ZiPS’ history:
ZiPS has missed on teams, but that’s not the same thing as those misses being predictive. ZiPS has overrated the Cubs by 3.4 wins a year on-average, but that’s not as exciting as it sounds: given the number of seasons for which ZiPS has made projections, you’d expect to miss that often on 2.3% of teams. ZiPS has overrated nine teams by at least a win per season (2005-2019), but you expect to miss by at least a win per season on roughly 30% of the league, or for … nine teams.
Put a different way, if you simply look at the correlation between an organization’s projection error one year and the next, the year-to-year correlation is -0.004. In other words, year-to-year errors for the same organization are essentially random.
Errors in a team’s projection come primarily from two sources. The first is missing on the projections of the individual players. When your projections see a Marcus Semien as a three-win player instead of a seven-win player, you’ve already “denied” the A’s four wins. Get enough of those big misses in one direction or the other, and you’re going to miss badly on the team.
The other major source of error comes from players’ actual playing time. We know, of course, that Mike Trout is going to get a lot more playing time than Michael Hermosillo if teams were to play an infinite number of seasons, but to the chagrin of anyone who tries to model baseball, they only actually play one season per year. (And sometimes less…)
Injuries and trades can eviscerate playing time projections very quickly. Imagine if the Red Sox, instead of trading Mookie Betts during the winter, decided to send him to Los Angeles after Opening Day. Suddenly, you’ve carved out a five-win swing for two franchises relative to your original assumptions. A 200-inning workhorse can become a zero-inning injured-list resident after nothing but a twinge in the forearm and an awkward discussion with James Andrews or Neal ElAttrache.
So, how much did ZiPS miss on teams in 2019? Let’s start with the final preseason projections and the actual win totals. There was a game unplayed between the White Sox and Tigers, so I’m arbitrarily adding a half-win to both teams; we’re making hamburgers here, not performing heart surgery:
|Team||2019 Wins||ZiPS Projected Wins||Miss|
The typical miss was 7.8 wins. That’s about an average result; some years ZiPS does better, some years ZiPS does worse. But the important question — the one I need to answer when doing a season’s post-mortem — is whether the error was due to getting the players wrong or getting the playing time assumptions wrong, as the latter error can sometimes be a Very Big Deal.
The largest team error in ZiPS history that was based purely on playing-time assumptions belongs to the 2014 Texas Rangers. Coming off their fourth-straight 90-win season, a run that featured two World Series appearances, ZiPS projected the team to go 88-74 in 2014. 88 wins would have been the team’s worst performance since 2009, but the bottom dropped out and quickly, and the Rangers finished at 67-95. The culprit? A stunning number of injuries. The Rangers combined for 2,116 days on the disabled list, which was, according to Jeff Zimmerman’s quite in-depth research at the time, the most days lost to injury for any team from 2002 to 2014, and nearly 100 more games lost than the next-worst team.
Knowing who actually got the playing time in Texas that season, and without changing a single projection for an individual player, ZiPS would have projected the Rangers to go 65-97, a 23-win difference. In other words, Texas underperformed the expectations terribly thanks to losing so much time due to injury, despite ZiPS actually underrating the individual players slightly.
So, let’s reshuffle the 2019 projections, this time with 20/20 hindsight about who actually played:
|Team||2019 Wins||ZiPS Projected Wins||Reconfigured Projected Wins||New Miss|
Unsurprisingly, this knowledge improves the accuracy of the projections greatly. Knowing who actually plays dropped the average error from 7.8 wins to 4.8 wins, and improved the win projections for 25 of the 30 teams. The biggest outlier in the latter group was the Yankees, a team ZiPS projected for 98 wins and that actually won 103. But that projection looks more accurate than it truly was; knowing who the Yankees actually played would have dropped the preseason projections to 90 wins.
On the flip side, ZiPS overrated the Tigers, Padres, and Red Sox by an average of 14 wins in 2019, an error that is cut nearly in half once you know who actually took the field. I under-projected the plate appearances of Fernando Tatis Jr. and Luis Urías by about 40%, failed to anticipate players like Gordon Beckham getting so much playing time, and greatly overestimated the health of Boston’s non-Eduardo Rodriguez pitchers and Dustin Pedroia.
More data means I can better model these playing time assumptions, but things like future injuries and eventual trades fall into that irreducible error category. This is part of the reason I’ve learned to not agonize over every missed projection or pull out my hair; I’d be even balder thanks to the unstoppable forces of math and fate.
Dan Szymborski is a senior writer for FanGraphs and the developer of the ZiPS projection system. He was a writer for ESPN.com from 2010-2018, a regular guest on a number of radio shows and podcasts, and a voting BBWAA member. He also maintains a terrible Twitter account at @DSzymborski.