The 2021 ZiPS Projections: An Introduction
The first ZiPS team projection for 2021 goes live on Wednesday, and as usual, this is a good place to give reminders about what ZiPS is, what ZiPS is trying to do, and — perhaps most importantly — what ZiPS is not.
ZiPS is a computer projection system, developed by me in 2002–04 and which officially went live for the ’04 season. The origin of ZiPS is similar to Tom Tango’s Marcel the Monkey, coming from discussions I had with Chris Dial, one of my best friends and a fellow stat nerd, in the late 1990s (my first interaction with Chris involved me being called an expletive!). ZiPS moved quickly from its original inception as a fairly simple projection system: It now does a lot more and uses a lot more data than I ever envisioned 20 years ago. At its core, however, it’s still doing two basic tasks: estimating what the baseline expectation for a player is at the moment I hit the button; and then estimating where that player may be going using large cohorts of relatively similar players.
ZiPS uses multi-year statistics, with more recent seasons weighted more heavily; in the beginning, all the statistics received the same yearly weighting, but eventually, this became more varied based on additional research. Research is a big part of ZiPS, and every year, I run literally hundreds of studies on various aspects of the system to determine their predictive value and better calibrate the player baselines. What started with the data available in 2002 has expanded considerably: Basic hit, velocity, and pitch data began playing a larger role starting in ’13; and data derived from StatCast has been included in recent years as I got a handle on the predictive value and impact of those numbers on existing models. I believe in cautious, conservative design, so data is only included once I have confidence in improved accuracy; there are always builds of ZiPS that are still a couple of years away. Additional internal ZiPS tools like zBABIP, zHR, zBB, and zSO are used to better establish baseline expectations for players. These stats work similarly to the various flavors of “x” stats, with the z standing for something I’d wager you’ve already figured out!
When estimating a player’s future production, ZiPS compares their baseline performance, both in quality and shape, to the baseline of every player in its database at every point in their career. This database consists of every major leaguer since the Deadball era — the game was so different prior to then that I’ve found pre-Deadball comps make projections less accurate — and every minor league translation since the late 1960s. Using cluster analysis techniques (Mahalanobis distance is one of my favorite tools), ZiPS assembles a cohort of fairly similar players across history for player comparisons, something you see in the most similar comps list. Non-statistical factors include age, position, handedness, and, to a lesser extent, height and weight compared to the average height and weight of the era (unfortunately, this data is not very good). ZiPS then generates a probable aging curve — both midpoint projections and range — on the fly for each player. This method has been used by PECOTA and by the Elias Baseball Analyst in the late 1980s, and I think it is the best approach. After all, there is little experimental data in baseball; the only way we know how plodding sluggers age is from observing how plodding sluggers age.
One of the tenets of projections I follow is that no matter what the projection says, that’s the ZiPS projection. Even if inserting my opinion would improve a specific projection, I’m philosophically opposed to doing so. ZiPS is most useful when people know that it’s purely data-based, not some unknown mix of data and my opinion. Over the years, I like to think I’ve taken a clever approach to turning more things into data — for example, ZiPS’ use of basic injury information — but some things just aren’t in the model. ZiPS doesn’t know if a pitcher wasn’t allowed to throw his slider coming back from injury, or if a left fielder suffered a family tragedy in July. I consider these things outside a projection system’s purview, even though they can affect on-field performance.
It’s also important to remember that the bottom-line projection is, in layman’s terms, only a midpoint. You don’t expect every player to hit that midpoint; 10% of players are “supposed” to fail to meet their 10th percentile projection and 10% of players are supposed to pass the 90th percentile forecast. This point can create a surprising amount of confusion. ZiPS gave .300 BA projections to three players in 2020: Luis Arraez, Jose Altuve (yikes!), and Christian Yelich (double yikes!). But that’s not the same thing as ZiPS thinking there would only be three .300 hitters. ZiPS actually projected that there would be 41 .300 hitters, which is more than typical given a very short season, and 39 hitters with at least 100 plate appearances in fact did so.
Another crucial thing to bear in mind is that the basic ZiPS projections are not playing-time predictors. By design, ZiPS has no idea who will actually play in the majors in 2021. ZiPS is essentially projecting equivalent production; a batter with a .240 projection may “actually” have a .260 Triple-A projection or a .290 Double-A projection. How a Julio Rodriguez would hit in the majors full-time in 2021 is a far more interesting use of a projection system than it telling me that he won’t play in the majors, or at least play only a little bit. For the depth charts that go live in every article, I use the FanGraphs Depth Charts to determine the playing time for individual players. Since we’re talking about team construction, I can’t leave ZiPS to its own devices for an application like this. It’s the same reason I use modified depth charts for team projections in-season. There’s a probabilistic element in the ZiPS depth charts: sometimes Joe Schmo will play a full season, sometimes he’ll miss playing time and Buck Schmuck has to step in. But the basic concept is very straight-forward.
The to-do list never shrinks. One of the things still on the drawing board is better run/RBI projections. ZiPS wasn’t originally designed as a fantasy baseball tool — fantasy baseball analysts have been making fantasy-targeted projections for a long times — but given that ZiPS is frequently used by fantasy players, more sophisticated models are in the works. Saves, on the other hand, are a particularly difficult issue. As of now, the only thing I tell ZiPS about a player’s role is if it is going to change, which determines if ZiPS sees future Mike Moustakas as a second or third baseman. I’ve tried a lot of shortcuts, like trying to model the manager’s decision about who the closer would be using both statistics and things like age, salary, and history. While it generally does a good job projecting who will be the closer, the misses are gigantic and render save projections ineffective; a managerial decision can turn a 35-save pitcher into a five-save pitcher. I’m still figuring out how to approach this problem.
What’s new in 2021? Not as much as I originally expected. I like to have a lot of testing data before introducing major updates and not just rely on things like cross-validation. In the end, I want to test projections and the various components of projections against players that ZiPS has never “seen” before, not randomized subsets of players already “in the cookie dough.” There has been additional refinement of my models for BABIP and for the various z measures noted above, but nothing earth-shattering.
The projections for 2021 aren’t so much about what is new as about what is missing: most of 2020. I hardly need to go over the tale of why data is missing, but the fact is that it creates a new challenge, a new bit of uncertainty for a projection system. With little in history as a guide, I’ve looked extensively at the abbreviated 1981 and 1994/1995 seasons and found that the best approach is to simply project out the missing games as if players were going to come back in the fall and finish off the schedule. This has the effect of 2020’s performance being weighted much more lightly than the recent numbers of any season I’ve ever used in a projection.
The layoff factor is also a large one for minor leaguers. The good news is that non-injury layoffs are much less severe in terms of expectations than layoffs as a result from injury. And in this case, most players are in the same boat, and even those who did get to play in the majors only got two months of baseball, a much smaller helping than usual.
Playoff stats are included in the mix as they have since, if I’m not mistaken, the 2012 season. They usually add a skosh of accuracy — these are real, important games against the toughest of competition — and are hopefully even more useful in a season in which so many games are missing. The Tampa Bay Rays, after all, extended their season a whole third by virtue of playing in 20 playoff games.
Have any questions, suggestions, or concerns about ZiPS? I’ll try to reply to as many as I can reasonably address in the comments below. If the projections have been valuable to you now or in the past, I would also strongly urge you to consider becoming a member of FanGraphs, should you currently have the ability to do so. It’s with your support that I have been able to keep so much of this work available to the public for free for such a long time.
Dan Szymborski is a senior writer for FanGraphs and the developer of the ZiPS projection system. He was a writer for ESPN.com from 2010-2018, a regular guest on a number of radio shows and podcasts, and a voting BBWAA member. He also maintains a terrible Twitter account at @DSzymborski.
 
								
Are playoff games equally weighted as regular season games? Do Dodgers or Rays players have any noticeable difference (outside of skill) in their projections from playing more games?
This was a question I had, look at someone like Randy Arozarena. He had more PA in the post season than in the regular season. You would think that those PA appearance in the post season would at least have some predictive measure on his success going forward, even if it’s just a little.
Weighted equally, though there are greater adjustments than usual due to opponent quality.
In a related question, is the difficulty of the postseason taken into consideration? After all, for example, you would not expect a hitter with a .300 BA to have a .300 BA in the postseason, all things being equal.