The 2022 ZiPS Projections Are Coming!
The first ZiPS team projections will hit the site this coming Monday, and as I typically do, I’m going to use this space to talk a little bit about my philosophy behind ZiPS, my goals, and the new things I’ve worked on before they go live.
ZiPS is a computer projection system I initially developed in 2002–04 and which officially went live for the ’04 season. The origin of ZiPS is similar to Tom Tango’s Marcel the Monkey, coming from discussions I had with Chris Dial, one of my best friends (my first interaction with Chris involved me being called an expletive!) and a fellow stat nerd, in the late 1990s. ZiPS moved quickly from its original inception as a reasonably simple projection system, and now does a lot more and uses a lot more data than I ever envisioned 20 years ago. At its core, however, it’s still doing two primary tasks: estimating what the baseline expectation for a player is at the moment I hit the button, and then estimating where that player may be going using large cohorts of relatively similar players.
ZiPS uses multi-year statistics, with more recent seasons weighted more heavily; in the beginning, all the statistics received the same yearly weighting, but eventually, this became more varied based on additional research. And research is a big part of ZiPS. Every year, I run hundreds of studies on various aspects of the system to determine their predictive value and better calibrate the player baselines. What started with the data available in 2002 has expanded considerably: Basic hit, velocity, and pitch data began playing a larger role starting in ’13, while data derived from StatCast has been included in recent years as I’ve gotten a handle on the predictive value and impact of those numbers on existing models. I believe in cautious, conservative design, so data is only included once I have confidence in improved accuracy; there are always builds of ZiPS that are still a couple of years away. Additional internal ZiPS tools like zBABIP, zHR, zBB, and zSO are used to better establish baseline expectations for players. These stats work similarly to the various flavors of “x” stats, with the z standing for something I’d wager you’ve already figured out. Read the rest of this entry »