# Hitter zStats Entering the Homestretch, Part 1 (Validation)

One of the strange things about projecting baseball players is that even results themselves are small samples. Full seasons result in specific numbers that have minimal predictive value, such as BABIP for pitchers. The predictive value isn’t literally zero — individual seasons form much of the basis of projections, whether math-y ones like ZiPS or simply our personal opinions on how good a player is — but we have to develop tools that improve our ability to explain some of these stats. It’s not enough to know that the number of home runs allowed by a pitcher is volatile; we need to know how and why pitchers allow homers beyond a general sense of pitching poorly or being Jordan Lyles.

Data like that which StatCast provides gives us the ability to get at what’s more elemental, such as exit velocities and launch angles and the like — things that are in themselves more predictive than their end products (the number of homers). StatCast has its own implementation of this kind of exercise in its various “x” stats. ZiPS uses slightly different models with a similar purpose, which I’ve dubbed zStats. (I’m going to make you guess what the z stands for!) The differences in the models can be significant. For example, when talking about grounders, balls hit directly toward the second base bag became singles 48.7% of the time from 2012 to ’19, with 51.0% outs and 0.2% doubles. But grounders hit 16 degrees to the “left” of the bag only became hits 10.6% of the time over the same stretch, and toward the second base side, it was 9.8%. ZiPS uses data like sprint speed when calculating hitter BABIP, because how fast a player is has an effect on BABIP and extra-base hits.

ZiPS doesn’t discard actual stats; the models all improve from knowing the actual numbers in addition to the zStats. You can read more on how zStats relate to actual stats here. For those curious about the r-squared values between zStats and real stats for the offensive components, it’s 0.59 for zBABIP, 0.86 for strikeouts, 0.83 for walks, and 0.78 for homers. Those relationships are what make these stats useful for predicting the future. If you can explain 78% of the variance in home run rate between hitters with no information about how many homers they actually hit, you’ve answered a lot of the riddle. All of these numbers correlate better than the actual numbers with future numbers, though a model that uses both zStats and actual ones, as the full model of ZiPS does, is superior to either by themselves.

And why is this important and not just number-spinning? Knowing that changes in walk rates, home run rates, and strikeout rates stabilized far quicker than other stats was an important step forward in player valuation. That’s something that’s useful whether you work for a front office, are a hardcore fan, want to make some fantasy league moves, or even just a regular fan who is rooting for your faves. If we improve our knowledge of the basic molecular structure of a walk or a strikeout, then we can find players who are improving or struggling even more quickly, and provide better answers on why a walk rate or a strikeout rate has changed. This is useful data for me in particular because I obviously do a lot of work with projections, but I’m hoping this type of information is interesting to readers beyond that.

As with any model, the proof of the pudding is in the eating, and there are always some people that question the value of data such as these. So for this run, I’m pitting zStats against the last two months and all new data that obviously could not have been used in the model without a time machine to see how the zStats did compared to reality. I’m not going to do a whole post for this every time, but this is something that, based on the feedback from the last post in June, people really wanted to see the results for.

Starting with zBABIP, let’s look at how the numbers have shaken out for the leaders and trailers from back in June. I didn’t include players with fewer than 100 plate appearances over the last two months.

zBABIP Overachievers (As of 6/8)
Name BABIP (6/8) zBABIP (6/8) Since
Luis Arraez .417 .358 .368
Spencer Steer .337 .278 .269
Matt Chapman .359 .295 .318
TJ Friedl .398 .311 .270
Anthony Santander .306 .247 .281
Brandon Belt .453 .333 .313
Corbin Carroll .342 .284 .271
Orlando Arcia .374 .291 .318
Lane Thomas .350 .299 .371
Isaac Paredes .287 .224 .224
Randal Grichuk .414 .321 .312
Nick Castellanos .400 .355 .270
Elias Díaz .350 .298 .297
Austin Hays .379 .330 .331
Joey Meneses .374 .335 .276
Dominic Smith .318 .280 .291
Nick Pratto .466 .375 .292
James Outman .324 .270 .398
Geraldo Perdomo .330 .271 .299
Owen Miller .350 .304 .269
Alex Bregman .258 .229 .240
Mickey Moniak .394 .223 .438

zBABIP Underachievers (As of 6/8)
Name BABIP (6/8) zBABIP (6/8) BABIP Since
Keibert Ruiz .222 .301 .268
Julio Rodriguez .304 .376 .345
Pete Alonso .199 .269 .206
Miguel Rojas .243 .348 .237
Tony Kemp .165 .242 .298
Anthony Volpe .241 .314 .288
Jean Segura .230 .289 .283
Jake McCarthy .215 .330 .398
Brendan Donovan .260 .323 .364
Gleyber Torres .258 .307 .301
Jake Cronenworth .257 .314 .275
Kyle Schwarber .174 .238 .218
Bo Bichette .359 .396 .380
George Springer .268 .308 .313
Willson Contreras .252 .301 .422
Mookie Betts .254 .295 .314
Bobby Witt Jr. .265 .302 .340
Justin Turner .285 .325 .341
Michael Harris II .225 .300 .374

For the overachievers, when it came to predicting BABIP for the middle-third of the season, zBABIP was closer than actual BABIP for 19 of the 23 players projected, with a mean absolute error (MAE) of 42 points of BABIP versus 72 points. For the underachivers, zBABIP was closer on 16 of 20 with an MAE of 43 points versus 65 points.

On to homers.

zHR Overachievers (As of 6/8)
Name HR zHR zHR Diff HR% (6/8) zHR% (6/8) HR% Since
Pete Alonso 22 14.9 7.1 8.4% 5.7% 5.1%
Bo Bichette 14 9.2 4.8 4.9% 3.2% 1.6%
J.D. Martinez 15 10.3 4.7 7.4% 5.1% 5.6%
Francisco Alvarez 9 4.3 4.7 6.6% 3.1% 7.6%
Bryan De La Cruz 8 3.4 4.6 3.4% 1.4% 3.3%
Isaac Paredes 9 4.4 4.6 4.3% 2.1% 6.9%
Yandy Díaz 12 7.5 4.5 5.2% 3.3% 2.2%
Harold Ramírez 8 3.5 4.5 4.3% 1.9% 0.9%
Max Muncy 18 13.7 4.3 7.6% 5.8% 6.2%
Nolan Gorman 14 9.9 4.1 6.3% 4.5% 6.1%
Jose Siri 11 6.9 4.1 8.4% 5.3% 7.0%
Shohei Ohtani 16 12.2 3.8 5.8% 4.4% 10.6%
Christian Walker 12 8.3 3.7 4.9% 3.4% 5.0%
Josh Lowe 11 7.3 3.7 5.6% 3.7% 2.1%
Taylor Walls 7 3.3 3.7 3.9% 1.8% 0.0%
Christopher Morel 9 5.6 3.4 10.6% 6.6% 4.1%
Jarred Kelenic 10 6.6 3.4 4.3% 2.8% 0.8%
Luis Robert Jr. 13 9.8 3.2 5.1% 3.9% 7.9%
Adam Frazier 6 2.8 3.2 2.7% 1.3% 5.5%
Luke Raley 11 7.9 3.1 6.8% 4.9% 2.7%
Gleyber Torres 9 6.0 3.0 3.4% 2.2% 4.3%
Will Smith 9 6.0 3.0 4.8% 3.2% 2.7%
Blake Sabol 7 4.1 2.9 4.5% 2.6% 3.6%

zHR Underachievers (As of 6/8)
Name HR zHR zHR Diff HR% (6/8) zHR% (6/8) HR% Since
Matt Chapman 8 15.9 -7.9 3.1% 6.1% 3.4%
Trent Grisham 5 11.0 -6.0 2.3% 5.0% 3.1%
Spencer Torkelson 5 10.1 -5.1 2.0% 4.0% 4.5%
Dansby Swanson 6 10.8 -4.8 2.2% 4.0% 6.8%
MJ Melendez 5 9.3 -4.3 2.1% 4.0% 2.1%
José Abreu 1 5.2 -4.2 0.4% 2.1% 4.4%
Vladimir Guerrero Jr. 9 12.7 -3.7 3.3% 4.7% 4.2%
Michael Massey 4 7.6 -3.6 2.2% 4.2% 4.5%
Cal Raleigh 8 11.5 -3.5 3.9% 5.6% 6.0%
Triston Casas 6 9.5 -3.5 3.0% 4.7% 6.7%
José Ramírez 6 9.2 -3.2 2.3% 3.5% 5.4%
Andrew Benintendi 0 3.2 -3.2 0.0% 1.3% 1.0%
Paul Goldschmidt 10 13.0 -3.0 3.7% 4.8% 3.7%
Josh Bell 4 7.0 -3.0 1.7% 3.0% 4.2%
Austin Hays 6 9.0 -3.0 2.7% 4.1% 1.7%
Dominic Smith 1 3.9 -2.9 0.4% 1.6% 2.2%
Alex Verdugo 5 7.8 -2.8 1.9% 3.0% 1.8%
Ronald Acuña Jr. 12 14.7 -2.7 4.2% 5.2% 5.8%
Kyle Isbel 1 3.6 -2.6 1.0% 3.8% 2.9%
Joey Meneses 2 4.5 -2.5 0.8% 1.8% 3.6%
Bryan Reynolds 7 9.5 -2.5 2.8% 3.8% 4.2%
Willson Contreras 7 9.5 -2.5 3.0% 4.0% 2.9%
Jurickson Profar 5 7.4 -2.4 2.0% 3.0% 1.1%

Closer victories for ZiPS here, which isn’t surprising because homers is simply a more predictive stat than zBABIP! zHR rate was closer than actual on 12 of the 23 overachievers and 15 of 23 underachievers. The MAEs were also two closer wins, with 2.1% versus 2.6% on the first group and 1.3% versus 1.7% on the second. Ohtani almost singlehandedly won it for the first group with his homer surge!

zBB Overachievers (As of 6/8)
Name BB zBB zBB Diff BB% (6/8) zBB% (6/8) BB% Since
Jake Cronenworth 32 19.6 12.4 12.5% 7.6% 5.5%
Ryan Noda 42 31.5 10.5 19.8% 14.9% 14.0%
Ian Happ 45 35.0 10.0 17.0% 13.2% 15.4%
Byron Buxton 26 17.5 8.5 12.3% 8.3% 6.7%
Joey Gallo 25 17.2 7.8 15.2% 10.4% 10.4%
Juan Soto 56 48.4 7.6 20.9% 18.1% 19.2%
Myles Straw 21 14.0 7.0 9.4% 6.3% 6.9%
Hunter Renfroe 17 10.1 6.9 6.9% 4.1% 8.0%
Willson Contreras 24 17.1 6.9 10.1% 7.2% 9.4%
Nico Hoerner 17 10.3 6.7 6.9% 4.2% 5.6%
Matt Olson 44 37.6 6.4 15.7% 13.4% 11.3%
José Abreu 17 10.6 6.4 6.7% 4.2% 6.8%
Kyle Schwarber 44 37.7 6.3 16.6% 14.2% 15.4%
Andrew McCutchen 35 29.1 5.9 15.5% 12.9% 17.1%
Adley Rutschman 45 39.3 5.7 16.5% 14.4% 9.6%
Josh Bell 30 24.3 5.7 13.0% 10.5% 9.0%
Alejandro Kirk 20 14.7 5.3 11.3% 8.3% 7.0%
Will Smith 28 22.8 5.2 14.8% 12.1% 12.6%
Miguel Cabrera 12 6.8 5.2 10.7% 6.1% 10.0%
MJ Melendez 26 21.0 5.0 11.1% 8.9% 7.7%
Jean Segura 15 10.1 4.9 7.4% 5.0% 5.6%

zBB Underachievers (As of 6/8)
Name BB zBB zBB Diff BB% (6/8) zBB% (6/8) BB% Since
Esteury Ruiz 10 20.0 -10.0 3.7% 7.3% 3.8%
Luis García 13 21.5 -8.5 5.6% 9.3% 4.5%
Mike Yastrzemski 13 20.8 -7.8 7.6% 12.2% 13.8%
Connor Joe 21 28.7 -7.7 10.7% 14.6% 7.8%
Francisco Alvarez 7 14.5 -7.5 5.1% 10.6% 8.3%
Corey Julks 4 10.6 -6.6 2.4% 6.3% 12.4%
Bo Bichette 13 19.5 -6.5 4.5% 6.8% 3.7%
J.D. Martinez 10 16.5 -6.5 4.9% 8.1% 9.6%
Bryan Reynolds 21 27.5 -6.5 8.3% 10.9% 7.3%
Ozzie Albies 14 20.1 -6.1 5.4% 7.8% 8.7%
TJ Friedl 10 16.1 -6.1 6.3% 10.2% 8.7%
Rafael Devers 16 21.6 -5.6 6.2% 8.4% 10.8%
Michael A. Taylor 9 14.6 -5.6 5.3% 8.6% 4.8%
Emmanuel Rivera 5 10.1 -5.1 5.1% 10.2% 7.4%
Jorge Soler 24 28.9 -4.9 9.8% 11.7% 12.7%
Leody Taveras 13 17.9 -4.9 7.1% 9.8% 4.6%
Taylor Ward 18 22.9 -4.9 7.3% 9.3% 12.5%
Bobby Witt Jr. 11 15.6 -4.6 4.2% 5.9% 6.1%
Ronald Acuña Jr. 29 33.3 -4.3 10.2% 11.8% 12.9%

Unusually, ZiPS did significantly better with pegging overachievers than underachievers. You don’t see it in the basic win totals (13 of 21 and 11 of 19), but zBB had a significant advantage in MAE for the overachievers (2.0% versus 3.0%) and was very close in underachievers (2.9% versus 3.1%). zBB shouldn’t be kicking butt here; the value of zBB is in addition to actual walk rate, not instead of. zBB and actual BB are a big gain when used in tandem.

zSO Overachievers (As of 6/8)
Name SO zSO zSO Diff SO% (6/8) zSO% (6/8) SO% Since
Masataka Yoshida 24 38.6 -14.6 9.9% 15.9% 13.8%
Brent Rooker 60 74.1 -14.1 26.9% 33.2% 39.2%
Esteury Ruiz 47 61.0 -14.0 17.2% 22.3% 21.9%
Andrew McCutchen 44 57.3 -13.3 19.5% 25.4% 22.4%
Max Muncy 68 81.1 -13.1 28.6% 34.1% 24.7%
Paul Goldschmidt 57 69.9 -12.9 20.9% 25.6% 22.1%
Randy Arozarena 60 72.8 -12.8 22.9% 27.8% 25.9%
Harold Ramírez 36 48.5 -12.5 19.3% 25.9% 17.7%
Pete Alonso 51 63.3 -12.3 19.5% 24.3% 22.3%
Javier Baez 49 60.6 -11.6 20.5% 25.4% 23.2%
Gleyber Torres 33 44.1 -11.1 12.3% 16.5% 14.4%
Bryan Reynolds 47 58.0 -11.0 18.7% 23.0% 23.0%
Andrew Vaughn 47 57.7 -10.7 18.1% 22.3% 23.6%
Jose Siri 43 53.4 -10.4 32.8% 40.8% 39.4%
Harrison Bader 13 23.1 -10.1 13.7% 24.3% 17.1%
C.J. Cron 35 45.0 -10.0 23.6% 30.4% 20.2%
Christian Walker 47 56.9 -9.9 19.3% 23.3% 18.8%
Adam Frazier 23 32.4 -9.4 10.4% 14.7% 15.0%
Starling Marte 40 49.4 -9.4 17.8% 22.0% 24.0%
Nick Castellanos 63 72.3 -9.3 24.0% 27.6% 29.8%
José Ramírez 22 31.2 -9.2 8.4% 12.0% 11.7%
Andrés Giménez 37 46.2 -9.2 15.9% 19.9% 20.8%
Will Smith 19 28.2 -9.2 10.1% 14.9% 21.3%
William Contreras 41 50.1 -9.1 20.4% 24.9% 17.5%

zSO Underachievers (As of 6/8)
Name SO zSO zSO Diff SO% (6/8) zSO% (6/8) SO% Since
Jake Cronenworth 60 41.1 18.9 23.3% 16.0% 13.6%
Brandon Marsh 64 49.4 14.6 31.1% 24.0% 28.3%
Thairo Estrada 50 35.6 14.4 22.5% 16.0% 27.1%
Lane Thomas 64 51.4 12.6 25.0% 20.1% 27.5%
Taylor Ward 54 41.6 12.4 22.0% 16.9% 15.5%
Jarred Kelenic 76 63.6 12.4 32.6% 27.3% 32.6%
Jarren Duran 52 40.2 11.8 29.1% 22.5% 19.5%
DJ LeMahieu 59 47.4 11.6 26.7% 21.4% 16.8%
Brandon Belt 63 51.5 11.5 36.4% 29.8% 31.2%
Nathaniel Lowe 57 46.0 11.0 20.7% 16.7% 23.2%
Myles Straw 44 33.6 10.4 19.7% 15.1% 20.1%
Josh Jung 67 57.2 9.8 27.0% 23.1% 31.9%
Anthony Volpe 74 64.3 9.7 30.6% 26.6% 23.9%
Brice Turang 48 38.4 9.6 27.1% 21.7% 13.9%
Seiya Suzuki 53 44.2 8.8 26.1% 21.8% 23.8%
Ha-Seong Kim 김하성 53 44.5 8.5 24.7% 20.7% 14.6%
Byron Buxton 61 52.6 8.4 28.8% 24.8% 35.6%

More minor wins for zSO both in the simple head-to-head matchups (15 of 24 and 9 of 17) and in MAE (3.5% versus 4.3% and 5.3% versus 5.8%). I was especially interested in Jake Cronenworth’s results because he had a weird outlier bunch of results in these numbers as both the biggest walk overachiever and the biggest strikeout underachiever. And both of these numbers did rebound considerably into things that ZiPS expected. If not, I’d really have to dig farther into his plate discipline approach to see why his results were so atypical given his plate discipline numbers.

Tomorrow, I’ll have the updated hitter numbers through August 8.

Dan Szymborski is a senior writer for FanGraphs and the developer of the ZiPS projection system. He was a writer for ESPN.com from 2010-2018, a regular guest on a number of radio shows and podcasts, and a voting BBWAA member. He also maintains a terrible Twitter account at @DSzymborski.