Hitter zStats Entering the Homestretch, Part 1 (Validation)

One of the strange things about projecting baseball players is that even results themselves are small samples. Full seasons result in specific numbers that have minimal predictive value, such as BABIP for pitchers. The predictive value isn’t literally zero — individual seasons form much of the basis of projections, whether math-y ones like ZiPS or simply our personal opinions on how good a player is — but we have to develop tools that improve our ability to explain some of these stats. It’s not enough to know that the number of home runs allowed by a pitcher is volatile; we need to know how and why pitchers allow homers beyond a general sense of pitching poorly or being Jordan Lyles.
Data like that which StatCast provides gives us the ability to get at what’s more elemental, such as exit velocities and launch angles and the like — things that are in themselves more predictive than their end products (the number of homers). StatCast has its own implementation of this kind of exercise in its various “x” stats. ZiPS uses slightly different models with a similar purpose, which I’ve dubbed zStats. (I’m going to make you guess what the z stands for!) The differences in the models can be significant. For example, when talking about grounders, balls hit directly toward the second base bag became singles 48.7% of the time from 2012 to ’19, with 51.0% outs and 0.2% doubles. But grounders hit 16 degrees to the “left” of the bag only became hits 10.6% of the time over the same stretch, and toward the second base side, it was 9.8%. ZiPS uses data like sprint speed when calculating hitter BABIP, because how fast a player is has an effect on BABIP and extra-base hits.
ZiPS doesn’t discard actual stats; the models all improve from knowing the actual numbers in addition to the zStats. You can read more on how zStats relate to actual stats here. For those curious about the r-squared values between zStats and real stats for the offensive components, it’s 0.59 for zBABIP, 0.86 for strikeouts, 0.83 for walks, and 0.78 for homers. Those relationships are what make these stats useful for predicting the future. If you can explain 78% of the variance in home run rate between hitters with no information about how many homers they actually hit, you’ve answered a lot of the riddle. All of these numbers correlate better than the actual numbers with future numbers, though a model that uses both zStats and actual ones, as the full model of ZiPS does, is superior to either by themselves.
And why is this important and not just number-spinning? Knowing that changes in walk rates, home run rates, and strikeout rates stabilized far quicker than other stats was an important step forward in player valuation. That’s something that’s useful whether you work for a front office, are a hardcore fan, want to make some fantasy league moves, or even just a regular fan who is rooting for your faves. If we improve our knowledge of the basic molecular structure of a walk or a strikeout, then we can find players who are improving or struggling even more quickly, and provide better answers on why a walk rate or a strikeout rate has changed. This is useful data for me in particular because I obviously do a lot of work with projections, but I’m hoping this type of information is interesting to readers beyond that.
As with any model, the proof of the pudding is in the eating, and there are always some people that question the value of data such as these. So for this run, I’m pitting zStats against the last two months and all new data that obviously could not have been used in the model without a time machine to see how the zStats did compared to reality. I’m not going to do a whole post for this every time, but this is something that, based on the feedback from the last post in June, people really wanted to see the results for.
Starting with zBABIP, let’s look at how the numbers have shaken out for the leaders and trailers from back in June. I didn’t include players with fewer than 100 plate appearances over the last two months.
| Name | BABIP (6/8) | zBABIP (6/8) | Since | 
|---|---|---|---|
| Luis Arraez | .417 | .358 | .368 | 
| Spencer Steer | .337 | .278 | .269 | 
| Matt Chapman | .359 | .295 | .318 | 
| Thairo Estrada | .374 | .306 | .266 | 
| TJ Friedl | .398 | .311 | .270 | 
| Anthony Santander | .306 | .247 | .281 | 
| Brandon Belt | .453 | .333 | .313 | 
| Corbin Carroll | .342 | .284 | .271 | 
| Orlando Arcia | .374 | .291 | .318 | 
| Lane Thomas | .350 | .299 | .371 | 
| Isaac Paredes | .287 | .224 | .224 | 
| Randal Grichuk | .414 | .321 | .312 | 
| Nick Castellanos | .400 | .355 | .270 | 
| Elias Díaz | .350 | .298 | .297 | 
| Austin Hays | .379 | .330 | .331 | 
| Joey Meneses | .374 | .335 | .276 | 
| Dominic Smith | .318 | .280 | .291 | 
| Nick Pratto | .466 | .375 | .292 | 
| James Outman | .324 | .270 | .398 | 
| Geraldo Perdomo | .330 | .271 | .299 | 
| Owen Miller | .350 | .304 | .269 | 
| Alex Bregman | .258 | .229 | .240 | 
| Mickey Moniak | .394 | .223 | .438 | 
| Name | BABIP (6/8) | zBABIP (6/8) | BABIP Since | 
|---|---|---|---|
| Keibert Ruiz | .222 | .301 | .268 | 
| Adam Frazier | .240 | .315 | .247 | 
| Julio Rodriguez | .304 | .376 | .345 | 
| Pete Alonso | .199 | .269 | .206 | 
| Miguel Rojas | .243 | .348 | .237 | 
| Tony Kemp | .165 | .242 | .298 | 
| Anthony Volpe | .241 | .314 | .288 | 
| Jean Segura | .230 | .289 | .283 | 
| Jake McCarthy | .215 | .330 | .398 | 
| Brendan Donovan | .260 | .323 | .364 | 
| Gleyber Torres | .258 | .307 | .301 | 
| Jake Cronenworth | .257 | .314 | .275 | 
| Kyle Schwarber | .174 | .238 | .218 | 
| Bo Bichette | .359 | .396 | .380 | 
| George Springer | .268 | .308 | .313 | 
| Willson Contreras | .252 | .301 | .422 | 
| Mookie Betts | .254 | .295 | .314 | 
| Bobby Witt Jr. | .265 | .302 | .340 | 
| Justin Turner | .285 | .325 | .341 | 
| Michael Harris II | .225 | .300 | .374 | 
For the overachievers, when it came to predicting BABIP for the middle-third of the season, zBABIP was closer than actual BABIP for 19 of the 23 players projected, with a mean absolute error (MAE) of 42 points of BABIP versus 72 points. For the underachivers, zBABIP was closer on 16 of 20 with an MAE of 43 points versus 65 points.
On to homers.
| Name | HR | zHR | zHR Diff | HR% (6/8) | zHR% (6/8) | HR% Since | 
|---|---|---|---|---|---|---|
| Pete Alonso | 22 | 14.9 | 7.1 | 8.4% | 5.7% | 5.1% | 
| Bo Bichette | 14 | 9.2 | 4.8 | 4.9% | 3.2% | 1.6% | 
| J.D. Martinez | 15 | 10.3 | 4.7 | 7.4% | 5.1% | 5.6% | 
| Francisco Alvarez | 9 | 4.3 | 4.7 | 6.6% | 3.1% | 7.6% | 
| Bryan De La Cruz | 8 | 3.4 | 4.6 | 3.4% | 1.4% | 3.3% | 
| Isaac Paredes | 9 | 4.4 | 4.6 | 4.3% | 2.1% | 6.9% | 
| Yandy Díaz | 12 | 7.5 | 4.5 | 5.2% | 3.3% | 2.2% | 
| Harold Ramírez | 8 | 3.5 | 4.5 | 4.3% | 1.9% | 0.9% | 
| Max Muncy | 18 | 13.7 | 4.3 | 7.6% | 5.8% | 6.2% | 
| Nolan Gorman | 14 | 9.9 | 4.1 | 6.3% | 4.5% | 6.1% | 
| Jose Siri | 11 | 6.9 | 4.1 | 8.4% | 5.3% | 7.0% | 
| Shohei Ohtani | 16 | 12.2 | 3.8 | 5.8% | 4.4% | 10.6% | 
| Christian Walker | 12 | 8.3 | 3.7 | 4.9% | 3.4% | 5.0% | 
| Josh Lowe | 11 | 7.3 | 3.7 | 5.6% | 3.7% | 2.1% | 
| Taylor Walls | 7 | 3.3 | 3.7 | 3.9% | 1.8% | 0.0% | 
| Christopher Morel | 9 | 5.6 | 3.4 | 10.6% | 6.6% | 4.1% | 
| Jarred Kelenic | 10 | 6.6 | 3.4 | 4.3% | 2.8% | 0.8% | 
| Luis Robert Jr. | 13 | 9.8 | 3.2 | 5.1% | 3.9% | 7.9% | 
| Adam Frazier | 6 | 2.8 | 3.2 | 2.7% | 1.3% | 5.5% | 
| Luke Raley | 11 | 7.9 | 3.1 | 6.8% | 4.9% | 2.7% | 
| Gleyber Torres | 9 | 6.0 | 3.0 | 3.4% | 2.2% | 4.3% | 
| Will Smith | 9 | 6.0 | 3.0 | 4.8% | 3.2% | 2.7% | 
| Blake Sabol | 7 | 4.1 | 2.9 | 4.5% | 2.6% | 3.6% | 
| Name | HR | zHR | zHR Diff | HR% (6/8) | zHR% (6/8) | HR% Since | 
|---|---|---|---|---|---|---|
| Matt Chapman | 8 | 15.9 | -7.9 | 3.1% | 6.1% | 3.4% | 
| Trent Grisham | 5 | 11.0 | -6.0 | 2.3% | 5.0% | 3.1% | 
| Spencer Torkelson | 5 | 10.1 | -5.1 | 2.0% | 4.0% | 4.5% | 
| Dansby Swanson | 6 | 10.8 | -4.8 | 2.2% | 4.0% | 6.8% | 
| MJ Melendez | 5 | 9.3 | -4.3 | 2.1% | 4.0% | 2.1% | 
| José Abreu | 1 | 5.2 | -4.2 | 0.4% | 2.1% | 4.4% | 
| Vladimir Guerrero Jr. | 9 | 12.7 | -3.7 | 3.3% | 4.7% | 4.2% | 
| Michael Massey | 4 | 7.6 | -3.6 | 2.2% | 4.2% | 4.5% | 
| Cal Raleigh | 8 | 11.5 | -3.5 | 3.9% | 5.6% | 6.0% | 
| Triston Casas | 6 | 9.5 | -3.5 | 3.0% | 4.7% | 6.7% | 
| José Ramírez | 6 | 9.2 | -3.2 | 2.3% | 3.5% | 5.4% | 
| Andrew Benintendi | 0 | 3.2 | -3.2 | 0.0% | 1.3% | 1.0% | 
| Paul Goldschmidt | 10 | 13.0 | -3.0 | 3.7% | 4.8% | 3.7% | 
| Josh Bell | 4 | 7.0 | -3.0 | 1.7% | 3.0% | 4.2% | 
| Austin Hays | 6 | 9.0 | -3.0 | 2.7% | 4.1% | 1.7% | 
| Dominic Smith | 1 | 3.9 | -2.9 | 0.4% | 1.6% | 2.2% | 
| Alex Verdugo | 5 | 7.8 | -2.8 | 1.9% | 3.0% | 1.8% | 
| Ronald Acuña Jr. | 12 | 14.7 | -2.7 | 4.2% | 5.2% | 5.8% | 
| Kyle Isbel | 1 | 3.6 | -2.6 | 1.0% | 3.8% | 2.9% | 
| Joey Meneses | 2 | 4.5 | -2.5 | 0.8% | 1.8% | 3.6% | 
| Bryan Reynolds | 7 | 9.5 | -2.5 | 2.8% | 3.8% | 4.2% | 
| Willson Contreras | 7 | 9.5 | -2.5 | 3.0% | 4.0% | 2.9% | 
| Jurickson Profar | 5 | 7.4 | -2.4 | 2.0% | 3.0% | 1.1% | 
Closer victories for ZiPS here, which isn’t surprising because homers is simply a more predictive stat than zBABIP! zHR rate was closer than actual on 12 of the 23 overachievers and 15 of 23 underachievers. The MAEs were also two closer wins, with 2.1% versus 2.6% on the first group and 1.3% versus 1.7% on the second. Ohtani almost singlehandedly won it for the first group with his homer surge!
| Name | BB | zBB | zBB Diff | BB% (6/8) | zBB% (6/8) | BB% Since | 
|---|---|---|---|---|---|---|
| Jake Cronenworth | 32 | 19.6 | 12.4 | 12.5% | 7.6% | 5.5% | 
| Ryan Noda | 42 | 31.5 | 10.5 | 19.8% | 14.9% | 14.0% | 
| Ian Happ | 45 | 35.0 | 10.0 | 17.0% | 13.2% | 15.4% | 
| Byron Buxton | 26 | 17.5 | 8.5 | 12.3% | 8.3% | 6.7% | 
| Joey Gallo | 25 | 17.2 | 7.8 | 15.2% | 10.4% | 10.4% | 
| Juan Soto | 56 | 48.4 | 7.6 | 20.9% | 18.1% | 19.2% | 
| Myles Straw | 21 | 14.0 | 7.0 | 9.4% | 6.3% | 6.9% | 
| Hunter Renfroe | 17 | 10.1 | 6.9 | 6.9% | 4.1% | 8.0% | 
| Willson Contreras | 24 | 17.1 | 6.9 | 10.1% | 7.2% | 9.4% | 
| Nico Hoerner | 17 | 10.3 | 6.7 | 6.9% | 4.2% | 5.6% | 
| Matt Olson | 44 | 37.6 | 6.4 | 15.7% | 13.4% | 11.3% | 
| José Abreu | 17 | 10.6 | 6.4 | 6.7% | 4.2% | 6.8% | 
| Kyle Schwarber | 44 | 37.7 | 6.3 | 16.6% | 14.2% | 15.4% | 
| Andrew McCutchen | 35 | 29.1 | 5.9 | 15.5% | 12.9% | 17.1% | 
| Adley Rutschman | 45 | 39.3 | 5.7 | 16.5% | 14.4% | 9.6% | 
| Josh Bell | 30 | 24.3 | 5.7 | 13.0% | 10.5% | 9.0% | 
| Alejandro Kirk | 20 | 14.7 | 5.3 | 11.3% | 8.3% | 7.0% | 
| Will Smith | 28 | 22.8 | 5.2 | 14.8% | 12.1% | 12.6% | 
| Miguel Cabrera | 12 | 6.8 | 5.2 | 10.7% | 6.1% | 10.0% | 
| MJ Melendez | 26 | 21.0 | 5.0 | 11.1% | 8.9% | 7.7% | 
| Jean Segura | 15 | 10.1 | 4.9 | 7.4% | 5.0% | 5.6% | 
| Name | BB | zBB | zBB Diff | BB% (6/8) | zBB% (6/8) | BB% Since | 
|---|---|---|---|---|---|---|
| Esteury Ruiz | 10 | 20.0 | -10.0 | 3.7% | 7.3% | 3.8% | 
| Luis García | 13 | 21.5 | -8.5 | 5.6% | 9.3% | 4.5% | 
| Mike Yastrzemski | 13 | 20.8 | -7.8 | 7.6% | 12.2% | 13.8% | 
| Connor Joe | 21 | 28.7 | -7.7 | 10.7% | 14.6% | 7.8% | 
| Francisco Alvarez | 7 | 14.5 | -7.5 | 5.1% | 10.6% | 8.3% | 
| Corey Julks | 4 | 10.6 | -6.6 | 2.4% | 6.3% | 12.4% | 
| Bo Bichette | 13 | 19.5 | -6.5 | 4.5% | 6.8% | 3.7% | 
| J.D. Martinez | 10 | 16.5 | -6.5 | 4.9% | 8.1% | 9.6% | 
| Bryan Reynolds | 21 | 27.5 | -6.5 | 8.3% | 10.9% | 7.3% | 
| Ozzie Albies | 14 | 20.1 | -6.1 | 5.4% | 7.8% | 8.7% | 
| TJ Friedl | 10 | 16.1 | -6.1 | 6.3% | 10.2% | 8.7% | 
| Rafael Devers | 16 | 21.6 | -5.6 | 6.2% | 8.4% | 10.8% | 
| Michael A. Taylor | 9 | 14.6 | -5.6 | 5.3% | 8.6% | 4.8% | 
| Emmanuel Rivera | 5 | 10.1 | -5.1 | 5.1% | 10.2% | 7.4% | 
| Jorge Soler | 24 | 28.9 | -4.9 | 9.8% | 11.7% | 12.7% | 
| Leody Taveras | 13 | 17.9 | -4.9 | 7.1% | 9.8% | 4.6% | 
| Taylor Ward | 18 | 22.9 | -4.9 | 7.3% | 9.3% | 12.5% | 
| Bobby Witt Jr. | 11 | 15.6 | -4.6 | 4.2% | 5.9% | 6.1% | 
| Ronald Acuña Jr. | 29 | 33.3 | -4.3 | 10.2% | 11.8% | 12.9% | 
Unusually, ZiPS did significantly better with pegging overachievers than underachievers. You don’t see it in the basic win totals (13 of 21 and 11 of 19), but zBB had a significant advantage in MAE for the overachievers (2.0% versus 3.0%) and was very close in underachievers (2.9% versus 3.1%). zBB shouldn’t be kicking butt here; the value of zBB is in addition to actual walk rate, not instead of. zBB and actual BB are a big gain when used in tandem.
| Name | SO | zSO | zSO Diff | SO% (6/8) | zSO% (6/8) | SO% Since | 
|---|---|---|---|---|---|---|
| Masataka Yoshida | 24 | 38.6 | -14.6 | 9.9% | 15.9% | 13.8% | 
| Brent Rooker | 60 | 74.1 | -14.1 | 26.9% | 33.2% | 39.2% | 
| Esteury Ruiz | 47 | 61.0 | -14.0 | 17.2% | 22.3% | 21.9% | 
| Andrew McCutchen | 44 | 57.3 | -13.3 | 19.5% | 25.4% | 22.4% | 
| Max Muncy | 68 | 81.1 | -13.1 | 28.6% | 34.1% | 24.7% | 
| Paul Goldschmidt | 57 | 69.9 | -12.9 | 20.9% | 25.6% | 22.1% | 
| Randy Arozarena | 60 | 72.8 | -12.8 | 22.9% | 27.8% | 25.9% | 
| Harold Ramírez | 36 | 48.5 | -12.5 | 19.3% | 25.9% | 17.7% | 
| Pete Alonso | 51 | 63.3 | -12.3 | 19.5% | 24.3% | 22.3% | 
| Javier Baez | 49 | 60.6 | -11.6 | 20.5% | 25.4% | 23.2% | 
| Gleyber Torres | 33 | 44.1 | -11.1 | 12.3% | 16.5% | 14.4% | 
| Bryan Reynolds | 47 | 58.0 | -11.0 | 18.7% | 23.0% | 23.0% | 
| Andrew Vaughn | 47 | 57.7 | -10.7 | 18.1% | 22.3% | 23.6% | 
| Jose Siri | 43 | 53.4 | -10.4 | 32.8% | 40.8% | 39.4% | 
| Harrison Bader | 13 | 23.1 | -10.1 | 13.7% | 24.3% | 17.1% | 
| C.J. Cron | 35 | 45.0 | -10.0 | 23.6% | 30.4% | 20.2% | 
| Christian Walker | 47 | 56.9 | -9.9 | 19.3% | 23.3% | 18.8% | 
| Adam Frazier | 23 | 32.4 | -9.4 | 10.4% | 14.7% | 15.0% | 
| Starling Marte | 40 | 49.4 | -9.4 | 17.8% | 22.0% | 24.0% | 
| Nick Castellanos | 63 | 72.3 | -9.3 | 24.0% | 27.6% | 29.8% | 
| José Ramírez | 22 | 31.2 | -9.2 | 8.4% | 12.0% | 11.7% | 
| Andrés Giménez | 37 | 46.2 | -9.2 | 15.9% | 19.9% | 20.8% | 
| Will Smith | 19 | 28.2 | -9.2 | 10.1% | 14.9% | 21.3% | 
| William Contreras | 41 | 50.1 | -9.1 | 20.4% | 24.9% | 17.5% | 
| Name | SO | zSO | zSO Diff | SO% (6/8) | zSO% (6/8) | SO% Since | 
|---|---|---|---|---|---|---|
| Jake Cronenworth | 60 | 41.1 | 18.9 | 23.3% | 16.0% | 13.6% | 
| Brandon Marsh | 64 | 49.4 | 14.6 | 31.1% | 24.0% | 28.3% | 
| Thairo Estrada | 50 | 35.6 | 14.4 | 22.5% | 16.0% | 27.1% | 
| Lane Thomas | 64 | 51.4 | 12.6 | 25.0% | 20.1% | 27.5% | 
| Taylor Ward | 54 | 41.6 | 12.4 | 22.0% | 16.9% | 15.5% | 
| Jarred Kelenic | 76 | 63.6 | 12.4 | 32.6% | 27.3% | 32.6% | 
| Jarren Duran | 52 | 40.2 | 11.8 | 29.1% | 22.5% | 19.5% | 
| DJ LeMahieu | 59 | 47.4 | 11.6 | 26.7% | 21.4% | 16.8% | 
| Brandon Belt | 63 | 51.5 | 11.5 | 36.4% | 29.8% | 31.2% | 
| Nathaniel Lowe | 57 | 46.0 | 11.0 | 20.7% | 16.7% | 23.2% | 
| Myles Straw | 44 | 33.6 | 10.4 | 19.7% | 15.1% | 20.1% | 
| Josh Jung | 67 | 57.2 | 9.8 | 27.0% | 23.1% | 31.9% | 
| Anthony Volpe | 74 | 64.3 | 9.7 | 30.6% | 26.6% | 23.9% | 
| Brice Turang | 48 | 38.4 | 9.6 | 27.1% | 21.7% | 13.9% | 
| Seiya Suzuki | 53 | 44.2 | 8.8 | 26.1% | 21.8% | 23.8% | 
| Ha-Seong Kim 김하성 | 53 | 44.5 | 8.5 | 24.7% | 20.7% | 14.6% | 
| Byron Buxton | 61 | 52.6 | 8.4 | 28.8% | 24.8% | 35.6% | 
More minor wins for zSO both in the simple head-to-head matchups (15 of 24 and 9 of 17) and in MAE (3.5% versus 4.3% and 5.3% versus 5.8%). I was especially interested in Jake Cronenworth’s results because he had a weird outlier bunch of results in these numbers as both the biggest walk overachiever and the biggest strikeout underachiever. And both of these numbers did rebound considerably into things that ZiPS expected. If not, I’d really have to dig farther into his plate discipline approach to see why his results were so atypical given his plate discipline numbers.
Tomorrow, I’ll have the updated hitter numbers through August 8.
Dan Szymborski is a senior writer for FanGraphs and the developer of the ZiPS projection system. He was a writer for ESPN.com from 2010-2018, a regular guest on a number of radio shows and podcasts, and a voting BBWAA member. He also maintains a terrible Twitter account at @DSzymborski.
 
								
great read