April Hitting Stats Mean Nothing… Except When They Kinda Do by Dan Szymborski April 12, 2021 As part of my exhausting shtick, I like to respond “April!” to questions in my chats involving player performances in the season’s early going. This is effective shorthand when someone wants to know if, say, George Springer is a bust because he’s put up a .480 OPS in his first two weeks in the majors. It’s also dead wrong. April stats, in their proper context, are meaningful. “But Dan, a few weeks of baseball is a tiny sample!” That’s correct, but you have to take into consideration the underlying reasons projections can prove to be inaccurate. It’s not just that things change, though they do — pitcher X learns a sweet knuckle-curve or batter Y realizes that not hitting everything into the ground might be good — it’s that it’s challenging to gauge where players stand in the first place. Players’ stats themselves aren’t even perfect at this. Tim Anderson hit .322 in 2020, but that doesn’t actually mean his mean batting average projection should have been .322. We don’t actually know if a theoretical player was “truly” a .322 hitter, a .312 hitter who got lucky, a very unlucky .342 hitter, or a .252 hitter who made a deal with a supernatural or extraterrestrial entity. A .300 hitter isn’t observed, they’re inferred. The way most, if not all, in-season projections (or any projections, really) function is by applying what we call Bayesian inference. We won’t get into a full-blown math class, but in essence, it simply means that we update our hypotheses to take new data into account. And for players, data comes in all the time: every pitch or swing of the bat is new information about a player. It’s valuable information, too, as only the last handful of seasons have much predictive value and recent performance is the most useful. We only have about two weeks of baseball in the hopper, but projections for some players have already shifted by considerable margins. I’m using ZiPS here — it’s very convenient to use the projection system I own and run — but this will likely hold true for other projections that update in-season, though the magnitude of shift may vary: Largest Positive 2021 OPS Projection Changes Name RoS BA Ros OBP RoS SLG RoS OPS Preseason OPS Difference Byron Buxton .268 .312 .532 .844 .796 0.048 Yermín Mercedes .260 .318 .437 .756 .712 0.044 J.D. Martinez .274 .347 .516 .863 .826 0.037 Tyler Naquin .254 .295 .446 .741 .709 0.032 Chris Owings .258 .311 .441 .751 .720 0.031 Jonathan Lucroy .258 .318 .382 .699 .673 0.026 Cedric Mullins II .256 .307 .404 .711 .686 0.025 Nelson Cruz .284 .365 .567 .933 .911 0.022 Jared Walsh .250 .311 .497 .809 .787 0.022 Omar Narváez .254 .344 .414 .758 .736 0.022 Pablo Sandoval .239 .295 .404 .700 .678 0.022 Phillip Evans .263 .331 .411 .741 .720 0.021 Tucker Barnhart .244 .324 .395 .719 .699 0.020 Akil Baddoo .210 .277 .353 .630 .610 0.020 Roberto Pérez .205 .296 .356 .652 .633 0.019 Wilson Ramos .270 .323 .421 .744 .725 0.019 Evan Longoria .251 .303 .408 .712 .693 0.019 Zach McKinstry .231 .291 .387 .678 .659 0.019 Mike Trout .289 .428 .610 1.038 1.020 0.018 Will Smith .243 .342 .494 .836 .818 0.018 Largest Negative 2021 OPS Projection Changes Name RoS BA Ros OBP RoS SLG RoS OPS Preseason OPS Difference Marcell Ozuna .280 .350 .517 .867 .891 -0.024 Miguel Sanó .225 .319 .510 .829 .852 -0.023 Martín Maldonado .206 .286 .347 .633 .656 -0.023 Rowdy Tellez .253 .317 .484 .801 .824 -0.023 Tommy Pham .251 .346 .402 .748 .769 -0.021 Joey Votto .236 .338 .372 .710 .731 -0.021 Jorge Polanco .272 .326 .427 .754 .773 -0.019 Joc Pederson .235 .323 .455 .778 .797 -0.019 Matt Chapman .240 .318 .507 .825 .843 -0.018 Roman Quinn .216 .279 .337 .616 .634 -0.018 Ozzie Albies .280 .328 .496 .824 .841 -0.017 Sean Murphy .234 .304 .416 .720 .737 -0.017 C.J. Cron .269 .340 .526 .866 .883 -0.017 Elvis Andrus .243 .285 .363 .648 .665 -0.017 Mitch Moreland .231 .305 .418 .723 .740 -0.017 Anthony Alford .211 .280 .350 .630 .647 -0.017 Gleyber Torres .281 .362 .513 .874 .890 -0.016 Aaron Hicks .233 .355 .419 .774 .790 -0.016 Alejandro Kirk .254 .323 .417 .740 .756 -0.016 Jake Cave .251 .310 .437 .747 .763 -0.016 Probably the most fun name in the improvement category is Yermín Mercedes of the Chicago White Sox. A solid hitter in the minors but one with a poor defensive reputation behind the plate, Mercedes wasn’t considered a top prospect (he was 15th on Eric Longenhagen’s White Sox list this year). But the injury to Eloy Jiménez opened up an opportunity for someone to seize the DH job, and Mercedes has. Starting seven of the team’s nine games at DH, he’s gotten off to a blistering start, going 15-for-28 with five extra-base hits. It’s easy for people to dismiss comic book numbers, and a 298 wRC+ definitely fits into that category. But it’s those outlandish performances that are most likely to suggest a performance improvement. A .300 batter hitting .320 for a week seems “normal,” but it’s also less likely to represent a change in abilities than the same player hitting .800 for a week. In this case, ZiPS and Steamer both give Mercedes a wRC+ boost of 13 points based on just these seven games of data. It sounds like a lot, but that’s a product of our uncertainty in making projections. Coming into the season, ZiPS was only 50% sure that Mercedes would have a wRC+ between 77 and 102. Think of a hurricane being tracked in the Caribbean, where even the slightest change in direction or timing can be the difference between the storm running smack-dab into the Atlantic coast or churning (mostly) harmlessly out into empty waters. In Mercedes’ case, with only a single plate appearance a year ago, his seven games in 2020 make up roughly 10% of his rest-of-season projection. While the weights ZiPS uses are derived from baseball history, sometimes recent examples are the most effective illustrations. So let’s go back to 2019 — the last season of normal length — and look at how the players with the biggest changes in their projections at this point in that season fared the rest of the year. I looked at the preseason OPS projections and the RoS OPS projections through April 8, 2019 and compared them to the actual results for the 233 players who qualified for our splits leaderboards. From April 9 to the end of the season, using the updated OPS projection rather than the preseason one improved the average absolute error from 68 points of OPS to 62 points. The in-season model “expects” to come closer to the actual result than the preseason projection about 57% of the time. In 2019, that was 55.8% (130 of 233). But the larger question is: does the model actually work well for the outliers? Is it too generous to players like Mercedes and too miserly to players like Marcell Ozuna? Again, let’s look back at 2019’s largest projection changes at a similar point in the season. ZiPS Projection Outliers Through April 8, 2019 Player April 8 OPS Preaseason RoS Projection Difference Actual RoS Tim Beckham 1.314 .679 .722 0.043 .665 Cody Bellinger 1.468 .889 .923 0.033 .995 Austin Barnes 1.328 .699 .732 0.033 .559 Alex Avila 1.324 .688 .720 0.031 .716 Kolten Wong 1.235 .732 .762 0.030 .750 Anthony Rendon 1.412 .843 .869 0.026 .983 Domingo Santana 1.071 .759 .783 0.024 .732 Rhys Hoskins 1.446 .851 .875 0.024 .783 Trey Mancini 1.252 .762 .785 0.023 .874 Josh Phegley .942 .642 .665 0.023 .674 Mike Trout 1.508 1.023 1.045 0.022 1.052 Tim Anderson 1.326 .685 .706 0.021 .837 Christian Yelich 1.340 .901 .922 0.021 1.078 Clint Frazier 1.292 .767 .787 0.020 .756 Dansby Swanson 1.169 .699 .719 0.020 .719 Martin Prado 1.141 .647 .667 0.020 .504 Brian Goodwin 1.092 .664 .684 0.020 .778 Jorge Polanco 1.116 .717 .737 0.019 .826 Adam Jones 1.105 .742 .761 0.019 .690 Edwin Encarnacion 1.142 .782 .801 0.019 .850 ZiPS Projection Outliers Through April 8, 2019 Player April 8 OPS Preaseason RoS Projection Difference Actual RoS Jesse Winker .157 .800 .773 -0.028 .881 Jurickson Profar .350 .726 .700 -0.026 .754 Mike Zunino .198 .682 .658 -0.024 .586 Chris Davis .125 .673 .650 -0.024 .649 Roberto Perez .249 .600 .577 -0.022 .798 Yasiel Puig .354 .825 .803 -0.022 .809 Garrett Hampson .151 .752 .730 -0.022 .736 Jesus Aguilar .399 .825 .805 -0.020 .748 Eduardo Escobar .439 .798 .779 -0.019 .857 Josh Donaldson .554 .879 .861 -0.019 .923 Kevin Pillar .303 .700 .682 -0.018 .745 Jackie Bradley Jr. .416 .754 .736 -0.018 .764 Ian Desmond .369 .725 .708 -0.018 .826 Grayson Greiner .321 .602 .585 -0.017 .594 Danny Jansen .379 .717 .700 -0.017 .662 Brandon Nimmo .395 .761 .745 -0.017 .847 Yolmer Sanchez .122 .678 .661 -0.017 .664 Brian Dozier .340 .787 .772 -0.016 .801 Yadier Molina .458 .709 .693 -0.015 .737 Brian Anderson .422 .743 .729 -0.014 .846 In this case, ZiPS RoS did a little better than expected with the overachievers (closer on 13 of 20) and worse than expected with the underachievers (seven of 20). Rather than hitting on the 22-23 players the model expected, it only hit on 20 of the 40 outliers. However, if we expand our search to the top 20 positive and negative outliers from 2004-19, the RoS model was closer to what actually happened for 362 of the 640 players (56.6%). In other words, yes, these big leaps forward and backward for individual players have real meaning. Yermín Mercedes will not finish the season hitting .500. He’s unlikely to get MVP votes or make people forget about Eloy Jiménez. But if you’re not a little more optimistic about him than you were two weeks ago, even after just seven games, you’re doing it wrong.