Musings on Minor League Home Runs by Ben Clemens July 16, 2019 One of the most significant stories of the past few years of baseball has been the changing composition of the baseball. I mean story in the grand narrative sense, but I also mean it in the sense of literal stories. The slice-open-a-baseball-and-catalog-its-contents article has gotten very popular over the last few years, as have studies of drag and bounciness. Physics is having a moment in baseball analysis. One of the frustrating parts of breaking down the new ball’s effect on offense is that there was no clean way to isolate which hitters were helped most. Many batters adapted their swings to the new ball as the ball kept changing. The ball undoubtedly led to more home runs, as an independent panel reported in 2018. Which batters, though, were the biggest beneficiaries of the new ball? The launch angle revolution might increase batters’ home run rates, but surely a lot of its popularity comes down to the fact that home runs started flying out of the park at elevated rates as the ball changed. Separating cause from effect as swings and baseballs change over years’ worth of games is difficult. Our own Jeff Sullivan took a shot at it in 2016, but didn’t find much evidence for one group over another. Luckily, this season provides a convenient, natural experiment. To much fanfare, Triple-A switched over to using the major league ball this year. Home runs predictably skyrocketed, driven by the new ball. This creates an opportunity for all kinds of research, such as this attempt by Baseball America to test out whether the ball changes fastball velocity. I thought I’d take the occasion of this new ball to investigate something I’ve always wanted to know about the home run surge: what types of hitters does the new ball help most? The conventional wisdom around the home run surge is that the hitters who benefit the most are smaller-stature contact hitters. Call it the Altuve effect: the Astros second baseman had never put up an HR/FB% above 5.1% until the baseball changed, and he promptly developed 20-homer power. The likes of Giancarlo Stanton and Aaron Judge hit the ball so far over the wall, the theory goes, that they don’t need the help, whereas Altuve can use every inch of carry he gets. That’s a great theory, but it’s tremendously tough to verify, because all of these players kept changing their swings while the ball changed. Good luck figuring out cause and effect through all that noise. In the case of Triple-A, though, it’s not nearly that complex. Rather than a baseball that slowly got more lively over time, we have a state change; the minor league home run rate didn’t budge for years before exploding in 2019. 5.6% of batted balls are turning into home runs in the PCL this year, up from 3.6% in 2018. It’s the same story in the International League, where that rate has spiked from 3.2% to 5.1%. The change is sudden and sharp, which means we can look to see who was affected the most, ideally with less noise. The bottom of this article contains a more detailed explanation of my methodology, but let me go over it quickly here. I took the batting lines of every hitter who appeared in both 2018 and 2019 on the same team and looked for changes in the rate of fly balls they turned into home runs. To get an idea of what we might expect from each hitter in 2019 in the absence of a new ball, I regressed their 2018 HR/FB to the mean and compared that to what they’ve done in 2019. In all cases where I divided players into groups, each group has an equal number of players, and the statistics in that group are weighted by 2019 fly balls. Onto the data! First things first: HR/FB rate has ticked up across our group of Triple-A repeaters, from 10.2% in 2018 to 17.3% in 2019. While the group is a year older, and that could help home run rates, there’s not really any evidence that players who repeat Triple-A see their HR/FB tick up in the second year. Returning minor leaguers from 2016 to 2017 saw their HR/FB increase by about 1%, while the same classification of players saw their HR/FB fall by just over 1% from 2017 to 2018. This is a good result, because it’s what we expected. The home run surge isn’t due to an influx of power hitters. It’s due to the same hitters hitting more home runs. This is a pretty obvious conclusion, but it’s still good to check. Next, let’s make a simple cut down the middle. How did the bottom half of 2018 home run producers do relative to the top half? 2018-2019 Home Run Changes, 2 Groups Group 2018 rHR/FB 2019 HR/FB HR/FB Increase StDev Fly Balls Weighted Age Low HR 6.80% 13.70% 6.90% 0.67% 2615 25.9 High HR 13.60% 21.50% 7.90% 0.83% 2475 25.9 Can’t say much about that data. The top half increased their power by slightly more, but not by enough that it looks significant. The samples are large enough that we wouldn’t need a huge difference to feel confident that one group or the other was being helped out more, but honestly, there’s just not much to say. Let’s try this again, only with a bit more granularity this time. Instead of two groups, let’s chop things up into four. Any better? 2018-2019 Home Run Changes, 4 Groups Quartile 2018 rHR/FB 2019 HR/FB HR/FB Increase StDev Fly Balls Weighted Age First 5.30% 11.80% 6.50% 0.88% 1336 26.6 Second 8.40% 15.60% 7.20% 1.02% 1279 25.2 Third 11.30% 17.90% 6.60% 1.09% 1226 26.1 Fourth 15.80% 25.10% 9.30% 1.23% 1249 25.7 Hmm. This almost looks like noise. You could argue that there’s a generally increasing pattern, that the hitters who were already hitting more home runs got a little more boost from the new ball, but it’s nowhere near conclusive. In my old life, I spent a lot of time synthesizing data and trying to determine whether something was signal or noise, and if I saw this data, I’d probably call it quits there. That said, hey, we have the database. Let’s do some more slicing! Four slices was no good — it didn’t look like any quartile was particularly advantaged. How about with 10? 2018-2019 Home Run Changes, 4 Groups Decile 2018 rHR/FB 2019 HR/FB HR/FB Increase StDev Fly Balls Weighted Age First 3.97% 7.85% 3.88% 1.21% 497 27.3 Second 5.86% 13.94% 8.08% 1.48% 545 26.5 Third 6.71% 13.40% 6.69% 1.57% 470 24.8 Fourth 7.70% 14.99% 7.29% 1.65% 467 25.0 Fifth 9.18% 16.38% 7.20% 1.53% 586 25.8 Sixth 10.38% 18.28% 7.90% 1.79% 465 25.6 Seventh 11.38% 15.67% 4.29% 1.60% 517 26.4 Eighth 12.65% 22.25% 9.60% 1.91% 472 26.7 Ninth 14.43% 25.12% 10.69% 1.76% 605 25.7 Tenth 18.61% 26.39% 7.78% 2.04% 466 24.8 Is this — is this a result?? Weirdness in groups seven, eight, and nine aside, one thing sticks out: players with the very lowest HR/FB rates in 2018 got the least help from the new ball. That could well be real — this group contains the Pete Kozmas, Magneuris Sierras, and Myles Straws of the world, guys whose power is best described as sub-warning-track. Kozma has a career 2.4% HR/FB in the major leagues; Straw’s best power performance is two home runs in 600 PA across three levels last year. Anecdotes are obviously not the right way to come to conclusions, but come on, no baseball could help Pete Kozma hit home runs, right? A word of caution: we’re getting into small sample size territory here, though HR/FB stabilizes reasonably quickly, so the standard deviations aren’t unworkably large or anything. Still, we’re slicing this into 14 players a group, meaning one player could tilt things a decent amount. Just as an example, the lowest decile features 67 fly balls by career minor leaguer Sean Kazmar. Kazmar hit only a single home run in 123 fly balls last year, by far the lowest power production of his career. This year, he’s hit a more reasonable nine home runs. Exclude him from that group, and their improvement craters by a full 1%. The point is, this broken-down analysis isn’t foolproof — we should be careful in drawing conclusions. I’m hunting for an exciting answer as much as the next person. I was hoping for some obvious shape, for a giant bulge in home run rate increase for guys with warning track power. Maybe some kind of mixed model that does a cleaner job of isolating year-one talent could find a better relationship between pre-existing power and the benefit of the new ball, but I’m not holding my breath. Maybe more precise batted ball data (angle and distance from TrackMan, say) could find players whose batted ball mix is more likely to be helped by a few extra feet of carry. The pure home run data, however, just isn’t enough to say anything with any certainty. Viewed in one light, this is a frustrating result. I want there to be a pattern! The theory that middling-power players are helped the most makes so much sense that I want it to be true. The lack of anything tremendously obvious aside from the lowest-power few players legitimately having no home run power means that there probably aren’t any easily-found takeaways. Stupid new ball, with its lack of slam-dunk winners and losers. In another light, though, what a wonderful takeaway. Minor league data always feels remote and inaccessible. I had no idea what I was going to find — a power surge driven only by slap hitters, or vast swaths of the league staying constant while a select few hitters benefited. It turns out, though, that wasn’t the case. Minor league hitters improved in a predictable and reasonably uniform fashion with the new ball. The data might be harder to find, but it’s telling us things we expect about baseball. As someone who interacts with the minors mostly indirectly, that’s very gratifying to me. Methodology I’m pretty proud of the methodology involved in doing this study, because I had to invent it as there wasn’t an obvious template to copy. But that also means that there’s some chance I goofed up somewhere. For all the good ideas and clever little statistical workarounds, I give full credit to Sean Dolinar, who helped me out immensely. For everything that’s messed up, that’s on me. Let’s get down to business. First, I took the full population of all non-pitcher batters in Triple-A in 2018 and 2019. I then filtered out everyone who didn’t appear on the same team in both seasons. I wanted to get rid of park effects; this obviously doesn’t completely solve them, but having the same park both years helps. I also threw out the Mets, A’s, and Brewers, as all three moved to parks with tremendously different elevations. I also threw out everyone with five or fewer fly balls in 2018, because you have to draw a line somewhere, and my projection of their HR/FB was going to be basically league average (due to the regression), which felt wrong for the purposes of this analysis. To create regressed HR/FB data for every player in 2018, I used the inverse variance formula. I worked out the variance of the population as a whole, then worked out each player’s individual variance based on binomial variance and the number of fly balls they produced. Rather than use each player’s HR/FB% to produce the variance, I used the overall sample probability. This avoided problems with players who had observed 0% HR/FB. I’m not confident this is right, but it seemed like the best option. When it comes to what mean was correct to regress back to, I looked at a few options, but settled on regressing back to the mean of players who repeated Triple-A from 2018 to 2019. There’s useful information in that — I don’t think it would be appropriate to regress to the overall Triple-A population, as returning to the level in 2019 rather than graduating to the majors is an important distinction. To create the weighted HR/FB rate changes, I took the difference between each member of the group’s 2019 HR/FB and 2018 regressed HR/FB, then took a weighted average based on the number of fly balls in 2019. Finally, there’s a question of whether the 2019 data could or should be regressed. I pretty firmly believe it shouldn’t be — we’re looking for changes among sub-populations that were affected differently by the ball change. Regressing those changes to a uniform mean would obscure that. Taking weighted averages of HR/FB changes addresses this issue somewhat — if someone has, say, 5 fly balls in 2019, they’ll be a reasonably small weight in the overall sample. If you have any other methodological questions or concerns, feel free to let me know. I’m new to this amount of experimental design, and though I feel this was set up well, there’s always room for improvement.