# For Your Enjoyment, a Home Run Rate Refresher

Here’s a question for you: does Mike Trout hit more home runs against bad pitchers? The answer is yes, of course, but we can parse the question a little differently to make it more interesting. How about this one: does Mike Trout hit more home runs *per fly ball* against pitchers who are home run-prone? That at least has some intrigue.

Here’s one way you might do this study. Take every pitcher in baseball and group them into quartiles based on their home run per fly ball rate. I’m using line drives and non-pop-up fly balls to make a slightly different rate, but the idea is the same. With the pitchers bucketed like so, simply observe Trout’s home run rate against each quartile:

Stat | Quartile 1 | Quartile 2 | Quartile 3 | Quartile 4 |
---|---|---|---|---|

HR/Air% | 13.33% | 16.44% | 26.76% | 20.00% |

Batted Balls | 45 | 73 | 71 | 30 |

But before Tom Tango pulls his hair out, let me add something important: This is a bad way to do this study. There’s a big problem here. Trout’s home runs and the pitchers’ home run rate aren’t independent of each other. If Trout tags a guy for a few home runs, that pitcher’s home run rate goes up. If Trout doesn’t hit any out against a pitcher, that pitcher will tend towards the stingiest quartile. Even if Trout’s home runs were randomly distributed across pitchers, this data would tend towards shape.

So if we want to get an uncontaminated look at this, we need to use a sample that isn’t influenced by Trout’s own exploits. There are two options: strip out Trout’s at-bats, or use a previous season’s data to group the pitchers. For now, I’m simply going to use 2018 home run rates, though with it being the offseason and all, I’ll likely go back and try the other method at some point as well.

Okay, so now we’ve got a method. One quick note: I excluded all pitchers who didn’t throw in the majors in 2018 from the sample. You could try to estimate their rate from minor league or previous major league numbers, but I deemed it cleaner to simply ignore them altogether. How does Trout do against each group as determined this way?

Stat | Quartile 1 | Quartile 2 | Quartile 3 | Quartile 4 |
---|---|---|---|---|

HR/Air% | 19.23% | 16.00% | 20.97% | 16.00% |

Batted Balls | 26 | 75 | 62 | 25 |

No more signal now! That looks more or less like random noise to me. But Trout is just one player, and the sample size is small. Let’s answer a bigger question: do *batters* hit more home runs per fly ball against bad pitchers? I repeated the study with every batter in baseball who hit a fly ball this year.

A few notes on my methodology here: I wanted to get an equal sample of each group. If Aaron Judge, say, faced mainly pitchers from one quartile, that quartile might show higher power numbers due to Judge. So I weighted each batter’s data by the smallest sample they had. If a batter faced 8, 13, 25, and 32 pitchers in each group, for example, they’d get a weight of 8 in each quartile. In this way, each sample has the same weights from each batter.

With that done, we can look at the data, but boy, there’s nothing to see there:

Stat | Quartile 1 | Quartile 2 | Quartile 3 | Quartile 4 |
---|---|---|---|---|

HR/Air% | 10.92% | 11.47% | 12.08% | 12.90% |

Binary Std. Dev. | 0.39% | 0.40% | 0.41% | 0.42% |

Batters didn’t really hit more home runs in 2019 against pitchers who were more homer-prone in 2018. There’s a general upward tilt, and the sample size is large enough (6,472 air balls per quartile) that we can be somewhat sure that the slope is not due to a measurement error, but the size of the effect is miniscule. The edge to facing a pitcher who’s particularly homer-prone, as compared to particularly stingy, is only 2%.

Next, let’s repeat the study, only in reverse. Rather than look at each hitter against four quartiles of pitchers, we’ll look at each pitcher against four quartiles of batters. As before, we’ll group batters based on their 2018 rates and throw out batters who didn’t record any fly balls in 2018. For this run, I also got rid of pitchers batting, because otherwise the entirety of the least-powerful batters was made up of pitchers, and that’s not really what we’re testing here.

What do the data show? Well, they show a difference, that’s what:

Stat | Quartile 1 | Quartile 2 | Quartile 3 | Quartile 4 |
---|---|---|---|---|

HR/Air% | 7.70% | 10.96% | 13.13% | 15.68% |

Binary Std. Dev. | 0.29% | 0.34% | 0.37% | 0.40% |

If you face the bottom tier of batters, you don’t give up many home runs, even if they get the ball in the air. Face the best guys, and balls leave the yard at more than double the rate of the weaklings.

That’s really logical, and I’m not trying to present something new here. The point, instead, is that batters have the most to say about home runs. You probably know this intuitively. If you look at Joey Gallo’s stats, you don’t say “oh boy, high HR/FB%, he’s probably due for some regression.” You say “Yup, he’s big.” If you look at a pitcher’s numbers, you might see that same HR/FB% and intuitively assume he’s been getting unlucky.

This fact, that the most- and least-homer-prone pitchers aren’t all that different in terms of home run rate, is the reason that people cite xFIP as often as they do. Expecting each pitcher to have a league average rate of home runs per fly ball isn’t exactly right — there’s slope, after all — but it’s closer to right than wrong.

That’s all I’ve got for you today. Probably you already knew it, even. But every so often it can be useful to revisit the basics, to make sure that assumptions like “home runs even out for pitchers” still make sense. Luckily for us (and for them), they do.

Ben is a writer at FanGraphs. He can be found on Twitter @_Ben_Clemens.

Wow. This seems like really important data.

It seems to imply that for most (all?) pitchers, what matters is FB rate, not really HR/FB. Guys with high FB rates are going to give up lots of home runs, almost without exception.

It means someone like Caleb Smith, with a 52.5% flyball rate and a 15.6% HR/FB rate is probably not going to allow fewer home runs this year unless the flyball rate comes down.

And someone like John Means (50.3% flyball and 9.7% HR/FB) is probably going to allow a lot more.

It means we can mostly ignore HR/FB rate independent of flyball rate. That seems really useful, actually.

EDIT: Though I guess there isn’t an easy way to account for the fact that certain pitchers may face big HR/FB guys more often because of their division or coincidence or whatnot.

Yeah, there is, but it’ll just be a little painful. Basically you could go through and create deviations from normalized expected HR rate — i.e. look at how each hitter did in PA’s against everyone else, use those rates to develop an ‘expected’ HR/FB allowed by that pitcher, and then see what deviations from expected we see and whether those persist. Then you get into a bit of a matryoshka doll issue where what if those HITTERS faced a particularly easy slate or whatever, but that’s at least a first order way of looking at it. I’m goofing around with that at the moment, but it’s a decent amount of calculations, so I’m going to have to automate it.

xFIP!

Also, Jeff Zimmerman took a look at this and found similar results with the happy fun ball

https://fantasy.fangraphs.com/juiced-baseball-pitcher-evaluation-changes/

That seems like common sense to me. Some FB end up as HR, not many GB do. HR are the most rare outcome (outside of 3B) that pitchers allow. Putting stock into the flukiest of outcome is not smart, yet we do it a lot. We don’t pretend that 3B are particularly important. Most tools regress BABIP towards the mean but we treat HR allowed like an important factor which is just silly. Batted ball distributions vary quite a bit year over year so I am not too sure what the predictive value of any of this is anyways.. other than when it naturally varies we will call it an adjustment one way or the other. Nobody is damned to their batted ball rates from the past season. Sure some guys are extreme FB or GB pitchers but those guys should already be known as HR averse or prone respectively. Don’t forget that all of this work is tied to juiced balls. We are not unraveling the mysteries of the game as much as we are exploring the effects of juiced baseballs on the game.

Ben doesn’t actually tell us the values that define these quartiles. Just looking over the data from 2019, I see, as I thought, that the variation in HR/FB is much greater for hitters than it is for pitchers. For qualified hitters, it ranges from 32.8% – 1.5%, with 95% of hitters within a factor of four of each other; whereas for pitchers, the range is 22.8% – 9.3%, with 95% of pitcher rates within a factor of two of each other. The average HR/FB rate for hitters (individually taken, i.e., not averaging all HR/all FB) was 17.0 %, with a SD of 6.3%. For pitchers, it’s 14.3%, with a SD of 3.0%.

Surely the lesser variation for pitchers contributes to not seeing a significant difference for batters facing pitchers across the entire HR/FB range. Batters do not see as much difference among pitchers as pitchers do among hitters.

I feel iffy linking to a Twitter thread, but Tango and I happened to talk about this over the weekend, where I looked at the 2018 rates of each quartile:

https://twitter.com/_Ben_Clemens/status/1221144116974632960