For Your Begrudging Enjoyment, a Batted Ball Refresher

February 19, 2020

Earlier this offseason, I wrote a few articles about whether pitchers or batters had more influence over different events. There’s nothing groundbreaking about my conclusions — in fact, they specifically reinforce prior studies. Despite that, however, I think there’s value in these refreshers.

Concepts like “batters control home runs” and “pitcher groundball rate matters” are implicit in many of the statistics that you see on this site and certainly in many of the articles that you read here. When we cite xFIP or talk about what a pitcher can do to control his groundball rate, we’re drawing on these concepts.

You don’t need to know these basic concepts to accept the conclusions, but it certainly helps. Appealing to authority (hey, these stats are good because smart people made them) is a pretty bad way to convince someone, and understanding the reason behind a metric is the quickest way to accept its conclusions.

In that spirit, I thought I’d round out the series by looking at a few more common events and working out whether pitchers or batters do more to influence them. Today I’ll be looking at line drive rate and also popup rate, the percentage of fly balls that become harmless popups. Later this week, I’ll cover walks and strikeouts. Then we can move on to more pressing matters, like I don’t know, José Altuve tattoo investigations or what would happen if Mike Trout knew what was coming.

Before looking at line drive rate, I had a rough idea of what to expect. There are plenty of hitters I think of as line drive machines — peak Joey Votto, Miguel Cabrera, even Nick Castellanos. I had trouble placing a pitcher in the same category, unless you count “your favorite team’s fifth starter.”

A quick refresher on the methodology: I used 2018 line drive rates to divide pitchers and batters into four quartiles. I defined a line drive as any batted ball hit between 8 and 25 degrees of launch angle. Then I looked at how each pitcher did against each quartile of batters and vice versa. I weighted each pitcher’s results by the minimum of the four groups faced. For example, Marcus Semien faced 98 low-line-drive-rate pitchers, 135 medium-low, 162 medium-high, and 77 high-line-drive-rate pitchers in 2019, so his weight in our sample is 77 batted balls.

You can see the results below. The first line is what we would expect to see if pitchers held their exact line drive rates from 2018 to 2019. In other words, rather than use what actually happened in 2019, I took each pitcher’s 2018 rate. The batter-pitcher matchups are from 2019, but the results are merely the 2018 rates pasted onto the 2019 matchups. The second line is what actually happened in those matchups:

Batters vs. Different LD% Pitchers

Year	Quartile 1	Quartile 2	Quartile 3	Quartile 4
“2018”	20.2%	23.9%	26.5%	31.5%
2019	25.5%	26.5%	26.2%	26.6%

Hm. What looks in year one like a clear differentiation in pitchers turns into soup in year two. The pitchers who got absolutely lit up in year one, allowing 31.5% line drives, were only 1% worse in year two than the absolute best year one group. In other words, pitchers don’t seem to exhibit much skill when it comes to preventing line drives.

Let’s do the same experiment, only across the quartiles of batters. Surely this data will show the batters control line drive rate, right? …Right?

Pitchers vs. Different LD% Batters

Year	Quartile 1	Quartile 2	Quartile 3	Quartile 4
“2018”	16.9%	22.5%	25.5%	29.9%
2019	23.2%	24.8%	24.8%	26.6%

Well, okay, not so much. There’s definitely more here — the guys who were the best line drive hitters in 2018 were still the best in 2019 by a clear margin, and the same for the worst. What’s with the extremely low rate for quartile one in “2018?” My guess is that it’s a selection bias — the hitters who weren’t hitting any line drives sheerly due to luck tended to lose playing time, which made them appear worse than their true talent — if they’d kept playing, I imagine they would have recorded a higher line drive rate. The same effect is true, to a smaller extent, for highest-quartile pitchers.

Leaving that aside, however, two things still seem clear. First, batters have more to say about line drives than pitchers do. Second, line drives are far more random from year to year than grounders. There aren’t really line drive batters in the same way that there are fly ball or grounder batters.

To make that more clear, take a look at this grid. This is the average result of a matchup between each group of batters and pitchers. The batters are down the left side, and the pitchers across the top. You can see that pitcher tiers explain variation much less well than batter tiers:

Batter/Pitcher Result Grid, LD%

Quadrant	1	2	3	4
1	22.8%	24.8%	23.6%	25.8%
2	25.8%	25.8%	25.6%	26.4%
3	25.2%	26.1%	26.3%	26.3%
4	26.7%	28.3%	27.9%	27.2%

As a side note, these aren’t the true league rates; due to the weighting I used, the average won’t be perfectly right. Additionally, my definition is slightly different from the numbers on our website, as it’s defined strictly by angle rather than manually. Line drives are hard to classify, so I don’t think that says that my data is any more or less meaningful, I just thought it was worth pointing out.

So line drives are slightly a batter thing, but mostly a nobody thing. What about popups, which are as bad for hitters as line drives are good? It’s worth noting that there are two ways of looking at popups, so we’ll cover them both today. The first is popups as a percentage of all batted balls. I’d expect pitchers and batters to both demonstrate skill in limiting or allowing popups in the same way as groundball rate, because specific pitches can either induce a lot of grounders or a lot of fly balls.

Batters vs. Different PU% Pitchers

Year	Quartile 1	Quartile 2	Quartile 3	Quartile 4
“2018”	4.5%	7.5%	10.1%	14.7%
2019	6.5%	8.6%	9.7%	12.1%

At first glance, pitcher popup rates in year one tell you a lot about year two — more than half of the variation in the first year carries over. How about batters?

Pitchers vs. Different PU% Batters

Year	Quartile 1	Quartile 2	Quartile 3	Quartile 4
“2018”	4.6%	7.4%	10.1%	13.8%
2019	7.1%	7.8%	10.6%	12.7%

Even stronger! If you see a batter who hits a lot of popups in year one, you can count on a lot of popups in year two. The same goes for pitchers. The grid makes that clear:

Batter/Pitcher Result Grid, PU%

Quadrant	1	2	3	4
1	4.9%	6.4%	7.0%	8.5%
2	5.7%	7.9%	8.6%	11.6%
3	7.5%	9.6%	10.6%	13.5%
4	8.9%	11.2%	13.5%	15.4%

Next, let’s consider something slightly different. FanGraphs doesn’t report popups as a proportion of all batted balls. We report something called IFFB%, which is the percentage of fly balls that are infield popups. I decided to test something similar, though meaningfully distinct. I defined popups, both here and in the above study, as balls hit with a launch angle of 50 degrees or higher. This is broader than infield fly balls, because it adds towering flies that reach the outfield grass.

As a result, the below rates won’t match up perfectly with IFFB%. Still, they’re pretty close, and I value choosing an objective cutoff rather than relying on the subjective definitions, so let’s stick with it. Last year, I did a quick study and concluded that popups as a proportion of fly balls were largely random, at least from the pitcher side. Do we get that result again using quartile divisions?

Batters vs. Different PU/FB% Pitchers

Year	Quartile 1	Quartile 2	Quartile 3	Quartile 4
“2018”	17.9%	25.3%	30.6%	38.4%
2019	25.6%	27.5%	29.0%	30.0%

Well, kind of. There’s still some variation in year two, but the distribution gets squashed. There’s a pattern, sure, enough that you might shade your estimation of a pitcher’s results slightly, but the vast majority of it is noise. I should also note that the percentages might seem high, but that’s because there are a lot of fly balls hit 50 degrees or higher that leave the infield. But those batted balls have produced a wOBA of 0.013, so they track with infield fly balls really well — they’re essentially valueless contact.

Do hitters display the same pattern? Uh, no. Theirs is altogether weirder:

Pitchers vs. Different PU/FB% Batters

Year	Quartile 1	Quartile 2	Quartile 3	Quartile 4
“2018”	17.3%	25.0%	30.4%	37.5%
2019	22.7%	23.4%	29.6%	31.5%

What the heck does this mean? To be honest, I’m not quite sure. That big gap between the second and third quartile is wild. It’s reasonable to say that the shape remains; some batters are good at avoiding popups and some aren’t. Given the distribution of the quartiles, further research is required. But it’s not surprising that popups as a percentage of all fly balls is the weirdest one; it’s a smaller sample than popups or line drives as a percentage of all balls in play.

Breaking it down into the same grid as above, you can see that both sides have a bit to say about popup rate, but there’s not much differentiation overall:

Batter/Pitcher Result Grid, PU/FB%

Quadrant	1	2	3	4
1	21.5%	23.0%	23.9%	23.5%
2	22.7%	24.3%	26.8%	29.4%
3	27.7%	29.7%	32.8%	31.6%
4	31.5%	32.6%	33.4%	34.4%

That slightly weird result aside, there are some general takeaways. Can batters consistently hit line drives at a below- or above-average rate? Eh, a little. Can pitchers allow the same? Nope! Are popups a good indicator of what will happen in year two, particularly as a percentage of fly balls? Not so much for a pitcher, and maybe — but not definitely — for a batter.

Next time we’ll cover strikeouts and walks. But until then, remember: it might feel like a pitcher has a great skill for turning every fly ball into a popup, or that he gives up only line drives. But like home runs, these two events are the domain of batters. It can feel like there’s a pattern, but in the data, they largely look like noise.

7 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

JoserMember since 2021

5 years ago

There aren’t really line drive batters in the same way that there are fly ball or grounder batters.

Well, you wouldn’t really expect there to be, would you? GB are anything with a Launch Angle less than 10º; FB are anything with an LA greater than 25º (pop-ups being the subset greater than 50º). LD are the relatively tiny set between 10º and 25º, which is a pretty difficult window to target even for the best professional hitters… and all the more so now that so many are trying to hit everything in the air.

BTW, I can’t find those launch angle definitions anywhere on Fangraphs: they came from Baseball Savant. It might be nice some offseason for FG to do the inglorious but necessary work to update the Glossary section of FG — the most relevant page I found gives no numeric cut-offs but says

Second, batted ball classification is tricky. What’s the difference between a fly ball and a line drive? At what angle does one become the other? While BIS has a great team scouting each major league game, video data only offers only a certain level of detail. Even the most diligent stringer can’t get it right 100% of the time because they just don’t always have the proper angle to distinguish between a fly ball and line drive. When StatCast becomes fully operational, this problem should disappear because we will be able to use a simple numeric cut point.

Ben ClemensFanGraphs Staff

Reply to Joser

A glossary update is actually on my to-do list. Not saying it’s imminent or anything but I’ve been working on it when I have some free time.

Reply to Ben Clemens

Thanks Ben, that would be awesome. I kind of felt you were laying the groundwork for that with these articles, and it would be great if they got put into the glossary in some form (at least until new developments overtake them again, as the switch from BIS to Statcast has done to what’s there now).

Because Fangraphs has to keep up with the relentless output of the weighted random number generator* that is baseball, articles quickly slide off the front page and disappear, never to be found again except occasionally when you’re searching on a particular player. That’s completely appropriate in many cases: FG’s revenue is driven by new content, after all, and really who cares right now about some fun statistical oddity that somebody noticed a couple of seasons ago (yeah I sometimes do, and I wish there was an easier way to find those articles, but that’s for another day).

But in addition to topical news and commentary, Fangraphs is also about analysis, and that analysis needs a strong foundation set out in one place so that everybody who wants to argue about the analysis has a common basis to do so. That requires a library of definitions and “accepted wisdom” (which itself needs to be revised as we learn new things, of course) that lives somewhere permanent… which is kind of antithetical to the site design, devoted as it is to the completely necessary pursuit of whatever just happened.

The “glossary,” when it was introduced, was clearly intended to be the solution to that (though I’d argue the name is a bit misleading). Unfortunately, in a lot of places it — like Baseball — has been overtaken by events, as they say. The conventional wisdom has been revised, in some cases multiple times, and it needs to keep up.

Like I said in my earlier comment, that kind of backfilling is vital but unglamorous. Kudos for taking it on.

*obligatory XKCD reference.

Ryan DCMember since 2016

Gonna make another pitch for a pitch classification explainer/update (maybe from Michael Augustine?)

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG