What (New) Statcast Data Tell Us About Pitcher BABIP
For the past few days, I’d been searching for a baseball topic to write about. It usually takes less time, but we’re in that calm (if not monotonous) period between the All-Star break and the trade deadline. Ideas are scarcer. Maybe I’d settle on an article with a simple premise?
So I committed myself to tackling pitcher BABIP. (Good going, Justin!)
The notion that pitchers have no control over what happens to a ball in play ushered in a golden age of baseball research, and findings from back then still influence how we view the game today. But over time, we realized that exceptions do exist; for example, Clayton Kershaw consistently allows a below-average BABIP, most likely because he’s a phenomenal pitcher. In addition, certain pitchers have a knack for inducing weak contact in the form of pop-ups or grounders. Exactly how those batted balls impacted BABIP remained a mystery, but you could no longer brush off the metric as total noise.
Years later, Statcast data became available for public use. Even so, research on pitcher BABIP remained far and few between; it’s a daunting subject! I did use two articles as inspiration, however. The first is from FanGraphs user rplunkett97 on our community research page. Dating back to 2017, it mainly discusses a linear model with several variables (BB/9, GB%, Team UZR, and more) used to produce an expected BABIP for each pitcher. The second is courtesy of Alex Chamberlain, also from the same year, who used a mixture of Hard-hit and Barrel rate to create his own version of xBABIP.
We’re still in the Statcast era, but there have been a couple of changes since then that could help us better evaluate pitcher BABIP. Statcast is now provided by Hawk-Eye, which does a better job at recording batted balls that in previous seasons might have flown under the radar or been mislabeled. As proof of this being the case, below is the yearly rate of batted balls that are either above 40 degrees or below -20:
Hitters and pitchers aren’t producing or allowing more of these balls; Hawk-Eye is just more adept at measuring extreme angles.
Those launch angle ranges were part of my article about Cristian Javier and Framber Valdez as examples of pitchers who have control over how much extreme contact they allow. Moreover, such contact pertains to BABIP in important ways. This isn’t a new concept, but it’s helpful to remind ourselves that not all groundballs are created equal:
LA Range | No. of BBE | BABIP |
---|---|---|
-5 to 10 | 14,889 | .421 |
-20 to -5 | 9,968 | .160 |
Below -20 | 8,326 | .116 |
The lower the launch angle, the lower the BABIP allowed. Some of the super-negative angles do produce hits depending on how they were struck and the batter’s speed (see: the Rays), but I imagine they’re offset by the fact that low angles limit exit velocity. Also, what’s surprising is how drastically BABIP rises once a batted ball is even slightly elevated. It’s possible a fair amount of noise within BABIP is due to pitchers’ inability to control the rate of bad grounders they allow.
As for batted balls over 40 degrees, they rarely become hits; when they do, they’re mostly home runs that don’t count as balls in play. And since pitchers can choose to allow them (or not), including high fly balls and pop-ups into a BABIP model is a no-brainer.
But when Alex Chamberlain examined pop-up rate (PU%) in his article, he found that it “correlates very weakly with BABIP, producing an R^2 of 0.10.” My semi-educated guess is that the pre-Statcast definition of a pop-up was much narrower and thus made little difference between pitchers on a rate basis. Even MLB.com defines a popup as a ball hit 50 degrees or higher, which, if we’re basing it off the relationship between BABIP and launch angle, is also a bit stingy. Though the exact number could be debated, a lower limit of 40 degrees seems ideal.
In the same spirit, I used exact launch angles to adjust the definition of a line drive. According to MLB.com, line drives are balls within the 10–25 degree range, but I found that shifting each end back by five degrees best captured the relationship between BABIP and launch angle. Plus, I’ve always found it weird that a nine-degree rocket would count as a groundball, so this change is a fantastic reconciliation.
Lastly, we now have access to Outs Above Average (OAA) data by pitcher, meaning we can see who was harmed or aided by his teammates. I know the point of BABIP is that the defense behind him is not something we should hold a pitcher accountable for, but I wanted to see how much of the variance in BABIP select variables can explain. Speaking of variables, here’s a list of the ones I chose below:
- Percent of BBE below -20 degrees (GB%)
- Percent of BBE between 5 and 20 degrees (LD%)
- Percent of BBE above 40 degrees (PU%)
- Outs Above Average when pitching
I then fed these four variables into a linear model, which did the hard work of measuring how they correlated to each pitcher’s BABIP in 2021 as of July 19. Here are the long-awaited results:
The sample consists of 138 of 148 pitchers this season with at least 50 innings pitched; the remaining 10 weren’t included in a database I use to translate MLB IDs into FanGraphs ones. I thought of calculating their xBABIPs manually but figured they wouldn’t make much of a difference on the results.
Regardless, it’s cool to see that four variables alone can account for nearly half the variance in BABIP among pitchers with at most 110–120 innings under their belts. Even if you toss out the defensive component, you still end up with a robust adjusted R^2 of 0.329. Individually, Statcast-adjusted line drive rate has the strongest correlation with BABIP (R^2 = 0.211), and groundball rate has the weakest (R^2 = 0.002, basically nonexistent). Within the context of a model, however, I suspect it works in tandem with defensive data, and removing it weakens both the R^2 value and the RMSE.
What about predicting future BABIP? This is a question worth answering. How, for example, does Kershaw author season after season of low BABIP? Can our model provide an answer? My instincts said no, given that two of the variables (LD%, OAA) are highly volatile year-to-year, and a third (GB%) correlates weakly to same-season BABIP.
That leaves us with PU%. To start, I compiled pitchers with both (a) at least 30 innings pitched in 2020 and (b) at least 50 innings pitched in 2021; that returned a sample of 104. Then I looked at how a pitcher’s PU% in 2020 correlated to his BABIP in 2021 and how that compared to his BABIP the previous year. The results:
- 2020 PU% to 2021 BABIP: R^2 = 0.051
- 2020 BABIP to 2021 BABIP: R^2 = 0.034
Womp, womp.
Yep, the rate that’s not only sticky year-to-year but also quite indicative of same-season BABIP is barely better at estimating future BABIP. There’s the caveat that we’re dealing with micro samples here, but even with more innings, it’s doubtful PU% will reach a point of moderate correlation. Unfortunately, this is a dead end.
Still, the descriptive power of xBABIP provides us with a few insights. Take John Means as an example. He has a microscopic .201 BABIP since last season, and while he did possess an abnormally low line drive rate in 2020, it’s rebounded in 2021, but without a corresponding change to BABIP. His secret? Means has gained a few ticks of ride and velocity on his fastball, allowing him to induce pop-ups (and whiffs) up in the zone. Although there’s virtually no chance that his current BABIP of .192 lasts for an entire season, the regression-friendly xBABIP nevertheless puts it at .241. Barring drastic decline, Means looks like he could outpitch his peripherals for years to come.
I ideally would have multiple full seasons to work with, but I don’t feel comfortable using years prior to 2020 since there’s been a change in tracking systems. Predicting BABIP is still an extremely tall task, but more accurate and granular Statcast data allow us to break down the metric’s components within a season. As more batted balls are compiled under the gaze of Hawk-Eye, it’ll be interesting to see if the details of xBABIP change. For now, this will do. (P.S. BABIP data for the aforementioned 138 pitchers can be found here.)
All statistics are current through games of July 19.
Justin is an undergraduate student at Washington University in St. Louis studying statistics and writing.
This progressive narrative of science and data unraveling its own flawed ideas is crazy. So, people who know nothing state that BABIP is all luck. Then, those people take credit for unraveling the mystery that they were wrong. Thank you Statcast for this valuable gift! I like that the cooperate sponsor needs to get credit for an entire revolution of junk. I am sure that the new technology is super reliable although they will need a sponsor. All this kind of analysis leads to is digging real deep in arbitrary metrics in search of generalizations that don’t hold. This is such a weird tone but typical of Fangraphs.
Imagine what might happen if analyst abandon linear models.
Always thus.
Old people: things fall down
Newton: actually, things are attracted to each other by a force based on their relative mass
Einstein: actually, gravitational effects are the result of spacetime curvature, not forces
The latter is a more detailed understanding that explains more phenomena, but Einstein doesn’t make Newtonian physics irrelevant within non-relativistic motion anymore than ‘things fall down’ is inaccurate when the frame of reference is on Earth.