Expected BABIP for Pitchers

May 22, 2008

Recently on FanGraphs, we’ve been referring to a stat called xBABIP or Expected Batting Average on Balls in Play to help justify a pitcher’s current BABIP. There’s been a few questions about what this stat means, so I thought it’d be as good a time as any to try and explain the ins and outs of this particular metric.

The initial concept of BABIP is that pitchers do not have control over what happens to balls once they are hit into the field of play.

BABIP typically fluctuates from year to year with a baseline of around .300. If a pitcher has a particularly high or low BABIP, we may say he’s been lucky or unlucky. Things are of course not quite this simple, but for the most part the rule holds true.

In enters ball in play data; we know how many line drives, fly balls, and ground balls a pitcher allows in to play. Line drives fall for hits the most often and ground balls fall for hits more often than fly balls. What types of batted balls a pitcher allows into play are going to effect a pitcher’s overall BABIP.

BABIP by Type (2007):
Fly Balls – .15
Ground Balls – .24
Line Drives – .73

Ideally, the formula is going to look something like this to find out a player’s expected BABIP:
expected BABIP = .15 * FB% + .24 * GB% + .73 * LD%

You Aren't a FanGraphs Member

It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.

We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.

1. Ad Free viewing! We won't bug you with this ad, or any other.

2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.

3. Dark mode and Classic mode!

4. Custom player page dashboards! Choose the player cards you want, in the order you want them.

5. One-click data exports! Export our projections and leaderboards for your personal projects.

6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)

7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.

8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.

9. A weekly mailbag column, exclusively for Members.

10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!

We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

Click Here To Become a Member

For more accuracy you could remove home runs from the batted ball percentages at a rate of 92% from fly balls and 8% from line drives. You could even account for infield fly balls and remove that from total fly balls, but the formula above will get you pretty far.

Dave Studeman a couple of years ago calculated that adding .12 to LD% was good enough for a ball park estimate of a player’s expected BABIP. This is what you’ll often see writers on FanGraphs refer to as xBABIP.

The best way to use this statistic is to attempt to validate a pitcher’s current BABIP. For instance, a pitcher might have an high line drive percentage and a high BABIP. This would give a pitcher a high xBABIP as well and you could say: “Yes, his high line drive percentage is responsible for his high BABIP.”

While this is useful for looking at past performances, the difference in xBABIP and BABIP should not be used in an attempt to evaluate future performance. This is because LD% and BABIP are somewhat independent of each other. While there is some correlation between LD% and BABIP, it isn’t enough to suggest that they will always track each other.

LD% in itself is highly variable and it would be difficult to say that a pitcher with a BABIP of .300 and a LD% of 22% (xBABIP of .340) should do considerably worse going forward because you really don’t know what his LD% is going to be the rest of the season. His xBABIP of .340 was his expected BABIP and will not be his expected BABIP in the future. Typically a pitcher’s expected BABIP in the future will be around the original baseline of .300.

8 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Brett

17 years ago

Cool.

So one of the main points Voros originally made is that BABIP correlates very poorly from year to year. But we see that there is a relationship between BABIP and hit type, specifically LD% as it’s coefficient makes it the dominant term in your equation. Does this mean that LD% doesn’t correlate well year to year?

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG