BABIP Park Factors and the Batted Ball Connection

by Steve Staude

March 27, 2013

Some of you may recall that before being promoted from a FanGraphs Community Research writer to an actual FanGraphs writer, my primary focus was on the relationship between batted ball types (infield fly balls, in particular) and BABIP for pitchers. At the time, I’d been leaving park factors out of the equation in a [vain] attempt to keep things simple, but now I want to give them a bit of attention.

Now, Guts! is a great resource on FanGraphs, but it does leave out BABIP, HR/FB, and — I believe — something else I’d like to talk to you about for a second. If you’re a big fan of the batted ball stats here, this bit of information might be completely earth-shattering, leaving you sobbing in a heap on the floor, pondering how your life will never be the same again: IFFB% may not mean what you think it does. Now, FB%, for example — that’s defined as fly balls divided by batted balls, right? Many of us might therefore assume that IFFB% equals infield fly balls divided by batted balls… but it doesn’t. IFFB% is actually infield fly balls divided by fly balls. This means that IFFB% doesn’t tell you much about a player unless you have the context of his FB% to go with it. It also means that IFFB% * FB% equals what you probably thought IFFB% was, which is IFFB/(Batted Balls).

Hopefully you’ll be able to read this clearly through your tears: I’m going to introduce a new, not-officially-FanGraphs-sanctioned term here: IFFB% * FB% = PU%. PU%, or popup percentage — again, it’s what you probably thought IFFB% meant — is the percentage of batted balls that are infield flies. This leaves OFFB%, or outfield fly balls, as the remainder of FB% (i.e., FB% = PU%+OFFB%).

So, without further ado, here’s a sortable list of the park factors I came up with for the 2009-2012 seasons, with the exceptions noted at the bottom:

Team	BABIP	GB/FB	LD%	GB%	FB%	IFFB%	PU%	OFFB%	HR/FB
Angels	98.4	98.0	99.3	99.2	101.3	95.4	96.5	101.8	93.8
Astros	99.4	99.8	98.7	100.3	100.4	98.4	98.9	100.6	103.2
Athletics	98.0	98.9	101.2	99.0	100.5	102.7	103.2	100.2	91.5
Blue Jays	99.7	99.6	100.0	99.7	100.3	98.6	98.9	100.4	105.9
Braves	100.6	100.7	101.8	99.9	99.2	95.8	95.0	99.6	97.7
Brewers	99.5	96.0	98.7	98.4	102.8	98.2	100.8	103.0	109.3
Cardinals	99.3	102.8	101.4	100.7	98.1	102.1	100.1	98.0	91.8
Cubs	100.9	98.4	99.6	99.4	101.0	99.2	100.2	101.1	99.5
Diamondbacks	102.3	100.4	101.5	99.8	99.4	97.5	96.9	99.7	104.1
Dodgers	98.8	100.4	97.8	100.8	100.2	109.5	109.4	99.3	98.9
Giants	99.8	106.6	100.4	102.7	96.6	101.3	97.9	96.5	90.5
Indians	98.9	104.5	101.0	101.7	97.5	101.9	99.3	97.3	96.3
Mariners	98.2	99.4	101.1	99.4	100.2	103.9	104.1	99.8	90.4
Marlins **
Mets ***	98.3	96.6	97.2	99.2	102.5	107.0	110.0	101.8	92.6
Nationals	99.5	99.4	99.3	99.9	100.5	98.9	99.4	100.6	99.3
Orioles	101.4	99.8	98.6	100.1	100.6	95.2	95.7	101.1	108.9
Padres	96.6	102.4	97.2	101.7	99.4	96.4	95.9	99.8	89.6
Phillies	99.5	100.4	100.8	100.1	99.6	97.7	97.3	99.8	103.4
Pirates	98.2	103.4	100.4	101.3	98.2	96.2	94.5	98.6	90.6
Rangers	102.2	97.7	103.8	98.0	100.3	96.6	96.7	100.6	109.9
Rays	98.3	96.5	99.1	98.4	102.3	112.2	115.1	101.0	93.6
Red Sox	104.3	100.7	101.1	100.1	99.4	103.7	103.1	99.0	97.0
Reds	100.2	99.6	100.5	99.7	100.1	105.3	105.5	99.6	115.5
Rockies	105.5	103.8	104.1	100.5	97.2	90.6	88.3	98.2	115.5
Royals	101.8	103.4	97.7	102.1	98.7	91.1	90.0	99.7	91.3
Tigers	99.8	98.7	97.9	99.9	101.2	104.2	105.5	100.8	96.3
Twins *	101.1	103.4	102.9	100.9	97.4	102.8	100.0	101.0	94.5
White Sox	99.7	95.2	99.7	97.8	102.8	97.3	99.9	103.2	113.6
Yankees	98.9	99.0	98.2	99.9	101.0	102.9	103.9	100.7	112.7

* Twins’ factors based on 2010-2012 data only

** Marlins Park excluded due to 2012 being first year (insufficient sample size)

*** Citi Field’s walls were moved closer in 2012

Park factors are halved (based on the assumption that a player will play half of their games there).

If you hadn’t heard, The Padres and Mariners will be moving the fences in a bit this year, by the way. My apologies if I neglected to mention any significant park dimension changes that happened between 2009 and 2012.

If you’re like me, you might find this table interesting; if you’re a normal person, skip right ahead:

Correlations Between Park Factors

	BABIP	GB/FB	LD%	GB%	FB%	IFFB%	PU%	OFFB%
GB/FB	0.198
LD%	0.544	0.296
GB%	0.003	0.922	-0.087
FB%	-0.322	-0.966	-0.527	-0.799
IFFB%	-0.389	-0.249	-0.228	-0.165	0.274
PU%	-0.432	-0.501	-0.357	-0.378	0.534	0.959
OFFB%	-0.167	-0.867	-0.366	-0.753	0.863	0.009	0.258
HR/FB	0.450	-0.342	0.205	-0.435	0.257	-0.239	-0.137	0.306

An obligatory refresher for those who haven’t taken statistics in a while (or ever): correlation coefficients (“r”) range between -1 and 1. A correlation of “0” means the two factors being compared have no apparent connection, whereas “1” indicates the two factors move together in a perfectly linear way, and “-1” means they move perfectly linearly in opposite directions.

I bolded the connections I thought were the most interesting. Now, to discuss them more in-depth:

High LD% factor = high BABIP factor

This should come as no surprise to those of you who read my first Community article, in which I pointed out LD% and [what I’m now calling] PU% as the two main factors for explaining pitcher BABIPs. LD% for a pitcher is hard to predict from year-to-year, and park factors aren’t entirely consistent on a yearly basis either, but many of the line drive park factors do make a lot of sense, and you can reasonably expect the factors behind them to exert their influence yearly.

Specifically, let’s look at the top two parks in terms of high LD% — Colorado and Texas. What do they have in common? Well, the most obvious is thin air; Colorado due to its altitude, and Texas presumably due to heat and perhaps dryness. Thin air, of course, offers less resistance to a batted ball, but it also should theoretically allow for less break on pitches. Most of the stadia (that’s fancy talk for “stadiums”) at the low end of the list also make sense, having thick marine air. KC is an exception… but then again, its BABIP factor isn’t in-line with those of its surrounding teams on the LD% list. This could have something to do with scorer’s bias issues, such as the one discussed here. Another possible contributor to LD% differences is the batter’s eye in each stadium.

High PU% = low BABIP

This shouldn’t be a shocker to those of you who’ve read my previous work. Popups are pretty close to automatic outs. Let’s talk about how stadium characteristics might influence PU%. The first thing that comes to mind is that a greater amount of foul territory should lead to a higher PU%; that’s because a foul IFFB is only recorded if caught.

Another possible factor, judging by the Rays’ home field being firmly at the top of the list, is the dome factor. You might think the whitish background of the dome against a popup might not be so conducive to catching it, but perhaps the lack of sun and wind helps to make up for that. And it’s not like fielders in non-domed parks never have to deal with whitish backgrounds — clouds and haze are a thing, after all.

High HR/FB equals high BABIP, high OFFB%, and low PU%?

It’s worth reminding you at this point that home runs are excluded from consideration in BABIP, but not in batted ball stats. That’s one reason why fly ball pitchers tend to have lower BABIPs — they may allow more HR, but those don’t count as a knock against their BABIPs. The other reason is that fly balls, especially popups, make for easier putouts.

So, if HR aren’t part of BABIP, why would HR/FB have an apparent strong-ish connection to BABIP? The most obvious is that a high HR/FB is a sign of harder contact being made, for whatever reason, which you might expect to lead to a higher BABIP. Of course, you would also expect a higher HR/FB in small stadia, where perhaps more balls are bouncing uncatchably off of outfield walls and dropping for hits.

Now, PU% and OFFB% generally move together, both moving against BABIP, whereas HR/FB moves together with BABIP. That’s why I found it interesting that HR/FB divides PU% and OFFB%. I think that’s easy to explain in the context of an individual pitcher, but maybe not so much in the context of park factors. I’d like to hear your theories on it.

Oh, but before we make too much of this, I should tell you that the HR/FB factor appears to be the most prone to fluctuation of the bunch.

Putting it all together, kind of

As I am wont to do, I’ve regressed some of the various park factors to see how they might be able to explain each park’s BABIP factor:

BABIP = 0.48*LD% + 0.37*GB% – 0.05*PU% + 0.11*OFFB% + 0.09*HR/FB

The formula itself is, since it only applies to park factors, as useful as a poopie-flavored lollipop (Patches O’Houlihan) but it does have a 0.696 correlation to a stadium’s BABIP factor, meaning it can explain nearly half of the differences in BABIP factors (with a 0.484 R-squared). The park it has the hardest time explaining — by far — is Fenway, no doubt largely thanks to The Green Monster’s extreme BABIP-boosting ways. Take Boston out of the mix, and the correlation shoots to 0.772 (0.596 R-squared). Remove the second-biggest outlier, Kauffman Stadium in KC (with its suspiciously-low LD% factor) and the correlation goes to 0.816, explaining 2/3 of the differences. The exclusion of these outliers lends itself to the creation of a formula not tainted by them, which you probably don’t care about, yet here it is anyway:

BABIP = 0.52*LD% + 0.28*GB% – 0.03*PU% + 0.12*OFFB% + 0.11*HR/FB

That one achieves a 0.829 correlation to the remaining BABIP park factors (0.687 R-squared).

That can be whittled down to:

BABIP = 0.552*LD% + 0.320*GB% + 0.124*HR/FB

…which has a 0.821 correlation to BABIP factor, but if you remove any of those three factors, the correlation takes a major hit (though PU% and OFFB% together can mostly compensate for the loss of GB%).

I haven’t talked about what might contribute to a park’s GB% factor… well, groundskeeping might have a bit to do with it, but my guess is that it’s mainly due to less foul territory, and therefore fewer easy foul ball outs.

Methodology

For those who are curious, the formula I used to calculate each factor was:

(Home Pitching + Home Batting) / (Away Pitching + Away Batting) * 100

… which is a pretty standard park factor formula. I then halved it like so: 0.5 + Factor/2 … this is based on the assumption that the player plays half their games away at a neutral-factor stadium. When you consider that some teams play in divisions full of non-neutral opponent stadiums (e.g., Texas faces a bunch of pitcher’s parks), that’s probably not such a safe assumption to make, buuut it’s how park factors are done, and it’s a topic for a different conversation.

45 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

RMD

12 years ago

Does anyone know why the Wrigley’s HR/FB is so much lower than Comiskey’s? Are the fences that different? They get more wind on the South Side?

Steve Staude

Reply to RMD

Yeah, that’s an odd one. Wind direction could have something to do with it — the two parks are oriented differently.

Another possibility is that the Cubs play more games in HR/FB-happy parks than the White Sox do, making their park look less homer-friendly by comparison. It’s perhaps the biggest potential flaw of park factors in general.

studes

Reply to Steve Staude

No magic. It’s the fences. See:

http://www.hardballtimes.com/main/article/home-run-park-factor-a-new-approach/

Reply to studes

Thanks, Studes. Hmm, well, the parks are similar in size (though Wrigley is deeper at the corners, shallower at the power alleys). Wall heights are different, though (maybe 12′ at Wrigley, including the fence, vs. 8′ at US Cellular, from the pictures I’m seeing). http://www.andrewclem.com/Baseball/Dimensions.html

I’m definitely a fan of Greg Rybarczyk’s idea of trying out a “calibrated hitting machine” in all the parks.

ralph

Are pitcher PAs excluded from this analysis? I don’t know how big of an impact they would make, but I’d feel pretty confident that HR/FB will be lower for pitchers hitting than for non-pitchers hitting.

Reply to ralph

They’re not excluded, but since the formula to calculate the park factor is all relative (Home Pitching HR/FB + Home Batting HR/FB)/(Away Pitching HR/FB + Away Batting HR/FB), it cancels out.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG