An Unsolicited Follow-Up Study of Pull%

by Steve Staude

April 2, 2013

I’m always looking for new angles to unlock the mysteries of BABIP, so I was intrigued by Jeff Sullivan’s exploration of pull rates against pitchers. So I grabbed the data from baseball-reference.com, and set to work subjecting it to my usual rigmarole of correlations and multiple regressions. You know how they say if your only tool is a hammer, everything looks like a nail to you? Well, plug your ears — there’s about to be a lot of wild, uncontrolled pounding going on in here…

I’ll cut right to the chase — did I find anything interesting relating to pitchers’ overall effectiveness when it comes to their Pull%, Middle%, and Opposite%, as I’m calling them? Well, I found one decent connection that will seem obvious and stupid after you think about it, and a slight but kind of interesting connection. I’ll provide you with some correlation tables that have left few stones unturned. But, mainly, the research might help to set some things straight about how important this stuff actually is for pitchers.

My sample was composed of the career numbers of active pitchers, with a semi-arbitrary minimum of 623 batted balls to analyze per pitcher (long story… not important)(n=244). Remember that an “AB” in the context of this study refers only to ABs that resulted in batted balls hit against each pitcher, including home runs (but not uncaught foul balls). That means if I’m talking about a batting average, it refers to the batting average on batted balls, not the pitcher’s overall batting average against. It also means strikeouts are out of the equation here. So are walks — so don’t you worry about on-base rates. And of course, it’s also going to mean that AVG and SLG, like Pablo Sandoval, are going to be a lot bigger than you’re used to seeing in baseball.

Some Context

As Jeff said, hits that are pulled get hit a lot harder than those that aren’t. Jeff explained that this has to do with the bat speed being greater towards the end of the swing, but it probably also has to do with bat angles causing more fly balls to the opposite field, as explained here. As you probably know by now, fly balls have a lower batting average on balls in play, generally being easier putouts (other than home runs, which are removed from the equation for the purposes of BABIP). I believe there’s even more to it than bat speed and angle, which I’ll get to near the end. Anyway, here are the results from my sample:

Hit Direction	BABIP	AVG	SLG
Pulled	0.361	0.408	0.726
Middle	0.265	0.285	0.420
Opposite	0.296	0.323	0.507

So it seems balls hit up the middle of the field have the worst results for hitters. This probably has to do with both defensive alignments and the depth of center fields relative to left and right fields.

It’s Correlation Time

So, yes, it matters where the individual ball goes. Does that mean you can say “X is a ‘pull pitcher,’ so his BABIP should be higher”? How much of a difference does hit location really matter to a pitcher’s overall results, in the grand scheme of things? To take a deeper look at that, let’s look at the correlation table for my whole sample:

	BABIP	AVG	SLG	ISO	Pull%	Mid%	Opp%
BABIP
AVG	0.939
SLG	0.398	0.679
ISO	-0.025	0.304	0.906
Pull%	-0.147	-0.039	0.229	0.320
Mid%	0.218	0.070	-0.271	-0.392	-0.508
Opp%	-0.043	-0.022	0.003	0.016	-0.610	-0.373
ROE/AB	0.181	0.083	-0.176	-0.276	-0.028	0.189	-0.143

Before we make too much of that, let’s also look at one where the minimum batted balls for inclusion have been raised to 2000 (n=114):

	BABIP	AVG	SLG	ISO	Pull	Mid	Opp
BABIP
AVG	0.911
SLG	0.342	0.686
ISO	-0.019	0.375	0.932
Pull	-0.008	0.132	0.291	0.304
Mid	0.078	-0.157	-0.439	-0.482	-0.494
Opp	-0.067	0.015	0.126	0.153	-0.547	-0.457
ROE/AB	0.117	0.006	-0.236	-0.304	-0.038	0.196	-0.151

Part of the effect of raising the minimum to 1500 batted balls is the exclusion of one Jeremy Hellickson, the outlier among outliers. But I think what the data suggests, overall, is that the best conclusion that can be drawn from pitchers’ Pull/Mid/Opp breakdowns is this: some pitchers have more balls hit up the middle against them, and these pitchers tend to allow lower ISO rates (ISO–isolated power–is SLG – AVG). I think the main reason for this is that home runs are harder to hit to center field, due to the distances involved.

As for BABIP, I’m sorry to say it looks like a dead end, so far. It seems to have no direct connection at all to Pull/Mid/Opp. One thing I wanted to point out, though, is that BABIP is correlated with SLG but not ISO — this is due to BABIP’s only connection to SLG being the AVG component of SLG, which is removed in ISO. Yet ISO is still correlated with AVG. That’s because home runs are a part of both AVG and ISO but not BABIP. Just something to keep in mind when you’re analyzing the trends here.

I decided to test out ROE (reached on error) data since it came with the rest of the data. It’s a weak-ish connection, but a pretty consistent one through the different minimum AB samples I ran correlations on (I tried a few more): pitchers who have more batters reach base on error tend to allow lower ISOs. I think the explanation is probably a pretty simple one: more ground balls = fewer extra base hits, plus more errors. I won’t go into that anymore since it’s quite a tangent.

I ran multiple regressions to see if there might be some hidden interactions between Pull/Mid/Opp, but it appears there aren’t.

Righty Batters vs. Lefty Batters

I was curious whether a pitcher’s Pull/Mid/Opp rates had any consistency between righty and lefty batters faced. I also wanted to find out whether each hit location type got different results depending on the batter’s handedness. The results:

	Rate	BABIP	AVG	SLG
Righty Pull vs. Lefty Pull	0.215	-0.012	0.063	0.248
Righty Mid vs. Lefty Mid	0.359	0.168	0.140	0.010
Righty Opp vs. Lefty Opp	0.491	0.372	0.394	0.375
Righty Pull vs. Lefty Opp	-0.227	0.045	-0.066	-0.210
Righty Opp vs. Lefty Pull	-0.161	-0.011	-0.084	-0.195

Conclusions: by far, the strongest connection here for pitchers overall are their opposite field rates and results between righty and lefty opposition. To clarify, the results (BABIP, AVG, and SLG) are not at all influenced by the rates (e.g. Pulled Batted Balls by Righties/Total Batted Balls by Righties) in my calculations.

What’s Going on Here?

I’ve taken the liberty of running correlations between Pull/Mid/Opp and pretty much all of the rate stats FanGraphs has for pitchers, to see what odd connections might pop up. It’s a little sloppy, since I’m matching up 2003-2012 FanGraphs data (well, 2007-2012 in the case of PitchF/X data) with career numbers, some of which extend beyond 2003. I grovel at your feet for forgiveness for that, my masters. I’m only giving you correlations above the 0.3 cutoff that is generally regarded as better than “weak.” Here you go:

	Correlation to Pull%
Fastball%	-0.472
wFB/C	-0.441
FBv	-0.437
vFS (pfx)	-0.430
vFA (pfx)	-0.425
FA% (pfx)	-0.415
wFA/C (pfx)	-0.407
HR/FB	0.397
wFB	-0.383
FIP	0.367
CH% (pfx)	0.367
HR/9	0.356
wFA (pfx)	-0.355
CH%	0.350
vCH (pfx)	-0.321
FIP-	0.318
BUH%	0.313
CHv	-0.312

Main conclusion: throwing hard prevents batters from pulling the ball. Not a shocker. Changeups get pulled.

	Correlation to Middle%
PU%	-0.532
IFFB%	-0.522
GB%	0.482
FB%	-0.472
GB/FB	0.456
Z-Contact%	0.431
wFS/C (pfx)	0.354
Z-Contact% (pfx)	0.348
SI% (pfx)	0.333
SwStr%	-0.317
HR/9	-0.305
FA-Z (pfx)	-0.304

You won’t find popup percentage, “PU%,” in FanGraphs’ glossaries; I introduced it last week, defining it as IFFB% * FB%, or IFFB/Batted Balls. Pitchers with a lot of infield flies tend not to be up-the-middle types, for whatever reason. Part of that (probably most of it) has to do with all of the foul popups that become “batted balls” when they’re caught. That’s kind of a cause for concern — infield flies are undoubtedly making Pull and Opposite results look better than they really are. I say that because the fact that they were popups were the only thing that made them “in play” — they’d be non-factors otherwise, like foul liners and grounders.

I think we should also conclude that beyond the obvious distance-to-the-centerfield-wall issue, the lower home run rates for “Middle” pitchers have to do with their higher ground ball rates.

Z-Contact%, by the way, goes hand-in-hand with PU%, as I pointed out here.

Moving on:

	Correlation to Opposite%
FA% (pfx)	0.584
wFB/C	0.526
wFA/C (pfx)	0.495
K%	0.492
SI% (pfx)	-0.487
K/9	0.484
Z-Contact%	-0.477
GB%	-0.471
wFB	0.460
PU%	0.453
GB/FB	-0.448
IFFB%	0.436
HR/FB	-0.434
Relieving	0.433
wFA (pfx)	0.429
FB%	0.425
FA-Z (pfx)	0.424
inLI	0.424
Fastball%	0.406
pLI	0.405
Z-Swing%	0.400
Z-Contact% (pfx)	-0.398
FIP-	-0.391
LOB%	0.388
exLI	0.384
gmLI	0.382
H/9	-0.374
Outside%*	-0.373
FIP	-0.370
SIERA	-0.363
LD%	0.362
Zone%	0.362
Swing%	0.354
ERA-	-0.353
BUH%	-0.349
FBv	0.343
ERA	-0.330
tERA	-0.321
wSF	-0.316
Zone% (pfx)	0.315
vFA (pfx)	0.310
Pace	0.308
SwStr%	0.305
CH%	-0.304
K/BB	0.301
Heart%*	0.300

*Bill Petti and Jeff Zimmerman’s Edge%, Heart%, and Outside% are 2008-2012, and aren’t completely finalized as of this writing.

Conclusions: well, this is why opposite field rates are more predictable, I’m guessing — see how many fairly strong correlations it has. Basically, opposite field is where popups and other fly balls tend to end up, but where not many grounders go. With a good “rise” on hard, frequently-thrown fastballs is how it’s generally done. Pitchers who induce a lot of them tend to throw either over the middle of the plate or inside edge, per Zimmerman and Petti’s Edge% numbers.

In the “Explaining Popups” section of a Community Research article I wrote, I concluded that fooling hitters into swinging underneath pitches is the main explanation behind popups, and you’ll notice that the same factors involved in my formula there are very important to Opposite%.

Final Thoughts

Besides the likely Middle%-HR connection, the most significant thing about Pull%/Middle%/Opposite% breakdowns for pitchers is really the LD%/GB%/PU% connection they translate to. To a defense, a batted ball’s trajectory and speed make a lot more difference to its degree of fielding difficulty than does the third of the field it was hit to (I mean, unless there’s a major shift that it’s beating). Still, I think batted ball direction trends lend some useful context to the complex dynamics that are involved.

Hope you’ve enjoyed this over-analysis!

22 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

rusty

11 years ago

Would you say that the magnitude of the effects you’ve explored is great enough that pitcher pull tendencies should be considered for infield shifts? Or is the variability there washed out by the identity of the hitter?

Steve Staude

Reply to rusty

Great question. Well, I haven’t given this enough research to say how much of a role the hitter plays in it all, but that’s something I may have to look into.

I think teams should probably just keep doing what I’m sure they’re already doing — looking at the pitcher’s and batter’s spray charts. All I can say is that it probably makes less sense to use an extreme pull shift when the pitcher is a flamethrower with fly ball tendencies.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG