I’m always looking for new angles to unlock the mysteries of BABIP, so I was intrigued by Jeff Sullivan’s exploration of pull rates against pitchers. So I grabbed the data from baseball-reference.com, and set to work subjecting it to my usual rigmarole of correlations and multiple regressions. You know how they say if your only tool is a hammer, everything looks like a nail to you? Well, plug your ears — there’s about to be a lot of wild, uncontrolled pounding going on in here…
I’ll cut right to the chase — did I find anything interesting relating to pitchers’ overall effectiveness when it comes to their Pull%, Middle%, and Opposite%, as I’m calling them? Well, I found one decent connection that will seem obvious and stupid after you think about it, and a slight but kind of interesting connection. I’ll provide you with some correlation tables that have left few stones unturned. But, mainly, the research might help to set some things straight about how important this stuff actually is for pitchers.
My sample was composed of the career numbers of active pitchers, with a semi-arbitrary minimum of 623 batted balls to analyze per pitcher (long story… not important)(n=244). Remember that an “AB” in the context of this study refers only to ABs that resulted in batted balls hit against each pitcher, including home runs (but not uncaught foul balls). That means if I’m talking about a batting average, it refers to the batting average on batted balls, not the pitcher’s overall batting average against. It also means strikeouts are out of the equation here. So are walks — so don’t you worry about on-base rates. And of course, it’s also going to mean that AVG and SLG, like Pablo Sandoval, are going to be a lot bigger than you’re used to seeing in baseball.
As Jeff said, hits that are pulled get hit a lot harder than those that aren’t. Jeff explained that this has to do with the bat speed being greater towards the end of the swing, but it probably also has to do with bat angles causing more fly balls to the opposite field, as explained here. As you probably know by now, fly balls have a lower batting average on balls in play, generally being easier putouts (other than home runs, which are removed from the equation for the purposes of BABIP). I believe there’s even more to it than bat speed and angle, which I’ll get to near the end. Anyway, here are the results from my sample:
So it seems balls hit up the middle of the field have the worst results for hitters. This probably has to do with both defensive alignments and the depth of center fields relative to left and right fields.
It’s Correlation Time
So, yes, it matters where the individual ball goes. Does that mean you can say “X is a ‘pull pitcher,’ so his BABIP should be higher”? How much of a difference does hit location really matter to a pitcher’s overall results, in the grand scheme of things? To take a deeper look at that, let’s look at the correlation table for my whole sample:
Before we make too much of that, let’s also look at one where the minimum batted balls for inclusion have been raised to 2000 (n=114):
Part of the effect of raising the minimum to 1500 batted balls is the exclusion of one Jeremy Hellickson, the outlier among outliers. But I think what the data suggests, overall, is that the best conclusion that can be drawn from pitchers’ Pull/Mid/Opp breakdowns is this: some pitchers have more balls hit up the middle against them, and these pitchers tend to allow lower ISO rates (ISO–isolated power–is SLG – AVG). I think the main reason for this is that home runs are harder to hit to center field, due to the distances involved.
As for BABIP, I’m sorry to say it looks like a dead end, so far. It seems to have no direct connection at all to Pull/Mid/Opp. One thing I wanted to point out, though, is that BABIP is correlated with SLG but not ISO — this is due to BABIP’s only connection to SLG being the AVG component of SLG, which is removed in ISO. Yet ISO is still correlated with AVG. That’s because home runs are a part of both AVG and ISO but not BABIP. Just something to keep in mind when you’re analyzing the trends here.
I decided to test out ROE (reached on error) data since it came with the rest of the data. It’s a weak-ish connection, but a pretty consistent one through the different minimum AB samples I ran correlations on (I tried a few more): pitchers who have more batters reach base on error tend to allow lower ISOs. I think the explanation is probably a pretty simple one: more ground balls = fewer extra base hits, plus more errors. I won’t go into that anymore since it’s quite a tangent.
I ran multiple regressions to see if there might be some hidden interactions between Pull/Mid/Opp, but it appears there aren’t.
Righty Batters vs. Lefty Batters
I was curious whether a pitcher’s Pull/Mid/Opp rates had any consistency between righty and lefty batters faced. I also wanted to find out whether each hit location type got different results depending on the batter’s handedness. The results:
|Righty Pull vs. Lefty Pull||0.215||-0.012||0.063||0.248|
|Righty Mid vs. Lefty Mid||0.359||0.168||0.140||0.010|
|Righty Opp vs. Lefty Opp||0.491||0.372||0.394||0.375|
|Righty Pull vs. Lefty Opp||-0.227||0.045||-0.066||-0.210|
|Righty Opp vs. Lefty Pull||-0.161||-0.011||-0.084||-0.195|
Conclusions: by far, the strongest connection here for pitchers overall are their opposite field rates and results between righty and lefty opposition. To clarify, the results (BABIP, AVG, and SLG) are not at all influenced by the rates (e.g. Pulled Batted Balls by Righties/Total Batted Balls by Righties) in my calculations.
What’s Going on Here?
I’ve taken the liberty of running correlations between Pull/Mid/Opp and pretty much all of the rate stats FanGraphs has for pitchers, to see what odd connections might pop up. It’s a little sloppy, since I’m matching up 2003-2012 FanGraphs data (well, 2007-2012 in the case of PitchF/X data) with career numbers, some of which extend beyond 2003. I grovel at your feet for forgiveness for that, my masters. I’m only giving you correlations above the 0.3 cutoff that is generally regarded as better than “weak.” Here you go:
|Correlation to Pull%|
Main conclusion: throwing hard prevents batters from pulling the ball. Not a shocker. Changeups get pulled.
|Correlation to Middle%|
You won’t find popup percentage, “PU%,” in FanGraphs’ glossaries; I introduced it last week, defining it as IFFB% * FB%, or IFFB/Batted Balls. Pitchers with a lot of infield flies tend not to be up-the-middle types, for whatever reason. Part of that (probably most of it) has to do with all of the foul popups that become “batted balls” when they’re caught. That’s kind of a cause for concern — infield flies are undoubtedly making Pull and Opposite results look better than they really are. I say that because the fact that they were popups were the only thing that made them “in play” — they’d be non-factors otherwise, like foul liners and grounders.
I think we should also conclude that beyond the obvious distance-to-the-centerfield-wall issue, the lower home run rates for “Middle” pitchers have to do with their higher ground ball rates.
Z-Contact%, by the way, goes hand-in-hand with PU%, as I pointed out here.
|Correlation to Opposite%|
Conclusions: well, this is why opposite field rates are more predictable, I’m guessing — see how many fairly strong correlations it has. Basically, opposite field is where popups and other fly balls tend to end up, but where not many grounders go. With a good “rise” on hard, frequently-thrown fastballs is how it’s generally done. Pitchers who induce a lot of them tend to throw either over the middle of the plate or inside edge, per Zimmerman and Petti’s Edge% numbers.
In the “Explaining Popups” section of a Community Research article I wrote, I concluded that fooling hitters into swinging underneath pitches is the main explanation behind popups, and you’ll notice that the same factors involved in my formula there are very important to Opposite%.
Besides the likely Middle%-HR connection, the most significant thing about Pull%/Middle%/Opposite% breakdowns for pitchers is really the LD%/GB%/PU% connection they translate to. To a defense, a batted ball’s trajectory and speed make a lot more difference to its degree of fielding difficulty than does the third of the field it was hit to (I mean, unless there’s a major shift that it’s beating). Still, I think batted ball direction trends lend some useful context to the complex dynamics that are involved.
Hope you’ve enjoyed this over-analysis!
Steve is a robot created for the purpose of writing about baseball statistics. One day, he may become self-aware, and...attempt to make money or something?