Breaking New Grounders: Ground Balls Since 1950

Recently, in these pages, I’ve been looking at the relationship between ground-out/air-out ratios (GO/AO) — available in slightly different iterations at MLB.com and Baseball Reference (B-R) — and the ground-ball percentages (GB%) we host here at FanGraphs. (See Part One and Part Two of this most exciting of explorations.)

We know that GB%s measure the percent of all batted-balls that were hit on the ground and that GO/AOs are a ratio only of ground-outs to air-outs (with the Retrosheet data from B-R appearing to include line-outs as part of the air-out data and MLB appearing to exclude line-outs entirely). Despite this difference, it appears as though the correlation between GB% and GO/AO — especially the iteration available at B-R — is rather strong. Plotting actual GB%s against “expected GB%s” (xGB%) — that is, the exptected GB% using only GO/AO as the input — gives us a correlation coefficient of .98 for qualified pitchers from 2002 to 2010. The root mean square error (RMSE) between the GB%s and xGB%s is a mere 1.4%.

Because the data at B-R goes back to 1950, what started off merely as an attempt to provide a reference for people who had access to GO/AO but not GB% has potentially become an opportunity to estimate with some accuracy the GB%s of pitchers going back to the middle of last century.

“OMG” is what you’re saying to yourself, I’m sure.

Because I’m not very smart, this work is unfolding somewhat slowly — and with no little help from statistical luminaries such as Tangotiger, Colin Wyers, and Eric Van.

What I’ve done since last time is apply our equation to the average MLB GO/AOs at B-R for every year going back to 1950.

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

Here’s what it looks like:

A couple points on that:

1. The average xGB% for all years combined since 1950 is 43.7% — i.e. very, very close to the 43.8% mark for the years (2002-10) for which we have actual ground-ball data.

2. The graph raises an immediate question: Why do the xGB%s rise so dramatically from 1950 to the mid-1960s?

Here are some guesses to the question in point No. 2:

1. The data is somehow bad. For whatever reason, more air-outs or fewer ground-outs appear in the data than actually exist, thus skewing GO/AO numbers and producing lower xGB%s.

2. Pitchers were actually inducing fewer grounders.

3. Pitchers were inducing the same number of grounders, but infielders were converting fewer of those grounders into outs.

4. Perhaps owing to larger parks or weaker batters or different hitting approaches or some other reason, a greater percentage of balls hit in the air were caught by fielders.

We can sort of test this last point by looking to see if there’s any relationship between an increase in home runs and a decrease in the amount of air-outs recorded relative to ground-outs.

First, here’s a graph of home runs per batted ball since 1950 (where “batted ball” is actually defined, for purposes of ease in finding the data, as any plate appearance that ended in neither a walk nor strikeout):

We see here a spike in that early part where we were getting lower-than-expected xGB%s.

On the other hand, when we plot our HR/Batted Ball data against all xGB%s, we find basically no correlation, at all.

Regard:

Is it possible that an increase in HR per Batted Ball explains the early spike (1950-1964) but does little to account for changes after that?

Maybe, actually, yes:

If I’m concluding anything from this, it’s that I’m wary of making any statements about pitcher xGB%s previous to 1964. On the other hand, the GO/AO numbers since 1964 appear to be mostly stable — producing xGB%s neither greater or less than present-day rates by more than 1%.





Carson Cistulli has published a book of aphorisms called Spirited Ejaculations of a New Enthusiast.

18 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
tangotiger
14 years ago

Carson: what if you focus ONLY on the Dodger game in the pre-1966 time period? That data is apparently very solid, since we even have pitch data there. I’d like to see if there’s a difference.

tangotiger
14 years ago

Retrosheet. What I mean to ask is if you repeat your xGB% chart (your first one), but ONLY with the Dodger games. Do you still get the same pattern or does it look smoother?