A Few Different Ways To Look at The Steroid Era, Graphically

Trying to point out where the steroid era begins and ends, using data, is not as easy as you might think. While there are substances you can take to increase muscle mass, and that has direct consequences on athleticism, there just isn’t a clear moment where one the numbers pinpoint the beginnings of steroid usage in Major League Baseball. That might be because the pitchers were on them too, or that the steroid era reaches further back than we suppose, or continues more into today’s game than we prefer to think.

Let’s start with something simple. Home runs per balls in play back to 2000, noting that 2005 was when the CBA language introducing testing was agreed upon:

HRBiPsince2000

So the steroid agreement totally nailed it. Helped reduce home runs off balls in play by 0.5%. *Does the cleaning up the game dance.*

Less facetiously: this doesn’t seem like a big difference in homer rates. Physicist Alan Nathan penned a paper in 2009 that suggested that a 10% increase in muscle mass could beget a 30-70% increase in home runs per ball in play, by way of increased bat speed. But even if the increase had been on the smaller side, say 30%, we could have expected home runs per balls in play to reduce to about 3.1%. Which we didn’t see.

But not all of baseball was doing steroids. And there were pitchers, doing it too. And where on your body you put that 10% of muscle mass seems very important. And if you zoom out a bit, it gets problematic. Here’s the same graph back to 1994:

HRBiPsince1994

Oh. Yeah. Now it looks like there’s no real difference between the beginning of this era and the end. Fluctuations as much as .5% must be random, considering the ebb and flow of this graph.

Why cut it off in 1994? Ken Caminiti admitted to doing steroids, and his career started in 1987. Let’s pull this back to 1980:

HrBiPsince1980

What happened in 1993? Did they disperse needles after the season was over? More likely, it was baseball’s expansion that year (the Rockies and the Marlins) that diluted the pitching pool some. You see a little temporary boost in 1998 when Arizona and Tampa were added to the mix too. And add in the Coors Field effect.

But the Steelers were doing steroids in the 1970s… let’s just do the post-war-time HR/BiP graph, and annotate it with important dates, including rule changes and expansions:

HRBiPsince1950annotated

If the zooming in and out didn’t do it for you, perhaps the annotated full version may convince you that this is a very complicated issue. Almost every big change in home run rate has a big change in baseball associated with it — if not two. In 1969, the mound was lowered (good for offense) and two teams were added (good for offense), and the strike zone was changed as well, to eliminate the high strike between the armpits and the shoulders. All of these changes pushed home runs on balls in play from 2.8% to 3.05%.

If there is a suspicious place on this graph, it’s in the mid-eighties. Here’s a table:

Year HR/BiP
1982 2.68%
1983 2.65%
1984 2.63%
1985 2.92%
1986 3.15%
1987 3.67%
1988 2.60%

If we didn’t know anything about Jose Canseco, maybe you could call it random. But we know there were steroids in baseball in the mid-80s, we know that home runs per balls in play took a lurch forward then, too, and we know that this lurch forward was stymied for a while by a 1988 change in the strike zone, and we know that baseball has largely returned to the homer rate that we settled on in 1987.

But before we say that 1987 was when baseball revealed what would it look like in the Steroid Era, we have to at least consider the other possibility. 1987 was the year of the Rabbit Ball as it was commonly called. Larry Granillo wrote a great piece chronicling the players’ comments about the ball, and linking to prominent writers of the day surmising that there were other reasons. Other reasons like pitchers of the late eighties were yuppies missing work ethic. Seriously.

Jay Jaffe, in his book Extra Innings, pointed out that baseball switched ball manufacturers in 1977 and that balls of different era haven’t appeared similar in constitution. You’ll find 1977’s power peak annotated in the graph above as an expansion year, but it seems we could add something about the seams there too.

Hitting a baseball is a complicated mix of athleticism and learned skills. While it might be folly to say that steroids have no effect on the results of that endeavor, it’s also nearly impossible to pinpoint what steroids have meant to baseball’s most hallowed number. The home run rate has changed — drastically at times — over the history of baseball. It’s just that the rules, strike zone, and even the ball has changed during that time, too. Which effect is to blame for 1987’s surge? Maybe the yuppies.

[My definition of balls in play for the purposes of this article was: BiP = PA – (K+BB+HBP). I used this definition because it included HRs, and I used BiP to try and control for fluctuations in the number of balls put in play.]





With a phone full of pictures of pitchers' fingers, strange beers, and his two toddler sons, Eno Sarris can be found at the ballpark or a brewery most days. Read him here, writing about the A's or Giants at The Athletic, or about beer at October. Follow him on Twitter @enosarris if you can handle the sandwiches and inanity.

76 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
walt526
10 years ago

Please either set the vertical axis to 0% or provide confidence intervals. These “trends” are very likely just random chance.

Clark Kent
10 years ago
Reply to  Eno Sarris

You have a data set with a reported average, and a whole bunch of data which deviates from that reported average. It’s not a question of whether homers are homers or not, but a question of homoscedasticity.

Brian
10 years ago
Reply to  Clark Kent

Allow me!

Homoscedasticity: the property of having equal statistical variances

http://www.merriam-webster.com/dictionary/homoscedasticity

Dave Cameron's Puppy
10 years ago
Reply to  Clark Kent

Here you actually probably want to use standard error as you’re trying to measure the difference between the averages any given year. Standard deviation gives you an idea of the spread of the data, not how accurate your average is.

Nate
10 years ago
Reply to  walt526

Do you have HR/BiP for each individual player in these data? If so you could bootstrap the league wide HR/BiP to obtain confidence intervals. This could also have the interesting side effect of certain players showing up more often in the extremes of the distribution, possibly suggesting steroid use.

anothernate
10 years ago
Reply to  Nate

I was thinking the same thing. There is also Demming’s famous paper on treating a census as a sample for theoretical motivation.

Anon
10 years ago
Reply to  anothernate

Do you have a citation on the paper? A quick check of Wikipedia was fruitless without more detail

Dave Cameron's Puppy
10 years ago
Reply to  walt526

Even the most stat minded baseball writers have a long way to go in terms of reporting data. There are never any error bars on any plots here and it drives me nuts.