Breaking News: Strikeouts Are Bad

March 12, 2020

When I first learned about a mysterious cabal of smart nerds who were analyzing baseball, I took the words I got from them as though passed down from heaven. I read Moneyball, of course. But I also read about DIPS theory, wOBA, and whatever else I could get my hands on. I read The Book so many times I wore it out and had to buy a new copy. It felt like there were cheat codes just under the surface of the sport that someone was highlighting for me.

Many of those lessons from 15 years ago are still kicking around in my head. I’m skeptical of BABIP-driven hitters, perhaps more skeptical than I should be. I dismiss batters with anomalous platoon splits, even if there’s something about them that really does make them unique. And recently I realized that I might be misunderstanding the signaling value of strikeout rate.

Back in the early 2000s, batters who struck out more hit better. That sounds counterintuitive, because strikeouts are bad. It’s actually not that weird though. Barry Bonds struck out more than Ozzie Smith in his career, just to pick two illustrative examples. Bonds isn’t even a great example, because his batting eye was otherworldly. Alex Rodriguez struck out twice as often as Omar Vizquel.

The popular opinion was that strikeouts weren’t really a negative indicator. A strikeout was bad, sure, but it was often a hidden indicator of some positive process under the hood. No one would say that being sore is good for your health, and yet people in great shape are probably sore more often than sedentary types, what with all the exercising. Amount of time spent being sore very likely has a positive correlation with health.

Take a look at a few correlations from 2000 to 2010. These look at every batter with 300 or more plate appearances in a given year:

Various Correlations to wOBA By Year

Year	Strikeouts	Walks	wOBACON
2000	0.054	0.568	0.899
2001	0.043	0.606	0.895
2002	0.066	0.685	0.872
2003	0.037	0.581	0.892
2004	0.026	0.646	0.851
2005	0.121	0.597	0.868
2006	0.074	0.575	0.858
2007	0.080	0.530	0.853
2008	0.081	0.514	0.837
2009	0.041	0.536	0.842
2010	0.177	0.537	0.863

Hey, more strikeouts mean more wOBA. Correlation, as you may have heard, doesn’t equal causation, and the slope was minimal (half a point of wOBA for every percentage point of strikeout rate), but it was there.

Let’s zoom that data out to every year since 2000:

Various Correlations to wOBA By Year

Year	Strikeouts	Walks	wOBACON
2000	0.054	0.568	0.899
2001	0.043	0.606	0.895
2002	0.066	0.685	0.872
2003	0.037	0.581	0.892
2004	0.026	0.646	0.851
2005	0.121	0.597	0.868
2006	0.074	0.575	0.858
2007	0.080	0.530	0.853
2008	0.081	0.514	0.837
2009	0.041	0.536	0.842
2010	0.177	0.537	0.863
2011	0.036	0.519	0.869
2012	-0.006	0.391	0.845
2013	0.007	0.447	0.838
2014	-0.098	0.373	0.800
2015	0.051	0.554	0.838
2016	-0.040	0.447	0.789
2017	-0.063	0.504	0.766
2018	-0.104	0.521	0.784
2019	-0.084	0.406	0.804

Wuh oh. Walks and production on contact are still correlated with positive outcomes, but higher strikeout rates are now associated with lower wOBA. That is unexpected.

Naturally, no one was ever saying that batters should strike out more to improve their outcomes. There was simply covariance (or correlation if you’re a normalizing sort) between striking out and doing good things like walking or smashing the ball. Maybe that’s the change. Did new training methods, or something else I can’t think of, somehow eliminate the link between strikeouts and positive outcomes?

K% Correlation to Positive Outcomes

Year	Walks	wOBACON
2000	0.278	0.427
2001	0.285	0.431
2002	0.198	0.501
2003	0.172	0.427
2004	0.147	0.498
2005	0.244	0.550
2006	0.248	0.522
2007	0.203	0.539
2008	0.283	0.558
2009	0.275	0.516
2010	0.270	0.602
2011	0.259	0.475
2012	0.256	0.478
2013	0.267	0.499
2014	0.184	0.467
2015	0.204	0.542
2016	0.204	0.534
2017	0.155	0.541
2018	0.123	0.478
2019	0.159	0.468

Nope! Those linkages are running strong, even if the correlation to walks has been a little bit wiggly of late. Again, this doesn’t say anything about causality, but it’s not hard to imagine that deeper counts lead to both more walks and more strikeouts, while swinging really, really, ridiculously hard produces more strikeouts and more damage when you do hit the ball. Neither of those fundamental relationships seems to have changed much in the last twenty years.

So what gives? I thought I’d approach the problem using an ad hoc method. I predicted each batter’s wOBA in each year using only their wOBA on contact (wOBACON if you’re hungry). From there, I took the “error,” the difference between the prediction and the batter’s actual wOBA, and compared that error term to a batter’s strikeout rate. This should, in theory, handle the covariance issue; a batter with a high strikeout rate but also high production on contact will have a higher predicted wOBA than one who strikes out less and dinks the ball more, which will get rid of that annoying cross-correlation.

When we look only at how strikeout rate contributes to this error term, we get a real relationship. Strikeout rate is strongly correlated to the gap between predicted and actual production:

K% Correlation to wOBA Error

Year	Correlation	wOBA Per 1% K
2000	-0.754	-.0030
2001	-0.769	-.0029
2002	-0.755	-.0029
2003	-0.762	-.0030
2004	-0.759	-.0029
2005	-0.717	-.0025
2006	-0.726	-.0027
2007	-0.728	-.0027
2008	-0.707	-.0024
2009	-0.729	-.0026
2010	-0.680	-.0023
2011	-0.761	-.0026
2012	-0.767	-.0026
2013	-0.752	-.0026
2014	-0.786	-.0028
2015	-0.739	-.0026
2016	-0.750	-.0027
2017	-0.743	-.0027
2018	-0.771	-.0029
2019	-0.775	-.0028

That correlation, and the slope, are strong and consistent over time. Once you know how well someone does when they put the ball in play, that’s not shocking. For every one point increase in strikeout rate, we’d expect to see a wOBA roughly three points lower on average (assuming contact quality stays constant).

But there’s something weird here. Look at that table again: Both the correlation and the slope are consistent over time. Adding strikeouts seems to hurt exactly as much, after accounting for loudness of contact, as it did in 2000. And yet we saw the evidence above: higher strikeout rates are now a sign of worse batters, not better.

Why is this? It’s because there’s a hidden factor we haven’t yet considered. A single point of strikeout rate might be tied to a similar decline in wOBA, but strikeout rates are far more dispersed now than they’ve ever been. It’s hardly a secret that strikeout rates have crept up over the years; non-pitchers struck out 15.9% of the time in 2000 and 22.4% of the time in 2019. That higher rate comes with wider variation:

K% Variation Over Time

Year	K%	StDev K%	3-Year Avg StDev
2000	15.9%	4.79%	–
2001	16.8%	5.08%	–
2002	16.3%	5.45%	5.11%
2003	15.9%	4.75%	5.10%
2004	16.3%	5.19%	5.13%
2005	16.0%	5.06%	5.00%
2006	16.3%	5.11%	5.12%
2007	16.6%	5.35%	5.17%
2008	17.0%	5.63%	5.36%
2009	17.5%	5.37%	5.45%
2010	18.0%	5.46%	5.49%
2011	18.1%	5.37%	5.40%
2012	19.2%	5.69%	5.51%
2013	19.3%	5.87%	5.64%
2014	19.9%	6.03%	5.86%
2015	19.9%	5.72%	5.87%
2016	20.6%	5.79%	5.85%
2017	21.2%	6.04%	5.85%
2018	21.7%	5.77%	5.87%
2019	22.4%	5.92%	5.91%

And that wider variation seems to be enough. It was never good to be a higher-strikeout batter; there simply wasn’t as much variation. Every batter was clumped in the middle, and production on contact was more important. But production on contact hasn’t changed much in magnitude, and it hasn’t changed at all in variance:

wOBACON Variation Over Time

Year	wOBACON	StDev	3-Year Avg StDev
2000	.383	0.061	–
2001	.371	.060	–
2002	.367	.057	.059
2003	.369	.053	.057
2004	.374	.052	.054
2005	.366	.051	.052
2006	.376	.052	.052
2007	.376	.055	.053
2008	.375	.053	.053
2009	.376	.052	.053
2010	.374	.057	.054
2011	.365	.052	.054
2012	.372	.055	.055
2013	.368	.055	.054
2014	.369	.052	.054
2015	.370	.055	.054
2016	.380	.050	.053
2017	.388	.055	.053
2018	.379	.052	.052
2019	.390	.056	.054

So now normal dispersion in strikeout rate is bigger, which means that a batter one standard deviation below the mean and a batter one standard deviation above the mean are much further apart in strikeout rate. It’s easier, in this day and age, to strike out enough that you just aren’t playable. Variance is wider, which means more players fit that bill. And correlation is a simple thing; it more or less looks at the data points and draws a line. A handful of batters striking out so much they torpedo their stats could flip the observed correlation of strikeout rate and wOBA, and now there’s an intuitive result where once there was a confusing one.

I’d like to point out, at this point, that this whole article has arguably been nonsense. There’s no real meaning in these relationships; as we saw, adding strikeouts is as costly as it ever was, and racking up value when you put the ball in play is still king. In fact, it’s always possible that I interpreted these correlations incorrectly; I’m no mathematician, and I’m merely spitballing based on my intuition of how these relationships have changed. The magnitude of everything is tiny, and there’s nothing causal. It’s just fun with numbers.

But the baseball season hasn’t started yet, and in an increasingly grim world, we could all use a little frivolous data entertainment. If, like me, you enjoy a little mathematical tomfoolery, I hope this fits the bill. Strikeouts have always been bad! They just show up that way now, even if you don’t take the time to control for other things.

21 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

ballskwokmember

4 years ago

I’m not sure if this is part of what you are finding or not, but I’ve been speculating that in the juiced ball era (which goes back to what, 2015?), there should be greater value in making contact as opposed to striking out. People have usually framed this as “if you strike out, the defense can’t make an error or a ball can’t find a hole or you can’t move a runner over.” But the last few years it’s more like “you can’t make a routine fly ball leave the yard.”

Cave Dameron

Reply to ballskwok

You also can’t hit into a double play if you strike out.

Reply to Cave Dameron

This is true and a counterpoint I make often. But I think the juiced ball might tip the scales a bit in terms of potential reward for making more contact.

Dominikk85member

The reason why Ks are bad is not that they don’t move runners but that they are outs. A K isn’t worse than a groundout (or only marginally so) but the PA that became a K could also have become a single or a homer.

So you have to compare a K to the entire batted ball spectrum and not just outs.

Of course a high K hitter can still be good with great walk rates and power (adam dunn) but given the same bb and iso lower Ks is better. K rate is basically the difference between adam dunn and mike trout (and defense:)) as their iso and walk rate are rather similar in their primes.

But of course this doesn’t exist in a vacuum and sometimes higher K guys have more pop.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG