Why Strikeouts Secretly Matter for Batters

by Steve Staude

September 20, 2013

I got my start at FanGraphs by writing Community Research articles. As you may have noticed, community authors have been very busy this season, cranking out a lot of interesting articles. One that caught my eye the other day was triple_r’s piece on the importance of strikeouts for hitters. The piece correctly pointed out, as other studies have, that there’s basically no correlation between a hitter’s strikeout rate and his overall offensive production. Strikeouts don’t matter; case closed, right? Well, not exactly.

Let me present a hypothetical situation. Say there’s a group of players who go to an “anti-aging” clinic in Florida and pick up some anabolic steroids. Let’s say these hypothetical players are named Bryan Raun, Ralex Odriguez, Tiguel Mejada, Phonny Jeralta, Celson Nruz, and Barry Bon… nevermind. Yet, after using the steroids, it appears that the group of them, on average, has not improved. The steroids didn’t improve their performance, right? But, wait — let’s also say that while visiting Florida, some of them contracted syphilis, which spread to their brains, causing delusions and severely impacting their judgment, strike-zone and otherwise. The players whose brains aren’t syphilis-addled have actually improved quite a bit, but their gains are completely offset by the losses suffered by those whose central nervous systems are raging with syphilis. So, the fact that the steroids actually do improve performance has been completely obscured by another factor that is somewhat — but not necessarily — associated with the steroids.

How does this analogy relate to the strikeout situation? Take a look at this table of correlations from the sample I’ll be using in this article — qualified batters in individual seasons, 1913-2012:

	ISO	BABIP	K%	BB%	AVG	OBP	SLG	wOBA	wRC+
ISO	1
BABIP	0.129	1
K%	0.461	-0.025	1
BB%	0.356	0.054	0.139	1
AVG	0.275	0.859	-0.350	0.083	1
OBP	0.418	0.642	-0.160	0.705	0.755	1
SLG	0.921	0.452	0.232	0.322	0.627	0.644	1
wOBA	0.693	0.584	-0.049	0.552	0.766	0.904	0.871	1
wRC+	0.724	0.515	0.101	0.550	0.662	0.825	0.854	0.921	1

The most obvious things we can gather about K% from this is that if you strike out a lot, you will probably have a lower batting average (and therefore, indirectly, a lower OBP), but that you will also tend to hit for more power. Being a power hitter may also cause you to get pitched around, leading to a higher BB%. The power and the contact basically offset each other overall, though.

But let’s think about causality. Does it make sense that striking out more would directly cause a drop in batting average? Definitely. Does it make sense that striking out more would directly cause a rise in ISO (which is SLG – AVG)? Not at all, really. If you put me in the majors, I’d K at least 95% of the time, but I definitely wouldn’t have a high ISO to show for it. How about reversing the causality — does trying to hit for more power make you K more? OK, maybe a bit. But I think there’s a more important factor at work: selection bias.

If a hitter is going to make it into the majors based on his offense, he’s usually either going to have to get on base a lot or hit for a lot of power. A deficit in one trait can be made up for if the other trait is good enough. Up to a point, of course; I bet Mike Tyson in his prime could have swung the bat extremely hard, but probably wouldn’t have made contact anywhere near enough to make the majors. Plus, he made way too much money boxing. Anyway, get a load of these guys from this season (ranks out of 141 qualified batters):

Name	K%	K% Rank	ISO	ISO Rank	BABIP	AVG	OBP	SLG	wOBA	wRC+	WAR
Brandon Moss	27.8%	9	0.255	4	0.308	0.257	0.339	0.512	0.367	136	1.6
Pedro Alvarez	31.2%	5	0.236	10	0.272	0.226	0.290	0.463	0.322	106	2.2
Alfonso Soriano	24.5%	15	0.233	11	0.286	0.255	0.294	0.487	0.336	109	2.6
Mark Trumbo	26.2%	12	0.231	12	0.275	0.240	0.299	0.471	0.331	112	2.8
Chris Carter	36.7%	1	0.226	15	0.309	0.220	0.318	0.446	0.334	111	0.5
Adam Dunn	30.8%	6	0.217	21	0.266	0.219	0.320	0.436	0.330	103	-0.3

Would they even be in the majors if they had a typical (about 0.145) ISO? Some definitely wouldn’t, and the others would be borderline players.

If you could get a random, representative sample of the entire world population, I believe it would become pretty apparent that strikeouts do in fact matter to batters. As it is, however, the fact that pretty much all the high-K% batters are only selected to play in the majors if they have phenomenal power to make up for it has created bias in the sample.

Another analogy: let’s say successful actresses are either very beautiful (e.g. Megan Fox) or very talented (e.g. Maggie Smith). Occasionally, an actress will be both (e.g. Charlize Theron). The selections of the casting people will pretty much keep actresses who are neither beautiful nor talented out of the big movies and shows. You might find a little bit of an imbalance towards beautiful actresses with not much talent; this could cause the overall sample to show no correlation between talent and success, when you’re only looking at those who’ve made it to the big time. But, as shown by all the actresses who never made it because they weren’t talented enough to overcome their lack of beauty, talent really does matter.

By the way, Charlize Theron is Miguel Cabrera. You heard me:

Name	K%	K% Rank	ISO	ISO Rank	BABIP	AVG	OBP	SLG	wOBA	wRC+	WAR
Miguel Cabrera	14.7%	95	0.305	2	0.353	0.347	0.443	0.653	0.461	195	7.5
Edwin Encarnacion	10.0%	131	0.262	3	0.247	0.272	0.370	0.534	0.387	145	4.1
David Ortiz	14.7%	95	0.254	5	0.321	0.309	0.394	0.562	0.399	150	3.7
Jose Bautista	15.9%	84	0.239	8	0.259	0.259	0.358	0.498	0.371	134	4.2

Edwin Encarnacion, meanwhile, is looking like a talented, beautiful actress who just hasn’t been getting great roles lately (as shown by his BABIP).

I think there are around four basic abilities that contribute to a player’s offense: Contact ability, Eye/ Plate Discipline, Power, and Speed. A fifth consideration is of course luck, or randomness. They are all difficult to quantify, but there’s an interplay between them that contributes towards the more easily-quantified stats like AVG, OBP, SLG, and wOBA. Below is a table of those traits with what I believe to be the best proxies for those traits, listed in order.

Contact	Eye/Discipline	Power	Speed
Contact%	O-Swing%	HR/FB*	Spd (?)
K%	BB%	ISO	UBR (?)
AVG	OBP	HR/PA	SB numbers
LD% and PU%	K%	SLG	BABIP
SLG		BABIP
ISO
BABIP
OBP

The asterisk next to HR/FB is there because I feel like park influences taint the stat quite a bit as a measure of true power; a park-adjusted version of the stat would be ideal.

K% is not entirely direct for either Contact or Eye; some Ks are due to whiffing, while others are due to misjudging a pitch and taking it for a strike. It would be nice to see a distinction between the two types of Ks, I think.

The batted ball stats LD% (Line Drives) and PU (Pop-Ups, which is FB%*IFFB%) are there because the trajectory of the contact matters too, not just whether contact was made or not.

I don’t think there really is a great proxy for speed, though. BIS is now timing players on base running events, and I think that has the potential to create some very good results. But we’ll be looking at wOBA here, which doesn’t really factor speed in much, other than the impact it has when a player legs out an infield hit, or pulls off an extra base on his hit.

Since my sample goes way back to 1913, whereas some of the best above stats are only available from 2002 onwards, I’ve chosen, for the study you’re about to see, K%, BB%, ISO, and BABIP as the key factors to examine. With these factors (sometimes using HR/PA instead of ISO), you can pretty closely approximate AVG, OBP, SLG, and wOBA.

Here’s my gratuitous flowchart, so we’re all on the same page here:

These are, of course, the basic ways a plate appearance can turn out. We have BB% and K% reflecting the chance of a walk or strikeout, and we have the semi-random, infrequent events like hit by pitch or reaching on error. The leftover is the balls hit in play. Since home runs aren’t (generally) subject to being caught, BABIP leaves them out of the equation, and only considers balls that are caught/putouts or those that become singles, doubles, or triples. A player’s power, reflected by ISO, plays a big part in deciding whether a batted ball will go for a home run or otherwise make it past the outfielders. Meanwhile, BABIP is part luck, part a reflection of a player’s contact, power, and speed. And you can’t neglect the role luck plays in everything.

Here’s how my key stats can be used to derive somewhat simplified versions of other stats:

Notice how Ks are a significant factor in the equations. wOBA is a little trickier, given its various, evolving weightings, but a simple multiple regression on the data yields the following equation: wOBA = 0.405*BB% – 0.321*K% + 0.484*ISO +0.625*BABIP +0.091. Strikeouts definitely hurt, when viewed from this angle, having sorted out the interactions. The real derivation for wOBA based on the above stats looks more like the OBP derivation, except with the special weightings for the different hits added in. Still, that formula gets pretty close, with a 0.962 R Square. More details on that, if you’re interested:

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%	Lower 99.0%	Upper 99.0%
Intercept	0.09137	0.00110	83.1	0	0.08921	0.09352	0.08853	0.09420
ISO	0.48371	0.00216	223.7	0	0.47947	0.48795	0.47813	0.48928
BABIP	0.62476	0.00354	176.7	0	0.61783	0.63169	0.61565	0.63387
K%	-0.32088	0.00232	-138.5	0	-0.32543	-0.31634	-0.32686	-0.31491
BB%	0.40500	0.00338	119.9	0	0.39838	0.41162	0.39630	0.41370

In the end, overall production, represented by wOBA or wRC+ is what really matters to a hitter, of course. But as the regression formula (and common sense) illustrates, with all else equal, overall production will increase as strikeouts go down. Mike Trout is a great example, as his ISO and BABIP have been almost identical these past two seasons:

Season	BB%	K%	ISO	BABIP	AVG	OBP	SLG	wOBA	wRC+	WAR
2012	10.50%	21.80%	0.238	0.383	0.326	0.399	0.564	0.409	166	10
2013	14.80%	18.30%	0.241	0.381	0.330	0.435	0.570	0.430	180	10.2

If Trout had not improved his K% in 2013, his OBP would have been about 13 points lower than it is, and his wOBA approximately 11 points lower. He’d still be amazing; just less so.

Ted Williams, by the way: career 0.289 ISO and 7.2% K%. Amazing. It can be done.

Eno reported from an interview with Mark Trumbo this season,

He thinks there is a genetic component to the ability to discern balls and strikes — some are “born with it” — but he’s working to get the most out of what he has.

I agree with that, although perhaps a lot of it is developmental from an early age, and not necessarily genetic.

Here’s how I look at it: everybody has some innate ceiling when it comes to the abilities needed to succeed in baseball. Some people have too low a ceiling in one or more of the abilities, and would never make the majors, no matter how hard they worked; maybe they’re too small, not naturally athletic enough, have bad eyesight, or lack the necessary rapid visuospatial processing potential. There are others who could have made the majors if they’d worked hard enough at it, but lacked either the drive to reach their potential, the exposure to baseball, the opportunity, or just the luck (e.g. staying healthy). Let me throw out some numbers, wildly estimating the chance of being an elite hitter by multiplying all the factors out:

Trait	Power	Contact	Eye	Speed	Exposure	Opportunity	Drive	Luck	Whole Package		# of People on Earth
% of people possessing	5%	5%	10%	25%	5%	10%	1%	50%	0.00000016%	x 7 billion people =	10.9

I think that helps explain why a player who can truly do it all is pretty rare. Cabrera has elite power, contact, and plate discipline, but is not even close in speed. This year, Chris Davis is showing the power and eye, but definitely not contact. Mike Trout is a bit short of elite in the contact department, but definitely has it in the other areas. Andrew McCutchen might have a bit of a case, although some of his traits are just borderline elite.

So, there are different ways to reach the same level of productivity, and by nature, people are unlikely to be outstanding in all of them. Talent evaluators in baseball will give a chance to a power hitter who strikes out a lot, because he’s capable of reaching an equivalent level of production as a player who strikes out less but hits for less power. They’re just making the most of what’s available. A truly perfect hitter, however, would never strike out. Each time they strike out is one less time a ball they could have otherwise hit might fall in for a hit, or that they might have drawn a walk. Obvious, right?

All else equal, strikeouts are bad. I guess the real question is, if you told a guy like Chris Carter to try to strike out less, how might that affect his overall performance? Is changing his approach more likely to be helpful or detrimental? Let me hear your thoughts.

55 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Dave S

11 years ago

Great post. Thanks for writing it!

If you asked me that question about Carter, I’d ask followup questions… how many guys like Carter have ever been able to strike out less? And what happened to them?

My guess would be very few players like him were able to decrease their strikeouts, and those that could/did… ended up out of the game within two years. Because I think the dropoff in power/iso would outweigh the gain in contact… and additionally, I don’t think their OBP would go up much, as they’d begin to walk less (less power=less walks).

But thats just a wild guess.

Steve Staude

Reply to Dave S

Thanks! Good idea. I’ll see what I can do.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG