Why Strikeouts Secretly Matter for Batters by Steve Staude September 20, 2013 I got my start at FanGraphs by writing Community Research articles. As you may have noticed, community authors have been very busy this season, cranking out a lot of interesting articles. One that caught my eye the other day was triple_r’s piece on the importance of strikeouts for hitters. The piece correctly pointed out, as other studies have, that there’s basically no correlation between a hitter’s strikeout rate and his overall offensive production. Strikeouts don’t matter; case closed, right? Well, not exactly. Let me present a hypothetical situation. Say there’s a group of players who go to an “anti-aging” clinic in Florida and pick up some anabolic steroids. Let’s say these hypothetical players are named Bryan Raun, Ralex Odriguez, Tiguel Mejada, Phonny Jeralta, Celson Nruz, and Barry Bon… nevermind. Yet, after using the steroids, it appears that the group of them, on average, has not improved. The steroids didn’t improve their performance, right? But, wait — let’s also say that while visiting Florida, some of them contracted syphilis, which spread to their brains, causing delusions and severely impacting their judgment, strike-zone and otherwise. The players whose brains aren’t syphilis-addled have actually improved quite a bit, but their gains are completely offset by the losses suffered by those whose central nervous systems are raging with syphilis. So, the fact that the steroids actually do improve performance has been completely obscured by another factor that is somewhat — but not necessarily — associated with the steroids. How does this analogy relate to the strikeout situation? Take a look at this table of correlations from the sample I’ll be using in this article — qualified batters in individual seasons, 1913-2012: ISO BABIP K% BB% AVG OBP SLG wOBA wRC+ ISO 1 BABIP 0.129 1 K% 0.461 -0.025 1 BB% 0.356 0.054 0.139 1 AVG 0.275 0.859 -0.350 0.083 1 OBP 0.418 0.642 -0.160 0.705 0.755 1 SLG 0.921 0.452 0.232 0.322 0.627 0.644 1 wOBA 0.693 0.584 -0.049 0.552 0.766 0.904 0.871 1 wRC+ 0.724 0.515 0.101 0.550 0.662 0.825 0.854 0.921 1 The most obvious things we can gather about K% from this is that if you strike out a lot, you will probably have a lower batting average (and therefore, indirectly, a lower OBP), but that you will also tend to hit for more power. Being a power hitter may also cause you to get pitched around, leading to a higher BB%. The power and the contact basically offset each other overall, though. But let’s think about causality. Does it make sense that striking out more would directly cause a drop in batting average? Definitely. Does it make sense that striking out more would directly cause a rise in ISO (which is SLG – AVG)? Not at all, really. If you put me in the majors, I’d K at least 95% of the time, but I definitely wouldn’t have a high ISO to show for it. How about reversing the causality — does trying to hit for more power make you K more? OK, maybe a bit. But I think there’s a more important factor at work: selection bias. If a hitter is going to make it into the majors based on his offense, he’s usually either going to have to get on base a lot or hit for a lot of power. A deficit in one trait can be made up for if the other trait is good enough. Up to a point, of course; I bet Mike Tyson in his prime could have swung the bat extremely hard, but probably wouldn’t have made contact anywhere near enough to make the majors. Plus, he made way too much money boxing. Anyway, get a load of these guys from this season (ranks out of 141 qualified batters): Name K% K% Rank ISO ISO Rank BABIP AVG OBP SLG wOBA wRC+ WAR Brandon Moss 27.8% 9 0.255 4 0.308 0.257 0.339 0.512 0.367 136 1.6 Pedro Alvarez 31.2% 5 0.236 10 0.272 0.226 0.290 0.463 0.322 106 2.2 Alfonso Soriano 24.5% 15 0.233 11 0.286 0.255 0.294 0.487 0.336 109 2.6 Mark Trumbo 26.2% 12 0.231 12 0.275 0.240 0.299 0.471 0.331 112 2.8 Chris Carter 36.7% 1 0.226 15 0.309 0.220 0.318 0.446 0.334 111 0.5 Adam Dunn 30.8% 6 0.217 21 0.266 0.219 0.320 0.436 0.330 103 -0.3 Would they even be in the majors if they had a typical (about 0.145) ISO? Some definitely wouldn’t, and the others would be borderline players. If you could get a random, representative sample of the entire world population, I believe it would become pretty apparent that strikeouts do in fact matter to batters. As it is, however, the fact that pretty much all the high-K% batters are only selected to play in the majors if they have phenomenal power to make up for it has created bias in the sample. Another analogy: let’s say successful actresses are either very beautiful (e.g. Megan Fox) or very talented (e.g. Maggie Smith). Occasionally, an actress will be both (e.g. Charlize Theron). The selections of the casting people will pretty much keep actresses who are neither beautiful nor talented out of the big movies and shows. You might find a little bit of an imbalance towards beautiful actresses with not much talent; this could cause the overall sample to show no correlation between talent and success, when you’re only looking at those who’ve made it to the big time. But, as shown by all the actresses who never made it because they weren’t talented enough to overcome their lack of beauty, talent really does matter. By the way, Charlize Theron is Miguel Cabrera. You heard me: Name K% K% Rank ISO ISO Rank BABIP AVG OBP SLG wOBA wRC+ WAR Miguel Cabrera 14.7% 95 0.305 2 0.353 0.347 0.443 0.653 0.461 195 7.5 Edwin Encarnacion 10.0% 131 0.262 3 0.247 0.272 0.370 0.534 0.387 145 4.1 David Ortiz 14.7% 95 0.254 5 0.321 0.309 0.394 0.562 0.399 150 3.7 Jose Bautista 15.9% 84 0.239 8 0.259 0.259 0.358 0.498 0.371 134 4.2 Edwin Encarnacion, meanwhile, is looking like a talented, beautiful actress who just hasn’t been getting great roles lately (as shown by his BABIP). I think there are around four basic abilities that contribute to a player’s offense: Contact ability, Eye/ Plate Discipline, Power, and Speed. A fifth consideration is of course luck, or randomness. They are all difficult to quantify, but there’s an interplay between them that contributes towards the more easily-quantified stats like AVG, OBP, SLG, and wOBA. Below is a table of those traits with what I believe to be the best proxies for those traits, listed in order. Contact Eye/Discipline Power Speed Contact% O-Swing% HR/FB* Spd (?) K% BB% ISO UBR (?) AVG OBP HR/PA SB numbers LD% and PU% K% SLG BABIP SLG BABIP ISO BABIP OBP The asterisk next to HR/FB is there because I feel like park influences taint the stat quite a bit as a measure of true power; a park-adjusted version of the stat would be ideal. K% is not entirely direct for either Contact or Eye; some Ks are due to whiffing, while others are due to misjudging a pitch and taking it for a strike. It would be nice to see a distinction between the two types of Ks, I think. The batted ball stats LD% (Line Drives) and PU (Pop-Ups, which is FB%*IFFB%) are there because the trajectory of the contact matters too, not just whether contact was made or not. I don’t think there really is a great proxy for speed, though. BIS is now timing players on base running events, and I think that has the potential to create some very good results. But we’ll be looking at wOBA here, which doesn’t really factor speed in much, other than the impact it has when a player legs out an infield hit, or pulls off an extra base on his hit. Since my sample goes way back to 1913, whereas some of the best above stats are only available from 2002 onwards, I’ve chosen, for the study you’re about to see, K%, BB%, ISO, and BABIP as the key factors to examine. With these factors (sometimes using HR/PA instead of ISO), you can pretty closely approximate AVG, OBP, SLG, and wOBA. Here’s my gratuitous flowchart, so we’re all on the same page here: These are, of course, the basic ways a plate appearance can turn out. We have BB% and K% reflecting the chance of a walk or strikeout, and we have the semi-random, infrequent events like hit by pitch or reaching on error. The leftover is the balls hit in play. Since home runs aren’t (generally) subject to being caught, BABIP leaves them out of the equation, and only considers balls that are caught/putouts or those that become singles, doubles, or triples. A player’s power, reflected by ISO, plays a big part in deciding whether a batted ball will go for a home run or otherwise make it past the outfielders. Meanwhile, BABIP is part luck, part a reflection of a player’s contact, power, and speed. And you can’t neglect the role luck plays in everything. Here’s how my key stats can be used to derive somewhat simplified versions of other stats: Notice how Ks are a significant factor in the equations. wOBA is a little trickier, given its various, evolving weightings, but a simple multiple regression on the data yields the following equation: wOBA = 0.405*BB% – 0.321*K% + 0.484*ISO +0.625*BABIP +0.091. Strikeouts definitely hurt, when viewed from this angle, having sorted out the interactions. The real derivation for wOBA based on the above stats looks more like the OBP derivation, except with the special weightings for the different hits added in. Still, that formula gets pretty close, with a 0.962 R Square. More details on that, if you’re interested: Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 99.0% Upper 99.0% Intercept 0.09137 0.00110 83.1 0 0.08921 0.09352 0.08853 0.09420 ISO 0.48371 0.00216 223.7 0 0.47947 0.48795 0.47813 0.48928 BABIP 0.62476 0.00354 176.7 0 0.61783 0.63169 0.61565 0.63387 K% -0.32088 0.00232 -138.5 0 -0.32543 -0.31634 -0.32686 -0.31491 BB% 0.40500 0.00338 119.9 0 0.39838 0.41162 0.39630 0.41370 In the end, overall production, represented by wOBA or wRC+ is what really matters to a hitter, of course. But as the regression formula (and common sense) illustrates, with all else equal, overall production will increase as strikeouts go down. Mike Trout is a great example, as his ISO and BABIP have been almost identical these past two seasons: Season BB% K% ISO BABIP AVG OBP SLG wOBA wRC+ WAR 2012 10.50% 21.80% 0.238 0.383 0.326 0.399 0.564 0.409 166 10 2013 14.80% 18.30% 0.241 0.381 0.330 0.435 0.570 0.430 180 10.2 If Trout had not improved his K% in 2013, his OBP would have been about 13 points lower than it is, and his wOBA approximately 11 points lower. He’d still be amazing; just less so. Ted Williams, by the way: career 0.289 ISO and 7.2% K%. Amazing. It can be done. Eno reported from an interview with Mark Trumbo this season, He thinks there is a genetic component to the ability to discern balls and strikes — some are “born with it” — but he’s working to get the most out of what he has. I agree with that, although perhaps a lot of it is developmental from an early age, and not necessarily genetic. Here’s how I look at it: everybody has some innate ceiling when it comes to the abilities needed to succeed in baseball. Some people have too low a ceiling in one or more of the abilities, and would never make the majors, no matter how hard they worked; maybe they’re too small, not naturally athletic enough, have bad eyesight, or lack the necessary rapid visuospatial processing potential. There are others who could have made the majors if they’d worked hard enough at it, but lacked either the drive to reach their potential, the exposure to baseball, the opportunity, or just the luck (e.g. staying healthy). Let me throw out some numbers, wildly estimating the chance of being an elite hitter by multiplying all the factors out: Trait Power Contact Eye Speed Exposure Opportunity Drive Luck Whole Package # of People on Earth % of people possessing 5% 5% 10% 25% 5% 10% 1% 50% 0.00000016% x 7 billion people = 10.9 I think that helps explain why a player who can truly do it all is pretty rare. Cabrera has elite power, contact, and plate discipline, but is not even close in speed. This year, Chris Davis is showing the power and eye, but definitely not contact. Mike Trout is a bit short of elite in the contact department, but definitely has it in the other areas. Andrew McCutchen might have a bit of a case, although some of his traits are just borderline elite. So, there are different ways to reach the same level of productivity, and by nature, people are unlikely to be outstanding in all of them. Talent evaluators in baseball will give a chance to a power hitter who strikes out a lot, because he’s capable of reaching an equivalent level of production as a player who strikes out less but hits for less power. They’re just making the most of what’s available. A truly perfect hitter, however, would never strike out. Each time they strike out is one less time a ball they could have otherwise hit might fall in for a hit, or that they might have drawn a walk. Obvious, right? All else equal, strikeouts are bad. I guess the real question is, if you told a guy like Chris Carter to try to strike out less, how might that affect his overall performance? Is changing his approach more likely to be helpful or detrimental? Let me hear your thoughts.