Further Research In Pursuit of Finding Hitter Breakouts

Peter Aiken-USA TODAY Sports

Earlier this week, I wrote up a side project I’ve been working on recently: looking through exit velocity distributions to find interesting hitters. You can read that if you’d like (obviously, that’s how the internet works), but as a refresher, I looked through 2022 batted ball data for hitters whose 95th-percentile exit velocity was high but whose average exit velocity was low, as well as hitters who hit the ball hard consistently but didn’t have the results to show for it.

With a little more time to monkey around with the data, I’ve come to a few conclusions about this line of analysis. If you just want to read the article for those conclusions, no sweat: just search for the words “phenomenal cosmic power.” It’s been too long since I’ve used an Aladdin reference in an article, so I promise to shoehorn that one in somehow just before I explain my conclusions.

Okay, great, now that we’ve dispensed with the casuals, let’s talk through a bunch of procedure. You nerds (I say this with affection) love the procedure, I know. First things first: I took Baseball Savant data for all batted balls and grouped them by player and season. I skipped 2020 due to sample size issues and last season because we don’t have subsequent-year data. That left me with approximately 3,000 player-seasons of at least 50 batted balls.

For each of these seasons, I grabbed a heaping helping of data, whatever I thought I might need. We’re talking wRC+, OBP, SLG, strikeout rate, average exit velocity, 95th-percentile exit velocity, plate appearances, hair color, contact rate, maximum exit velocity – okay, fine, not hair color, but the rest of them. Then I began coming up with criteria that might quanitfy what I’m looking for – big-bopping but inconsistent hitters ready for a breakout – and tested it to see whether what I’m talking about actually makes any sense.

My first pass was to look at roughly the criteria I outlined in my initial article. I looked for players who fit the following criteria: a 95th-percentile vs. average exit velocity gap in the top 10% of hitters (inconsistent), a 95th-percentile exit velocity in the top 30% of hitters (powerful), a wRC+ below 110 (not yet hitting), and at least 200 plate appearances in both year one and year two (available). Out of our 3,000 player-seasons, exactly 50 fit all four criteria.

How did those 50 batters fare in the subsequent year? They improved their wRC+ by 1.4 points. That’s not really a success – it’s definitely within the margin of error. But it’s worse than that, because picking a criteria of a wRC+ below 110 biases the results. Just as an example, 841 hitters in 2021 had 200 plate appearances, a wRC+ below 110, and batted 200 times in our sample. They improved their wRC+ by an average of 5.9 points. That makes good sense to me – that group will have a disproportionate number of players who performed below their true talent level. Our first-level breakout pick criteria stinks.

Let’s be real, though: that was obvious. Plenty of skilled but inconsistent players continue to be skilled but inconsistent. What I’m actually after in this exercise, as I mentioned Monday, is finding players whose improvement I should be quicker to buy into. More specifically, I want to find players whose raw power is so prodigious that if they improve in other facets of the game, they’ll stand out. Accordingly, I added something new: change in O-Swing% from year one to year two.

The idea here is that if you find players who can mash but aren’t having success, the ones who improve on their weaknesses are likely to turn in excellent performances. To handle this, I took every hitter who improved their O-Swing% from one year to the next in addition to the exit velocity criteria from earlier. But bad news: this one doesn’t really work either. The 22 hitters who fit our expanded criteria improved their wRC+ by eight points from one year to the next. That’s great – but if you remove the exit velocity criteria entirely and look at the population as a whole, the 388 hitters who chased less in year two improved their wRC+ by nine points. Swinging less seems to be good for hitters putting up low wRC+’s, but it doesn’t interact with exit velocity meaningfully.

I had one last trick up my sleeve before I gave up: contact rate. This is kind of a case of saving the best for last, because this is the one I thought would do the best job. If a player is already bopping the ball when they make solid contact, I hypothesize that improving contact should improve all outcomes, merely thanks to more of those best-hit balls. I had to relax the criteria slightly here to get a reasonable sample; using our top-10% EV gap and top-30% EV criteria from above, only 17 players improved their contact rate from one year to the next. I settled on top-15% EV gap and top-40% EV, but the analysis seems robust to different groupings.

Great news – this one seems to work. There were 40 players who hit my final criteria: a 95th-percentile vs. average exit velocity gap in the top 15% of the league, a 95th-percentile exit velocity in the top 40% of the league, 200 plate appearances in both seasons, a higher contact rate in year two than year one, and a wRC+ below 110 in season one. Those 40 hitters improved their wRC+ in year two by an average of 16 points – juicy! Our control group – 200 plate appearances in each year, better contact rate in year two than year one, and a wRC+ below 110 in year one – featured 367 players who improved by an average of seven points of wRC+.

This isn’t a definitive study, and there are all kinds of potential problems with it. I’m looking at full-season contact rate, which doesn’t exactly help if you’re looking at contact rates in mid-May and looking for breakouts. I had to try a bunch of different tests to find criteria that worked, and while this result is statistically significant at the p=.05 level, you can find a lot of things that are “statistically significant” but not actually meaningful if you measure enough different combinations of variables. All I’m actually saying, after all, is that there’s a 5% or less chance that these results would be achieved by random chance.

With all those caveats, I think this is a useful result. A lot of what it’s saying is obvious, but it’s useful to know when intuitive-sounding logic actually works. If you can find a hitter who is capable of tattooing the ball when they connect, and then they start making more contact, that hitter is liable to break out. Again, it’s not rocket science, but it’s a useful way to narrow the population of hitters you’re looking at when hunting for a breakthrough.

Let’s quickly run through two examples so that you can see the type of hitter I’m talking about. In 2017, Hunter Renfroe posted excellent top-end power numbers but also had a lot of weak contact in addition to strikeout issues. He finished the year with a 94 wRC+, and it didn’t look flukishly bad or anything; his wOBA outstripped his xwOBA, for example. Then in 2018, he started making more contact and put up a 114 wRC+ without meaningfully improving his raw power. Since then, he’s continued to flash that 2018 form; from ’19-22, he’s compiled a 110 wRC+, an above-average line. Maybe it’s not exactly a breakout, but the skills he already had made a modest improvement in contact rate count.

For a more recent version — and potentially a more exciting one — consider Andrés Giménez. He showed excellent top-end power even in a miserable 2021 season (.218/.282/.351, 74 wRC+). Then he started making more contact in 2022 while retaining his sneaky power, and put up a monster .297/.361/.466 line, good for a 140 wRC+. We don’t know how the rest of his career will go yet, but those are the kinds of players I’m after: ones with the raw power to really leverage their contact rate gains.

As I mentioned in my earlier article, I’m also interested in hitters who consistently hit the ball hard and post above-average 95th-percentile exit velocities. I used Mookie Betts and Freddie Freeman as archetypical examples of this kind of hitter in that article, but they’re best-case scenarios. Instead, I’m looking for hitters who combine those two traits but aren’t yet hitting. I ran the same tests as above, only using that group instead of the high-gap group from the first set of tests.

More specifically, I started by looking for players with a 95th-percentile vs. average exit velocity gap in the bottom 25% of the league, average exit velocity in the top half of the league, 200 plate appearances in each season, and a wRC+ below 110 in season one. This didn’t yield many seasons, even with those lenient criteria; as it turns out, it’s hard to hit the ball both hard and consistently, and put up a bad wRC+. Who knew?

Still, I got 33 players in that group, and they improved their wRC+ by an average of 6.5 points next year. If you’ll recall from above, players who recorded a wRC+ below 110 improved by roughly six points the next year. This seems largely indistinguishable from noise. The same held true when I added conditions for lower chase rate or higher contact rate – our low-EV-gap hitters performed about as well as their unfiltered counterparts. If anything works well for this group, it’s contact rate, but the sample size is absolutely miniscule. Only 17 hitters qualified over my six years of data, which makes me want to discard it as insufficient.

Now, let’s bring in the lazybones who skipped down to the end: as I suspected, hitters with phenomenal cosmic power but an itty bitty hard hit rate are a useful group of players to scan through for breakouts. You just need to add one important factor: contact rate improvement. Hitters who already hit the ball hard but don’t do so consistently improve by a lot when they start to make more contact. They improve by far more than a reasonable control group. This definitely isn’t settled science; there are a pile of potential problems with the data. But I’m satisfied that there’s something here.

If you’re looking at some early-season results and wondering which hitters to believe in, the ones who already hit it hard and now hit it more frequently are a good group to trust. As a parting thought, here’s the list of hitters to keep an eye on that I posted in my earlier article. If these guys start to make more contact at the plate (and who knows how likely that is), I’ll raise my expectations immediately:

Breakout Candidates
Player 95th Percentile EV Avg EV Difference
Oneil Cruz 112.7 91.9 20.8
Jesús Sánchez 109.0 89.4 19.6
Jorge Alfaro 108.8 89.4 19.4
Michael Harris II 108.7 89.4 19.3
Nolan Jones 108.3 88.5 19.8
Shea Langeliers 108.0 87.3 20.7
Joey Bart 107.3 86.6 20.7
Jo Adell 107.2 87.1 20.1
Dermis Garcia 106.6 86.5 20.1
Jose Siri 106.4 86.1 20.3

Ben is a writer at FanGraphs. He can be found on Twitter @_Ben_Clemens.

Newest Most Voted
Inline Feedbacks
View all comments
1 year ago

Interesting, if intuitive, results! I guess the next question is: how do you identify the hitters in this group that are likely to increase their contact rate?

1 year ago
Reply to  D-Wiz

The non-scientific method would be to look for three things:

#1 – a crappy contact rate in the prior season (more room to improve)
#2 – a combination of young age but ~ 2-3 years MLB experience.
#3 – some record of hitting success in the high minors (more relevant the younger the player is)

Trevor May Care Attitude
1 year ago
Reply to  tz

Those of us who have been waiting for a breakout from Siri after following his MVP campaign in the Dominican Winter League (somewhat similar to the high minors) a few years ago were no doubt intrigued by his inclusion on the list. The glove obviously plays in CF, to say the least, so he’ll keep getting chances.

1 year ago
Reply to  tz

Contact rates are probably one of the least fixable things. Guys that couldn’t hit enough used to be AAAA sluggers – this was correct. We are living through the worst generation of hitters in 100 years. Queue the juiced balls!