The production of new data by means of new recording technology is exciting — and the more data we get, the better we can become at analyzing said data. We have come a long way since PITCHf/x was made available, but we still have much more to learn. We also now have Statcast data with defensive numbers and figures — as well as exit velocity for hitters and against pitchers — and right now that data is very interesting. But a lot of people are all working very hard to transform the data from merely interesting to actually useful. If it remains interesting without becoming useful, it is still fascinating information to have, but also trivial from an analytical perspective. Organizations want the information to be useful. Exit velocity, one of the streams of data rendered available by Statcast, appears to have the potential to be very useful. Right now, however, I am still unsure what we have, and I am not alone.
As Ben Badler of Baseball America recently noted on Twitter:
Common thing I’m hearing from execs: They have an enormous amount of new data, but they’re still learning to turn it into usable information. Even the more data-driven organizations are still just scratching the surface of separating signal from noise and understanding what has predictive value.
Back in September, I gathered a bunch of exit velocity data on major league pitchers, and attempted to make some sense of it. There seemed to be some evidence to suggest that if a low exit velocity was a repeatable skill, then it might be helpful in limiting home runs. Not exactly groundbreaking, but at least from my perspective, interesting. Others have studied the data and found that exit velocity was five parts the responsibility of the hitter and just one part the responsibility of the pitcher, so perhaps in retrospect, I should have focused on hitters. Below represents my current attempt.
The same caveats with respect to the exit velocity data with pitchers are also true with hitters. Here’s a disclaimed from my September post that remains applicable to the current piece:
Tony Blengino discussed the issues earlier in the summer, determining that roughly one-quarter of all the batted ball data is unavailable, and that the numbers which are available are the product more often of positive outcomes (because Statcast has trouble recording weakly hit balls). Blengino concluded:
“Don’t get me wrong… Statcast is a great thing, and we are only scratching the surface of what it can eventually become. The sample generated for this article yielded some benchmarks which will serve as the foundation for some analysis you will see here in the coming weeks. Still, when one is faced with a data set, one must put it into some sort of context, while acknowledging its limitations. In many of my previous articles here, I have warned readers to never take pure average velocity data at face value; launch angles, BIP type frequencies, pull percentages, etc., significantly affect hitter and pitcher performance, and can be easily be overlooked. For this year, at least, we should additionally be aware of the Statcast data set’s unique shortcomings, which adjust the context within which analysis takes place.”
With those caveats in mind, we still have a very large dataset to use from the 2015 season. In future seasons, hopefully we can fine-tune the work with a view towards become more confident in the results, but there is still a good amount of data from which to learn. As I approached this data, I had three goals, as follow.
- Determine if exit velocity matters.
Some of this work has already been done for me and much of it is intuitive, but we want to make sure that, in general terms, the harder a ball is hit, the more likely it is to have a favorable result for the batter, and to examine if there is a cumulative effect for individual players.
- If exit velocity matters, determine if it is repeatable.
We know that hitters’ strikeout and walk rates stabilize quickly, and that isolated slugging is not too far behind, but that we need years for batting average and BABIP to stabilize. While we only have one year of data available, we can look at the two largest samples we have in pre- and post-All-Star Break to see if the numbers hold up.
- If exit velocity is repeatable, attempt to find some application for the data.
Is there something to be learned from the data that might be useful in future seasons?
Does Exit Velocity Matter?
The short answer is yes, but the question of how much is also relevant. Back in August, Bradley Woodrum wrote an extensive piece for The Hardball Times looking at the soft, med, and hard percentages from Baseball Info Solutions as well as the exit velocity data. Woodrum found that the information from Baseball Info Solutions, with a big sample (2002-2015), was much more closely correlated with offensive numbers like ISO and wRC+ than exit velocity, and even in a smaller sample, the BIS numbers were slightly superior.
We also know there is a bit of a donut hole, where weakly hit balls at certain launch angles tend to just make it over the infield for hits. However, batted balls in that particular range (65 mph to 74 mph) account for less than 10% of all batted balls, and with very few extra base hits coming on those hits, the overall effect on offense is not as exaggerated as the effect on batting average there might lead us to believe. Removing these pitches had little effect on a player’s average exit velocity. Among players who recorded at least 100 batted balls, the correlation between average exit velocity and percentage of balls hit at least 90 mph — both before and after the All-Star Break — was strong (r^2=.85).
Despite the issues above, we know that generally, the harder a player hits the ball, the more likely it is to become a hit and do considerable damage. This tweet from Darren Willman provides the information breaking down outcomes by batted ball velocity.
— Daren Willman (@darenw) October 24, 2015
With this information, we can identify the relationship between exit velocity and that stat on which it would seem to exert the most influence — namely, slugging percentage. To examine the correlation between the two, I began by finding the midpoint of each exit-velocity range for which there was a sample of at least 1,000 at-bats. Then I took the various total-base outcomes from Willman’s data to determine slugging percentage. Having made those calculations, I found the correlations between the exit-velocity buckets and slugging.
Even with the known donut hole in the 65-74 mph range, the correlation is very strong between exit velocity and slugging percentage. Beginning with the highest numbers, slugging decreases rather steeply between each of the buckets until reaching a plateau around the 85-89 mph range.
Despite the regularity of such figures across the entire population of batters, transferring those results to individual players is unlikely to produce such obviously strong relationships, given the vastly smaller samples. That said, there is still some relationship there.
Looking at a sample of the 130 players with at least 100 batted balls in each half last season, we find that the top-30 players by average exit velocity produced an average wOBA of .371, with only Wilson Ramos below .325 and only Mark Trumbo, Albert Pujols, and Robinson Cano below .340 on the season. Meanwhile, the bottom-30 players by exit velocity recorded an average wOBA of .308, with only Charlie Blackmon, Adam Eaton, and Francisco Cervelli topping .340 and no players topping .350 for the year. This is not to say that a player cannot succeed with a low exit velocity, or that a player with a high exit velocity is guaranteed success, but there is a pretty clear relationship between the two.
The graph below shows a scatter plot of the 130 players with their average exit velocity and wOBA.
The relationship is not perfect, but the correlation is actually better than the one Woodrum found in his piece in August (correlation for wOBA and wRC+ with exit velocity are nearly identical). While a correlation coefficient of close to 0.6 and r-squared around 0.4 might not seem high, it is probably appropriate to provide some level of context to more common statistics.
The table below contains the the correlation numbers between a number of fairly common statistics (plus exit velocity) and overall production, as represented by wOBA. The data set here is the same as that produced by the 130 players mentioned above.
The results in the table provide for a mixed set of conclusions. On the one hand, exit velocity is sits well below the more important offensive statistics when it comes to correlating with overall offensive output, even falling behind batting average. Generally speaking, the chart helps show why FanGraphs and others encourage people to look at on-base percentage as opposed to batting average, as the former is more closely tied to offensive production.
To look at the data in a more optimistic light, it’s notable that exit velocity is so close to batting average. Batting average, like on-base percentage and slugging percentage, is a results-based stats that has a direct component-link to the overall production represented in wOBA. Exit velocity, on the other hand, represents nothing other than how hard the ball was hit. With batting average, we know that a ball dropped in and that the corresponding change in wOBA will move in the same direction. Despite lacking the result for exit velocity, the correlation is similar to batting average.
While the relationship might not be as strong as one would hope from this new technology, it is possible that improvements might provide a stronger relationship in the future. Even with the information that we have now, it is fair to say that there is a relationship with batted ball velocity and offensive production for hitters, allowing us to move forward to try to answer the next question: is exit velocity a repeatable skill?
Craig Edwards can be found on twitter @craigjedwards.