In part one of this three-part series, we examined the (relatively strong) correlation between exit velocity and slugging percentage — and the (also relatively strong) correlation with individual wOBA, a solid proxy for offensive production at the plate. While there might be some debate over how important exit velocity is on offensive production — particularly when we dial down to an individual level — we know there is some relationship, and that relationship is enough to answer the next question, which is whether exit velocity represents a repeatable skill.
We first attempted to answer the question of whether exit velocity matters. Once we know that it matters, it is still incredibly important to try and determine if it is a skill. An appropriate analogy might be as follows: we know that pitcher BABIP against is important because when the BABIP is higher, the pitcher gives up more hits and runs. Unfortunately, we know a lot less about determining pitchers who can suppress BABIP or pitchers who seem to be prone to a high BABIP. We might believe that it is a repeatable skill; however, if it takes an incredibly long time to figure out who has the skill and who does not, then using a pitcher’s BABIP against to try and predict future performance is of limited use.
With just one season of mostly complete exit-velocity data, coming to definitive conclusions is impossible — especially since we do not have enough data to be sure about a stabilization point, nor do we know if exit velocity is the type of information that fluctuates from year to year. However, we can take the biggest swath of data available, and see if exit velocity is a repeatable skill that could have some predictive uses.
To use the data available for individual players, I took every player that recorded at least 100 batted balls in both the first half and the second half of 2015 and compared their numbers. The list of players numbered 130, with 117 qualifying for the batting title, another 12 who recorded at least 425 plate appearances and David Murphy bringing up the rear with 391 plate appearances last year.
To provide appropriate comparisons, I used a number of different statistics from first half to second half to determine a relationship, or lack thereof, between the halves.
We will start with batting average, which has a poor correlation between halves.
The plots are all over the place in this chart. A player’s first-half batting average is not very instructive in determining batting average for the second half. This should not come as much of a surprise given what we know about stabilization points with hitting statistics. In the FanGraphs library entry on sample size, stabilization points are covered. Generally, once the correlation coefficient between one sample and the next same-size sample reach 0.7 (for an r-squared of close to .5), that is generally given as the stabilization point for hitting statistics. In his piece on the subject, Russell Carlton describes it like this:
If I say that 100 PA is enough of a sample to get adequate reliability on a measure (let’s say batting average), then I should be able to take 100 PA and calculate the batting average from those and then another 100 PA with which to calculate AVG. I could do this for some group of players and I could see how well their AVGs in the first group of 100 PA correlate with their second group. Why am I obsessed with .70? Because a correlation of .70 means an R-squared of 49%. Anything north of .70 means that a majority of the variance (> 50%) is stable. Higher correlations mean more stability, which is always better, but .70 is usually “good enough for government work.”
For batting average, the stabilization point is 910 at-bats, nearly two full years. As we have just one year of exit-velocity data, in order to compare apples to apples, we looked at just one year of batting-average data above, supplemented by our knowledge of the stabilization rate, even though we have multiple years of data for batting average.
So what happens if we look at a scatter plot for a statistic with a much better stabilization rate. For ISO, the stabilization rate is 160 plate appearances. This is what the scatter plot from half to half for ISO looks like from 2015.
We know that running the data for all players over many years, we would see a higher correlation due to know what we know about the stabilization rate. Given that we are looking at 130 players in one season, that .56 correlation coefficient and 0.31 r-squared are understandably lower.
This one last scatter plot before showing exit velocity depicts wOBA marks for our sample of 130. The plot below shows the relationship for offensive production at the plate between halves.
As the above plot shows, there is some relationship between first-half and second-half wOBA, unlike the complete lack of a relationship for batting average. However, the relationship is not as strong as the chart we considered for ISO. Now, let us look at the relationship between first-half and second-half exit velocity.
This is a slightly stronger relationship than ISO, which has a fairly quick stabilization rate. I ran the numbers requiring a higher number of batted balls in the first (150) and second half (120) limiting the number of players included to 90, and I received virtually the same results when it came to correlation. We have what appears to be a relatively strong relationship when it comes to exit velocity by half compared to other statistics. The chart below has a rundown of a number of statistics, their correlation numbers for last season as well as the stabilization rates from our library.
|Stablization Rate (PA)||r||r-squared|
The relationship between first- and second-half exit velocity is not as strong as walk rate or strikeout rate, but it beats all the other statistics for last year. Eighty-six of 130 players recorded a figure within 2.0 mph of their first-half exit velocity. Only 16 players produced changes greater than 3.0 mph from the first half to the second half.
There is going to be some variance in these results from year to year. As the chart shows, on-base percentage was ahead of slugging despite historical differences. That could mean in other years ISO might jump ahead of exit velocity, but if the exit-velocity data improves, the correlation might be even stronger. In any event, this data is encouraging. The preliminary data here suggests that exit velocity could stabilize relatively quickly, that it is repeatable, and that in-season conclusions based on the data could be meaningful. That leads to the final question I hoped to tackle with this data, which is finding an application for a player’s average exit velocity.
Craig Edwards can be found on twitter @craigjedwards.