A Look at Quality of Contact Profiles

May 7, 2015

It seems like it should matter how hard you hit the baseball. That statement probably seems self-evident, but until this year we haven’t really had a whole lot of evidence to demonstrate whether that’s true. We have an old month of HITf/x data from 2009 and there’s non-public data about exit velocity, but until StatCast data arrived this year, we didn’t really have the tools to determine how much quality of contact matters.

Last week, FanGraphs launched quality of contact statistics courtesy of Baseball Info Solutions to add to this effort. The methodology isn’t based solely on a raw exit velocity, but the data stretches back to 2002 and it’s publicly available now and easy to manage. As soon as people realized the data was available, the sabermetric masses went to work to run preliminary tests on the data. One of the interesting things that showed up right away was that the data didn’t do a great job predicting itself in the future and things like Hard% didn’t correlate with stats like BABIP or LD% as well as we might have otherwise thought.

One possible reason for this is that the data might not be super accurate. There’s no way to run quality control tests without some type of independent batted ball technology and you and I don’t have access to enough data to make that work. That’s always a possibility, but BIS does great work so I doubt that’s the issue. Another reason the numbers might not line up is because quality of contact is based on the type of batted ball, so a hard fly ball is defined as hard relative to other fly balls and not all batted balls in general. In other words, when looking at some of these correlations, it’s possible we just didn’t get granular enough with the numbers to get the results we were expecting.

I don’t think we know enough to really say why the year-to-year correlation for Hard% isn’t very high or why it doesn’t predict BABIP better than it does, but I also think looking correlations for these kinds of statistics leaves a lot of ground uncovered. In essence, because batted balls are classified into one of three categories, a simple correlation isn’t the right methodology for seeing how these numbers relate to other stats or to themselves in the future. Realistically, you want to run a multivariate regression because a 15-50-35 soft-medium-hard profile is very different from a 25-40-35 profile even though the Hard% is exactly the same.

But the problem with running multivariate regressions is that they’re not very accessible. The science is better than what I’m going to show you, but the learning curve is much flatter for the data I’m going to present and will likely be almost as useful to the average reader. Keep in mind that we’re dealing with some shortcuts here and that I’m averaging out some relationships that might not be as simple as they look.

With those caveats out of the way, let’s take a look at how quality of contact profiles relate to a few stats, namely wRC+, ISO, and BABIP.

For this post, I took all players with at least 500 PA from 2012-2014. That’s 381 players, close to half a million plate appearances, and something like 325,000 classified batted balls. I then divided the sample of players up into four groups for each of the stats, so there’s a group of players with wRC+ ranging from 49-84, 85-99, 100-113, and 114-170, all with a roughly equal number of hitters in each group. I did the same thing for ISO and BABIP with different cut points, which are listed below.

Next, I calculated the average quality of contact profile for each group. To be clear, the Soft% for the top tier wRC+ group is the total number of soft hit balls divided by batted balls and not the average of each players Soft%. Some rounding was necessary, but it doesn’t look like that caused any problems.

The goal of this exercise is to help us get an idea of what a “good” quality of contact profile might look like. You can rank players by Hard%, but to really know if a player is doing well, you need to know all three numbers at once. Avoiding soft contact might be as valuable as hitting the ball hard, or it might not. Let’s dive in.

Weighted Runs Created Plus (wRC+)

wRC+	Soft%	Med%	Hard%
114+	14%	51%	35%
100-113	15%	54%	31%
85-99	17%	55%	28%
49-84	18%	57%	25%

Aside from getting a sense of the typical profile, the thing you learn here is that the differences are very subtle. The difference between the best hitters and the worst hitters is a 10% drop in hard hit balls divided up about evenly between soft hit balls and medium hit balls. In other words, we’re talking about two or three batted balls a week being hit soft or medium compared to hard creating the difference between great and awful. That seemed surprising at first, but it actually lines up pretty well with what we know about the game. The differences between great hitters and bad hitters is actually not that significant, as Crash Davis will surely remind you.

Isolated Power (ISO)

ISO	Soft%	Med%	Hard%
.179+	14%	50%	35%
.146-.178	16%	53%	32%
.111-.177	16%	55%	29%
.046-.110	18%	59%	23%

Moving to ISO doesn’t change much. There’s a slightly larger gap between great and awful hitters in the hard to medium columns because you basically can’t hit for extra bases without hitting the ball hard. There aren’t a lot of soft doubles, triples, and homers, but there are a lot of ways to get singles without striking the ball very hard.

These first two tables are similar for a reason. Players with high ISO generally have high wRC+. They’re not perfectly overlapping groups, but it makes sense that power and overall production are similarly related to quality of contact profiles.

Batting Average on Balls in Play (BABIP)

BABIP	Soft%	Med%	Hard%
.322+	14%	54%	32%
.301-.321	16%	54%	30%
.281-.300	16%	54%	30%
.223-.280	17%	54%	29%

Here’s the one that grabs you. Compare this table to the two above and you notice a very different picture. The average profile of a high BABIP guy and a low BABIP guy are virtually the same. Keep in mind that these are large samples of PA, so you’re not getting a lot of single year BABIP spikes throwing things for a loop.

Cautiously, I think this table demonstrates that how hard you hit the ball is not really a determining factor when it comes to running above or below average BABIPs. When you think about it, that makes some sense. A hard hit fly ball is very different from a BABIP perspective than a hard hit line drive. The fly ball might have a higher run value, but it’s less likely to become a hit. Guys who don’t hit the ball very hard can run high BABIPs if they have the right spray chart and trajectory and guys who kill the ball might have low BABIPs if they hit a ton of pulled fly balls.

What this new data tells us is that high BABIP, average BABIP, and low BABIP hitters all tend to hit the ball with pretty similar authority. Guys who hit the ball hard don’t get more hits, their hits are just more valuable.

All of this is preliminary and hopefully it helps you put some context around the new statistics. We also haven’t determined the “stabilization” rate for these statistics, so while I used a very large sample of data, we don’t know how useful these stats are in month long samples versus year long samples, so that’s another point of caution.

In general, these stats seem to be pretty useful and could certainly offer some insight into struggling players and players who are breaking out when it comes to extra base power. I’m sure we’ll see plenty more about these numbers going forward, but for now, you have a little context to separate different quality of contact profiles.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG