What Statcast’s New Bat Tracking Data Does and Doesn’t Tell Us

May 14, 2024

The bat tracking era is here, and nothing will ever be the same again. Wait, no, that’s not right. Baseball is going to continue pretty much exactly as it was. Pitchers will throw the ball, hitters will swing at it, and then people will run around the field either trying to catch it or touch bases. But baseball analysis is going to start looking different, because we analysts have new shiny toys and a plethora of new ideas to test out. That’s very exciting, and also possibly a little overwhelming. So today, I thought I’d take you on a tour of what the high-level summary numbers do and don’t say about hitting, as well as stump for more granular analysis. I’m sure I’m not alone on either of those points, but still, it’s good to say it out loud. So let’s talk about average swing speed, average swing length, squared-up rate and blast rate, shall we?

Swing harder, do better, right? Well, maybe. That makes sense broadly, and it particularly makes sense when you look at some of the names dotting the top of the swing speed leaderboard. Juan Soto, Aaron Judge, Yordan Alvarez, William Contreras, Mike Trout, Shohei Ohtani, Gunnar Henderson — there are plenty of hitters at the top of the swing speed leaderboard who are unquestionably excellent.

Pop down to the bottom, though, you could make a pretty great offensive team out of the soft swingers too: Luis Arraez, Steven Kwan, Justin Turner, Marcus Semien, Isaac Paredes, Will Smith, Jose Altuve. In aggregate, there simply isn’t much correlation between average swing speed and offensive production, as measured by wRC+. More specifically, there’s a 0.11 correlation coefficient between swing speed and wRC+. That means, broadly speaking, that variation in swing speed explains only 1% of variation in wRC+ (0.012 r-squared). A quick note: I used an 80 PA cutoff for this and all subsequent calculations in the article, just so we’re comparing apples to apples.

That’s obvious if you stop and think about it. Baseball isn’t a fast swing competition. Swinging fast helps, obviously. But Giancarlo Stanton isn’t the best hitter in baseball history despite almost certainly being the hardest swinger, and Luis Arraez isn’t the worst hitter in baseball, or even close to it. Swing speed is merely one data point that describes a little bit of what a hitter does at the plate.

Swing length is another interesting new data point. Just like swing speed, you can draw some obvious conclusions without even looking through the data. A longer swing means more strikeouts, right? Well…

Swing length and strikeout rate have a .277 correlation coefficient and a 0.077 r-squared. That’s not bad. You can do better by looking at swing length against whiffs per swing, which strips out strike zone judgment (not what we’re looking for here). That gets us an r-squared of .152, which is better on a relative basis, but still not huge – it’s about the same as the year-over-year correlation of BABIP, which we know is quite noisy. Both average swing speed and squared-up rate (in the opposite direction) have stronger correlations to whiffs per swing, in fact. But it’s generally true that slower, shorter swings that prioritize getting the head of the bat on the ball result in fewer strikeouts.

Swing length is also heavily correlated to pull rate. If you’re trying to hit the ball in front of the plate to pull the ball, your bat will naturally travel farther, even with the exact same swing mechanics, than if you meet the ball earlier in your swing. I don’t have a great way of controlling for this yet, but it’s feasible that by controlling for batted ball tendency, you could find even stronger relationships between swing length and contact rate. It’s not to say that a long swing is bad, just that it comes with tradeoffs.

Let’s get back to bat speed. It’s clear from an initial inspection that you can’t say a ton about a hitter’s overall performance just by looking at how fast they swing. There are still some things to learn here, though. For example: The harder your average swing, the more damage you do on contact in general. I’ve listed enough correlation coefficients to put even the most stat-obsessed readers to sleep, so at this point I’m just going to show a grid of all of them and be done with it:

A Big Pile of Bat Tracking Correlation Coefficients

Statistic	wRC+	K%	Whiff/Swing	BABIP	wOBACON	xwOBACON
Average Swing Speed	0.110	0.406	0.459	-0.007	0.351	0.510
Hard Swing Rate	0.164	0.301	0.376	0.016	0.357	0.511
Swing Length	-0.013	0.277	0.390	-0.087	0.148	0.186
Squared-Up%	0.142	-0.667	-0.664	-0.019	-0.184	-0.167
Blast%	0.361	0.007	0.043	0.111	0.381	0.573

Why is swing speed more closely correlated to xwOBACON than wOBACON? Two reasons. First, we’re dealing with small samples across the board and production on contact is noisy, which means that a hitter with a ton of seeing-eye singles on mishits can mess up the data. Second, wOBA cares a lot about the horizontal angle of your hits, but neither xwOBA nor swing speed does. Isaac Paredes doesn’t swing hard, and doesn’t hit the ball particularly hard as a result. His xwOBACON and swing speed agree. But he’s dumping those batted balls over the left field fence for home runs, and wOBA knows that.

I suspect that raw bat speed data is going to end up a lot like other raw pitch and exit velocity data: interesting but incomplete. I didn’t need a leaderboard to tell me that Giancarlo Stanton swings harder than any other player in baseball, because I have seen Giancarlo Stanton swing before. It’s really cool that we’re now measuring this, and I’m sure you’ll hear it on broadcasts all the time going forward, but the link to production is tenuous enough that I don’t think it’s a great statistic all by itself.

Now onto the slightly more complicated statistics: squared-up rate and blast rate, which measure hitters’ ability to hit the ball right on the nose, and do so with high swing speed in the case of blasts. Squared-up rate seems like an obviously great metric right off the bat. When you hit the ball right on the sweet spot, you’d expect a lot more line drives. After all, Luis Arraez is the king of soft line drives and also the king of squared-up rate. Just one problem: there’s no correlation between squared-up rate and line drive rate, at least in 2024 data.

Now, maybe that’s just a sample size issue. Line drive rate is noisy even at a seasonal level, never mind after a month and a half of play. If you zoom in closer, the effect seems real. Batted balls that Statcast categorizes as squared up carry a 28% line drive rate so far this year; balls that aren’t squared up have a 19.1% mark. The issue here is that for a hitter who increases their squared-up rate by five percentage points, we’re talking about an increase of 0.4 percentage points of line drive rate. Half the players in baseball have a squared-up rate between 22% and 29.4%. The differences here are small. Beyond that, squared-up rate is nearly uncorrelated to BABIP, wOBACON, and xwOBACON. Should we just give up on squared-up rate?

I don’t think so, despite those uninspiring numbers. Squared-up rate and average swing speed are quite correlated themselves, and in the logical way. The harder you swing, generally speaking, the less frequently you hit the ball square. That’s why Juan Soto’s combination of fearsome swings and great contact is so impressive. If you run both swing speed and squared-up rate through a multivariate linear regression against measures of production on contact, they’re both significant. In other words, swinging harder and squaring the ball up more frequently both increase production.

Without digging too deep into the statistical minutiae, these two statistics are so correlated that I don’t have a lot of confidence in that regression. But even after you correct for that multicollinearity, there’s a clear relationship: swing the bat harder or make optimal contact more frequently, and you’ll tend to do better on contact. But even then, those two things don’t explain all or even most of a hitter’s production on contact. There’s plenty more than just swinging hard and catching the ball on the barrel of the bat. That’s a milquetoast conclusion, sure, but it’s still a useful one to me; it’s good to make sure that two plus two is four before you start on differential equations.

The same is generally true of blast rate, the rate at which a hitter squares the ball up while swinging hard. It does a bit better because it’s capturing what I was talking about up above; harder swings square the ball up less frequently in general, so you want both when you’re looking for production. But again, plenty of other variables go into this as well. As you might expect, swinging your bat hard and squaring the ball up are at their best when it comes to producing solid contact at positive launch angles.

That sounds a lot like launch angle and exit velocity, and we know that those inputs do a good job, but not a perfect job, of explaining production. As best as I can tell, that’s just a fundamental limitation of statistics like this that isolate a small portion of what’s involved in hitting a baseball. There are plenty of other things you can do to generate value, and some of them might even decrease your swing speed or squared-up rate, which will forever frustrate analysis.

Okay, so we know that neither raw swing speed nor squared-up rate do a great job of predicting overall production. What can we learn from this new bat tracking data, then? First of all, it’s just incredibly cool that it exists. Hitters can and should behave differently based on their swing speed, and now we can quantify that more than ever before.

More importantly, the neat part of this data is largely in granular interpretation. I find it fascinating that in-zone swings at secondary pitches are meaningfully faster, on average, than in-zone swings at fastballs. Hitters swing faster at in-zone fastballs when they’re ahead in the count than when they’re behind in the count; that makes intuitive sense, but we have the actual evidence now. Before, you could say something like, “Hitters can sit on a fastball thrown to a particular area when they’re ahead in the count and unload if they get it, but they have to react when they’re defending the zone,” but now you can prove it. They do much better on those early-count swings, which we already knew; now we just know why with more certainty.

Here’s a fun one: Early-count secondary pitches get squared up less frequently than two-strike ones, but at higher bat speeds. There’s a strong intuitive pattern here. Hitters slow their swing and prioritize contact when they get behind in the count. But even though they’re squaring the ball up slightly more frequently, that squared-up contact is less productive – they’re swinging more slowly, after all.

When we get more months and years of bat tracking data, the applications will only increase. Is a hitter cold because that’s how hitters get sometimes, or is he physically compromised? Is that new shorter swing making up for its lower bat speed with better contact numbers? Has that aging veteran remade his swing to prioritize contact now that his bat speed is flagging? Mike Petriello is already talking about new applications, too: attack angle and miss distance will put swing speed data in much better context, and I’m excited to get them in the fold.

There are surely some cool applications on the pitching side, too; obviously pitchers don’t do a ton to affect bat speed, but “avoiding the fat part of the bat” has long been the holy grail of contact managers. Only 11.1% of swings at Hunter Harvey’s fastball square the ball up, while 34.4% of swings at Adrian Houser’s fastball do. That sounds like an amazing discovery, but we’ll need to see how stable these statistics are to really know for sure.

When we find ways to measure something previously unmeasurable, it’s tempting to ascribe great untapped analytical power to those things. And to be clear, I think that there are going to be some cool advances in public-side analysis that wouldn’t have been possible without this data. One thing that almost certainly won’t advance public-side analysis, though, is asking, “Oh hey, who swings the hardest?” and leaving it at that.

So go out and have fun looking at this new information, and reading the interpretations and musings of people like me who are trying to find some new stories to tell with it. But be aware of the inherent limitations. Bat tracking isn’t enough to tell you who’s good and who isn’t, and it doesn’t have to be. We already have a bunch of ways of measuring that. Now, we’re just expanding our horizons a bit more.

27 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

HappyFunBallMember since 2019

1 year ago

“obviously pitchers don’t do a ton to affect bat speed”

If a batter’s swing speed (or variation thereof) is a reaction to the count, pitch type, and location then I’d say that a pitcher who knows those tendencies absolutely does have some control

CC AFCMember since 2016

Reply to HappyFunBall

I think the point is more like a pitcher doesn’t control that any specific hitter swings at 74 mph vs 76 mph, as an example, rather than they don’t control whether a hitter has to utilize their “A” swing in a particular count.

RyanMember since 2024

And the opposite may be true! In my comment above, I suggest that’s where the real hitter skills would stand out: In the ability to slightly change your swing speed and length based on pitch recognition.

Last edited 1 year ago by Ryan

A Salty ScientistMember since 2024

Reply to Ryan

Right. Conventional wisdom is that hitters should “shorten up” with 2 strikes. But what does that look like in practice? Do some hitters just have one swing, even with 2 strikes? Does shortening up more correlate with contact%?

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG