Archive for Research

The Messy Middle Part of the Season

Bill Streicher-USA TODAY Sports

Remember back in 2021 when Gen Z tried to tell everyone to move their side parts to the middle and swap their skinny jeans for a looser variety? While most Millennials responded with outward indigence, offline they begrudgingly tried on high-waisted mom jeans and posted up in the bathroom blowing out their hair in a new direction. But before long they let their hair go back to lying in the manner to which it had become accustomed and eschewed jeans completely in favor of athleisure-wear. Even as many of us considered complying with the directive of our teenaged overlords, it felt absurd that people who haven’t even finished developing their prefrontal cortexes are left in charge of dictating what’s cool. As it turns out, though, that’s exactly why teenagers decide what’s cool. Teenagers are the only members of society with the time, energy, and lack of rationality to care so deeply about something that matters so very little.

Those who stuck to their dated stylings and weathered the petty hail storm of Zoomer mocking were vindicated a couple of months ago, when the celebrity and influencer cohort brought back the side part, declaring it on-trend once more. Around that same time another trend was taking hold among the baseball commentariat: Using strength of schedule to determine which teams had actually earned their W-L records. Mostly, this meant arguing that the Phillies weren’t a top team in the league because they’d played a soft schedule. The discourse eventually spawned multiple articles arguing that while yes, Philadelphia hadn’t exactly been slaying dragons while walking a tightrope, its act wasn’t entirely smoke (generated by the clubhouse fog machine) and mirrors either.

Strength of schedule is not typically a prominent talking point when comparing MLB teams. It might occasionally come up when comparing September schedules in a tight postseason race, but as a phrase uttered in May, it’s typically part of a college baseball discussion, or because you’ve wandered into a BCS-era college football forum. College sports need strength-of-schedule metrics because teams don’t all play one another and the variation in team quality spans the Big Ten’s new geographical footprint. But in the major professional leagues, the schedule is fairly balanced, and even though the White Sox and Rockies exist, dominating the worst teams in MLB presents a tougher task than rolling over the University of Maryland Baltimore County Golden Retrievers. Read the rest of this entry »


Bat Tracking Shows That Hitting Is Reacting

Isaiah J. Downing-USA TODAY Sports

It’s been five weeks since Major League Baseball unveiled its first trove of bat tracking data. In that time, we’ve learned that Giancarlo Stanton swings hard, Luis Arraez swings quickly, and Juan Soto is a god who walks among us unbound by the irksome laws of physics and physiology. We’ve learned that Jose Altuve really does have the swing of a man twice his size, and that Oneil Cruz has the swing of a slightly less enormous man. Mostly, though, we’ve learned when and where batters swing their hardest. This is my fourth article about bat tracking data, and in gathering data for the previous three, I constantly found myself stuck in one particular part of the process: controlling for variables.

As baseball knowledge has advanced from the time of Henry Chadwick to the time of Tom Tango, we first found better, more descriptive ways to measure results. We went from caring about batting average to caring about OPS. We found better ways to weight the smaller results that add up to big ones, going from ERA to FIP and from OPS to wRC+. Then we got into the process behind those results. We moved to chase rates and whiff rates, and the ratio of fly balls to groundballs. With the advent of Statcast, we’ve been able to get deeper than ever into process. We can look at the physical characteristics of a pitch, just a single pitch, and model how well it will perform. Within a certain sample size, we can look at a rookie’s hardest-hit ball, just that one ball, and predict his future wRC+ more accurately than if we looked at the wRC+ from his entire rookie season.

Similarly, when I looked at average swing speed and exit velocity from the first week of bat tracking, I found that swing speed was more predictive of future exit velocity. Exit velocity is the result of several processes: You can’t hit the ball hard unless you swing hard and square the ball up, and you can’t square the ball up if you pick terrible pitches to hit. Between 2015 and 2023, our database lists 511 qualified batters. I measured the correlation between their average exit velocity and their wRC+ over that period. R = .63 and R-squared = .40. But because bat tracking takes us one more step away from results and toward process, it’s further divorced from overall success at the plate. The day after bat speed data was first released, Ben Clemens ran some correlation coefficients between some overall metrics of success. He found a correlation of .11 between average swing speed and wRC+. Now that we have more data, I re-ran the numbers and found that correlation has increased to .25. That’s a big difference, but over the same period, the correlation between wRC+ and average exit velocity is .47.

If you want to know how hard a batter is swinging, you’ll find that it’s dependent on the count, the type of pitch, the velocity of the pitch, the location of the pitch, the depth of contact, and whether contact takes place at all. As a result, if you want to measure any one factor’s effect on swing speed, you need to control for so, so many others. The more I’ve sorted through the data, the more I’ve come to appreciate the old adage that pitchers control the action. Bat tracking shows us just how right people are when they say that hitting is reactive. It shows us that different pitches essentially require different swings.

When Tess Taruskin started putting together her Visual Scouting Primer series, she asked around for scouting terms and concepts that people had a hard time picturing. Barrel variability was at the top of my list. I know that Eric Longenhagen is giving a glowing compliment when he says that a player can move his barrel all around the zone, but I’ve always had trouble picturing that. Maybe it’s because of the way I played the game when I was younger, but I’ve never really understood the concept of a grooved swing. When I was digging through the bat tracking data, seeing the effect of the pitch type, the location, and where in space the batter has to get the barrel in order to make solid contact, it finally clicked.

There’s obviously a reason that every hitter has a book, a certain way that pitchers try to get them out. I’m just not sure I ever connected it quite so clearly to the physical act of swinging, the flexibility, quickness, strength, and overall athleticism required to execute a competitive swing on different kinds of pitches in different locations. And that’s before we even get to the processing speed, judgment, and reaction time that comes with recognizing the pitch and deciding not just whether to swing, but how to attack the ball. Bat tracking highlights the how.

There are a million ways to succeed at the plate. Derek Jeter used an inside-out swing to send the ball the other way. Isaac Paredes uses an inside-even-further-inside swing, reaching out and hooking everything he can down the line. Chas McCormick and Austin Riley time their swings in order to drive a fastball to the right field gap and pull anything slower toward left. Arraez, like Tony Gwynn before him, stays back and places the ball in the exact spot that he feels like placing it. Ted Williams preached a slightly elevated swing, making him the progenitor of today’s Doug Latta disciples, who try to get on plane with the ball early and meet it out front, where their bat is on an upward trajectory. Some players talk about trying to hit the bottom of the ball in order to create backspin and carry. I could go on and on. But no matter what school of thought batters subscribe to, they’re not the ones who decide what kind of pitch is coming. Bat tracking data show us just how adaptable their swing has to be. Here’s a map of the 13 gameday zones, broken down by the average speed of competitive swings in each zone for right-handed batters.

The batter can bend at the waist and drop his bat head on a low pitch, especially inside. A high pitch requires a flatter swing, and it’s much more about pure rotational speed. An outside pitch requires hitting the ball deeper, where the bat might not have reached full speed yet, but it also allows the batter to get his arms extended. I just described three different skills, and there are plenty more that we could dive into. Because every batter is an individual, each will be better or worse at some of them than others.

At the moment when all this clicked, I thought of Shohei Ohtani. Ohtani hits plenty of balls that are very obviously gone from the second he makes contact. But he also hits some of the most awkward home runs imaginable, swings that end up with his body contorted in some weird way that makes it seem impossible that he managed to hit the ball hard. He looks like he’s stepping in the bucket and spinning off the ball, he looks like he’s simply throwing out his bat to foul off an outside pitch, or he looks like he’s just not swinging very hard, and yet the ball ends up over the fence. Somehow this ball left the bat at 106.4 mph and traveled 406 feet.

It might appear that this swing was all upper body. However, a swing is a little bit like cracking a whip, where you’re working from the bottom up to send all of the energy to the very end of the line. Some hitters are better than others at manipulating their bodies to time that energy transfer perfectly. Here’s another way of looking at this.

On the left are the 26 homers that Cody Bellinger hit in 2023. On the right are Ohtani’s 44 homers. I realize that because Ohtani hit 18 more, his chart looks more robust. But it’s not just about the number of dots. It’s about the spread. I’m not trying to pick on Bellinger. I used him in part because he had a great season. I found his pitch chart by searching for players with the highest percentage of home runs in the very middle of the strike zone. At 46%, Bellinger had the highest rate of anyone who hit 20 home runs. If you make a mistake in the middle of the zone, he’ll destroy it. On the other hand, Ohtani is capable of hitting the ball hard just about anywhere. It’s even clearer if you look at the two players’ heat maps on hard-hit balls from last season.

Bellinger has never been the same player since his 2019 MVP campaign, and it’s generally assumed that the significant injuries that followed affected his swing. He can still do major damage, but on a smaller subset of pitches. This is one of the reasons that scouts focus so much on flexibility and athleticism and take the time to describe the swings of prospects as grooved or adaptable, long or short, rotational or not, top-hand or bottom-hand dominant. These things may not matter much in batting practice, but if there’s any kind of pitch you can’t handle, the game will find it. The best hitters find a way to get off not just their A-swing, but a swing that can succeed against whatever pitch is heading toward them.


What Happened to All Those Steals of Third Base?

Charles LeClaire-USA TODAY Sports

Athletes like Elly De La Cruz can skew our perception of reality. His powerful arm makes most shortstops look like they throw with a wet noodle. His 99th-percentile sprint speed makes most other baserunners look like they’re running on sand. His tall frame, which our website somehow lists at 6-foot-2, makes that guy on Hinge who claims he’s 6-foot-2 look like he’s actually 5-foot-8. Oh, and his 13 steals of third base this year might make you think steals of third are at an all-time high, which couldn’t be further from the truth.

As a fan of highly specific baseball stats – a bold statement to make on this website, I know – I like to check in on the stolen base rates at each bag. Practically speaking, that means I pay particularly close attention to steals of third, the oft-forgotten middle child of stolen bases. Steals of third are too common to receive the same amount of attention as steals of home; at the same time, they’re infrequent enough that they’ll always be overshadowed by the sheer number of second-base steals. Steals of home are almost guaranteed to make tomorrow morning’s highlight reel. Steals of second outnumber all others and thus dictate league-wide stolen base trends every year. Steals of third are stuck in the middle, and that’s especially true this season as their siblings are taking even more of the glory than usual.

The stolen base success rate at home (16-for-29, 55.2%) is the highest it’s been since at least 1969. Indeed, it’s above 50% for only the second time in that span. In addition, runners are on pace to steal home 36 times this year, which would rank second in the divisional era and well within shouting distance of first (38 SBH in 1998). Meanwhile, the overall stolen base rate (i.e. steals per game) is also on the rise, primarily driven by an increase in steals of second. The league is on pace to steal second base 166 more times in 2024 than it did last year, a 5.6% increase, as runners continue to test the limits of the New Rules™. Read the rest of this entry »


The Kirby Corollary: Why Batters Don’t Swing at Sliders

Jay Biggerstaff-USA TODAY Sports

George Kirby had Javier Báez right where he wanted him. It was October 3, 2022, the last start of Kirby’s excellent rookie year, and Kirby had Báez, the king of chasing sliders off the plate, in an 0-2 count. His catcher, Cal Raleigh, set up off the plate, suggesting that Kirby would be targeting the outer edge.

Kirby hit his target with a well-executed slider. And Báez, instead of whiffing, hit it out of the park.

Báez wasn’t fooled; at seemingly no point did he think that pitch was a fastball. And Kirby’s lack of deception — defined here as a lack of overlap between the horizontal release angle (HRA) of his fastball and slider — may have played a part. Read the rest of this entry »


Maybe the Launch Angle Revolution Wasn’t Really About Launch Angle

Gary A. Vasquez-USA TODAY Sports

Over the past month, smart people have been deciphering the relationship between swing length and pitch location in MLB’s new bat tracking data. If you’re looking at raw data, it’s hard to know whether someone has a long swing because they like inside pitches (Isaac Paredes) or because their swing is actually long and loopy (Javier Báez). In order to make solid contact with an inside pitch, the barrel needs to meet the ball out in front of the plate, which means that it will take a longer journey to the point of contact than it would to meet a pitch over the middle of the plate. Below is a breakdown of Luis Arraez’s swing length against fastballs. As you can see, even the king of the short swing gets long when he has to reach pitches up-and-in or down-and-away.

Some of this is as old as the game itself. It’s the reason pitchers throw fastballs up and in, where a necessarily longer, slower swing makes them harder to catch up with. Bat tracking has given us numbers to back up another intuitive part of the game: Swing length is positively correlated with bat speed, confirming that players who are short to the ball sacrifice bat speed for bat control and contact ability. Those two correlations, pitch location to swing length and swing length to bat speed, got me thinking about the launch angle revolution.

The launch angle revolution really got its hooks into Major League Baseball in 2015. That’s the year Joey Gallo and Kris Bryant debuted, and the year Justin Turner and Daniel Murphy fully turned themselves from contact hitters into power threats. In The MVP Machine, Ben Lindbergh and Travis Sawchick documented what Turner was thinking in 2013, the very first time he tried out the new approach for which teammate Marlon Byrd had been proselytizing. “I was thinking, I’m just going to try and catch the ball as far out as I can in batting practice,” Turner said.

Catching the ball out front often means pulling it, especially in the air. The league’s overall pull rate is roughly the same as it was in 2011, but as you can see from the chart above, its pull rate on air balls — the line drives and fly balls where hitters do damage — hit an all-time high in 2017 and then again in four of the next five years. The exact approaches can differ. “I’m going to be on the fastball and drive it to right center, and if I’m a little early on the slider I’ll catch it out in front,” Austin Riley told Eno Sarris last year. And as Ben Clemens has noted, hitters have increased their pulled balls in the air simply by choosing to attack pitches that lend themselves to being launched in that direction. But strictly speaking, there isn’t a huge inherent advantage to pulling the baseball. If you’re going to hit a long fly ball, it’s better not to hit it to straightaway center, where the fence is deeper and the fielders are better, but that’s equally true for both pulling the ball and going the opposite way. In a sense, pulling the baseball is just a side effect of catching the ball out in front. Read the rest of this entry »


Squared-Up Rate and Launch Angle: A Visual Investigation

D. Ross Cameron-USA TODAY Sports

I continue to find Statcast’s bat tracking data fascinating. I also continue to find it overwhelming. Hitting is so complex that I can’t quite imagine boiling it down to just a few numbers. Even when I look at some of the more complex presentations of bat tracking, like squared-up rate, I sometimes can’t quite understand what it means.

I’ll give you an example: when I looked into Manny Machado’s early-season struggles last week, I found that he was squaring the ball up more frequently when he hit grounders than when he put the ball in the air. That sounds bad to me – hard grounders don’t really pay the bills. But I didn’t have much to compare it to, aside from league averages for those rates. And I didn’t have context for what shapes of squared-up rate work for various different successful batters.

What’s an analyst to do? If you’re like me in 2024, there’s one preferred option: ask my friendly neighborhood large language model to help me create a visual. I had an idea of what I wanted to do. Essentially, I wanted to create a chart that showed how a given hitter’s squared-up rate varied by launch angle. There’s a difference between squaring the ball up like Luis Arraez – line drives into the gap all day – and doing it like Machado. I hoped that a visual representation would make that a little clearer. Read the rest of this entry »


Getting in the Weeds With Bat Tracking

Charles LeClaire-USA TODAY Sports

Like many other nerds, I have devoted a lot of time to slicing and dicing Baseball Savant’s new bat tracking data over the last few weeks. And like many other nerds, I’m not entirely sure how we’ll end up using this wealth of new information. More time, more data, and more brain power is needed to wring out whatever sweeping new truths it may hold. I’m going to write about bat tracking data in a more focused way next week. There are a couple things I think are really interesting; not necessarily new information, but ways that bat tracking data can give us hard numbers for things that we’ve already learned. In this article, I’ll be a bit more scattershot. I’d just like to take you through how I’ve processed all the information that has come out over the last few weeks.

First off, bat tracking will give us new stats that stabilize more quickly than existing ones, as that’s how granular metrics that separate underlying skills from results tend to work. In smaller samples, exit velocity turned out to be a better predictor of overall batting performance than wRC+ or wOBA. Now we have swing speed, which in smaller samples turns out to be a better predictor of exit velocity. To wit, I pulled data from the first week of bat tracking, April 3 to April 9, and compared it to each player’s overall numbers this season. I eliminated any player with fewer than five plate appearances during the first week or fewer than 100 PA during the entire season, which left me with a sample of 295 players. It was no contest. Full-season exit velocity had a much stronger correlation to first-week swing speed (R = .60) than it did to first-week exit velocity (R = .40). It also predicted full-season hard-hit rate better than first-week hard-hit rate (R = .66 for swing speed, compared to R = .46 for hard-hit rate). If, after the first week, you want to know who’s going to hit the ball hard for the rest of the season, don’t look at exit velocity. Look at swing speed:

Read the rest of this entry »


The Dog Ate My Prospect

Joe Nicholson and Jerome Miron-USA TODAY Sports

Season one of One Tree Hill is a perfect season of television, and I will not be entertaining arguments to the contrary. In it we meet Nathan and Lucas Scott, the sons of hometown basketball hero, Dan Scott, who runs a local car dealership. Nathan was raised in the traditional nuclear family structure by Dan and his college sweetheart and wife. Lucas was raised in a single-parent household by his mother, Dan’s high school sweetheart. Despite sourcing their foundational genetic material from the same DNA pool, Nathan and Lucas are depicted at odds with one another in several key ways. Nathan is his father’s golden child and characterized as hyper competitive, entitled, and emotionally stunted; Lucas receives no acknowledgement from Dan and skews more intellectual, reserved, and empathetic. Both are super good at basketball and both crave the approval of their father. Nathan seemingly has it all, but presents as lonely and ill at ease in his environment. Lucas drew the short straw, but is mostly content and supported by several meaningful relationships.

The whole concept is a pretty straightforward exercise in nature vs. nurture, and if you haven’t seen One Tree Hill, don’t worry, I haven’t spoiled anything; this is all part of the show’s initial setup in the pilot episode. What the viewer is intended to puzzle out as the season unfolds is how much of Nathan’s arrogance and aggression is a reaction to his surroundings and how much is an inherent part of his character. And on the other hand, can Lucas, against his father’s wishes, learn to thrive in new surroundings as he steps into the spotlight of varsity basketball? Or is he more naturally suited to exist in the shadows?

I recently read almost six years of scouting reports, statistical breakdowns, and interviews covering two prospects from the 2018 MLB draft in an attempt to to understand the how and why of each player’s career arc. More on that later, but for now, I want to emphasize how much easier it is to analyze a teen soap opera. And it’s not that the scouting reports were unclear, or that the statistical analysis was misleading, or that the players misrepresented themselves in interviews. It’s that taking 18- to 22-year-olds and turning them into big leaguers is a hard thing to do under the best of circumstances. Read the rest of this entry »


One Fastball Isn’t Enough

David Butler II-USA TODAY Sports

The fastball is dead. Or is it?

Every season has its share of articles detailing the league-wide decline in fastball usage, and 2024 is no exception. This time around, the spotlight has been on the Red Sox, who have seemingly crafted an elite rotation based on a delightfully succinct philosophy: Spin go brrrr. Indeed, they trail the league in four-seam fastball usage by a wide margin. But they’re also ninth in sinker usage and first in cutter usage as of this writing. This is incredibly interesting to me, especially after you consider the graph below:

In early counts (0-0, 0-1, and 1-0), when batters are more eager to swing and hunt for fastballs, we’ve reached a new minimum for four-seam fastballs. That checks out. But look at the combined rate of sinkers and cutters: It’s back up to levels last seen in 2018. So really, the Red Sox aren’t being hipsters. If anything, they represent what the league is thinking as a whole. The uptick is there, even if you exclude Boston. Read the rest of this entry »


Umpiring Is About To Get Better

Charles LeClaire-USA TODAY Sports

For the last few years, I’ve been checking the accuracy rate of the ball-strike calls made by umpires, dividing the number of correct calls by the total number of takes. It’s a blunt approach, but because umpires make so many thousands of calls each year, it yields solid results. On Tuesday, I pulled the numbers for the 2024 season, and I found something I didn’t expect: Accuracy is going down rather than up. In every single season since the beginning of the pitch tracking era in 2008, umpires have gotten better at calling balls and strikes according to the Statcast strike zone. This is the first time I’ve ever pulled the numbers and seen a lower accuracy rate. However, this is also the first time I’ve checked the numbers this early in the season, and it turns out umpires tend to make better calls as the season goes on. Since 2017, accuracy in March, April, and May has been 0.19 percentage points lower than accuracy over the full season (though the difference in 2023 was just 0.03 percentage points). Here’s what that looks like in a graph.

You know how at the beginning of every season, there are a couple blown calls during a nationally televised game (or at least, calls that appeared to be wrong according to the on-screen strike zone), and certain people start complaining that umpires are terrible and they’re getting worse? Those people always catch me off guard. I usually forget about the missed calls when the season ends, but those people somehow manage to keep their umpire anger at a high idle through the entirety of the offseason so that the instant baseball returns, they’re ready to shout about the umpires again without any need to ramp up. I don’t know how they do it without pulling an oblique, but in a sense, those angry people are right. Even though the umpires are always getting better year after year, they’re nearly always more accurate toward the end of the season than at the beginning — so much so that when the season starts, they’re worse than they were at the end of the previous season. For a month or two, the umpires really have gotten worse. We often say early in the season that pitchers are ahead of hitters. It turns out they’re ahead of umpires too.

For each season, I broke down the overall accuracy in two-month increments, essentially dividing the season into thirds. I also broke down the accuracy during spring training and the playoffs, although there are plenty of factors that make those numbers suspect. During spring training, the umpiring pool is much wider. Perhaps more importantly, there are far, far fewer tracked pitches during spring training, both because the number of games is so small and because not every stadium is set up for Statcast. That results in a much smaller, much less reliable sample. The playoffs are also a much smaller sample, but they’re also, at least in theory, selecting for better umpires. Working the playoffs is seen as an honor and a reward for performing well in the regular season. We should expect accuracy to be at its lowest during spring training and highest during the playoffs.

Generally speaking, the results fit our preconceptions. Spring training accuracy is very low and it features the volatility that we’d expect from a small dataset. Umpires are also more accurate in the playoffs. The red line is March, April and May, and as you can see, it’s nearly always below everything but the spring training line. Not only do umpires start getting better in June, but they keep getting better right through the end of the season, which is why the light blue line for August, September, and October is usually above the yellow line for June and July. The trend is a little bit easier to see if we focus just on pitches in the shadow zone, the area that’s one baseball’s width from the edge of the zone on either side.

In the graph above, the dotted line represents that season’s overall accuracy on calls in the shadow zone. Each data point represents the number of percentage points above or below that year’s average. Not only do the calls get better as the season goes on, there’s a definite gap between the first two months and the rest of the season. Umpires are decidedly worse in those first two months. However, 2023 was a real outlier. It was first time since 2008 that umpires were more accurate in the beginning of the season than the end.

With that, I want to bring you back to 2024. So far this season, umpires have gotten 92.46% of calls right, down from 92.81% in 2023 and just two thousandths of a percentage point higher than in 2022. Based on everything I’ve shown you, we should expect umpires to get better over the rest of the season. However, the drop-off from last year is noticeable. Accuracy over the first two months of the season has only fallen once before, from 2009 to 2010, when it dropped by 0.16 percentage points. So far this season, accuracy has fallen by twice that amount: 0.32 percentage points. That’s a tiny change, on the order of one call per game, but that doesn’t make it any less real. We’ll have to wait and see how the rest of the season goes, but perhaps this year really could end up being different. Or, if it follows the pattern of the past decade and a half, accuracy will soon be in its way up.