Introducing THE BAT X for Pitchers… and THE BATcast Stuff Model!

I started this project nine and a half years ago. I’ve gone through multiple outside developers, cleared countless hurdles over an unknowable number of hours, taught myself a ton about coding and modeling, and studied thousands of variables. But finally, six years after the introduction of THE BAT X for hitters, THE BAT X for pitchers is finally here!

Similar to THE BAT X for hitters, the pitcher version starts with the classic version of THE BAT (which was built long before Statcast and all the advanced metrics we have now, and mostly just makes use of surface stats) and blends in my brand new pitcher “stuff” model, which I’m calling THE BATcast. (Get it, because it’s THE BAT using Statcast data?)

Where can I find THE BAT X and THE BATcast stuff model?
Right here at FanGraphs! THE BAT X for pitchers is fully integrated into the projections pages, the player pages, the playoff odds, and the auction calculator. THE BATcast is in the process of being added, so you’ll hopefully see it soon.

When you say “stuff,” what does that mean?
I mean the physical attributes that describe a pitch. How it moves through space. Most people associate “stuff” with high velocity, loopy breaking balls, and lots of whiffs. While stuff can manifest that way, good stuff doesn’t necessarily mean a high strikeout rate. THE BATcast model is actually tons of separate models put together, each one predicting a different aspect of the game for a different type of pitch. I’m predicting strikeouts based on stuff, yes, but I’m also predicting walk rate, homers, BABIP, etc. So when people are outraged that Framber Valdez makes the top 10 on THE BATcast list, they need to understand that we aren’t just talking about swings and misses here. And the variables that describe stuff go beyond the obvious velocity, movement, etc. that is clear for everyone to see.

What goes into THE BATcast, and what makes it different from Stuff+/PitchingBot?
Part of the reason this took so long to build is because I didn’t want to just reinvent the wheel. The stuff systems that began popping up during that 9 1/2-year span (like Stuff+ and PitchingBot) are great. If I was going to keep trying to build something, I knew it had to be unique somehow, or else what was the point? I could just use an existing system and integrate it into my projections.

As a starting point, THE BATcast includes all the variables you’d come to expect from a stuff model: velocity, movement, spin. It incorporates newer advances like seam-shifted wake and pitch mirroring and approach angles. It looks at the interactions between a pitch and the pitcher’s primary fastball (i.e. not just how fast his slider is, but also the velocity differential between the fastball and the slider). It accounts for different pitches being better or worse against hitters of a certain handedness. None of this is new ground, though. Without giving too much away, I think there are three main things (as far as I know) that THE BATcast is doing that others aren’t yet and that I’m comfortable sharing.

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

For one, while it uses these basic variables, it also runs them through a set of adjustments to try and squeeze more information out of them. For example, a slider from a side-arm pitcher may have more movement than one thrown over the top, but the batter is likely expecting this based on the arm angle, so that extra movement may not actually help. THE BATcast adjusts to try and account for unexpected movement, movement beyond what would be expected of a pitch given the arm angle, velocity, etc.

THE BATcast also includes tunneling variables. Tunneling is a fun word that’s been thrown around a lot in recent years, but I feel like it’s used as a catch-all to mean a number of different things that aren’t truly tunneling. I’ve heard it used to describe release point differential or movement differential or different things like that. But when I use the word tunneling, I’m talking about looking at the full trajectory of a given pitch and comparing it to the trajectory of the pitcher’s other offerings. Does the pitch hide inside the path of the fastball to make it harder for the batter to pick it up? For how much of the path does it hide, and just how hidden is it? Does it stay inside that path up until a certain point and then sharply deviate, after a hitter has already had to make a decision about the pitch? In other words, how similar do the pitcher’s pitches look to each other at various points or ranges along the trajectory, and how does that deceive the batter?

The third thing that I think separates THE BATcast from other systems is it attempts to account for bridge pitches. The most common bridge pitch is a cutter. It bridges the gap between a pitcher’s fastball and the slider, so that instead of being able to look for two distinct shapes, the batter now has to look for three shapes that all sort of blur together. Instead of a fastball running left and the slider running right, if there’s a cutter in the middle that can do a bit of both, now you’ve created this wide range of possible movements for a pitch, making it much tougher for the batter to identify what’s coming. Even if the cutter itself isn’t good, simply having it can make the slider and the fastball better. So similar to tunneling, I looked at the trajectories of different pitches throughout a pitcher’s repertoire and a whole host of variables therein to try and account for bridge pitches.

That sounds like a lot of stuff…
It is a lot of stuff (no pun intended). Because I’m looking at the full trajectory of pitches, there winds up being hundreds or thousands of tunneling and bridging variables to describe a single pitch. All told, the system evaluates roughly 4,000 variables and narrows them down to the most impactful from there.

Why does THE BATcast like Player X more than Stuff+/PitchingBot?
As I said, the process for building THE BATcast begins with 4,000 variables, with each describing something about a given pitch, before the modeling process whittles those down to the most important ones. But with the power of machine learning comes the frustration of its black-box nature. It can deliver accuracy and results, but it’s much harder to put our finger on why it came up with a given result. We can do our best to interpret, but there are no clear answers to these types of questions, unfortunately.

Why does THE BAT X like Player Y more than THE BAT?
The simple answer is it likes something about his stuff more than his surface numbers. Because of the black-box methodology, it may not be easy to identify exactly what it likes, though.

Are there are any planned upgrades?
Yes! This is the debut of the system, and there will be plenty of upgrades and refinements. Some of the bigger ones planned are altitude adjustments, reliever-to-starter conversions, and creating a “stuff” system for minor leaguers. I’m also in the process of finalizing the in-season version of this, which will hopefully allow us to identify fundamental changes in a pitcher very quickly without having to wait for the surface stats to stabilize.

Finally, I’d be remiss without offering some thank yous. Dr. Alan Nathan, Tom Tango, Alex Chamberlain, Eli Ben-Porat, Eno Sarris, Nick Pollack, Mike Petriello, Thomas Nestico, Josh Kalk, Mike Fast, Harry Pavlidis, Dan Brooks, Jak Jones from Deuce of Diamond Analytics, and a whole host of others all either provided their time to explicitly answer my questions or simply provided inspiration for this project, and for each of you I am extremely grateful.





2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
cashgod27Member since 2024
45 minutes ago

Phenomenal work, Derek! This all sounds really fascinating