Archive for Research

Checking in on Pythagoras

Kiyoshi Mio-Imagn Images

This June 25, the Dodgers and Tigers both played their 81st game of the season. Both teams finished the day 50-31, sharing the best winning percentage in baseball at .617. The Tigers got there with a slightly better run differential, though; their Pythagorean winning percentage was a cool .608, while the Dodgers checked in at .595. Pythagorean record is implied by runs scored and allowed, and broadly regarded as a more stable measure of talent than simple wins and losses. Since that day, though, the Tigers have gone 35-40 (.467 with a .483 Pythag), while the Dodgers have gone 38-37 (.507 with a .556 Pythag).

I’m bringing this up – last data project for a while, incidentally, I just had a bunch of things in my queue and couldn’t resist tackling them all – because “how good is that team, anyway?” has been a hot topic this year given the various surprising teams who have, at times, taken up the mantel of “hottest in baseball.” Versions of this question – “This team is doing well/poorly now, what does that mean for next month?” – have been both interesting and top of mind in 2025. The Tigers and Brewers played so well for so long that they each crashed the best-team-in-baseball debate. The Mets did their hot-and-cold thing. The Dodgers have endured multiple fallow stretches. Sometimes, teams felt like they were getting very lucky or unlucky relative to their run differential. But what does any of that even mean? Read the rest of this entry »


Fun With Playoff Odds Modeling

Gary A. Vasquez-Imagn Images

Author’s note: “Five Things I Liked (Or Didn’t Like) This Week” is taking a short break, but will return next Friday for the end of the regular season.

Earlier this week, I did the sabermetric equivalent of eating my vegetables by testing the accuracy of our playoff odds projections. I found that our odds do a pretty good job of beating season-to-date odds (particularly late) and pure randomness (particularly early, everything does pretty well late). It’s good to intermittently check in on the accuracy of our predictions. It’s also helpful to build a baseline as a benchmark to measure future changes or updates against.

Those are a bunch of solid, workmanlike reasons to write a measured, lengthy article. But boring! Who likes veggies? I want to beat the odds, and I want to flex a little mathematical muscle while doing it. So I goofed around with a computer program and tried to find ways to recombine our existing numbers to come up with improved odds built by slicing up existing ones. It didn’t break the game wide open or anything, but I’m going to talk about my attempts anyway, because it’s September 19, there aren’t many playoff races going on, and you can only write so many articles about whether the Mets will collapse or if Cal Raleigh will hit 60 dingers.

What if you just penalized extreme values?
I first tried to correct for the fact that early-season projection-based odds (which I’m calling FanGraphs mode for the rest of the article) seem to be too confident and thus prone to large misses. I did so by applying a mean reversion factor that pulled every team’s values toward the league-wide average playoff chances (i.e. how many teams made the playoffs that year). This method varies based on the current playoff format; we have 16-team, 12-team, and 10-team samples in the data, and I adjusted each appropriately. I set the mean reversion factor so that it was strong early in the year and decayed to zero by the end of the season. Read the rest of this entry »


MeatWaste Part 2: The Re-Meatening

Benny Sieu-Imagn Images

Last week, I dug into the data a little to see if there was any empirical basis to the suspicion that the Brewers lineup might not be cut out for October. The result was a new metric, if you want to call it that, called MeatWaste%. This number — the percentage of pitches that end up either in the dead center of the strike zone or out in Baseball Savant’s Waste region — I used as a proxy for pitcher quality. MeatWaste pitches are gifts to the batter, the kind of offering that produces an instant swing decision and either an easy take or a full-force swing.

I found two things: First, that the Brewers are better, relative to the league, on these two pitch locations than they are on the whole. And second, that these easy opportunities come around often in the regular season, but disappear in close playoff games. Simple enough, though there are limits to what this finding allows us to infer about the Brewers’ future. It’s why they play the games, after all. Read the rest of this entry »


A FanGraphs Playoff Odds Performance Update

Gregory Fisher-Imagn Images

Look, I get it. You keep refreshing FanGraphs, and it keeps saying that the Mets are 99.9999% likely to make the playoffs (okay, fine, 79.4%). You’ve seen the Mets play, though. They stink! They’re 32-48 since June 13. The White Sox are better than that! We think they’re going to make the playoffs? These Mets?! What, do we not watch the games or something?

Well, to be fair, our models don’t actually watch the games. They’re just code snippets. But given how the Mets’ recent swoon has created the most interesting playoff race in baseball this year, and given that our odds keep favoring them to pull out of a tailspin, the time is ripe to re-evaluate how our playoff odds perform. When we say a team is 80% likely to make the playoffs, what does that mean? Read on to find out.

In 2021, I sliced the data up in two ways to get an idea of what was going on. My conclusions were twofold. First, our model does a good job of saying what it does on the tin: Teams that we give an 80% playoff chance make the playoffs about 80% of the time, and so on. Second, our model’s biggest edge comes from the extremes. It’s at its best determining that teams are very likely, or very unlikely, to make the playoffs. Our flagship model did better than a model that uses season-to-date statistics to estimate team strength in the aggregate, with that coverage of extreme teams doing a lot of the work. Read the rest of this entry »


Opportunity, Takeoff Rate, and Stolen Base Opportunism

Rafael Suanes-Imagn Images

David Hamilton doesn’t wait casually at first base. He lurks, waiting for the slightest opening to take off. Watch an at-bat where Hamilton is on the bases, and he’s often as much a point of discussion as the man at the plate. Take the game between the Red Sox and Guardians on September 1, for example. Hamilton pinch-ran for Carlos Narváez with Connor Wong at the plate. Wong fouled off a bunt for strike one with the entire defense focused on Hamilton at first base. Then Hamilton stole second on the next pitch even with the catcher, pitcher, and infielders all fixed on his every move.

Hamilton isn’t the most prolific basestealer in the majors. He isn’t the most successful. But he is the baserunner who tries to steal most frequently, after adjusting for opportunities, and so he’s a great poster boy for what I’d like to talk about today: stolen base opportunities and takeoff rate.

It doesn’t take much to make a stolen base possible, just a runner and an open base. You do need both of those, though. Draw a walk to load the bases, and you’re not attempting a steal without something very strange going on. Stolen base opportunities aren’t easy to find in a box score or a game recap. They’re the negative space of baseball – no one’s counting them, and it’s easier to see where they aren’t than where they are. So, uh, I counted them. Read the rest of this entry »


How Much Do Trail Runners Matter? An Investigation

Rick Scuteri-Imagn Images

Watch this play. What do you notice?

Here’s what I see: Brooks Lee lofts a soft fly ball 248 feet from home plate. Chandler Simpson circles it but loses a bit of momentum by the time it lands in his glove. Twins third base coach Tommy Watkins sends the not-particularly-fast Trevor Larnach (18th-percentile sprint speed). Shallow fly ball, slow runner, close play at the plate — Larnach slides in just ahead of the throw. It’s an exciting sequence, and I’ve missed an important part of it. Read the rest of this entry »


Welcome to Meatball Watch 2025

Charles LeClaire-Imagn Images

I’d like to present the meatball-iest pitch thrown so far in 2025:

I know, I know! I said that, but it’s just a foul ball. Hear me out, though, because I can put some data behind my claim. Here at FanGraphs, PitchingBot, our in-house pitch modeling system, looks at every single pitch thrown, regresses it against a huge database of past pitches, and uses some mathematical ingenuity to turn that into the expected outcomes of the pitch. That’s not the same as knowing which pitch is most likely to turn into a home run, but luckily, a good bit of mathematical wrangling can turn pitch grades into home run percentages.

Last year, I worked out the rough contours of converting PitchingBot grades into home run likelihood. This year, I’ve expanded that methodology to try to learn a little bit more about the pitchers doing the meatballing. If you’d like to skip through the how, you can head right down to the table labeled “Meatball Mongers.” If you’re here for the nitty gritty of turning pitch metrics into home run likelihood, though, here’s how I did it.

That Trent Thornton fastball had a lot of things working against it, and those things help explain how PitchingBot estimates the chances that a pitch will be hit for a home run. PitchingBot has a flowchart that explains how the model works. Here’s how the system assesses every pitch it grades:

Hey, a convenient “start here” label! How great! The “swing model” takes location, count, pitch type, movement, platoon matchups, and pretty much everything else you can imagine into account and guesses at the likelihood of a batter swinging at each pitch. That Thornton fastball was down the middle in an 0-1 count, and it’s not a particularly deceptive offering. In other words, hitters often swing at fastballs like that – 92.7% of the time, per PitchingBot’s model. Read the rest of this entry »


The Enigma: My Journey Through Statistical Artifacts in Pursuit of Hot Streaks

Brett Davis-Imagn Images

A warning up top: This article is about seeking and not finding, about the unique ways that data can mislead you. The hero doesn’t win in the end – unless the hero is stochastic randomness and I’m the villain, but I don’t like that telling of the tale. It all started with an innocuous question: Can we tell which types of hitters are streaky?

I approached this question in an article about Michael Harris II’s rampage through July and August. I took a cursory look at it and set it aside for future investigation after not finding any obvious effects right away. To delve more deeply, I had to come up with a definition of streakiness to test, and so I set about doing so.

My chosen method was to look at 20-game stretches to determine hot and cold streaks, then look at performance in the following 20 games to see which types of players were more prone to “stay hot” or “stay cold.” I started throwing out definitions and samples: 2021-2024, minimum 400 plate appearances on the season as a whole, overlapping sampling (so check games 1-20 vs. 21-40, 2-21 vs. 22-41, and so on), wOBA as my relevant offensive statistic, 50 points of wOBA deviation against seasonal average to convey hot or cold, 40-PA minimum per 20-game set to avoid weird pinch-hitting anomalies, throw out games with no plate appearances to skip defensive replacements — the list goes on and on. Read the rest of this entry »


Hit-By-Pitch Rates Have Been Falling for Five Years Now

Daniel Kucin Jr.-Imagn Images

What is the sound of a batter not getting hit by a pitch? I ask because as hit-by-pitch rates climbed over the years (and kept climbing), we writers have made lots of noise about them. In 2007, Steve Treder published an article called “The HBP Explosion (That Almost Nobody Seems to Have Noticed)” in The Hardball Times. After that, everybody noticed. We’ve seen articles about rising hit-by-pitch rates here at FanGraphs, Baseball Prospectus, the Baseball Research Journal, MLB.com, The Athletic, SportsNet, FiveThirtyEight, the Wall Street Journal — even the Clinical Journal of Sports Medicine. The venerable Rob Mains of Baseball Prospectus has been writing about it (and writing about it and writing about) ever since he was the promising Rob Mains of the FanGraphs Community Blog. Tom Verducci wrote about the “hit-by-pitch epidemic” for Sports Illustrated in 2021, then wrote a different article with a nearly identical title just two months ago. There’s good reason for all this noise, and in order to show it to you, I’ll reproduce the graph Devan Fink made when he wrote about this topic in 2018:

Hit-by-pitches have been rising since the early 1980s, and despite a decline in the 1970s, you could argue that they’ve been rising ever since World War II. Devan’s graph ends in 2018, but the numbers kept on going up — for a while, anyway. Here’s a graph that shows the HBP rate in recent years. After a couple decades of sounding the HBP alarm, it’s time for us to unring that bell (which I assume, without having looked it up, is an easy thing to do):

Congratulations everybody, we’ve done it! We’ve ended the epidemic. The HBP rate has fallen in four of the last five seasons. It’s safe to leave your home again. You can enter a public space without fear that you’ll be bombarded with stray baseballs. Rob Mains can finally take a vacation. Tom Verducci can finally take a deep breath. Read the rest of this entry »


The Most Feared Hitter in Baseball

Charles LeClaire-Imagn Images

“Who is the most feared hitter in baseball?” is not a question I set out to answer. That would be too easy! Step one: Write “Aaron Judge.” Step two: Let out a bemused chuckle. Obviously it’s Aaron Judge. Who would have commissioned such a silly article? Step three: Get lunch. That does sound pretty tempting, I must admit, but that’s not this article. This one is a little bit weirder.

I started by asking the opposite question: “Who is the least feared hitter in baseball?” I had a simple idea for how to test it. Take a look at the rate of pitches over the heart of the plate that each batter sees when behind in the count – more strikes than balls. A hitter who sees tons of pitches down the middle in a bad hitting situation isn’t a guy who scare opponents. Pitchers are so not afraid that they’re chucking pitches down Broadway even in the situations where that’s least necessary and least advantageous. Read the rest of this entry »