Baseball Baselining: How Likely Is a Comeback?
Monday night, my wife posed a baseball question I couldn’t immediately answer. As the Angels and Giants went to the eighth inning with the Halos up by a run, she had a simple question: How often does a team that’s losing after seven innings come back and win? I guess I could have gone to our wonderful WPA Inquirer, a fun little tool for hypotheticals. That tells me that the Giants had around a 25% chance to win heading into the eighth. But I took her question as a broader one, concerned not just with that specific game, but with all games. How likely is a comeback?
I didn’t know the answer offhand, and I couldn’t find it on Google either (secret professional writer tip: use Google). So I did what anyone in my situation would do: I said “I don’t know, but now I’m going to write an article about this.” Two days later, here we are.
I’m hardly the first person to do research on comebacks. Russell Carleton has been looking into comebacks for a while. Rob Mains has too. Chet Gutwein investigated comeback wins and blown saves here at FanGraphs in 2021. Everyone loves to write about comebacks. Baseball Reference even keeps a list of the biggest comeback wins. They’re memorable games, and fertile ground for investigation.
I’m not trying to split the atom here, though. I don’t have any groundbreaking new research on how we’ve been misunderstanding the very nature of baseball for decades. I’m going for something simpler, a big ol’ batch of tables that answers this broad question: When I’m watching a baseball game in inning X, how likely is it that the team that’s currently winning will see its lead hold up?
Again, you could get more granular than that if you wanted to. You could consider the size of the lead, the teams involved, the stadium, any number of things. You could just look at a win probability graph in real time and have a much better idea than a broad generalization. But that’s not really what I’m going for. I don’t want to look at a win probability graph every time I watch a baseball game. I usually want to sip my drink and enjoy a lovely evening at the park or on my couch. I’m trying to teach my brain to say “Here’s what you can expect.”
I do this in a lot of areas of life. How often will my train be late? How many people will no-show to our dinner party? What are the odds of me drawing a playable hand in whatever board game I’m currently into? It helps me to have a probability in my head, because it gives me an idea of how rare (or how mundane) the subsequent thing that transpires is. I also like having a number rather than a general feeling, because general feelings are a little squishy and subjective for me. The number doesn’t have to be exact; it’s just a little bit of brain training.
That’s why I’m doing this. Now, let’s set some baselines. I decided to break out my data into three groups: the last five seasons, the last 10 seasons, and the last 20 seasons. There’s no particular scientific reason for this; I just wanted to see whether much has changed. I queried our database like so: How many games had a different team in front after the eighth than at the end of the game? How about the seventh? The sixth? I continued down the line.
Without further ado, here’s the data. If you’re looking for a single-line takeaway, just look at the right-most column and round to the nearest 5%. If one team is leading after a given inning, here are the odds that the other team will win the game:
Inning | Past 20 Years | Past 10 Years | Past 5 Years |
---|---|---|---|
1 | 31.3% | 30.9% | 30.8% |
2 | 28.2% | 27.8% | 27.5% |
3 | 25.0% | 24.5% | 24.0% |
4 | 20.9% | 20.5% | 19.9% |
5 | 17.5% | 17.0% | 16.5% |
6 | 13.3% | 12.7% | 12.8% |
7 | 9.0% | 8.6% | 8.9% |
8 | 4.6% | 4.5% | 4.7% |
These numbers seem tiny, and in fact they are. They’re felled by a critical flaw: plenty of times, the game is obviously out of reach. When we lump together one-run nail-biters with 10-run thumpings, the aggregates will look weird. To handle this in a lazy but simple way, I ran the data again and threw out any game with a score differential of four or more runs. In my head, those games are already decided; I’m more interested in how likely a score is to flip in a contested game. Here’s that amended chart:
Inning | Past 20 Years | Past 10 Years | Past 5 Years |
---|---|---|---|
1 | 33.2% | 32.8% | 32.6% |
2 | 31.3% | 31.0% | 30.8% |
3 | 29.5% | 29.0% | 28.7% |
4 | 26.2% | 25.8% | 25.1% |
5 | 23.6% | 22.8% | 22.5% |
6 | 19.2% | 18.3% | 18.6% |
7 | 14.2% | 13.6% | 14.2% |
8 | 7.9% | 7.7% | 8.0% |
That reads a bit better, and it gives us some overall guidelines. In the early innings, trailing teams win the game nearly a third of the time – there are plenty of at-bats left to create some drama. By the middle innings, we’re talking about 20-25%, the odds of flipping heads twice in a row. By crunch time, the team in the lead generally stays ahead, just like you’d expect.
Now that I’ve stopped to think about it, I prefer a slightly refined version of that rule of thumb. I’m going to break out the score for the ninth inning specifically, because “save situation, closer coming in” is such a big part of baseball broadcasts that I want to remember those specifics. Before that, I’ll default to generalities. It goes like this: in the early innings, the odds of a comeback are 33%. In the middle innings, they’re 25%. By the start of the eighth inning, they’re down to 15%. Those numbers are for all games – you can shade the odds higher if it’s a one-run deficit, or lower for bigger gaps. By the start of the ninth, the odds of a comeback are 15% in a one-run game, 5% in a two-run game, and 2.5% in a three-run game, which means you can almost ignore three-run games and treat them as over.
That’s a short enough rule of thumb to hold in my head, so that’s what I’ll be using from now on. You can, of course, add bells and whistles if you want. You could break the eighth inning down by deficit: 20-25% with a one-run deficit, 10% with a two-run deficit, and 5% with a three-run deficit. That’s more nuance than I want when I’m coming up with heuristics, but your mileage may vary.
It didn’t take too long to solve that particular problem, so I thought I’d answer one more using this same data. It’s one I’ve often wondered about when I go to a game in person. It goes like this: How often is the game result decided after a given inning?
As an example, take a game where the home team leads 4-2 after four and goes on to win. The result would have been no different if the game ended after four innings or played out to its conclusion. Note that I’m not accounting for what happens in between – the road team might have taken a lead only to blow a save or something along those lines. I could account for that, but at large computational cost, and without much upside in my opinion. Those intermittent hijinks are fun for sure, but at the end of the day, I think that “leading after X innings and goes on to win” feels fairly decided to me, even if the winning team makes you sweat a bit.
This formulation of the question brings tie games back into the fold. A tie game obviously hasn’t been decided, so it falls into the same bucket as comebacks. With that said, here’s the data:
Inning | Past 20 Years | Past 10 Years | Past 5 Years |
---|---|---|---|
1 | 66.9% | 67.3% | 67.3% |
2 | 52.9% | 52.9% | 52.7% |
3 | 43.0% | 43.0% | 42.2% |
4 | 35.5% | 35.5% | 34.8% |
5 | 29.7% | 29.5% | 29.4% |
6 | 24.2% | 23.9% | 23.8% |
7 | 19.2% | 18.9% | 19.0% |
8 | 14.0% | 14.0% | 14.3% |
The stability of these numbers over time interests me. The difference between a 20-year average and a five-year average isn’t particularly big in any given inning or way of looking at it. Somehow, the past five years have the most games that haven’t been decided by the end of the eighth, which I would not have guessed in the era of everyone throwing 100.
So, how likely is a comeback when a team is trailing after seven innings? It happens about 15% or so of the time in the abstract, but 25% of the time in the Giants’ specific situation (i.e. trailing by one run). By the time they made it to the ninth down a run, it was almost exactly 15%. That’s a healthy amount of uncertainty, at least in my opinion, and it was certainly a fun amount of uncertainty for Giants fans, because their team hung six runs on the Angels to win in convincing fashion. It’s a good reminder that none of these numbers are absolute, and like Yogi Berra once said, it ain’t over ‘til it’s over.
Ben is a writer at FanGraphs. He can be found on Twitter @_Ben_Clemens.
So what are the odds of a comeback after 8 innings with the best reliever on the mound and being 3 runs back? The win probability from last night before Houston’s rally said 3.2%. That seemed high.
General win probability figures aren’t going to adjust for individual player performances, home park, prevailing conditions, etc. It’s just a more detailed version of what Ben does here, for most intents and purposes. Based on the current game state (inning, score, runners on, outs), the kind of general WP stat you’ll find most places is the chance based on the data across many, many games that team X will win based solely on the game-state.
The less likely the comeback the more memorable it is. Yes we can stop watching when our team is hopelessly behind, but then Daniel Camarena and Kyle Tucker hit Grand Slams or Rajai Davis and Kirk Gibson hit 2 run homers and things are upside down.