Hey FanGraphs, Your Math Isn’t Mathing… Or Is It?

Denis Poroy and Cary Edmondson-Imagn Images

If you spend some time poking around the nooks and crannies of FanGraphs, you’ll eventually encounter one weird thing. Go to our Depth Charts Team WAR Totals page, and you’ll see all 30 teams arranged by the amount of WAR we project them to accrue this season. Go to our Projected Standings page, and you’ll see the winning percentage we expect for each team. Sometimes, those two pages seem to be displaying the exact same information. Sometimes, they don’t quite line up.

Take right now, for instance. We project the Padres for 40.8 WAR, the Giants for 38.7 WAR, and the Diamondbacks for 38.2 WAR. Look at the projected standings, however, and we have the Padres down for a .490 winning percentage, the Giants at .504, and the Diamondbacks at .501. That doesn’t feel right. Shouldn’t the team with the most projected WAR also project for the best record? Well, buckle up, because to explain how this works, we’re going to have to do some math.

We’ll break this one down into two parts. First, what does a team WAR projection mean? Most basically, it’s the sum of each player on that team’s WAR projection, but we’ll have to get more specific than that. Our projection systems can spit out a WAR, but that’s not their real output. They project actual on-field baseball results. Manny Machado’s Depth Charts projection is for 644 plate appearances, 28 doubles, 26 homers, 127 strikeouts, and so on. The WAR part of it gets calculated after the fact.

To quote our glossary, WAR “summarize(s) a player’s total contributions to their team in one statistic.” As you can imagine, there’s some approximation involved. One of the key features of WAR is that it puts everything into a neutral context. That’s how it can compare disparate players and situations. One good broad generalization of WAR is that it thinks about everything in linear weights plus adjustments. Every home run is worth the same given the same park factor. A stolen base is a stolen base. Gertrude Stein would approve.

That’s incredibly useful as an estimate. It’s just that, though: an estimate. The Blue Jays had a .333 on-base percentage as a team last year. The Guardians had a .296 team OBP. That’s 12.5% more baserunners for the Jays, given the same number of plate appearances. A double was more valuable to Toronto than to Cleveland because there were, on average, more baserunners to advance. The Pirates had a .350 slugging percentage, the worst in baseball. Walks are worth less if the team around you is less likely to drive you home with loud contact. But wOBA, and thus WAR, doesn’t care about that; those measures count every double as the same and every walk as the same, and then toss in a park adjustment after the fact to handle stadium effects.

Pitching WAR is a bit more complicated, but it’s based on the same idea: creating a context-neutral approximation of value to allow the comparison of players in different situations. We apply other adjustments after the fact for leverage and for a pitcher’s own effect on their run environment, but the broad strokes remain the same. When you’re looking to compare players on different teams without giving them too much credit (or docking them too much) for the teammates who happen to play around them, WAR is very useful.

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

If you’ll notice, though, that none of those steps involve adding up the WAR of a team to estimate its overall talent level. The idea of context neutrality is great for the main purpose of WAR, but at a team level, that doesn’t make as much sense. The sum of a team’s WAR is how well we’d expect all its players to do if they were each independently put in a neutral context, but they don’t play in a neutral context; they play on their own team. The differences aren’t enormous – home runs are good no matter who you play for, and strikeouts are pretty bad for everyone — but they can easily be on the order of a dozen or two runs across an entire season for an entire team.

While the Depth Charts are summing up WAR, the projected standings are doing something else entirely. We use the same projections there, but in a different way. In the same way that WAR takes in a ton of component stats, pushes them through a formula, and spits out wins, we throw team-level component stats into a big equation to do projected standings. The difference is that we use BaseRuns instead of a context-neutral WAR summation.

BaseRuns will make your eyes glaze over. Here’s just one of the four components of calculating a team’s BaseRuns:

B = 1.1*[1.4*TB – 0.6*H – 3*HR + 0.1*(BB + HBP – IBB) + 0.9*(SB – CS – GDP)]

While the exact math of BaseRuns might be hard to grasp, the concept isn’t. The formula approximates run scoring based on specific team context. Research shows that this does a better job of approximating run scoring than a linear weights formula, because it does a better job with values far from league average, like the Blue Jays’ and Guardians’ offenses in 2025.

We calculate team-level runs scored and runs allowed estimates using BaseRuns, then use those to create an expected winning percentage using Pythagorean approximation. WAR never enters the equation here, even though we’re using all the same building blocks to construct our estimates. Instead of aggregating them up via the WAR formula and layering in adjustments from there, we aggregate them up via the BaseRuns formula. WAR is better at comparing individual players. BaseRuns is better at measuring how players interact.

How does this play out in the specific case of the Padres and Giants? The big difference comes in how BaseRuns looks at each team’s pitching projections. The Giants staff projects to allow far fewer home runs than league average. That’s partially a stadium effect, but it’s also the effect of having a ton of good sinkerballers, led by Logan Webb, the league’s very best groundball pitcher. The Giants allowed the fewest home runs in baseball last year, and they’ve been among the three best teams at preventing the long ball in each of the last five years.

BaseRuns looks at that skill and says that it will lead to fewer runs scoring than a naive, context-neutral ERA estimator would expect. Allowing baserunners is less harmful when you don’t allow home runs, as it turns out, and BaseRuns handles that quite well. There’s a real interaction effect going on here. With fewer home runs allowed, every other way of reaching base is less damaging to the pitching team. WAR isn’t looking at that; it’s measuring everything in a neutral context.

But San Francisco allows 15% fewer home runs than the average team. That’s pretty clearly not neutral – it’s a miniature, localized Deadball era. Meanwhile, the Padres also play in a park that suppresses offense, and yet we project them to allow more home runs than league average. BaseRuns looks at that and sees more runs scored than you’d expect. In all, we have the Padres allowing 22 more runs than the Giants. That’s despite 15 projected pitching WAR for the Padres and 12.2 projected pitching WAR for the Giants.

That’s pretty much the entire difference between a WAR-based projection of the Padres and Giants, and a BaseRuns-based projection. WAR uses context-neutral statistics and some clever adjustments for leverage and park to say that the Padres pitchers are better. BaseRuns uses different math to say that Giants pitchers project to allow fewer runs. They take the same data and end up in different places.

If you’re looking for the biggest single driver of this discrepancy, look no further than Mason Miller and crew. The Padres closer is projected for 2.3 WAR, more than any Giants pitcher other than Webb. Some of that is because he projects for a 2.43 ERA. A lot of it is because of our leverage adjustment – we multiply the raw WAR output of high-leverage relievers to account for the fact that they enter the game in more important situations.

In fact, leverage adjustments play a huge role in the Padres’ WAR projection. We have their bullpen down for a 3.89 ERA, 3.96 FIP, and 5.0 WAR. We have the Giants ‘pen projected for a 3.91 ERA, 3.94 FIP – and 1.1 WAR. Same runs allowed, more or less, and in a similar offensive environment, but four wins of difference by the way we define WAR. Maybe the Padres will leverage their phenomenal unit into more wins than you’d expect based on their run differential, but they had the best bullpen in baseball last year and played exactly at their Pythagorean expectation, so it’s certainly not a given.

There’s no equivalent to that leverage adjustment in our BaseRuns calculation. To bring back Ms. Stein, a run is a run is a run as far as BaseRuns is concerned. That’s a weakness of the BaseRuns formula, just as not accounting for specific context is a weakness of the way WAR works. BaseRuns gets a lot closer to the truth, at least in my estimation, but at their core, both systems are trying to take some core statistics and turn them into an estimate of talent.

If you’re dead set on using WAR at the team level, though, don’t worry. You still can! We have a WAR-based version of our playoff odds available on the site. In my backtests, it’s done nearly as well as the BaseRuns version, certainly within a margin of error. The truth is, both BaseRuns and WAR are abstractions. “Here’s how the players will do, now tell me how the team will do” isn’t a solved problem.

In fact, if you think about it, it’s interesting that both formulas often come to such similar conclusions about team talent. They’re built completely differently. They both have blind spots and things they’re best at. They compare teams that are, in some cases, quite different from each other, and they get answers within a few wins over a 162-game season most of the time. So who’s better between the Padres and Giants? It all depends on what you mean by “better,” and how you want to measure it.





Ben is a writer at FanGraphs. He can be found on Bluesky @benclemens.

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
mike sixelMember since 2016
30 minutes ago

Just came to say Ben also explained some things to me about math earlier, and he seems like a nice guy. Didn’t talk down to me, just tried to help me understand. Also, this is interesting math!

Farbsy15Member since 2023
5 minutes ago
Reply to  mike sixel

Don’t ever expect a fraction of that respect from anyone at Baseball Prospectus. Learned my lesson a few times