Did WAR Ruin the MVP Conversation?

Brad Penner-USA TODAY Sports

On Wednesday, I wrote about one of my favorite topics: The impact of sabermetrics on the practice and analysis of baseball. Specifically, in this case: How MVP voters behave in the post-Fire Joe Morgan era. And for those of you who got to the end of that 2,000-word post and did not feel sated, there’s good news! This was not the question I actually set out to answer when I started kicking the topic around.

Welcome to Part 2.

The very name of the MVP award invites voters to consider the value of a certain player’s contributions. For nearly 100 years, that was a tricky proposition. How do you weigh differences in position, in playing style, park factors, hitting versus pitching versus fielding versus baserunning? It’s enough to boggle the mind.

One of the earliest and most enduring projects of the sabermetrics movement has been the pursuit of a catch-all metric that captures the entire picture. To measure every player’s production, put it in context, and spit out a single number that says he was worth X number of runs or Y number of wins. The implications of such a number are obvious for an enterprise such as identifying the most valuable player in the league in a given season.

Critics from days gone by — or critics from now, who are so behind the times they only sound like they’re from days gone by — would wield WAR as a straw man against more numerate analysts. “If you believe in WAR, why not just vote the WAR leaderboard?”

The answer is that WAR, while published to the tenth of a win, is a model built on assumptions that might not hold to that level of precision. It’s a good starting point, but it’s not the whole conversation, especially in a close race.

And you can tell that WAR is not settled science because even the sabermetrics eggheads can’t agree on what WAR is. FanGraphs, obviously, has its own WAR, and I use that because I’m a good company man. But there are dozens of MVP voters, and many thousands of people who publish analysis of baseball for public consumption. Unfortunately, some of them get their information from other sources. (But we’re working on that, I promise.)

When people talk about WAR, they’re either talking about FanGraphs WAR, Baseball Reference WAR, or WARP, from Baseball Prospectus. (The “p” at the end is for PECOTA, I think.) Each makes its own assumptions about the role of defense in pitcher evaluation, just to name one point of divergence. So while all three mainstream WARs usually arrive at similar conclusions, they don’t arrive at the same conclusion. Which is what you’d need to make “sort by WAR” a coherent award voting strategy.

I’ve compiled a list of every player since 2000 to finish in the top 10 in MVP voting, and their WAR totals that year for fWAR, bWAR, and WARP — 481 seasons in all. On only one occasion — Justin Turner in 2017 — did all three major flavors of WAR agree on a player’s value to within a tenth of a win. So I feel pretty confident that Turner was worth exactly 5.6 WAR in 2017. Everything else is open to interpretation.

There’s another complicating factor: All three flavors of WAR get updated every so often, as wins above replacement is not some inviolable concept that was considered perfect when it was first willed to humanity by the gods. It’s a statistical model, after all, and those ought to get updated when new information emerges. So the WAR totals you see now might not be exactly the same as they were when voters were looking at the various sites’ leaderboards back in the day.

So there are different ways of answering the question of how closely WAR and the MVP vote line up. From a normative standpoint, you could look at it as the voters getting the question right, or as a check on WAR — do the numbers match what we see with our eyes, or is this math left to run amok?

From 2000 to 2023 (since we don’t have the MVP vote tallies from this year yet), there were 134 individual league leaders in WAR. That’s 24 seasons times two leagues times three WAR brands, minus the five years (2000 to 2004) for which BP doesn’t have published WAR data.

Out of those 134 individual player seasons, the WAR leader — for each league and WAR type — finished in the top 10 in MVP voting 120 times. Here are the exceptions.

WAR Leaders Outside the MVP Top 10
Season League Player WAR(P) Type
2005 AL Victor Martinez Baseball Prospectus
2006 AL Grady Sizemore FanGraphs
2008 AL Nick Markakis Baseball Reference
2009 AL Zack Greinke FanGraphs and B-Ref
2009 AL Justin Verlander Baseball Prospectus
2011 NL Cliff Lee Baseball Reference
2011 NL Brian McCann Baseball Prospectus
2014 AL Corey Kluber Baseball Reference
2014 AL José Bautista Baseball Prospectus
2016 NL Buster Posey Baseball Prospectus
2021 NL Corbin Burnes FanGraphs
2021 NL Zack Wheeler Baseball Reference
2022 NL Juan Soto Baseball Prospectus

It should surprise no one that starting pitchers are heavily represented here. MVP voters hate pitchers for reasons that are not entirely clear to me. A starting pitcher can lead the league in WAR and not get the time of day from voters. Greinke, apparently, can lead the league in two different WARs and still be ignored.

On Wednesday, I dragged Max Nichols, the poor soul who cast the one dissenting MVP vote against Carl Yastrzemski in 1967. Today, I’d like to praise another singularly iconoclastic voter: Nick Piecoro of the Arizona Republic in 2018.

That year, Christian Yelich ran away with the MVP vote, winning 29 first-place votes for a season in which he missed the Triple Crown by two home runs and one RBI. He hit .326/.402/.598, which was not only a paradigm-shifting breakout for Yelich himself, but also a transformative one for his team. Yelich, who was traded from Miami to Milwaukee the previous winter, led the unfashionable Brewers to the no. 1 seed in the National League. Narratively, it was not unlike Yaz in 1967: an unbelievable individual season at the center of the league’s best team level story.

And yet Jacob deGrom was so much better. This was the 10-9, 1.70 ERA season that spawned a million memes and ended up, somehow, going underrated by voters. The Mets ace led the league in all three WAR categories and had more than a win on Yelich in all three cases, up to 2.6 WAR over the eventual MVP according to B-Ref. And yet he only finished fifth.

Not every MVP debate has a right and wrong answer. This one did, and Piecoro was the only voter who found it.

Most of the rest of the ignored WAR leaders topped the table in WARP, which can be a bit stingy and occasionally spits out an unexpected result. Nevertheless, it’s a reminder that Soto was really good even in a down year in 2022, and McCann was hugely underrated in general.

I did want to highlight Markakis because, at the risk of being unkind, he’s the last person I expected to lead the league in anything, ever. He led the league in bWAR in 2008 by having, in some respects, a typical Markakis season: He hit around .300, with about 20 homers and 40-odd doubles, and played good defense in right. What was unusual is that he also walked a career-high 99 times, which elevated his OBP to .406, and DRS credited him with 22 runs saved above average, which is a ridiculous number. That was enough to boost a very, very good season to 7.4 bWAR (and 6.1 fWAR, so it’s not like B-Ref was an outlier here), and in a weak season that topped the AL.

This was my favorite moment in researching this piece. Finding out Markakis led the league in WAR once was like finding out Cesar Tovar was actually better than Yastrzemski in 1967. Suffice it to say, the voters did not notice. Not only did Markakis miss out on the top 10 in MVP balloting, he didn’t get a single vote from anyone.

Now that we’ve seen the WAR leaders who didn’t get much love from the MVP voters, let’s look at the other side of the equation: MVPs who didn’t lead the league in any version of WAR. Since 2000, there have been 18 such cases. (Incidentally, that list includes the past two Twins MVPs and the past three Phillies MVPs, so in cases where voters dislike WAR they apparently love sarcasm and smoked and/or cured meats.)

That group of 18 also includes both MVPs in 2000 and nine of the 20 MVPs between 2000 and 2009. That inflection point would seem to hold with the historical proliferation of advanced stats. The Fire Joe Morgan era ended in 2008, and the early 2010s were to the anti-intellectual voter what the Hundred Days were to Napoleon. These included the Jack Morris Hall of Fame debate and the Miguel Cabrera vs. Mike Trout MVP campaigns of 2012 and 2013.

But recent history doesn’t line up with the WAR leaderboards either. From 2020 to 2022, four out of six MVPs — José Abreu, Freddie Freeman, Bryce Harper, and Paul Goldschmidt — failed to top any WAR leaderboard. Of course, all of those races were hard cases: 2020 because of the 60-game season, 2021 and 2022 because of a diffuse and pitcher-dominated WAR leaderboard.

Indeed, many of these supposedly WAR-deficient MVPs only missed out on league leadership by fractions of a win. So let’s narrow that group to MVPs who won in years where a different player led the league in at least two different WAR categories by at least a win in each case.

Miscarriages of WAR
Year League Actual MVP bWAR fWAR WARP WAR MVP bWAR fWAR WARP
2002 AL Miguel Tejada 5.7 4.5 n/a Alex Rodriguez 8.8 10.0 n/a
2012 AL Miguel Cabrera 7.1 7.3 6.5 Mike Trout 10.5 10.1 6.5
2013 AL Miguel Cabrera 7.5 8.6 7.0 Mike Trout 8.9 10.1 7.5
2015 AL Josh Donaldson 7.1 8.7 7.0 Mike Trout 9.6 9.3 8.0
2017 AL Jose Altuve 7.7 7.7 5.4 Aaron Judge 8.0 8.7 8.3
2018 NL Christian Yelich 7.3 7.7 5.9 Jacob deGrom 9.9 9.0 7.0

This list doesn’t include cases like the 2000 AL race or both races in 2006, where it’s clear that a purely WAR-based voter would not have given the MVP to the player who won in real life, but the top of the leaderboard is close enough that it was less clear which player should’ve won. (I had forgotten, for instance, how good Jason Giambi was in 2001 and Carlos Beltrán was in 2006.)

Every one of these cases has a strong narrative basis. Cabrera won the Triple Crown in 2012. Donaldson, Altuve, and Yelich turned up-and-coming teams into juggernauts. Nobody respects pitchers or A-Rod, and for some reason Trout became a proto-culture war volleyball in the waning days of the sport’s battle over empirics.

From 2000 to 2004, FanGraphs and Baseball Reference WAR had the same leader nine times out of 10 chances. That might have something to do with methodological convergence, but it probably has more to do with Barry Bonds and Alex Rodriguez just dragging everyone else in the league during that period. The voters — most of whom had probably never heard of WAR at the time — gave the consensus WAR leader the MVP five times out of those nine chances, and voted him second on two other occasions.

In the 19 seasons that followed, one player led the league in all three WARs on 15 occasions. The voters agreed 10 times.

WAR Triple Crown Winners, 2005-Present
Year League Name bWAR fWAR WARP MVP
2007 AL Alex Rodriguez 9.4 9.6 6.3 Rodriguez
2008 NL Albert Pujols 9.2 8.7 9.2 Pujols
2009 NL Albert Pujols 9.7 8.4 10.4 Pujols
2012 NL Buster Posey 7.6 9.8 8.0 Posey
2013 AL Mike Trout 8.9 10.1 7.5 Miguel Cabrera
2015 NL Bryce Harper 9.7 9.3 8.0 Harper
2015 AL Mike Trout 9.6 9.3 8.0 Josh Donaldson
2016 AL Mike Trout 10.5 8.7 8.9 Trout
2017 AL Aaron Judge 8.0 8.7 8.3 Jose Altuve
2018 NL Jacob deGrom 9.9 9.0 7.0 Christian Yelich
2019 NL Cody Bellinger 8.6 7.9 6.9 Bellinger
2020 AL Shane Bieber 3.2 3.1 2.5 José Abreu
2021 AL Shohei Ohtani 8.9 8.0 10.2 Ohtani
2022 AL Aaron Judge 10.5 11.1 10.0 Judge
2023 AL Shohei Ohtani 9.9 8.9 9.3 Ohtani

In three of those cases — Trout in 2013 and 2015 and Judge in 2017 — the consensus WAR leader came second. The other two involved pitchers: deGrom in 2018 and Shane Bieber in 2020, and while neither won the MVP, they are the two most recent full-time pitchers to finish in the top five.

Has WAR turned MVP voting into a leaderboard-reading exercise, as critics predicted? Not really. Thanks to the three publishers’ subtle differences in methodology, I’m not sure that it can. When there is consensus, the voters usually follow suit (again, unless there’s a pitcher involved). But that was basically the case 20 years ago, when on-base percentage was state of the art.

If MVP voting has become predictable in the 2020s, and if it’s the result of pressures and innovations brought on by sabermetrics and online media, I don’t think it’s because everyone is just blindly following WAR. We’re all getting better information now, and people with clubhouse access and award votes are better equipped to use that information. Whether that leads to conformity is of secondary importance to the quality of the analysis.

Even in the 1960s, there was a risk of being brigaded for expressing an unpopular opinion — just ask the ill-fated Max Nichols. And fear of the dogpile has only grown immeasurably since then. It’s an incentive not to disagree with the consensus, sure, but it’s an even bigger incentive to get your facts straight. Surely nobody would place a league-average utilityman over an 11-WAR slugger on an MVP ballot in 2024. (With that said, nothing would make me happier than finding out, in a week’s time, that someone voted for Matt Vierling over Judge for AL MVP. I will throw an actual party if that happens, and you’re all invited.)

So is it that voters no longer have the courage of their convictions, or is it just that they have better convictions these days?





Michael is a writer at FanGraphs. Previously, he was a staff writer at The Ringer and D1Baseball, and his work has appeared at Grantland, Baseball Prospectus, The Atlantic, ESPN.com, and various ill-remembered Phillies blogs. Follow him on Twitter, if you must, @MichaelBaumann.

75 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
cashgod27Member since 2024
1 month ago

I will defend the Yelich over deGrom vote. Was deGrom the better player that season? Yes. Was he the more valuable player? I disagree. The Mets had a better winning percentage when deGrom didn’t pitch than when he did. That’s the furthest thing from deGrom’s fault, and it’s not fair. But I don’t think the “value” of the “Most Valuable Player” should be completely hypothetical as it would be with 2018 deGrom.

MoateMember since 2022
1 month ago
Reply to  cashgod27

So tour argument here is “pitcher wins are an important statistic”? That’s…weird, especially in an era where bullpen usage is higher than bygone eras, so starters have less direct control of the 9 definite innings.

cashgod27Member since 2024
1 month ago
Reply to  Moate

I think pitcher wins are the dumbest stat alive, I just think it’s understandable to say a guy whose team lost most of the games he played in wasn’t the most valuable player in the league, even if he was the “best”.

mikejuntMember
1 month ago
Reply to  cashgod27

The guy whose team famously lost a lot of games where he gave up 1 run over 7 innings is not an argument against his lack of value, its just an indictment of the rest of his team. And that was the story of 2018 Jacob DeGrom.

In your world, I guess, DeGrom would only have been valuable if he’d matched Orel Hershiser’s consecutive scoreless innings streak, because that’s what he would have had to do for the Mets to win more of the games he pitched – they simply scored 0, 1 or 2 runs in a huge percentage of his games, and won some of them anyway because he was so dominant.

That’s not a fault of Jacob DeGrom.

NATS FanMember since 2018
1 month ago
Reply to  mikejunt

The games were boring and the bats fell asleep.

dl80Member since 2020
1 month ago
Reply to  cashgod27

I am of the opinion that MVP basically does mean “best.” That’s how I would vote, anyway.

cowdiscipleMember since 2016
1 month ago
Reply to  cashgod27

How many of those games would they have lost with someone else pitching?

Dmjn53
1 month ago
Reply to  cashgod27

“Alright cashgod27 thanks for the call. Next up on WFAN is Jimmy from Staten Island, who thinks Juan Soto walks too much”

MattMember since 2020
1 month ago
Reply to  Dmjn53

Say less Poindexter. Go upstairs and make your mom dinner. Have her clean your pocket protector. Talk to a girl. Are.you not old enough to be familiar with the standard response to your ain’t been clever since 1999 insults?

dl80Member since 2020
1 month ago
Reply to  Matt

the standard response to your ain’t been clever

?

Oh, Beepy.Member since 2024
1 month ago
Reply to  Matt

Sometimes I wonder how people like you even end up as FG members to begin with. This is one of the most pro-nerd spaces on the internet and yet you pay to be a member here and rail against nerds.

Thanks for funding the site man, please don’t feel the need to comment while you’re here.

tung_twista
1 month ago
Reply to  cashgod27

I understand why people are downvoting this, but it is really fascinating.
And no, this is not about pitcher wins.
Mets won 44% of deGrom starts in 2018 when he averaged 6.8 innings with 1.99 RA9 and 1.70 ERA.
They also won 48% of non-deGrom starts in 2018 when the starting pitchers averaged 5.4 innings with 4.37 RA9 and 4.12 ERA.
Just looking at the numbers, it would have been surprising if the Mets had “only” 4% higher win probability in deGrom starts, but apparently, 32 games is still a small enough sample size for these kinds of wacky results to happen.

mikejuntMember
1 month ago
Reply to  tung_twista

This isnt a surprise, players and teams routinely put up fluky results over whole seasons.

dl80Member since 2020
1 month ago
Reply to  tung_twista

It was probably a fluke, but it’s super weird. In 2016, 2018, and 2019, DeGrom was well below the Mets’ overall run support.

2016
Mets: 4.2, DeGrom 3.5

2017
Mets: 4.6 DeGrom 5.1

2018
Mets: 4.2, DeGrom 3.5

2019
Mets 4.9, DeGrom 4.1

The years before and after, he was average or even above average in terms of run support, but it almost seems like something happened those years that made his team less likely to score runs for him.

NATS FanMember since 2018
1 month ago
Reply to  dl80

he faced the ace of the other team most if the time.

PhilMember since 2016
1 month ago
Reply to  NATS Fan

In his first few starts, perhaps. But after a short while most teams will have had a different number of games played due to off days, rain outs, etc. And I don’t think teams will then try to line up their ace against the opposition’s ace in the regular season.

adam30
1 month ago
Reply to  cashgod27

How many games did they win with him starting? Him being the best pitcher on the planet would mean that their record would have been worse had any other pitcher been in the mound on those days he pitched. Any pitcher not in DeGroms level would theoretically mean even fewer wins, more stress on the bullpen for subsequent games, etc. hence, value.

Chip LockeMember since 2016
1 month ago
Reply to  cashgod27

Most people replying to this are not actually considering the argument. The argument is, as I understand it, that the Mets could have replaced Degrom with a completely average starting pitcher and their record would not have changed very much. There is nothing Degrom could do to change that, but it does mean his performance is less meaningful from a W-L perspective.

Put another, weirder, way (couple glasses of wine deep at this point): if I have two bottles of wine, and one is better than the other but my fiancée pours out half of the better bottle, that less good bottle is now more valuable to me.

It’s an argument you can disagree with, but it is way more sound than commenters here are making it out to be. Prof. Cashgod (I assume it’s a teacher’s surname) isn’t saying Degrom “wasn’t valuable”, they are saying his value gets diminished when his team sucks.