WAR: It Works

We use Wins Above Replacement around here a lot, as one of the focuses of the site is to accurately quantify the value each player produces, and WAR is the best tool we have to do that. However, it faces a decent amount of skepticism from people who don’t trust various components for a variety of reasons – they don’t like the numbers that UZR spits out for defense, they don’t believe in replacement level, or they believe that pitchers do have control over their BABIP rates.

So, the question is, does WAR work? If it’s designed well, there should be a pretty strong correlation between a team’s total WAR and their actual record. Fans of WAR rejoice – there is.

For 2009, the correlation between a team’s projected record based on their WAR total and their actual record was .83. This is a robust number, especially considering that WAR is almost completely context independent and currently includes some notable omissions – base running (besides SB/CS, which are included in wOBA) and catcher defense are both ignored in the calculations. We also don’t have an adjustment for differences in leagues, so we’re not accounting for the fact that the AL is better than the NL.

Despite these imperfections, WAR still performs extremely well. One standard deviation of the difference between WAR and actual record is 6.4 wins, and every single team is within two standard deviations. Only four teams were more than 10 wins away from their projected total by WAR, with Tampa Bay ending up the furthest away from our expectation (96.6 projected wins, 84 actual wins), and 18 of the 30 teams were within six wins of their projected WAR total.

For comparison, the correlation between pythagorean expected record and actual record is .91, and pythag includes some aspects of context (performance with men on base, for instance) that impact runs scored and allowed, so we would expect it to predict actual record somewhat better than a context independent metric like WAR. The fact that WAR is even close to somewhat-context-included pythag is impressive in its own right.

WAR isn’t perfect. But given the known limitations and the variations in how contextual situations impact final record, it does an awfully impressive job of projecting wins and losses.

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.




Dave is the Managing Editor of FanGraphs.

132 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
JoeR43
16 years ago

Try taking extra data (from 2002 on) and plot it out. Still, anything with > 80% correlation is usually good to go.