Why Our Pitcher WAR Uses FIP by Dave Cameron October 1, 2010 Yesterday, respected scribe Joe Posnanski put out the following message on his Twitter account: Fangraphs WAR has Cliff Lee at 6.6, best in AL. Baseball Ref WAR has Cliff Lee at 4.2, not even in Top 10. We need a summit. Since Baseball Reference implemented Sean Smith’s version of Wins Above Replacement, people have been pointing out the differences between the two systems. While they use the same basic framework, they have different inputs which naturally lead to different results. For position players, the defensive system is the main driver of the discrepancies, as we use UZR while Sean uses Total Zone. For pitchers, however, it’s not just a different system, but a fundamental difference of what is being measured, and this is what drives the big gaps in WAR for pitchers like Lee. Our version of pitcher WAR is essentially based on FIP, meaning that a pitcher is judged by his walk rate, strikeout rate, and home run rate (and, of course, the quantity of innings that he throws and the role in which he throws them). Sean takes a pitcher’s actual runs allowed, then makes adjustments to try to compensate for the defense behind him. The two systems might have the same goal, but they’re measuring two different things. Over the last few months, I’ve seen plenty of comments about how some people don’t like our FIP-based version of pitcher WAR for various reasons, so today I thought I’d explain the thought process of why we decided to build it like we did. There is essentially one big problem that anyone evaluating pitchers has to deal with: how do you separate responsibility for hits allowed? There are usually four or five variables at work on any ball that is put in play – the quality of the pitch, the quality of the swing, the quality of the defensive play, the effects of the ballpark that game is being played in, and sometimes weather. All of these factors influence whether that ball ends up as a hit or an out, and, obviously, they don’t all have to do with the pitcher. So, in creating a statistic that attempts to isolate just the pitcher’s contribution, you have to figure out how you want to deal with the other factors. Pretty much everyone just assumes that the quality of the opposing hitter essentially evens out over the course of a season, and it likely does most of the time, at least enough to where it doesn’t make a huge difference whether that is factored in or not. The two environmental factors, park and weather, are usually lumped together in one estimate of how a park plays, which is part of the adjustment in pitcher WAR for both our version and Sean Smith’s version. I have some issues with using a single park factor for every player, and I think before too long we’ll have a better way of adjusting for those things, but that’s another post entirely. The last issue – quality of defensive contribution – is where we do something quite different. When we sat down and talked about how to handle this issue, I essentially realized that there is no right way to do this, based on the statistics we currently have available. There are compromises that have to be made, as we simply don’t have the tools available to really allow us to correctly assign responsibility between a pitcher, hitter, or defender on each hit or out in play. So, no matter what we did, there’d be a problem, and we’d just have to acknowledge that issue. In the end, we had to choose between two different methods – assuming that the pitcher had no responsibility for the outcome of a ball in play, or attempting to approximate the amount of time that the result was due to the pitcher or the fielder. Ideally, we’d be able to do the latter – which is how Sean approaches it – but I just don’t think we currently have the tools available to make an accurate enough judgment on how to apportion that responsibility. The why for that last paragraph deserves its own post, so I’ll have a follow-up here on the site in a few hours, where we’ll go through the problems with using both defense-adjusted RA and FIP for pitcher WAR, and show why I think that the FIP model is preferable for now.