As we announced yesterday, win values for pitchers are now available on the site. As before, we’re going to go through the process of explaining the calculations that lead to the values you see here on FanGraphs and lay the foundation for understanding what these win values represent.
To start with, let’s take a look at the main input that goes into the win value calculation – a pitcher’s FIP, or Fielding Independent Pitching, which calculates a pitcher’s responsibility for the runs he allows based on his walks, strikeouts, and home runs allowed. The FIP formula is (HR*13+(BB+HBP-IBB)*3-K*2)/IP, plus a league-specific factor that scales FIP to match league average ERA for a given season and league. For the win value purposes, we modified the league specific factor to scale FIP to RA instead of ERA.
Why did we use FIP? I know this a popular question, and it’s something I wrestled with myself. However, what I couldn’t get away from is that we wanted the context sensitivity for the position player and pitcher win values to be as close as possible. wRAA, the offensive input into Win Values for position players, is context-neutral – a hitter does not get credit for his situational performance, such as hitting well with runners in scoring position. Since we aren’t giving hitters credit for situational performance, we can’t give it to pitchers either, in order to maintain the same situation neutral scale.
This is going to lead to some questions – we’re aware of that. Claiming that Javier Vazquez was a +5.2 win pitcher in 2006, when traditional metrics will tell you that he went 11-12 with a 4.84 ERA, is going to be a tough sell. We know.
However, the tangled web of responsibility for run prevention is not accurately unraveled by simply giving pitchers credit and blame for all earned runs and fielders credit and blame for all unearned runs. As most of you know, there are so many extra variables that go into a pitcher’s ERA that the pitcher himself simply doesn’t have control over. We have to try to extract the pitcher’s responsibility from his team’s run prevention while he’s on the mound. Using ERA or RA simply adds too many non-pitcher factors into the equation to the point that we’re no longer just evaluating the pitcher.
FIP removes defense from the equation by only looking at three factors that a pitcher has demonstrable control over – walks, strikeouts, and home runs allowed. By using FIP, we’re isolating the pitcher’s core abilities and evaluating him based on those skills. Now, we’re not claiming that FIP captures everything a pitcher is responsible for. It is not the perfect context-neutral pitcher run modeler – we know that. But when confronted with a choice of including way too many non-pitcher inputs or leaving out a few minor actual pitcher inputs, the latter was the better choice. You will get more accurate win values for a pitcher using FIP than you will ERA or RA.
Getting back to Vazquez for a second – his 2006 FIP was a full run lower than his ERA. The driving forces behind his struggles were a .321 BABIP and a 65.8% LOB%. Most everyone would agree that we don’t want to penalize him for poor defense played behind him, but how do we untangle the responsibility for the lack of stranded runners? Vazquez was horrible with men on base in ’06, but most of that was BABIP related – a .343 BABIP with men on versus a .284 BABIP with the bases empty. If we’re going to say that he’s not responsible for his high batting average on balls in play, and the batting average on balls in play was responsible for the lack of runners stranded, than how do we remove the former but not the latter? This is what I mean by a tangled web of responsibility in terms of run prevention.
If you wanted to make the argument that the context-sensitive stuff, such as how often a pitcher leaves runners on base, should be included, then you also need to be prepared to fight for WPA/LI as the offensive metric of choice for hitter win values. And honestly, I won’t put up much of an argument – there’s a case to be made for context-sensitive win values as a useful metric, and I’d imagine there will be a day that those are publicly available too. But, there’s a more compelling argument for context neutral win values, which is what we’ve decided to present here. What most of us are interested in knowing is how well a player performed in helping his team win, regardless of the performance of his teammates. To answer that, we have to strip out as much context as we can.
Think of FIP as the pitcher version of wRAA. wRAA doesn’t include non-SB/CS baserunning or situational hitting. FIP doesn’t include batted ball data or situational pitching. Neither are perfect, but but both give us the vast majority of the context-neutral picture.
That doesn’t mean that we’re set in our ways and that these win values will never be improved upon. If and when a new metric like tRA is proven to be significantly more effective in valuing pitchers (and I’m hopeful that it will be, given more data exploration on the topic), we won’t be standing here as guardians of the infallibility of FIP. We want to get to the truth, and do so as quickly and as accurately as possible. I will encourage you (especially those of you in the “tRA is awesome/FIP sucks” camp), though, to not let minor differences cause you to miss the fact that FIP and tRA lead to very similar results.
This afternoon, we’ll talk about replacement level for pitchers, how it differs for each league and role, and how we tackled the issue.
Dave is the Managing Editor of FanGraphs.