On Cy Youngs and Theoretical Pitcher WAR Models

by Dave Cameron

November 13, 2013

Here at FanGraphs, we have two different models of pitcher WAR: one based on FIP, and one based on runs allowed. These represent the extreme opposite ends of the viewpoints on how much credit or blame a pitcher should receive for events in which his teammates have some significant influence. If you go with strictly a FIP-based model, a pitcher is only judged on his walks, strikeouts, and home runs, and the events of hits on balls in play and the sequencing of when events happen are not considered as part of the evaluation.

If you go with the RA9-based model, then everything that happens while the pitcher is on the mound — and in some cases, what happens after they are removed for a relief pitcher — is considered the pitcher’s responsibility, and he’s given full credit or blame for what his teammates do while he’s pitching.

Both of these models are wrong. It is evidently clear that pitchers have some influence over the rate of which their balls in play are turned into hits, and the order in which the events they give up occur, but it also evident that they are not solely responsible for those two things. The quality of defensive support behind a pitcher, and the timing of when the defense either bails out or screws over their teammate, has an impact on a pitcher’s runs allowed total. The truth of nearly every pitcher’s performance lies somewhere in between his FIP-based WAR and his RA9-based WAR.

The trick is that it’s not so easy to know exactly where on the spectrum that point lies, and its not the same point for every pitcher. Some pitchers hold down BABIP better than others. Some pitchers are better at pitching from the stretch than others. Some pitchers on good defensive teams didn’t actually get good defensive support behind them, and vice versa, and we can’t simply assume equal defensive support for every pitcher based on a team’s total aggregate UZR or DRS.

But, at the same time, we know that both FIP-WAR and RA9-WAR are giving either too much or too little credit to pitchers for their past results. Having both on the site gives us a nice ability to say that a player was likely worth between x and y wins, and you can essentially defend almost any number in between those two points. The question in evaluating pitchers essentially comes down to how much of the Fielding Dependent aspects of pitching you want to give the pitcher credit for.

So, with the Cy Young announcements coming this evening, I thought it would be useful to look at the entire spectrum of potential pitcher-WAR models, and how the candidates would rate based on a compromise between the two extremes. We already have the 100% FIP and 100% runs allowed models on the site, but what about all the WAR-models that would result from a compromise between the two, giving some but not full credit to pitchers for things measured by FDP? That’s what the tables below are going to show. First, the American League, because that’s the more interesting conversation.

The players in this table posted at least +6.0 WAR either by FIP-WAR or by RA9-WAR, meaning that they have some legitimate claim on Cy Young consideration depending on your perspective. The table is initially sorted by 100% FIP-WAR, but each column is sortable, so you can see how the leaders would stack up by the various hybrid models that contained some parts FIP and some parts RA9. The first number in the header represents what percentage weight the FIP-based WAR is being given in that calculation, so it moves in 10% increments, from left to right, from 100% FIP-WAR to 0% FIP-WAR. The results.

Name	FIP-WAR	90/10	80/20	70/30	60/40	50/50	40/60	30/70	20/80	10/90	RA9-WAR
Max Scherzer	6.4	6.4	6.4	6.3	6.3	6.3	6.3	6.3	6.2	6.2	6.2
Anibal Sanchez	6.2	6.2	6.2	6.1	6.1	6.1	6.1	6.1	6.0	6.0	6.0
Felix Hernandez	6.0	5.9	5.8	5.7	5.6	5.6	5.5	5.4	5.3	5.2	5.1
Yu Darvish	5.0	5.2	5.3	5.5	5.7	5.9	6.0	6.2	6.4	6.5	6.7
James Shields	4.5	4.7	4.8	5.0	5.1	5.3	5.4	5.6	5.7	5.9	6.0
Hisashi Iwakuma	4.2	4.5	4.7	5.0	5.2	5.5	5.8	6.0	6.3	6.5	6.8

Because of their excellence in both FIP and runs allowed, the two Tigers grade out as at least +6 WAR pitchers by every combination model, and even the polar extremes. No matter what amount of credit you want to give to a pitcher for his FDP, Scherzer and Sanchez will essentially come out as legitimate Cy Young candidates.

Darvish and Iwakuma become legitimate candidates if you believe that a pitcher is responsible for a large majority of the fielding dependent outcomes, as Darvish climbs above +6 WAR in every model that weights runs allowed as at least 60% of the calculation; Iwakuma gets over +6 WAR in the models where runs allowed is at least 70% of the calculation. In the 40/60 and 30/70 models, however, Scherzer still comes out on top.

In the 20/80 model, Darvish actually takes the lead, but the differences are so tiny that any reasonable conclusion would be that it’s a three way tie. Darvish and Iwakuma essentially tie in the 10/90 and 0/100 models, with Scherzer finally sliding out of the top position in just those two calculations.

Across the spectrum of potential pitcher WAR models, Scherzer is either the leader or tied for the lead in 8 of the 10 calculations. The only way to give the 2013 Al Cy Young Award to someone besides Scherzer is to lean almost entirely on runs allowed, and believe that absolutely everything that happens when a pitcher is on the mound is the responsibility of the pitcher. I find that position pretty hard to defend, personally, and given that the pitchers with a very small advantage on the right side of the spectrum have a much larger deficit on the left side, I think the reasonable conclusion is that Scherzer was the American League’s best pitcher in 2013.

Now, for the National League, which isn’t quite as compelling.

Name	FIP-WAR	90/10	80/20	70/30	60/40	50/50	40/60	30/70	20/80	10/90	RA9-WAR
Clayton Kershaw	6.5	6.7	7.0	7.2	7.4	7.7	7.9	8.1	8.3	8.6	8.8
Adam Wainwright	6.2	6.1	6.0	6.0	5.9	5.8	5.7	5.6	5.6	5.5	5.4
Matt Harvey	6.1	6.1	6.1	6.1	6.1	6.1	6.0	6.0	6.0	6.0	6.0

If you absolve the pitchers of all their non BB/K/HR events and ignore the timing of when events occurred, Kershaw is in a pretty tight race with Wainwright and Harvey. If you give any credit to the pitcher for his FDP events, however, Kershaw starts to pull away easily, taking a full +1 WAR lead as early as the 80/20 calculation. By the time you get to 100% runs allowed, Kershaw is nearly +3 WAR ahead of the next closest NL starter. As much as I love Adam Wainwright, there’s really no case for anyone besides Kershaw here. Any calculation that includes both FIP and runs allowed will have Kershaw running circles around the rest of the field.

I think eventually we’ll get to a point where we have enough data to support a model of pitcher WAR that includes both FIP and FDP events, and my guess — and right now, it’s only a guess — is that the data will suggest something in the range of the 70/30 model, giving pitchers some credit for FDP but being closer to FIP than RA9. If I had to put my money down on one single calculation that captured all pitcher’s past value most accurately (and I couldn’t switch between models for different types of pitchers), I’d probably pick the 70/30 model from the table above.

And that’s why I don’t have any problem with Scherzer or Kershaw winning today. They’re both deserving candidates, no matter how you you try to isolate a pitcher’s contributions to run prevention.

38 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

ASURay

11 years ago

Both of these models are wrong.

All models are wrong, so you’re okay 🙂

abreutime

Reply to ASURay

But some are useful!

Kate Upton

Reply to abreutime

I’m NEVER wrong. And ALWAYS useful. 🙂

Owen

Reply to Kate Upton

This is wrong. But maybe useful? http://nyc.barstoolsports.com/m/sports-page/kate-upton-yankee-fan/

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG