New SIERA, Part Three (of Five): Differences Between xFIPs and SIERAs

by Matt Swartz

July 20, 2011

Who’s up and who’s down? Which pitcher will improve upon a stellar season, and who is headed for the trash heap? SIERA and xFIP attempt to answer these questions from year to year, but they’re not totally interchangeable metrics. Why? The biggest difference is the way each uses strikeout rates.

That’s not to say that the two statistics don’t generally say the same thing. In fact, they’re much more reliable – and calculated much differently – than traditional ERA. For a quick example, take a look at the top 10 pitchers in all three metrics during the past four seasons.

2010

Rk	Pitcher	SIERA	Pitcher	xFIP	Pitcher	ERA
1	Roy Halladay	2.90	Roy Halladay	2.80	Clay Buchholz	2.25
2	Francisco Liriano	3.02	Francisco Liriano	2.95	Josh Johnson	2.28
3	Cliff Lee	3.10	Adam Wainwright	3.02	Felix Hernandez	2.37
4	Josh Johnson	3.10	Josh Johnson	3.02	Roy Halladay	2.41
5	Adam Wainwright	3.13	Cliff Lee	3.06	Adam Wainwright	2.49
6	Jered Weaver	3.15	Tim Lincecum	3.09	Ubaldo Jimenez	2.65
7	Mat Latos	3.16	Felix Hernandez	3.14	Jaime Garcia	2.78
8	Felix Hernandez	3.20	Jon Lester	3.18	Roy Oswalt	2.79
9	Tim Lincecum	3.21	Mat Latos	3.21	David Price	2.79
10	Jon Lester	3.25	Cole Hamels	3.28	Tim Hudson	2.88

2009

Rk	Pitcher	SIERA	Pitcher	xFIP	Pitcher	ERA
1	Javier Vazquez	2.86	Javier Vazquez	2.77	Zack Greinke	2.12
2	Tim Lincecum	2.92	Tim Lincecum	2.83	Chris Carpenter	2.31
3	Zack Greinke	3.04	Roy Halladay	3.00	Tim Lincecum	2.47
4	Justin Verlander	3.06	Dan Haren	3.02	Felix Hernandez	2.59
5	Dan Haren	3.07	Jon Lester	3.09	Jair Jurrjens	2.64
6	Roy Halladay	3.12	Zack Greinke	3.09	Adam Wainwright	2.70
7	Jon Lester	3.15	Justin Verlander	3.20	Roy Halladay	2.80
8	Rick Nolasco	3.23	Ricky Nolasco	3.23	Clayton Kershaw	2.85
9	Josh Beckett	3.40	Josh Beckett	3.30	Matt Cain	2.89
10	Chris Carpenter	3.43	Adam Wainwright	3.32	J.A. Happ	2.89

2008

Rk	Pitcher	SIERA	Pitcher	xFIP	Pitcher	ERA
1	Roy Halladay	3.13	CC Sabathia	3.06	Cliff Lee	2.58
2	CC Sabathia	3.16	Roy Halladay	3.11	Johan Santana	2.60
3	Tim Lincecum	3.18	Tim Lincecum	3.13	Tim Lincecum	2.61
4	Dan Haren	3.21	Dan Haren	3.16	CC Sabathia	2.74
5	Josh Beckett	3.21	Josh Beckett	3.19	Roy Halladay	2.79
6	Brandon Webb	3.23	Derek Lowe	3.32	Daisuke Matsuzaka	2.80
7	Ervin Santana	3.32	Brandon Webb	3.33	Ryan Dempster	2.82
8	Derek Lowe	3.37	Mike Mussina	3.43	Cole Hamels	3.05
9	Randy Johnson	3.55	Ervin Santana	3.48	Jon Lester	3.10
10	Mike Mussina	3.55	A.J. Burnett	3.51	Jake Peavy	3.11

2007

Rk	Pitcher	SIERA	Pitcher	xFIP	Pitcher	ERA
1	Erik Bedard	2.95	Erik Bedard	2.90	Jake Peavy	2.54
2	Johan Santana	3.20	Felix Hernandez	3.27	Brandon Webb	3.01
3	Josh Beckett	3.32	Johan Santana	3.30	John Lackey	3.01
4	Jake Peavy	3.33	Brandon Webb	3.30	Brad Penny	3.03
5	Felix Hernandez	3.35	Josh Beckett	3.31	Fausto Carmona	3.06
6	Brandon Webb	3.43	John Smoltz	3.33	Dan Haren	3.07
7	Derek Lowe	3.45	Jake Peavy	3.34	John Smoltz	3.11
8	Javier Vazquez	3.45	Cole Hamels	3.40	Chris Young	3.12
9	John Smoltz	3.46	CC Sabathia	3.40	Erik Bedard	3.16
10	Cole Hamels	3.46	Derek Lowe	3.40	Roy Oswalt	3.18

Pitchers atop the leaderboards both for SIERA and for xFIP usually stick there. It’s not a coincidence. The best pitchers usually have the best stuff. On the other hand, the ERA league-leaders are most always a mix of luck and good performances – meaning it’s difficult for pitchers to replicate those successes from year-to-year.

Three pitchers appear three times on the SIERA and xFIP leaderboards from the 2007 to 2010 (Roy Halladay, Josh Beckett and Tim Lincecum). Out of that trio, only Halladay appears three times among the ERA leaders. In the past four seasons, there have been 13 pitchers with multiple years in the top 10 in SIERA and 14 pitchers with multiple years in the top 10 in xFIP. But only six pitchers have managed to crack the top 10 in ERA more than once in the past four years.

From this, it’s obvious that xFIP and SIERA generally agree — with a few exceptions. Let’s have a look at those pitchers so it’s clearer when — and why — the two metrics differ.

Among the 88, 78 and 92 pitchers with enough innings to qualify for the ERA titles from 2008 to 2010, there were several who were as much as 10 ranking apart. Here’s the 2010 list.

2010:

Pitcher	SIERA	xFIP	SIERA Rank (of 92)	xFIP Rank (of 92)
Clay Buchholz	4.29	4.07	67	56
Jaime Garcia	3.81	3.87	37	22
Tommy Hanson	3.74	3.87	30	42
Colby Lewis	3.58	3.74	19	35
Ian Kennedy	3.99	4.10	47	59
Shaun Marcum	3.62	3.71	20	32
Jeremy Guthrie	4.51	4.60	75	86
Phil Hughes	4.05	4.13	48	60
Brian Matusz	4.21	4.31	59	73

Pitchers with good, but not great, groundball rates fare much better with xFIP than they do with SIERA. That’s why SIERA was more bearish on Clay Buchholz than xFIP. Because of that — and also because of Buchholz’s modest strikeout rate — SIERA assumed the Red Sox pitcher was hittable. SIERA also wasn’t as excited about Jaime Garcia as xFIP was after 2010, but Garcia improved his walk rate and maintained a low ERA.

SIERA saw Tommy Hanson’s high strikeout rate in 2010 and knew that his low BABIP wasn’t a fluke. And neither SIERA nor xFIP had much confidence in Jeremy Guthrie’s chance to reproduce his 3.83 ERA in 2010. SIERA figured he’d at least control damage by giving up less-costly walks.

Only a few pitchers in 2009 had large spreads between their xFIP and SIERA rankings.

2009:

Pitcher	SIERA	xFIP	SIERA Rank (of 78)	xFIP Rank (of 78)
Ted Lilly	3.78	3.90	21	33
Mark Buehrle	4.62	4.37	66	55
Jered Weaver	4.25	4.40	45	55
Zach Duke	4.58	4.25	60	49
Johnny Cueto	4.34	4.51	51	66

The pitchers whom xFIP is particularly cautious about are guys who give up a lot of fly balls despite whiffing lots of batters. On the other hand, SIERA tends to favor exactly those types of pitchers.

SIERA likes Ted Lilly because it thinks he can limit his BABIP and HR/FB, just like Jered Weaver has done. Both these pitchers generate weak contact because of their high strikeout rates — and both of their high fly ball rates mean they generate shorter, more catchable balls. Their SIERAs are regularly more attractive than their BABIPs.

Zach Duke’s low 2009 strikeout rate looked like a recipe for disaster to SIERA, and that’s exactly what happened in 2010. Since strikeouts mean more to SIERA than other outs, SIERA puts more distance between guys like Lilly and Duke than other estimators.

Consider this: If you subtract 20 Ks from Lilly’s 2009, his SIERA would shoot up from 3.78 to 4.09 – 31 points. On the other hand, his xFIP would go up from 3.95 to 4.22 — a 27-point difference. Adding 20 strikeouts to Zach Duke’s 2009 would help his ERA improve from 4.58 to 4.32, a 26-point difference — while his xFIP would fall 21 points to 4.04. These discrepancies are not massive, but they add up.

Strikeouts matter more to SIERA than to xFIP. Fly balls matter less, so a pitcher like Lilly would suffer less from being slightly below average in that category. Duke would benefit less from being slightly above average.

Along those lines, there were more pitchers with big xFIP and SIERA differences in 2008, including some familiar names from the 2009 and 2010 lists.

2008:

Pitcher	SIERA	xFIP	SIERA Rank (of 88)	xFIP Rank (of 88)
Ted Lilly	3.93	4.08	29	41
Nick Blackburn	4.61	4.42	67	54
Oliver Perez	4.58	4.74	64	76
Jered Weaver	4.06	4.70	58	72
Manny Parra	4.06	3.81	35	23
Vicente Padilla	4.50	4.65	59	71
Johnny Cueto	4.11	4.31	38	52
Justin Verlander	4.50	4.70	58	72

Nick Blackburn’s low 2008 strikeout rate looked like trouble to SIERA, while Justin Verlander was the classic high-strikeout-low-groundball guy whom SIERA loves. Since SIERA doesn’t see large differences between pitchers with less than 50% groundball rates, pitchers who really miss bats while allowing a few more fly balls — like Verlander, Weaver and Lilly — are the types of pitchers who look better to SIERA compared to xFIP.

Johnny Cueto appears on both the 2008 and 2009 lists. With strikeout rates slightly above average and fly ball rates slightly below average, Cueto fits the mold of a pitcher whom SIERA prefers, compared to xFIP. Whiffs indicate better performance in other areas. Keeping the ball on the ground often does not.

Pitchers with control problems differ depending on how many hitters they can strike out. SIERA thought Manny Parra’s wildness — as well as his so-so 2008 strikeout and groundball rates – was a sign of trouble. And it was in 2009. Still, it made a mistake thinking that Oliver Perez’s strikeout rate would mitigate some of his control problems, which clearly didn’t happen.

Remove 20 walks from Parra’s 2008 line and his SIERA would go down 40 points to 3.66. His xFIP would go down 36 points — to 3.45. But SIERA thinks those extra walks were a slightly bigger sign of trouble. In a way, it smelled doom.

With Perez, SIERA thought that as long as he maintained his strikeout rate, he’d do well enough. But drop his strikeouts by 20 to make it look more like his 2009 rate, and his SIERA would spike to 4.87, clearly moving Perez from the area from an acceptable to an unacceptable pitcher. These examples show why it’s important to treat SIERA as an estimator and not as a predictor. In reality, Perez’s talent level in 2009 was probably around a 4.58 ERA.

Fans’ these days know that pitchers have little control over BABIP and HR/FB. Taking this hypothesis to the extreme, xFIP shows fans the effects of the most important parts of pitching. Relaxing that hypothesis is more realistic, though, which is what SIERA does by allowing groups of pitchers to have different BABIPs and HR/FBs.

Run prevention directly due to strikeouts, walks and home runs is better evaluated with xFIP; run prevention due to other factors is better evaluated with SIERA. This is why both are useful. Now that SIERA and xFIP are available at FanGraphs, readers have the tools to understand pitching like never before.

With these two statistics, it’s worth testing them against each other – and then against the rest of the field. We’ll do that tomorrow.

40 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Ben HallMember since 2016

13 years ago

Matt,

I’m really liking this series, and I’m excited that SIERA is at Fangraphs. I do think you should be a bit more cautious in some of your statements.

“SIERA saw Tommy Hanson’s high strikeout rate in 2010 and knew that his low BABIP wasn’t a fluke.” Seems like “suspected” would be better than “knew.” While I’m trusting your research has shown that high strikeout pitchers have lower BABIP, there certainly have been high strikeout pitchers with average or high BABIP.

“Both these pitchers generate weak contact because of their high strikeout rates — and both of their high fly ball rates mean they generate shorter, more catchable balls.” Again, these ideas may generally be true, but I don’t think we should state them as fact.

That said, on second glance most of the time you did state these ideas as likelihoods rather than absolute truths, so I’m probably nitpicking. Looking forward to tomorrow’s entry.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG