New SIERA, Part One (of Five): Pitchers with High Strikeouts Have Low BABIPs

by Matt Swartz

July 18, 2011

Predicting baseball statistics is a tough job, especially when it comes to pitchers.

For every Roy Halladay pitch machine, there are 10 James Shieldses – guys whose ERAs change a run or two every year. Basically, it’s a crapshoot when it comes to figuring out the next ace – or the former ace-in-waiting who’ll lose his job by the all-star break. Don’t believe me? Consider this: In the past 11 years, four hitters have led the major leagues in WAR; eight pitchers have led the majors in ERA.

So while the league-leading run producers might be predictable, league-leading run preventers change almost every year. But hope isn’t completely lost when it comes to figuring out the next pitching superstar – or dud. Some pitching stats are very predictable, and focusing on those few numbers might lead us to a better system to evaluate talent.

Enter Skill-Interactive ERA – or SIERA — which follows in the footsteps of xFIP by using statistics that don’t change much from year-to-year: strikeout rate, walk rate and ground-ball rate.

When Eric Seidman and I developed the original SIERA — which appeared at Baseball Prospectus more than a year ago — we didn’t totally appreciate why it worked. Essentially, since it uses regression analysis, it asks the question: What’s the typical ERA for pitchers with similar strikeout, walk and ground-ball rates in recent years?

Our thought was that the main reason SIERA worked so well was that it took into account the interplay between those three statistics. But on further analysis, it turns out that SIERA is successful mainly because it assumes a low BABIP and HR/FB for strikeout pitchers (and for fly ball pitchers, as well).

By including both of these effects, SIERA measures pitcher performance in unique ways. Unlike batters, pitchers generally participate in consecutive plate appearances. Metrics like wOBA can approximate hitter performance using linear weights — effectively assuming that the hitter isn’t responsible for the events immediately before or after his performance. Pitchers also have only a share of the responsibility for the outcome of a plate appearance.

Metrics like FIP and xFIP attempt to isolate the outcomes that pitchers do control — like strikeouts, walks and home runs — and then credit them only with those outcomes. Using linear weights and highlighting the run-preventing and run-creating effects of these defense-independent events, the numbers can approximate a pitcher’s contribution to wins and losses far better than a simple ERA formula.

Take American League ERA-leader, Jered Weaver, as an example. xFIP sees Weaver’s 7.81 K/9, 2.06 BB/9 and 48.3% fly ball rate and gives him a 3.32. SIERA sees the same numbers and puts Weaver at 3.15.

So why is SIERA such a Weaver fan? It’s pretty simple: Year-to-year correlation in strikeout rate, walk rate and home run rate is significantly higher for pitchers than BABIP. That implies pitchers have a greater ability to control those outcomes.

Understanding this, Tom Tango developed FIP, which credits a pitcher with the effects of strikeouts, walks and home runs, while assuming that the player had a league-average BABIP. Taking that a step forward, xFIP also assumes a league-average home-run-per-fly-ball rate. Nate Silver took another approach by creating QERA, which uses regression analysis to estimate ERA. And last year, Eric Seidman and I developed SIERA, which also used regression analysis to predict earned run average but made several tweaks to Silver’s original version.

As a result, SIERA gives more credit for a strikeout than FIP and xFIP and less blame for a fly ball. It not only credits pitchers with the run-dampening effect of a strikeout, but also assumes that they allow fewer hits and home runs than FIP and xFIP, since high-strikeout pitchers allow fewer hits and home runs. While xFIP and FIP both assumed that Weaver’s 2010 BABIP was .297, SIERA assumed Weaver had a similar BABIP as other high-K-rate and fly-ball rate pitchers – which was about .278. His actual BABIP was .277.

Weaver isn’t an exception. Last year’s strikeout-rate leader, Jon Lester, had a BABIP of .291. Tim Lincecum led the league in strikeout rate in 2009, and his BABIP was .288. And while Lincecum’s BABIP was .310 in 2008, the 2007 strikeout-rate leader was Erik Bedard, who had a BABIP of .284.

In fact, look at the BABIPs for the league-leaders in strikeout rate during the past nine years:

Year	Pitcher	SO/PA	BABIP
2010	Jon Lester	26.1%	.291
2009	Tim Lincecum	28.8%	.288
2008	Tim Lincecum	28.6%	.310
2007	Erik Bedard	30.2%	.284
2006	Johan Santana	26.5%	.271
2005	Mark Prior	26.8%	.281
2004	Randy Johnson	30.1%	.267
2003	Kerry Wood	30.0%	.275
2002	Randy Johnson	32.3%	.291

Pitchers who allow less contact see weaker contact from hitters. While the batter, the defense and luck have a larger influence on whether a batted ball lands for a hit, pitchers have a role, too. Pitchers who allow ground balls also allow more hits, but fewer extra base hits on balls in play.

Looking at all 3,328 pitcher-seasons (with at least 40 innings pitched) between 2002 and 2010, I sorted players into four groups by strikeout rate. The higher the strikeout rate, the lower the BABIP and HR/FB at each level of strikeout rate.

STRIKEOUT GROUP	BABIP	HR/FB
HIGH	.286	9.1%
MEDIUM-HIGH	.295	10.2%
MEDIUM-LOW	.298	10.7%
LOW	.301	10.7%

Pitchers do have some control over their BABIPs, but there’s too much noise in their actual numbers to infer their true skill levels. Weaver has had BABIPs as low as .238 and as high as .316 during his career. Still, small sample sizes of peripherals say more about BABIP skill level in a way that BABIP alone cannot.

On the extreme end, SIERA assumed the lowest BABIPs in 2010 for:

Jered Weaver (.278 predicted, .277 actual)

Ted Lilly (.280 predicted, .254 actual)

Phil Hughes (.284 predicted, .275 actual)

Colby Lewis (.284 predicted, .277 actual)

Matt Cain (.285 predicted, .254 actual)

It assumed the highest BABIPs for:

Jon Garland (.307 predicted, .268 actual)

Paul Maholm (.306 predicted, .332 actual)

Fausto Carmona (.306 predicted, .284 actual)

Rick Porcello (.305 predicted, .308 actual)

Clay Buchholz (.305 predicted, .263 actual)

While Weaver’s strikeout-inducing prowess makes him an extreme example, SIERA is on the money more often than not. Time after time, a player’s future ERA is closer to SIERA than any similar pitching metric. SIERA does this not by factoring projection trends — like reversion to the mean and aging — but because it answers the most basic question that can be asked about a pitcher: How well did the guy actually pitch?

The next four stories will:

1. Discuss the previous research I’ve done on pitching and how SIERA utilizes it. Discuss the SIERA changes for FanGraphs. Introduce the new formula and explain why it works.

2. Discuss pitchers with large differences in their xFIPs and SIERAs and explain what they teach us about pitching.

3. Test SIERA against different ERA estimators.

4. Discuss attempted SIERA changes that didn’t work.

86 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

eric

13 years ago

can somebody explain to me that point about 4 hitters leading in WAR and 8 pitchers leading in ERA? I’m a bit confused.

Ari Collins

Reply to eric

WAR is more consistent thane ERA, which fluctuates wildly. Thus you have more different pitchers leading in ERA than you have different hitters leading in WAR.

-1

jay

“while the league-leading run producers might be predictable, league-leading run preventers change almost every year”

Basically, it’s a lot more difficult to predict the ERA leaders than the WAR leaders. Personally, I would argue that difference is probably due to the comparative strength of WAR (ERA is a terrible predictive stat).

Reply to jay

apologies for the repeat. guess my computer is moving slowly today

MikeS

In addition to the accurate replys given, it may also have something to do with the fact that Barry Bonds (3 times) and Albert Pujols (4) don’t pitch. Two guys may be skewing this sample quite a bit.

Old Hoss

Reply to MikeS

That and the fact that pitchers are generally more injury prone than hitters.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG