Park Factors and ERA Estimators: Part III

February 28, 2012

When we last left the question on Park Factors’ effect on ERA estimators we found that the estimators performed the best in hitters’ parks when looking at starting pitchers. FIP and xFIP performed better than tERA or SIERA when predicting the next year’s ERA for this group of pitchers. For the other park types, the pattern looked similar to what we generally see — SIERA generally performs best, while all estimators provide better leverage over a pitcher’s YR2_ERA.

But what if we want to predict how pitchers with certain batted-ball profiles (fly ball vs. ground ball) will perform in different parks? If we’re trying to predict how C.J. Wilson (lifetime 1.68 GB/FB ratio) will perform moving from Texas to Anaheim — or Michael Pineda’s (0.81 GB/FB ratio) move from pitcher-friendly Safeco to hitter-friendly Yankee Stadium will turn out — in which estimator(s) should we have more faith? That is the focus of Part III.

I used the same methodology as Part II to determine park type. I then coded each pitcher as ground ball or fly ball based on their GB/FB ratio. A pitcher’s GB/FB is one of the most consistent metrics (for starter pitchers, the year-over-year correlation is 0.87, which is highest for all outcome metrics), so there was little concern about a pitcher changing their batted-ball profile between seasons. A GB/FB greater than 1 was coded as ground ball; less than 1 was coded as fly ball. In the end, 1,387 season pairs were included in the analysis:

Park Type	Fly ball	Ground ball
Hitter	157	71
Pitcher	144	99
Neutral	650	266
Total	951	436

Here’s what I found:

Generally, estimators perform better for ground ball pitchers. Across all parks, estimators pick up additional power when they’re applied to ground ball pitches relative to YR1_ERA, with SIERA doubling the predictive power of YR1_ERA. But when we split the parks into hitter- and pitcher-friendly, we find that tERA performs the best across both types of pitchers in hitter’s parks; SIERA works best in pitcher’s parks, at least doubling the predictive power of YR1_ERA for both fly ball and ground ball pitchers.

Here are the results for all park types:

All Parks	Fly ball (n=951)				Ground ball (n=436)
Metric	R	RSQR	RMSE	Sig	R	RSQR	RMSE	Sig
ERA	0.315	10%	1.137	.01 level	0.310	10%	1.167	.01 level
FIP	0.365	13%	1.115	.01 level	0.392	15%	1.129	.01 level
xFIP	0.375	14%	1.110	.01 level	0.404	16%	1.123	.01 level
tERA	0.363	13%	1.116	.01 level	0.422	18%	1.113	.01 level
SIERA	0.400	16%	1.098	.01 level	0.449	20%	1.097	.01 level

The results are right in line with what we find overall for ERA estimators — with ground ball pitchers picking up a decent amount of power, relative to YR1_ERA.

When we restrict this to hitter-friendly parks, we find that the relative power of the estimators increases — but their overall effectiveness declines. Except, that is, for tERA. tERA actually picks up 2%, in terms of variance explained in YR2_ERA, for fly ball pitchers. And while all estimators decline when applied to ground ball pitchers in hitter’s parks, tERA managed to maintain the same R-squared and explains seven-times the variance in YR2_ERA than Yr1_ERA.

Let’s use Pineda’s move to Yankee Stadium for our first example. A good place to start is to examine his tERA, given his batted-ball profile and the park type. Last year, his ERA was 3.74. His tERA was 3.42.

Hitter Parks	Fly ball (n=157)				Ground ball (n=71)
Metric	R	RSQR	RMSE	Sig	R	RSQR	RMSE	Sig
ERA	0.272	7%	1.023	.01 level	0.141	2%	1.172
FIP	0.367	13%	0.989	.01 level	0.301	9%	1.129	.05 level
xFIP	0.335	11%	1.001	.01 level	0.288	8%	1.134	.05 level
tERA	0.388	15%	0.980	.01 level	0.382	15%	1.094	.01 level
SIERA	0.333	11%	1.002	.01 level	0.332	11%	1.117	.01 level

When we switch over to pitcher-friendly parks, we find that ERA estimators are far better at predicting ground ball pitchers’ performances. Almost every estimator more than doubled the R-squared of YR1_ERA, with SIERA accounting for almost 30% of the variance in ERA in the second year. Previously, the effectiveness of the estimators declined when moving to pitcher-friendly parks. That pattern holds here — at least for fly ball pitchers. Not so for ground ball pitchers. Here, we find the strongest R-squared numbers outside of starters in hitter-friendly parks (see Part II).

And what about C.J. Wilson? His move to Anaheim suggests that we start with SIERA for clues as to how he will perform this year. Last season, Wilson posted a 2.94 ERA and a 3.44 SIERA.

Pitcher Parks	Fly ball (n=144)				Ground ball (n=99)
Metric	R	RSQR	RMSE	Sig	R	RSQR	RMSE	Sig
ERA	0.238	6%	1.152	.01 level	0.337	11%	1.134	.01 level
FIP	0.267	7%	1.143	.01 level	0.458	21%	1.070	.01 level
xFIP	0.301	9%	1.131	.01 level	0.487	24%	1.052	.01 level
tERA	0.268	7%	1.143	.01 level	0.425	18%	1.090	.01 level
SIERA	0.340	12%	1.116	.01 level	0.527	28%	1.024	.01 level

Finally, we find that — in neutral parks — the estimators again resemble their performance across all parks. There’s a little explanatory power picked up for ground ball pitchers, versus flyball pitchers, but overall there weren’t many surprises.

Neutral Parks	Fly ball (n=650)				Ground ball (n=266)
Metric	R	RSQR	RMSE	Sig	R	RSQR	RMSE	Sig
ERA	0.335	11%	1.158	.01 level	0.328	11%	1.168	.01 level
FIP	0.384	15%	1.135	.01 level	0.390	15%	1.138	.01 level
xFIP	0.402	16%	1.126	.01 level	0.423	18%	1.120	.01 level
tERA	0.375	14%	1.140	.01 level	0.418	17%	1.123	.01 level
SIERA	0.427	18%	1.117	.01 level	0.474	22%	1.089	.01 level

ERA estimators are just that–estimators. Even the very best only explain ~30% of the variation we see year to year in actual ERA. That being said, they offer a better barometer of a pitcher’s true performance and likely future performance than simply relying on ERA alone.

We gain some additional leverage by breaking up the population of pitchers, since there is much that can be masked in a large aggregation. In this current case, we find that tERA provides us with the best estimation for both ground ball and fly ball pitchers in hitter-friendly parks relative to ERA and to other estimators. SIERA, however, takes the crown when looking at both types of pitchers in pitcher-friendly parks. Generally, SIERA is the leading estimator for ground ball hurlers, since it, at a minimum, doubles the variance explained relative to YR1_ERA across all park scenarios.

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Joe

13 years ago

So what do you think we should expect from Mat Latos in 2012, then? Having pitched in Petco his whole career he had a shining 3.48 Career SIERA. That drops all the way to a 2.87 Career tERA. Now he’s moving to a park where he StatCorner ranks as a Park Factor of 103 for both LHB/RHB wOBA, so hitter-friendly-ish. There’s a career .46 ERA difference in his home/road splits.

Any thoughts on what his 2012 ERA might look like?

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG