Park Factors and ERA Estimators: Part III

When we last left the question on Park Factors’ effect on ERA estimators we found that the estimators performed the best in hitters’ parks when looking at starting pitchers. FIP and xFIP performed better than tERA or SIERA when predicting the next year’s ERA for this group of pitchers. For the other park types, the pattern looked similar to what we generally see — SIERA generally performs best, while all estimators provide better leverage over a pitcher’s YR2_ERA.

But what if we want to predict how pitchers with certain batted-ball profiles (fly ball vs. ground ball) will perform in different parks? If we’re trying to predict how C.J. Wilson (lifetime 1.68 GB/FB ratio) will perform moving from Texas to Anaheim — or Michael Pineda’s (0.81 GB/FB ratio) move from pitcher-friendly Safeco to hitter-friendly Yankee Stadium will turn out — in which estimator(s) should we have more faith? That is the focus of Part III.

I used the same methodology as Part II to determine park type. I then coded each pitcher as ground ball or fly ball based on their GB/FB ratio. A pitcher’s GB/FB is one of the most consistent metrics (for starter pitchers, the year-over-year correlation is 0.87, which is highest for all outcome metrics), so there was little concern about a pitcher changing their batted-ball profile between seasons. A GB/FB greater than 1 was coded as ground ball; less than 1 was coded as fly ball. In the end, 1,387 season pairs were included in the analysis:

 Park Type Fly ball Ground ball
Hitter 157 71
Pitcher 144 99
Neutral 650 266
Total 951 436

Here’s what I found:

Generally, estimators perform better for ground ball pitchers. Across all parks, estimators pick up additional power when they’re applied to ground ball pitches relative to YR1_ERA, with SIERA doubling the predictive power of YR1_ERA. But when we split the parks into hitter- and pitcher-friendly, we find that tERA performs the best across both types of pitchers in hitter’s parks; SIERA works best in pitcher’s parks, at least doubling the predictive power of YR1_ERA for both fly ball and ground ball pitchers.

Here are the results for all park types:

All Parks Fly ball (n=951) Ground ball (n=436)
Metric R RSQR RMSE Sig R RSQR RMSE Sig
ERA 0.315 10% 1.137 .01 level 0.310 10% 1.167 .01 level
FIP 0.365 13% 1.115 .01 level 0.392 15% 1.129 .01 level
xFIP 0.375 14% 1.110 .01 level 0.404 16% 1.123 .01 level
tERA 0.363 13% 1.116 .01 level 0.422 18% 1.113 .01 level
SIERA 0.400 16% 1.098 .01 level 0.449 20% 1.097 .01 level

The results are right in line with what we find overall for ERA estimators — with ground ball pitchers picking up a decent amount of power, relative to YR1_ERA.

When we restrict this to hitter-friendly parks, we find that the relative power of the estimators increases — but their overall effectiveness declines. Except, that is, for tERA. tERA actually picks up 2%, in terms of variance explained in YR2_ERA, for fly ball pitchers. And while all estimators decline when applied to ground ball pitchers in hitter’s parks, tERA managed to maintain the same R-squared and explains seven-times the variance in YR2_ERA than Yr1_ERA.

Let’s use Pineda’s move to Yankee Stadium for our first example. A good place to start is to examine his tERA, given his batted-ball profile and the park type. Last year, his ERA was 3.74. His tERA was 3.42.

Hitter Parks Fly ball (n=157) Ground ball (n=71)
Metric R RSQR RMSE Sig R RSQR RMSE Sig
ERA 0.272 7% 1.023 .01 level 0.141 2% 1.172
FIP 0.367 13% 0.989 .01 level 0.301 9% 1.129 .05 level
xFIP 0.335 11% 1.001 .01 level 0.288 8% 1.134 .05 level
tERA 0.388 15% 0.980 .01 level 0.382 15% 1.094 .01 level
SIERA 0.333 11% 1.002 .01 level 0.332 11% 1.117 .01 level

When we switch over to pitcher-friendly parks, we find that ERA estimators are far better at predicting ground ball pitchers’ performances. Almost every estimator more than doubled the R-squared of YR1_ERA, with SIERA accounting for almost 30% of the variance in ERA in the second year. Previously, the effectiveness of the estimators declined when moving to pitcher-friendly parks. That pattern holds here — at least for fly ball pitchers. Not so for ground ball pitchers. Here, we find the strongest R-squared numbers outside of starters in hitter-friendly parks (see Part II).

And what about C.J. Wilson? His move to Anaheim suggests that we start with SIERA for clues as to how he will perform this year. Last season, Wilson posted a 2.94 ERA and a 3.44 SIERA.

Pitcher Parks Fly ball (n=144) Ground ball (n=99)
Metric R RSQR RMSE Sig R RSQR RMSE Sig
ERA 0.238 6% 1.152 .01 level 0.337 11% 1.134 .01 level
FIP 0.267 7% 1.143 .01 level 0.458 21% 1.070 .01 level
xFIP 0.301 9% 1.131 .01 level 0.487 24% 1.052 .01 level
tERA 0.268 7% 1.143 .01 level 0.425 18% 1.090 .01 level
SIERA 0.340 12% 1.116 .01 level 0.527 28% 1.024 .01 level

Finally, we find that — in neutral parks — the estimators again resemble their performance across all parks. There’s a little explanatory power picked up for ground ball pitchers, versus flyball pitchers, but overall there weren’t many surprises.

Neutral Parks Fly ball (n=650) Ground ball (n=266)
Metric R RSQR RMSE Sig R RSQR RMSE  Sig
ERA 0.335 11% 1.158 .01 level 0.328 11% 1.168 .01 level
FIP 0.384 15% 1.135 .01 level 0.390 15% 1.138 .01 level
xFIP 0.402 16% 1.126 .01 level 0.423 18% 1.120 .01 level
tERA 0.375 14% 1.140 .01 level 0.418 17% 1.123 .01 level
SIERA 0.427 18% 1.117 .01 level 0.474 22% 1.089 .01 level

ERA estimators are just that–estimators. Even the very best only explain ~30% of the variation we see year to year in actual ERA. That being said, they offer a better barometer of a pitcher’s true performance and likely future performance than simply relying on ERA alone.

We gain some additional leverage by breaking up the population of pitchers, since there is much that can be masked in a large aggregation. In this current case, we find that tERA provides us with the best estimation for both ground ball and fly ball pitchers in hitter-friendly parks relative to ERA and to other estimators. SIERA, however, takes the crown when looking at both types of pitchers in pitcher-friendly parks. Generally, SIERA is the leading estimator for ground ball hurlers, since it, at a minimum, doubles the variance explained relative to YR1_ERA across all park scenarios.





Bill leads Predictive Modeling and Data Science consulting at Gallup. In his free time, he writes for The Hardball Times, speaks about baseball research and analytics, has consulted for a Major League Baseball team, and has appeared on MLB Network's Clubhouse Confidential as well as several MLB-produced documentaries. He is also the creator of the baseballr package for the R programming language. Along with Jeff Zimmerman, he won the 2013 SABR Analytics Research Award for Contemporary Analysis. Follow him on Twitter @BillPetti.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Joe
12 years ago

So what do you think we should expect from Mat Latos in 2012, then? Having pitched in Petco his whole career he had a shining 3.48 Career SIERA. That drops all the way to a 2.87 Career tERA. Now he’s moving to a park where he StatCorner ranks as a Park Factor of 103 for both LHB/RHB wOBA, so hitter-friendly-ish. There’s a career .46 ERA difference in his home/road splits.

Any thoughts on what his 2012 ERA might look like?