Research – What if everything was equal?

The article I wrote about Juan Padilla being the luckiest pitcher in baseball got me thinking that maybe combining the three statistics of BABIP, HR/FB, and LOB% was not the best way to display a pitcher's overall “luck”. So instead of championing “Luck Factor” as a useful stat, I'm going to make a “luckless” ERA or luck independent ERA (iERA) which will have better practical applications. The formula is actually pretty simple:

iERA = ( ((.3627 * IP) – (.1287 * K) + (.1408 * FB) + (.3 * BB)) * 8.28) / IP

Basically this formula was the result of a bunch of relatively simple algebra on the three “luck” statistics HR/FB, BABIP and LOB%. Instead of solving each formula for each player, I plugged in the major league average for each, and proceeded to find out how many home runs, hits, unearned runs, and finally earned runs a player would have if everything was equal. This formula should normalize ERA for park, league, and in some sense, competition. In other words, it gives a pitcher's ERA as if all pitchers pitched under the same conditions.

ONE

Let's take a look at how iERA compares to regular ERA. Naturally there is some correlation here, but the correlation isn't that great. I just did this to make sure they were actually measuring two different things, and they do. What iERA does have a high correlation with is a pitchers strikeout to walk ratio. Not much of surprise here as strikeouts and walks (and fly balls) are what sets pitchers apart in iERA.

TWO

So you're probably asking, why should I care about a pitcher's iERA? It's actually quite good at telling you how lucky a pitcher was in any particular year. As you can see, once a player's ERA and iERA start to differ by more than 20%, there's about a 75% chance that his ERA and iERA will be closer the following year. Once you start to see a difference in ERA and iERA over 70%, it's pretty much a given that his ERA and iERA will be closer the next season.

THREE

Furthermore, the amount that a player's ERA will revert to iERA is somewhat a function of how far it deviated in the first place. Player's don't tend to stray from a 0% difference in ERA and iERA much further than 20% on average.

FOUR

So just for kicks, let's take a look at some of the bigger discrepancies in iERA and ERA the past four years and see what they've done the following year. Here are 5 starters and 5 relievers that had a much lower ERA than they did iERA in 2004.

		2004			2005		
Name		ERA	iERA	% Dif	ERA	iERA	% Dif
Joe Nathan	1.62	3.80	-81%	2.70	3.65	-30%
Mike Gonzalez	1.25	2.83	-78%	2.70	4.15	-42%
Steve Kline	1.79	4.00	-77%	4.28	4.67	-9%
Luke Hudson	2.42	5.07	-71%	6.38	5.25	19%
Akinori Otsuka	1.75	3.62	-70%	3.59	4.19	-15%

		2004			2005		
Name		ERA	iERA	% Dif	ERA	iERA	% Dif
Jake Peavy	2.27	3.83	-51%	2.88	3.56	-21%
Al Leiter	3.21	5.07	-45%	6.13	5.41	12%
Bruce Chen	3.02	4.66	-43%	3.83	4.51	-16%
Carlos Zambrano	2.75	4.02	-38%	3.26	3.91	-18%
Tomo Ohka	3.40	4.81	-34%	4.04	4.63	-14%

Each of them did see an increase in their ERA and they all saw a drastic reduction in the percent difference between their ERA and iERA. Now let's take a look at the players that had a much higher ERA than iERA.

		2004			2005		
Name		ERA	iERA	% Dif	ERA	iERA	% Dif
Lance Cormier	8.14	5.12	46%	5.11	4.46	14%
Sergio Mitre	6.62	4.22	44%	5.37	4.17	25%
Mike Wood	5.94	4.37	30%	4.46	4.78	-7%
Brian Fuentes	5.64	4.32	26%	2.91	3.84	-28%
Neal Cotts	5.65	4.39	25%	1.94	4.25	-75%

		2004			2005		
Name		ERA	iERA	% Dif	ERA	iERA	% Dif
Casey Fossum	6.65	4.61	36%	4.92	4.50	9%
Derek Lowe	5.42	4.19	26%	3.61	3.69	-2%
Joaquin Benoit	5.68	4.42	25%	3.72	4.74	-24%
Brett Myers	5.52	4.48	21%	3.72	3.75	-1%
Scott Kazmir	5.67	4.62	20%	3.77	4.60	-20%

All of these pitchers saw a decrease in ERA, some more drastic than others. Neal Cotts had a very high ERA and iERA difference in 2005, which will certainly revert in 2006. Brian Fuentes also showed a very big difference in ERA, but notice that it was also reflected in his 2005 iERA. Here's the last list of pitchers which showed a very large difference in ERA and iERA for 2005.

Name		ERA	iERA	Dif
Juan Padilla	1.49	4.74	-105%
Mariano Rivera	1.38	3.38	-84%
Chad Cordero	1.82	4.36	-82%
Huston Street	1.72	4.02	-80%
Cliff Politte	2.00	4.52	-77%

Name		ERA	iERA	Dif
Tim Redding	10.57	5.51	63%
Alan Embree	7.62	4.28	56%
Greg Aquino	7.76	4.39	56%
Ryan Wagner	6.11	3.60	52%
Paul Wilson	7.77	4.91	45%

I'd like to emphasize that just because there is a large discrepancy between a pitcher's ERA and iERA doesn't mean that pitcher still can't be very good or very bad the next season. It just means that there's a high probability that pitcher will have an ERA much closer to his iERA the following season. If you subscribe to the theories of BABIP, HR/FB, and LOB% (and even if you don't), I think you'll find iERA a useful tool as it essentially combines the three of them into one simple stat that you can compare with any player's actual ERA to get an idea of how “lucky” a player was in a single season.

Update: Just a quick addendum, I've been told that iERA is essentially a pre-existing stat called xFIP. I was aware of just FIP, but I wasn't so aware of xFIP. I think xFIP might be a little better (even though the two are VERY close). If you're interested in xFIP, I'll point you over to the article: I'm Batty for Baseball Stats over at the Hardball Times.





David Appelman is the creator of FanGraphs.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Lotos
18 years ago

Where calculate % Dif?