Zack Greinke In ‘The Peripheral Disconnect’

Pitchers whose ERAs and estimators disagree are extremely interesting to analyze. On one hand, their signature run prevention mark might appear toward the top of leaderboards while the underlying numbers aren’t as fruitful. On the other side of the spectrum are pitchers like Zack Greinke, who, as Chris Cwik pointed out yesterday, has a vast disconnect between his ERA and FIP. In fact, it’s been that way since the first week of May when he came off of the disabled list.

His 2.63 FIP and 2.12 xFIP suggest that the newly-minted Brewers starter has been one of the best in the league. But Greinke’s actual 5.63 ERA is closer to the bottom than the top, and is three runs higher than his adjusted marks. One of the more popular stats here is E-F, a sortable number that measures the gap between ERA and FIP. Pitchers with a large separation are expected to regress in some fashion, because it is incredibly rare for anyone to finish with a huge disagreement between those two data points.

I thought about taking that concept a bit further and calculating the difference between ERA and xFIP, since the latter metric is a better predictor of future earned run average than its predecessor. This makeshift ‘E-X’ number would be measurable from 2002-now, and it piqued my curiosity to see which pitchers had the largest such gaps in that span.

Granted, the separation in Greinke’s numbers is unlikely to remain as substantial as the season wears on, but he was more of an introductory device here as opposed to a player being compared — obviously it wouldn’t be accurate to compare his numbers through two months to those amassed over full seasons for other hurlers.

Looking at pitchers whose ERAs exceeded their xFIPs since 2002, here are the guys with the biggest ERA-xFIP deltas:

Carlos Silva (2008), 1.88 E-X
Silva’s numbers looked terrible above the surface: a 4-15 record and a 6.46 ERA. Beneath those numbers, however, was a 61 percent strand rate and .342 BABIP. He managed a K/BB above 2.0 without striking anyone out, and induced grounders on 44 percent of his balls in play. His ERA estimators by no means suggested he was an upper echelon pitcher, but rather that he was closer to league average than the bottom of the pile. The 1.88 delta between his 6.46 ERA and 4.58 xFIP is the largest in this span.

Ricky Nolasco (2009), 1.83 E-X
Can’t say his name is shocking to see here. If I were to have polled you prior to revealing the list, Nolasco and Javier Vazquez’s names were likely to pop up. Nolasco tossed 185 innings in 2009 with a 5.06 ERA, despite a 9.5 K/9, 2.1 BB/9, 3.35 FIP and 3.23 xFIP. Unlike Silva, the estimators here suggest that Nolasco was one of the best in the league, and his 4.3 WAR in 2009 bears that out.

Unfortunately, nothing has really changed for Nolasco, who continues to be one of the more frustrating pitchers in baseball to follow, and potentially the Vazquez of this generation.

Nate Robertson (2008), 1.73 E-X
His numbers almost directly mirror those of Silva: a 6.35 ERA and 4.62 xFIP in 168 2/3 innings. Robertson struck more batters out but issued walks at a greater frequency. This issue seems to have plagued him forever, as he has a career .304 BABIP, 68 percent strand rate, 5.01 ERA and 4.45 xFIP.

James Shields (2010), 1.63 E-X
Finally, an example of a guy who actually met the expectations of his xFIP in the following season. Last year, Shields put up an ugly 5.18 ERA, but struck out 8.3 batters per nine in the tougher American League, while producing a 4.2 K/BB ratio. He finished the season with a 3.55 xFIP, and while his numbers are regression-bound this season (in the other direction, given his .256 BABIP and 82 percent strand rate), his peripherals have largely remained the same and his performance is more in line with what those underlying numbers suggest.

Jose Lima (2005), 1.63 E-X
Once again, we alternate the terrible made to look closer to league average with the ‘meh’ made to look solid. In this case, however, we’re looking at the equivalent of putting lipstick on a pig. Lima might not have been as bad as a 6.99 ERA suggests, but his 5.36 xFIP wasn’t exactly inspiring. He makes the top five because even someone with peripherals as unsightly as his couldn’t be expected to sustain a 7.00 ERA pace.

What stood out when looking at the 870 pitcher-seasons with 150 innings included in the span was how so many of those with an ERA at least a run greater than their xFIPs fell in line with Silva and Robertson as opposed to Nolasco and Shields. Of the 42 whose E-X was 1.00+, the weighted averages included a 5.49 ERA, 4.54 FIP and 4.22 xFIP.

The only pitcher in the span with a sub-4.00 ERA to also have a delta of one or more runs is Curt Schilling, back in 2002, when he pitched a 3.23 ERA, 2.40 FIP and 2.22 xFIP in 259 1/3 frames.

Perhaps this speaks to the notion that better pitchers are less likely to see large disconnects between their ERA and estimators. Because of their skillsets, it is less likely for a fluky BABIP or strand rate to influence run-prevention. For every Cole Hamels‘ 2009 there are likely double-digit Cole Hamels‘ 2010 replicas. Either way, there is no way that Greinke’s current three-run separation persists without his shattering the E-X record. Considering he has an 11.6 K/9 and 1.7 BB/9, it seems much more likely that his ERA improves than it does that his estimators will materially suffer.

Eric is an accountant and statistical analyst from Philadelphia. He also covers the Phillies at Phillies Nation and can be found here on Twitter.

Newest Most Voted
Inline Feedbacks
View all comments
Matt Huntermember
12 years ago

It seems to me like the DIPS theory doesn’t work as well for really awful pitchers. I’m not an expert by any means, but intuitively, there are going to be pitchers that are bad enough that they can maintain awful BABIPs and never strand runners because they can never get a batter out. It seems that this, albeit small, study supports this notion. Am I right?

Matt Huntermember
12 years ago
Reply to  Matt Hunter

Also, according to his peripherals, James Shields is pitching WAY better than last season. His xFIP is 2.90, compared to 3.55 the year before. Yeah, he’ll likely regress a little, but it seems that in addition to some good luck, he has transformed into an elite pitcher.

12 years ago
Reply to  Matt Hunter

FIP isn’t a DIPS stat.

FIP is directly influenced by BABIP, because its denominator is IP instead of PA

Matt Huntermember
12 years ago
Reply to  RC

Interesting. So because IP is influenced by BABIP, and IP is in the FIP formula, it’s not DIPS. I’ve never heard that before. Well still, I don’t think that takes away from my point. I’m just saying that DIPS theory probably assumes, like “yes” says below me, that the quality of competition is normally distributed. And even if FIP isn’t entirely independent of defense, it is mostly (and is according to the glossary here) so I would still say the small study here is relevant.

12 years ago
Reply to  RC

You can’t be DIPS (Defense Independant Pitching Statistic), when part of your stat is directly influenced by defense.

In a round about way, IP is a product of strikeouts and babip.

The biggest problem, is that having a HIGH BABIP (IE, allowing more hits) reduces your IP, and raises your K/9.

A)For example, if you have a pitcher who K’s 20% of the batters he faces, and has a BABIP of .750, through 10 batters, he’s going to strike out 2, give up 6 hits, and get 2 outs on BIP. (no homers or walks)

He’ll have a K/9 of 13.5. Have pitched 1.1 innings.

B)Now, the same pitcher has a lucky day, and has a BABIP of .250. (no homers or walks). He’ll strike out 2, give up 2 hits, and get 6 outs via BIP. He’ll have a K/9 of 6.75, and have pitched 2.2 innings.

See the problem there? From a Defense-Independent perspective, these two days were exactly the same. From FIP’s perspective:

A) (-2*2)/1.333 = -3.0 (or normalized 3.10 -3.0 = .1)
B) (-2*3)/2.666 = -1.5 (or normalized 3.10 – 1.5 = 1.6)

See the problem here? If you’re being truly defense independant, these pitchers were exactly the same. FIP doesn’t see it that way, it actually penalized the second pitcher for getting outs via BIP, IE, it can punish pitchers for getting groundballs, or being in front of good defenses.

It behaves a little differently when the term is positive, but BABIP is still a very important factor in it. In short, its a very flawed stat.

12 years ago
Reply to  Matt Hunter

I would agree and suggest that theories about babip and HR/FB rely on the fact that the quality of competition and outcomes are normally distributed. Things that fall outside of this are subject to different statistics or at least different values within the systems.

12 years ago
Reply to  Matt Hunter

I think a better explanation for this issue with DIPS is that “awful” pitchers are likely to have been unlucky.