Did Jon Gray Deserve His Demotion to the Minors?

In one sense, Jon Gray’s 2018 season has been pretty successful. He’s struck out 29% of the batters he’s faced this year, for example, which ranks 12th among 90 qualified starters. His walk rate, at 7%, sits in the top third for starters. His home-run rate of 1.1 per nine innings is right in the middle of the pack among that sample, too, as are his 92 innings.

That’s he’s done of his work at elevation in Colorado makes those numbers even more impressive. His 3.07 FIP has produced a 2.5 WAR, one of the top 15 figures in baseball. Unfortunately, the Rockies haven’t received the benefit of that good pitching. In fact, Gray’s 5.77 ERA ranks 88th out 90 starters. The massive difference between his ERA and FIP would represent the largest such disparity in baseball history, and it was of sufficient concern to the Rockies to send Gray to Triple-A.

Not too long ago, Jay Jaffe wrote a piece on Jon Lester, whose season was also busting historical norms. Lester’s ERA was significantly lower than his FIP. So far this season, Gray is Lester’s opposite. The graph below shows every pitcher’s FIP and ERA this season.

You can see Jon Lester over there on the left on his own. If you go to the right, you can see Jon Gray with nobody even close to him. It should be evident that, in the middle, most players are reasonably close when it comes to ERA and FIP. The average differential per player 0.53. Of the 88 ERA and FIP pairs in this sample, 77 are within one run. So far this season, Gray’s 2.69 ERA-FIP is roughly double the player closest to him, as the table below shows.

Biggest ERA-FIP Gaps, 2018
Name Team ERA FIP E-F
Jon Gray Rockies 5.77 3.08 2.69
Jason Hammel Royals 5.56 4.20 1.37
Lance Lynn Twins 5.49 4.37 1.12
Sonny Gray Yankees 5.44 4.39 1.05
Nick Pivetta Phillies 4.71 3.68 1.03
Luke Weaver Cardinals 5.16 4.22 0.93
Vince Velasquez Phillies 4.69 3.81 0.87
Luis Castillo Reds 5.85 5.03 0.82
Carlos Carrasco Indians 4.24 3.42 0.82
Zack Wheeler Mets 4.47 3.66 0.80
Qualified starting pitchers.

That isn’t just remarkable for this season. Since 1901, here are the biggest differences among qualified pitchers.

Largest ERA-FIP Since 1901
Name Team Season ERA FIP E-F
Jon Gray Rockies 2018 5.77 3.08 2.69
Jack Knott Browns 1936 7.29 5.16 2.12
George Caster Athletics 1940 6.56 4.52 2.04
Hub Pruett Phillies 1927 6.05 4.11 1.94
Chris Bosio Brewers 1987 5.24 3.38 1.86
John Burkett Rangers 1998 5.68 3.89 1.78
Bert Blyleven Twins 1988 5.43 3.66 1.77
Joe Oeschger Braves 1923 5.68 3.91 1.77
Ernie Wingard Browns 1927 6.56 4.80 1.76
Bobo Newsom – – – 1942 4.73 2.99 1.74
Ricky Nolasco Marlins 2009 5.06 3.35 1.71
Early Wynn Senators 1942 5.12 3.42 1.70
Jack Lamabe Red Sox 1964 5.89 4.21 1.68
Rick Wise Phillies 1968 4.55 2.89 1.66
Pol Perritt Cardinals 1913 5.25 3.59 1.66
Qualified starting pitchers.

There are a few Hall of Famers on that list in Blyleven and Wynn, but nobody comes close to what Gray has done thus far. Just to get a few more familiar names, here’s the same list since 1995.

Largest ERA-FIP Since 1995
Name Team Season ERA FIP E-F
Jon Gray Rockies 2018 5.77 3.08 2.69
John Burkett Rangers 1998 5.68 3.89 1.78
Ricky Nolasco Marlins 2009 5.06 3.35 1.71
Jaime Navarro White Sox 1997 5.79 4.21 1.59
Jose Mercedes Orioles 2001 5.82 4.32 1.51
LaTroy Hawkins Twins 1999 6.66 5.16 1.50
Edinson Volquez – – – 2013 5.71 4.24 1.47
Nate Robertson Tigers 2008 6.35 4.99 1.36
Derek Lowe Braves 2011 5.05 3.70 1.35
Clay Buchholz Red Sox 2014 5.34 4.01 1.33
Jose Jimenez Cardinals 1999 5.85 4.53 1.32
Zack Greinke Royals 2005 5.80 4.49 1.31
Mike Oquist Athletics 1998 6.22 4.93 1.30
Qualified starting pitchers.

There are some good pitchers on this list, too, including Zack Greinke. The odds are against Gray maintaining such a high difference. With half a season to go, Gray’s ERA is likely to be considerably closer to his FIP moving forward. If, the rest of the way, Gray’s FIP is one run lower than his ERA like it was in 2016, his ERA will end up right around Chris Bosio’s 1.86 number from 1988. If Gray’s ERA is half a run higher than his FIP like it was last year, he’ll end up with something close to Jaime Navarro’s 1.59 from 1997 and not even crack the top-15 all-time.

As we are getting close to the All-Star Break, it might be useful to take a look at the biggest half-season differences from our splits leaderboards, which go back to 2002. Here are the biggest first-half differences for pitchers with at least 70 first-half innings.

Largest ERA-FIP by Half Since 2002
Name Team Season 1st Half IP 1st Half ERA 1st Half FIP 1st Half ERA-FIP
Glendon Rusch MIL 2003 82.1 8.09 4.38 3.71
Jon Gray COL 2018 92.0 5.77 3.08 2.69
Tim Lincecum SFG 2012 96.2 6.42 4.01 2.42
Ubaldo Jimenez BAL 2016 79.1 7.03 4.63 2.40
Zack Greinke MIL 2011 74.1 5.45 3.05 2.40
Colby Lewis TEX 2014 84.0 6.54 4.17 2.37
Ricky Nolasco FLA 2009 90.2 5.76 3.56 2.20
Jake Arrieta BAL 2012 101.1 6.13 4.04 2.09
Edwin Jackson TBD 2007 74.1 7.26 5.19 2.07
John Lackey BOS 2011 79.0 6.84 4.84 2.00
Manny Parra MIL 2009 71.2 6.78 4.80 1.98
Ryan Dempster CIN 2003 96.0 6.75 4.78 1.97
Sidney Ponson BAL 2004 113.0 6.29 4.35 1.94
Edinson Volquez SDP 2013 109.2 5.74 3.85 1.89
AVERAGE 89.0 6.54 4.28 2.26
Min. 70 IP

That was quite a performance from Glendon Rusch. He would actually go on to have a couple productive seasons as a Cubs swingman, but 2003 might have soured the Brewers on his future. Scanning the list for similar performances to Gray, another Brewer, Zach Greinke, sticks out with a near-identical FIP to Gray this season. As the average indicates, we have roughly average to maybe below-average pitchers by FIP accompanied by horrendous ERAs. The next table shows how those players performed in the second half.

Second-Half Performance for Largest ERA-FIP
Name Team Season 1st Half FIP 1st Half ERA-FIP 2nd Half IP 2nd Half ERA 2nd Half FIP 2nd Half ERA-FIP
Glendon Rusch MIL 2003 4.38 3.71 18.0 4.00 2.14 1.86
Tim Lincecum SFG 2012 4.01 2.42 89.1 3.83 4.36 -0.53
Ubaldo Jimenez BAL 2016 4.63 2.40 52.2 2.39 3.60 -1.21
Zack Greinke MIL 2011 3.05 2.40 97.1 2.59 2.92 -0.33
Colby Lewis TEX 2014 4.17 2.37 86.1 3.86 4.75 -0.89
Ricky Nolasco FLA 2009 3.56 2.20 94.1 4.39 3.15 1.24
Jake Arrieta BAL 2012 4.04 2.09 0.0 0.00
Edwin Jackson TBD 2007 5.19 2.07 86.1 4.48 4.62 -0.14
John Lackey BOS 2011 4.84 2.00 81.0 6.00 4.58 1.42
Manny Parra MIL 2009 4.80 1.98 68.1 5.93 4.96 0.97
Ryan Dempster CIN 2003 4.78 1.97 16.2 6.48 6.63 -0.15
Sidney Ponson BAL 2004 4.35 1.94 102.2 4.21 4.54 -0.33
Edinson Volquez SDP 2013 3.85 1.89 59.2 5.73 4.98 0.75
AVERAGE 4.28 2.26 66.0 4.49 4.27 0.20
Min. 70 IP.

As we might expect, the players’ first-half FIPs line up pretty well with their second-half FIPs. What’s interesting is that the second-half ERAs also line up pretty well with the FIPs from both the first and second halves. While this is what we would expect to see, it’s nice to have it show up so neatly.

One problem the above doesn’t solve is why Gray’s FIP, specfically, is so much lower than his ERA. A portion of the responsibility goes to his home park. Pitchers routinely post higher ERAs than FIPs in Coors Field because BABIP is a lot higher in Coors Field. Balls in play are not incorporated into FIP, so larger swings, like the one we see at Coors Field, are going to drive up ERA a bit. That only explains a very small portion of Gray’s differential, though. For the rest, please see the graph below depicting BABIP and left-on-base percentages for all qualified starting pitchers.

Previous research indicates that a vast majority of the difference between FIP and ERA is due to two factors, the two stats seen in the table above: BABIP and LOB%. Gray is the worst in both, about 50 points clear in BABIP and with few peers in LOB% this season.

A really poor BABIP might be an indicator that Gray is no longer an MLB-caliber pitcher. The rest of his stats say otherwise, however. Per Baseball Savant, his expected BABIP is about 50 points lower than his actual figure. As league-wide expected BABIP is about 20 points higher than actual BABIP, even once you factor in Coors Field, Gray’s BABIP is about 50 points too high based on the quality of contact, leaving the rest to luck and defense.

As for left-on-base percentage, if it were really high, we might think that perhaps Gray has trouble pitching with runners on base and that Gray’s 1.91 FIP with bases empty compared to 4.67 with runners on — and his .274 xwOBA with bases empty compared to .334 with runners on — speaks to the same issue. However, the latter number is roughly average for the league regarding xwOBA and pretty close to average for FIP once Coors Field is factored in. Jeff Zimmerman theorized that the issue might be pitching meatballs behind in the count, but even Gray’s numbers behind in the count are similar to an average pitcher in those situations. His velocity has been down in his last few starts. Ben Lindbergh noted the absence of competitive pitches from Gray this season. However, none of those theories explains a league-worst left-on-base rate or the massively high BABIP. The Rockies would have to keep Gray in Triple-A the rest of the season to game his service time, so that is an unlikely motivation, although if he hits the disabled list in the minors he would not accrue MLB service time like he would if he were DL’d now and something more serious was discovered.

Jon Gray is performing historically so far, but not in the way he would like. Now he’s pitching in a city he’d probably prefer not to. Based on the past history of others, as well as himself, there seems to be a pretty good chance — absent injury — that his ERA is going to be headed downward soon even if the big-league Rockies won’t be seeing the benefit of that downturn at the moment.

Craig Edwards can be found on twitter @craigjedwards.

Newest Most Voted
Inline Feedbacks
View all comments
5 years ago


5 years ago
Reply to  tz

On the positive side, I’m sure the Rockies and their not-at-all-incompetent front office will handle this situation in a way that benefits both Gray & the team in the long run.

5 years ago
Reply to  Sleepy

I wish I had your optimism. This, the Desmond signing (to play 1B!), and the ridiculous amount of money they throw at ok-but-not-great relievers tells me all I need to know about the competence of the Rockies’ front office.

5 years ago
Reply to  Jon


5 years ago
Reply to  Jon

Given the Coor’s Field Effect, we will have to account for park factors when analyzing how far that went over your head.

5 years ago
Reply to  v2micca

Is it possible that the thin air makes it harder to general manage in Colorado? Maybe folks aren’t getting enough oxygen to their brains.

Tunnel Snakes
5 years ago
Reply to  Sarachim

Maybe it’s the thin air, in conjunction with the great breweries in Denver and the legalized marijuana.

5 years ago
Reply to  Sleepy

Not too often you see a guy demoted who is on his way to a 5-win season.

The difference is so large it can’t possibly be all bad luck. I am sure he has stuff to work on. But I am pretty confident that (a) Craig is right here and he’ll bounce back and (b) there is no way whoever takes his spot in the rotation will be as good.

5 years ago
Reply to  sadtrombone

Well it’s not as if pitcher WAR is some objective measure, according to bWAR Gray is on pace for less than 1 WAR

I prefer fWAR over bWAR for pitchers, but there are some things better represented by bWAR also.

Mark Davidson
5 years ago
Reply to  caesar_solid

What exactly is better presented by bWAR in this scenario? I’m curious because I’m especially leary of RA9 in this situation given the information that his xwOBA is 50 points lower than his current wOBA allowed.

5 years ago
Reply to  Mark Davidson

The argument for using bWAR was never all that strong. It was something about inducing soft contact vs. allowing hard contact, and how that was a skill that bWAR picked up and fWAR didn’t. Of course, bWAR also picks up all kinds of other noise too because it overfits based on defensive statistics. So it didn’t really work very well.

With xwOBA on hand, I think that serves as a much better foil for fWAR than bWAR ever did

5 years ago
Reply to  Mark Davidson

One valid argument is that FIP consistently overrates certain types of pitchers (usually ones with one great strikeout pitch but everything else they throw is garbage) while underrating others (weak contact pitchers). Another is that while an abstraction like FIP is useful for predicting future performance, it’s not really representing actual past performance.