An Annual Reminder from Eric Hosmer and Adam Jones

If you woke up this morning, looked at the WAR Leaderboards for position players and saw Mike TroutJose Altuve, and Manny Machado near the top, you might have had an inclination that all is right with the world. After all, those three players are some of the very best in major-league baseball, and we would expect to see them at the top of the list. Of course, when you look closely at the leaderboard, it’s important to note that there are 171 qualified players. To regard the WAR marks as some sort of de facto ranking for all players would be foolish. For some players, defensive value has a large impact on their WAR total, and it’s important, when considering WAR values one-third of the way into the season, to consider the context in which those figures.

“Small sample size” is a phrase that’s invoked a lot throughout the season. At FanGraphs, we try to determine what might be a small-sample aberration from what could be a new talent level. Generally speaking, the bigger the sample size, the better — and this is especially true for defensive statistics, where we want to have a very big sample to determine a player’s talent level. Last year, I attempted to provide a warning on the reliability of defensive statistics. Now that the season has reached its third month, it’s appropriate to revisit that work.

Before getting to this year’s players, let us first repeat the necessary caveats when looking at defensive numbers. First, from Mitchel Lichtman’s UZR primer:

Most of you are familiar with OPS, on base percentage plus slugging average. That is a very reliable metric even after one season of performance, or around 600 PA. In fact, the year-to-year correlation of OPS for full-time players, somewhat of a proxy for reliability, is almost .7. UZR, in contrast, depending on the position, has a year-to-year correlation of around .5. So a year of OPS data is roughly equivalent to a year and half to two years of UZR.

Keeping in mind these sample-size constraints, last season I identified players for whom we had a three-year defensive track record and then looked for outliers to find players who might be underrated by WAR at the time. The five biggest negative disparities between track records and current 2015 numbers came from Dustin Pedroia, Jason Heyward, Ian Desmond, Denard Span, and Chase Headley. Span played in only about 20 more games the rest of the season and Pedroia missed a month, but the rest of the group played roughly a full season. All improved their numbers.

The chart below simply shows DEF (which includes positional adjustment) at the one-third mark last season, DEF the rest of the way, and the difference between the two.

2015 SSS Defensive Warning and ROS Performance
1/3 DEF DEF ROS Difference
Dustin Pedroia -0.6 4.9 5.5
Ian Desmond -2.6 5.9 8.5
Jason Heyward -1.4 17.8 19.2
Denard Span -4.5 0.5 5.0
Chase Headley -5.5 4.7 10.2

Defensive numbers take longer to stabilize, and even one year isn’t a considerable amount of time, but the supposedly slow starts by the players above were more likely a function of the small sample size rather than some actual change in talent.

When running the numbers this year, I found another group of players whose current WAR totals might underestimate their actual contributions, especially if one is trying to use current numbers to put someone “on pace” from some total at the end of the season. When conducting the exercise this year, I purposely limited the group of players in the sample, using only qualified players from 2013 to 2015 who’d recorded at least 3,000 innings in the field over the course of those seasons. This resulted in a sample of just 56 players. That’s not ideal, perhaps, but this way we can get a pretty fair assessment of the players’ talent levels heading into this season. After identifying the sample, I then took the total fielding-run marks from the three seasons, divided by three to get a seasonal average, then divided by three again, to get an expected number for this point in the season. (Note: this is also known as “dividing by nine.”)

I compared the 2013-2015 numbers for all the relevant players with the players’ numbers this year. Most of the players weren’t far off from their expected numbers. Thirty-two of 56 players had recorded a mark within three runs of their averages; 44 players were within five runs of expectations. There were six players who’d recorded at least six runs below expectations at this point in the season.

First, the infielders:

Potentially Underrated WAR Due to Defense
2016 WAR 2013-2015 UZR AVG 2013-2015 DRS AVG 2013-2015 1/3 Season DEF AVG 2016 DEF DEF Difference in 2016
DJ LeMahieu 0.5 5.9 9.7 2.6 -3.7 -6.3
Eric Hosmer 0.7 1 2.3 -3.3 -11.7 -8.4
Todd Frazier 0.7 7.8 6 3.3 -3.1 -6.4

The columns above show the annual averages for UZR and DRS before getting to the 1/3 DEF averages (which, again, include positional adjustment) as well as the difference in 2016 thus far. All three players have been average (in Hosmer’s case) to solidly above average (for LeMahieu and Frazier) prior to this season both by DRS and UZR. Looking at this year’s numbers, we might suppose they’ve been terrible, but odds are this is just a small-sample blip and that the numbers will stabilize as the season wears on.

LeMahieu might not be an All-Star like he was the first half of last season, but he is better than a replacement player. Hosmer is having an excellent offensive season (144 wRC+), but his 0.7 WAR (108 out of 171 qualified batters) doesn’t represent how good he’s been this season. Frazier has been solid with the bat (110 wRC+) even with his .191 BABIP weighing him down, but his WAR number doesn’t quite represent where he stands among the rest of the league.

On to the outfielders:

Potentially Underrated WAR Due to Defense
2016 WAR 2013-2015 UZR AVG 2013-2015 DRS AVG 2013-2015 1/3 Season AVG DEF 2016 DEF Difference in 2016
Jay Bruce -0.2 0.0 5.0 -2.2 -15.5 -13.3
Adam Jones 0.1 2.9 2 1.7 -4.9 -6.6
Josh Reddick 0.7 6.5 8 0.4 -6.1 -6.5

I’m not going to tell you Jay Bruce is good on defense. Given that, since the beginning of 2014, his numbers have been bad, Bruce is likely a poor outfielder. He might fit better as a designated hitter on some club. However, he likely isn’t the worst outfielder of all time, like these numbers make him out to be. Over the last five years, no right fielder has even broken -20 UZR over the course of the season. Bruce, meanwhile, is “on pace” to double that. Even if you think Bruce is in line to be one of the worst right fielders this decade, the numbers are likely to be half of what he has now.

Adam Jones has had fluctuating seasonal numbers throughout his career, but even taking a negative view, he likely isn’t this bad. From 2013 to 2015, Jones had 31 assists and much of his defensive value in UZR derives from his arm, but Jones has just one assist this season. Jones’ low number likely lies with opportunities that will likely even out over the course of the season. For Reddick, currently on the disabled list, his defense is not likely as good as it once was when he was one of the better defensive right fielders in baseball. He also likely has not turned into one of the worst right fielders in baseball overnight.

When using any statistics, context is important. Some might choose to dismiss a statistic if it appears to have flaws. Others might choose to examine those apparent flaws to get a better understanding of how to use a particular statistic. This is important to do with defensive metrics. A player gets four or five plate appearances with somewhat similar difficulty every time up. A fielder doesn’t get the same opportunities and some plays are vastly more difficult than others. Defensive metrics capture that difficulty, which is not an easy thing to do. That means we should exercise a bit more patience when it comes to dismissing them or drawing conclusions.

Craig Edwards can be found on twitter @craigjedwards.

Newest Most Voted
Inline Feedbacks
View all comments
London Yank
6 years ago

Headley’s poor defensive performance was definitely a result of a change in ability. He developed a case of the yips throwing to first base and made a ton of errors because of it. I suppose you can think of these sorts of real fluctuations in ability as part of the sampling error inherent in these measurements, however, I don’t think this is what people generally have in mind when they think of sample-related variation.