As we move into May and people start to check our WAR leaderboards, there will inevitably be a discussion about why certain players rank highly, especially if they aren’t putting up big offensive numbers. Most of the time, that discussion revolves around the player’s defensive value; For example, last August, Alex Gordon sat on top of our WAR leaderboards, which generated a fair amount of controversy at the time.
Here at FanGraphs, we use Ultimate Zone Rating (UZR) as the fielding component of WAR. UZR is one of two defensive run estimators we host here on the site, the other being Defensive Runs Saved (DRS). Both metrics go beyond traditional fielding stats using the same Baseball Information Solutions (BIS) data set to assign runs to players by dividing the field into different areas and then comparing each play to a league average. At FanGraphs, we don’t have UZR values for catchers or pitchers, so those positions are simply removed from any data visualizations in this post. We also have great library entries that go over the minutiae of the metrics better than I can in this post.
To help illustrate some of the differences and similarities from this season, I have created an interactive visualization showing the UZR and DRS for players so far this year. In general, whenever you attempt to measure the same underlying talent level in different ways, you are going to have some agreement and some disagreement between the numbers. The scatter plot illustrates the variance between the two metrics and has a 0.44 r-squared value.
Data is current through 5/5/2015, and each player’s data point represent his defensive metrics at all positions.
In a perfect world, these measurements would be identical, but unfortunately, fielding data is fairly noisy. Both metrics are attempting to measure the fielding value of each player in the first month of the season, but the measurements are imperfect, arising from both the variables that go into measuring fielding value and the performances of the player. Of course, all baseball stats contain noise — Dee Gordon, Major League Leader in WAR — but fielding stats do have more than most.
Of course, when the measures agree, it’s easier to have more confidence in the conclusion; I don’t think anyone doubts that Andrelton Simmons has been a big asset to the Braves so far this year, for instance. That said, keep in mind that both systems are using the same data source, so if there was an issue with the quality of the underlying data, neither system would act as a check-and-balance for the other. However, Baseball Info Solutions has been collecting fielding data for over a decade, and the odds of an uncovered systemic bias are pretty low at this point.
After seeing the correlation between the DRS and UZR for plays through early May, I wondered what an entire year of fielding data would produce. More data will yield a clearer picture of the true talent level. Below is the same scatter plot with each player-year with the requisite qualified innings at a given position from 2010-2014 and plotted them to get a more attractive r-squared value of 0.66. When I averaged the stats over the same time frame, there was even more agreement between the two defensive stats; the r-square value became 0.74.
With the potentially exciting development of tracking data such as MLB’s Statcast system, we’ll likely be able to improve and refine these metrics in the future, giving everyone a better understanding of defensive value. But while the current numbers are imperfect, they are also worth keeping an eye on, and incorporating into our overall view of a player’s defensive value to his team.
I code a bunch of things here. I really need to update my blog about statistics at stats.seandolinar.com.