On Nick Markakis and Defensive Metrics by Dave Cameron December 4, 2014 Yesterday, the Atlanta Braves signed Nick Marakis to a four year, $45 million contract. As Mike Petriello noted, the deal just doesn’t make much sense from a numerical perspective, since nearly every piece of evidence we have is that Markakis just isn’t that valuable of a player. I’ve previously compared Markakis to Nori Aoki, and would imagine that Aoki will sign for something like half of what Markakis just received from the Braves. But, if we’re looking for an explanation of why the Braves — and others, since it’s not like Atlanta was the only team bidding for his services — valued Markakis, it’s actually not that much of a puzzle. The sentiment for the deal can essentially be summed up in this statement. braves actually replaced 1 gold glover (heyward) with another (markakis. the new RF, who also has GA roots, has a 4-yr deal — Jon Heyman (@JonHeymanCBS) December 3, 2014 While the metrics here will suggest that Jason Heyward is a star and Nick Markakis is a scrub, traditional evaluations are going to make them look very similar. And if you just look at their 2014 offensive numbers, they actually do look pretty similar. Both hit for minimal power (.113 ISO for Heyward, .111 for Markakis) while controlling the strike zone, and that combination led to a slightly above average wRC+ for both. If you happen to also think that the defensive gap between the two is not very large, then Markakis begins to look a lot like a veteran version of Atlanta’s former right fielder. And if you have never heard of UZR or DRS, you very well might think that Markakis is just as good of a defender as Heyward. He’s won two Gold Glove awards, including one last year. He has a career .994 fielding percentage, seventh highest among active players, and his 93 outfield assists rank behind only Ichiro and Torii Hunter among current outfielders. Markakis excels at the kinds of things that people have historically valued in defenders, so if you just evaluate him on these metrics or by an eye-test that relies heavily on those factors, then Markakis looks like a pretty good defender. And this is why there’s such a stark gap between the evaluation of Markakis’ defensive value between the advanced metrics and the Fan’s Scouting report. As Jeff noted back in February, the fan’s have consistently rated Markakis’ glovework far higher than the numbers have, putting him nearly 15 runs better than what UZR suggests. If you add 15 runs of defensive value to his ledger, 4/$45M for Markakis is perfectly rational. So, then, this signing basically comes down to how much weight you put on modern defensive statistics. This signing essentially suggests that the Braves are putting no weight on them whatsoever, rejecting their conclusions entirely. The Braves are simultaneously acknowledging the value of defense — if they didn’t think glove work mattered, they wouldn’t be paying significant money to a right fielder who hits like a middle infielder — while rejecting statistical attempts to quantify that value. And plenty of observers are going to agree with that assessment, considering that defensive metrics certainly are not perfect. But there’s a big divide between imperfect and useless, and most of the acknowledged issues with statistical evaluations of defense have to deal with the limitations of small sample data. The year to year fluctuations in defensive metrics are due to the fact that players are getting graded on how they do in ~50 plays per year, the number of chances they get that are neither routine nor impossible. In any kind of sample of that size, small fluctuations can lead to a big difference; we see the same thing in small samples of offensive metrics, for instance. That’s why caution is suggested when evaluating any player’s defensive true talent level based on a single-season calculation. If all you have is one year of UZR or DRS and you want to know what to expect from that player’s defense in the future, you’re best served regressing that number heavily towards the mean. How heavily you regress depends on how skeptical you are, but let’s just say that you think there’s very little actual information in any one season of UZR or DRS, so you’ll put just 10% weight on the calculation. In other words, if all you knew about a player was that he had a +20 defensive rating, you’d actually think he was a +2 defender, barely above average. But there’s a corollary to accepting that metrics like UZR and DRS contain ~10% real information in each season; after 10 years, you’d have collected enough information for the noise to have been filtered out, and the remainder to be almost entirely real information. It’s one thing to suggest that UZR and DRS need to be regressed heavily after a year or two, but you basically have to think they’re entirely worthless if you’re asking them to be regressed heavily after a player has racked up 12,000 innings in the field. And there’s no reason to believe that they’re entirely worthless. If you try to build a model to predict team wins, you actually do better by including unregressed defensive metrics in the calculations than if you discard the information. While many argue that the only way to evaluate defense is by actually watching the games, this is exactly how the data for UZR and DRS is collected. The potential for human error is a legitimate issue in the collection of the data — and the same kind of human error is a legitimate issue for people who are basing evaluations based on their own personal observations — but after a decade of varying humans collecting the information, the odds of a systemic bias sustaining itself solely against Markakis are slim indeed. No one thinks defensive metrics are perfect, but neither should anyone think that they’re completely worthless, especially after a decade of fairly consistent ratings. The Braves are essentially betting that UZR and DRS are so systematically wrong that they add no real information even after 12,000 innings of data collection. I don’t happen to think that’s a defensible position. Skepticism of defensive evaluations is still perfectly rationale; wholesale rejection of them, in a very large sample, is probably not.