Let’s Play With New Defensive Data

March 14, 2017

Here’s the thing about catch probability: It’s existed for close to two decades. It’s at the heart of what we refer to as the advanced defensive metrics. Defensive Runs Saved, Ultimate Zone Rating — they couldn’t exist and do anything without catch probabilities in some form. It’s just that, for the longest time, those probabilities were generalized, educated guesses. You might’ve heard that baseball has entered the information era.

Here’s a weekend tweet from Daren Willman:

Here’s the new @statcast catch probability leaderboard we introduced @sabr https://t.co/I1NDOCRVFO

— Daren Willman (@darenw) March 12, 2017

If you missed the link in there somehow, here it is again: the Statcast Catch Probability Leaderboard. We have most of two years of Statcast information, and now we’re getting to see it applied to player defense. Specifically, in this case, outfielder defense. If you don’t entirely understand what catch probability is, here’s the MLB.com glossary entry. Take a given fly ball or line drive to the outfield. What are the odds a given batted ball is caught? Statcast can tell us, by considering hang time and necessary distance to cover. This is the start of something beautiful.

When you have one catch probability, that’s pretty cool. When you have a lot of them, you start to understand how good or bad certain players are at making catches. As with anything else, the data adds up over time. And now I should say a few things before going any further. People like Daren Willman and Mike Petriello are going to treat this data most responsibly. They’re there on the inside, and they know what should or shouldn’t be done with the numbers. And, the numbers will probably be tweaked down the line, since for now they don’t consider directionality or proximity to the wall. This is a simple and first attempt. And for what’s about to follow, I’ve performed my analysis using buckets. Not every player’s bucket is the same! Let me explain. Here are the existing buckets, and their actual overall conversion rates since 2015:

5 Star Plays: 8% caught
4 Star Plays: 42% caught
3 Star Plays: 68% caught
2 Star Plays: 84% caught
1 Star Plays: 93% caught

Let’s take…Charlie Blackmon and Gerardo Parra? Sure, why not. Over the last two years, Blackmon has made five 5 Star plays, out of 69 opportunities. Parra has made six 5 Star plays, out of 57 opportunities. Based on the overall average for that bucket, Blackmon “should have” made 5.9 plays, and Parra “should have” made 4.8. But maybe Blackmon’s actual opportunities averaged out to a 10% catch rate, or a 5% catch rate. Maybe the same applies to Parra instead. Not every player’s opportunity buckets are equal, so everything, everything here is just an estimate. Statcast hasn’t yet eliminated defensive error bars.

I just can’t help myself but get into the data. It’s presented in a way similar to the Inside Edge data we already have on the site, but the Statcast information should be better. So, even considering the caveats I’ve mentioned, how about we look at some of the best Statcast defensive outfielders? On the left side of this table, the top 10 players in plays made over expected. On the right side, the same, but per 1,200 innings.

Statcast Outfield Defense, 2015 – 2016

Player	+/- Plays	Player	+/- Plays/1200
Kevin Kiermaier	46	Kevin Kiermaier	26.7
Billy Hamilton	40	Billy Hamilton	25.0
Lorenzo Cain	37	Jake Marisnick	24.0
Jake Marisnick	34	Lorenzo Cain	21.5
Mookie Betts	32	Juan Lagares	19.4
Ender Inciarte	32	Byron Buxton	19.2
Adam Eaton	27	Ender Inciarte	17.5
Kevin Pillar	24	Travis Jankowski	16.4
Jason Heyward	24	Keon Broxton	15.9
Odubel Herrera	23	Peter Bourjos	15.1

SOURCE: Baseball Savant

I’m not sure there’s a surprise in the bunch. Which is probably more of a good sign than a bad one — one wouldn’t think we’ve been completely wrong all this time. For as much as people have openly criticized the advanced defensive numbers, I think the bulk of the disagreement has centered on infield play, especially in the age of infielders moving around all over the place. We’ve long had a pretty good grasp on the outfield, I think. Statcast here mostly supports the information we already had. Kevin Kiermaier? Amazing! Billy Hamilton? Amazing! Keon Broxton? You better believe he’s amazing!

Maybe one way of interpreting this is as further evidence that Kiermaier has been better out there than Kevin Pillar. I know that’s been fiercely debated, but Statcast knows more than most of us do. There’s still room for these numbers to be adjusted, so Blue Jays fans can continue to take some heart. Travis Jankowski has apparently got it. Peter Bourjos has apparently still got it.

Moving on, we go to the other end.

Statcast Outfield Defense, 2015 – 2016

Player	+/- Plays	Player	+/- Plays/1200
Matt Kemp	-39	Hanley Ramirez	-23.7
Mark Trumbo	-27	Mark Trumbo	-22.1
Andrew McCutchen	-22	Matt Kemp	-18.3
Melky Cabrera	-21	Robbie Grossman	-18.1
Nick Markakis	-19	Danny Valencia	-13.6
Ryan Braun	-17	Preston Tucker	-13.4
Yasmany Tomas	-16	Tyler Naquin	-13.3
Hanley Ramirez	-15	Daniel Nava	-13.1
Jeff Francoeur	-13	Jeff Francoeur	-13.0
Jorge Soler	-13	Jorge Soler	-13.0

SOURCE: Baseball Savant

Again, this largely supports what many already suspected. This is why the offensive bar for Matt Kemp is so high, at least for as long as he’s in the National League. I don’t know the exact run value of an outfield play not made, but my ballpark guess is right around one run. That, in turn, would put Kemp around -39 runs over two seasons. Now, we can’t ignore Mark Trumbo, though. Kemp is at -39 plays in 2,575 outfield innings. Trumbo is at -27 plays in 1,461 outfield innings. Which means Trumbo comes out worse, and better on a rate basis than only Hanley Ramirez, who sure as heck doesn’t play outfield anymore. The Orioles know that Trumbo is a bat-first player. They might not fully appreciate the extent to which that’s been true. Trumbo and Kemp are not so dissimilar.

Trumbo is a league-worst -9 on 1 Star plays. Those are the easy ones, and Trumbo has converted nine fewer of them than you’d expect. Kemp has the worst mark on 5 Star plays and 3 Star plays, and he’s second-worst on 4 Star plays. Kiermaier is the easy leader in 4 Star plays; Hamilton leads everyone in the 5 Star category. Hamilton has made the greatest number of the most sensational catches, but Kiermaier has him beat elsewhere.

We’re always interested in the surprises. So I ran some quick and easy math to try to figure out who those might be. For every regular or semi-regular outfielder over the past two years, I calculated +/- from the Inside Edge data set. I also gathered the range component of DRS, and the range component of UZR (because catch probability has nothing to do with, say, throwing arm). I then ran a regression using these three advanced metrics against the Statcast numbers. From there I could calculate an “expected” Statcast +/-.

Important! UZR and DRS are adjusted for position. The Statcast information is not. So, in the following analysis, center fielders will look a little better, and corner outfielders will look a little worse. But anyway, here’s Statcast +/- against expected +/-:

There’s a good and pretty linear-looking relationship. That’s what you’d like to see, if the data we already had was worth anything. But you’re here for the outliers! Here are the five players with the biggest positive differences between Statcast +/- and expected +/-:

Adam Eaton
Adam Jones
Leonys Martin
Jake Marisnick
Lorenzo Cain

These are players our numbers have probably underrated a little bit. Eaton, for example, might well be a defensive superstar. Meanwhile, here are the five players with the biggest negative differences between Statcast +/- and expected +/-:

Nick Markakis
Matt Kemp
Kole Calhoun
Bryce Harper
Curtis Granderson

These are players our numbers have probably overrated a little bit. This is too simple an analysis to grant too much weight, especially given the issues with playing different positions, but there should be some substance at the extremes, and these last 10 players listed are extreme. It’s something to think about as we all move ahead, eager to learn more from the information system we’ve wanted to have forever.

Some of this, I’m sure, has been irresponsible, but who can think so clearly when given access to a new source of data? One of the major upsides of the Statcast system installation in the first place was that it seemed like it could give us unprecedented defensive clarity. We’re obviously not there yet, but every week brings us closer.

48 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Joshua Miller

8 years ago

So it looks like Adam Jones was right, the Orioles really do need to get more athletic in the corners(not that we didn’t already know that), it’s too bad Trumbo still appears slated to play a decent chunk in right, and Seth Smith probably isn’t much better.

Chimendime

Reply to Joshua Miller

Trumbo is not playing outfield this season. Rickard and Gentry will get the OF playing time.

-4

Reply to Chimendime

I’ll believe it when I see it, Rickard sucks and Gentry hasnt got more than 200 PAs since 2014. Trumbo is still projected to get the majority of RF PAs here on fangraphs, and he should only get more time in right with the Alvarez signing.

formerly matt wMember since 2025

Although Alvarez will be practicing in the outfield.

bosoxforlifeMember since 2016

Reply to formerly matt w

Which should send all Oriole fans into a catatonic state.

Reply to bosoxforlife

He can’t make the throws to play 3rd, couldn’t stick it at 1st, just imagine him in rf, ooof.

isaacmeep

I agree Rickard is awful but Gentry has had injury issues from 2014 on. If he’s anywhere near the player he used to be he might actually end up the everyday rfer.

Alvarez will be in AAA as will Mancini, both of which are now outfielders.

Also I wouldn’t completely write off this year’s Rule V guy Tavarez, he’s looked good this spring and could actually get legitimate OF time this year.

Reply to isaacmeep

Even at his best Gentry is more of a AAAA guy.

bohknowsbmore

9.7 WAR over the period 2011-2014 is far more than AAAA.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG