Gold Gloves Are About Defense… Right?

Juan Soto
Bill Streicher-USA TODAY Sports

When Juan Soto was announced as a finalist for the Gold Glove this year, I was perplexed. To say that the right fielder struggled with his defense this season would be an understatement. As recently as Game 2 of the NLCS, he was still making (or, failing to make) plays like this:

So I turned to Rawlings’ official website to get a better understanding of how Gold Gloves are won. In order to qualify for a Gold Glove, infielders and outfielders must have played in the field for at least 698 innings through their teams’ first 138 games. Maybe Soto grades out better in this subsample. In the absence of custom date ranges for advanced fielding statistics, I compared Soto to the 76 other players who were in the outfield for at least 698 innings on the season. Here is where he ranked:

Juan Soto Qualified Outfielder Ranks
Metric Rank
DRS T-48
UZR T-56
OAA 77

Both Defensive Runs Saved (DRS) and Ultimate Zone Rating (UZR) peg him as below average, and Outs Above Average (OAA) has him as abysmal, ranking him dead last.

Going just off of those numbers, Soto makes no sense as a finalist. However, there’s another key Rawlings criterion that may have made him more appealing: infielders and outfielders who played the field for the requisite 698 innings only qualify at the position they played the most. Due to this added wrinkle, there were only five other outfielders besides the three finalists in the National League that could be considered: Hunter Renfroe, Nick Castellanos, Starling Marte, Seiya Suzuki, and Randal Grichuk. Here are Soto’s ranks among those five:

Juan Soto Qualified Right Fielder Ranks
Metric Rank
DRS 4
UZR 3
OAA 6

Still not great. But there’s one thing that elevates Soto above all five of those players: his offensive production. Per Rawlings’ official site, Gold Gloves do not take offense into account and are decided in part by a fielder’s SABR Defensive Index, a composite measure of defense that includes six distinct data sources. But there is also a voting component — specifically, the manager and six coaches from each team vote. Besides the inability to vote for their own players or players outside of their league, there are no restrictions on who they can vote for among the qualifiers.

It stands to reason that, given how people view Soto as a generational talent based on all of his skills combined, they will extend that view to each of his individual skills, including his defense. This is an example of the halo effect, a mental shortcut that describes the human tendency to attribute specific positive characteristics to another person or thing based on a generally positive opinion of them.

I ran some statistical tests on this year’s and last year’s Gold Glove finalists in order to lend my theory more evidence. First, I compared Gold Glove finalists to their qualified peers. For infielders and outfielders, this meant playing at least 698 innings of defense on the season (all positions played combined). For catchers, the Rawlings criterion is that they must have played in at least 69 of their team’s first 138 games to qualify for Gold Glove consideration. It does not specify a minimum amount of innings, nor does it specify whether all 69 of those games had to have come at catcher. So I counted a player as a qualified catcher as long as they played at least 69 games on the season, their primary position was catcher, and they logged at least 500 total innings in the field. I also excluded the new utility category since this would render comparisons across years difficult. Here are the results:

GG Finalists vs. Non-Finalists, wRC+
Year Finalists Non-Finalists
2021 109 103
2022 110 102

Not only are these numbers almost identical across the two seasons, but when you put them together, they also form one statistically significant result: on average, Gold Glove finalists are significantly better hitters than their qualified peers.

Naturally, Gold Glove finalists also tend to grade out as better fielders than their qualified peers:

GG Finalists vs. Non-Finalists, Fielding
Year Metric Finalists Non-Finalists
2021 DRS 7.13 -0.67
2021 UZR 3.18 -0.33
2021 OAA 5.74 -0.28
2022 DRS 9.29 -1.18
2022 UZR 4.65 -0.69
2022 OAA 6.29 -0.61

Piecing these last two findings together, you might be wondering if better hitters are also better fielders in general. In other words, is the halo effect a good bias to have when it comes to choosing Gold Glove winners?

Well, for this year and last, wRC+ was not significantly correlated with DRS and UZR among qualified fielders. This year, wRC+ was weakly correlated with OAA, but not in the way you might think:

On average, a one-point increase in wRC+ corresponded with a 0.51-point decrease in OAA. Among Gold Glove finalists across the last two years, the trend was even more stark: a one-point increase in wRC+ corresponded with a 1.37-point decrease in OAA.

Further solidifying this finding, DRS was also negatively correlated with wRC+ among finalists: a one-point increase in wRC+ corresponded with a 0.89-point decrease in DRS. OAA is not calculated for catchers, but DRS is, so its inclusion here is an important stipulation. Overall, it seems the worst-fielding Gold Glove finalists tended to be the best-hitting. Coupled with the fact that Gold Glove finalists tend to fare better at the plate than other qualified fielders, this leads me to believe that the halo effect helps bolster certain candidacies on the basis of their hitting prowess.

I say certain candidacies because this isn’t always the case. Take Jackie Bradley Jr., who has a great reputation with the glove but has struggled with the bat the last two years to the tune of a 46 wRC+. Though his defense hasn’t been amazing, he has been a Gold Glove finalist each of the last two years likely due to his storied fielding fame. If a player’s candidacy is borderline, like Soto’s and Bradley’s, reputation seems more likely to play a role. For Bradley, a past Gold Glove lies behind the reputation; for Soto, it’s more likely that his hitting ability has subtly influenced how the coaches who vote view him.

There was one more test I wanted to run. How much does any of this really matter in terms of deciding the ultimate Gold Glove winner? While Soto is a finalist, he is only one of three alongside Daulton Varsho and Mookie Betts, the top two in OAA among NL right fielders. The chances of Soto winning are slim, right? Well, the ultimate winners last year outpaced the losing finalists in wRC+ 114 to 107 on average. It’s a small sample — I could only use 16 winning hitters and 32 losing hitters — but the result bears mentioning.

Soto only bested Betts by one point in wRC+, and Betts is clearly the better defender, so I would still put my money on the Dodger garnering more votes than the Padre. And we can’t forget Varsho, who was tops in the NL at his position in OAA. But keep an eye on this year’s winners overall to see if the trend holds.





Alex is a FanGraphs contributor. His work has also appeared at Pinstripe Alley, Pitcher List, and Sports Info Solutions. He is especially interested in how and why players make decisions, something he struggles with in daily life. You can find him on Twitter @Mind_OverBatter.

47 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
sogoodlooking
1 year ago

It’s still baffling that writers will refer to Soto as a ‘generational’ talent when he’s a 5-6 win DH, particularly now that we have DH’s in the NL. He’s Bryce Harper with more walks and fewer HRs. A majority of his seasons have OPS+’s under 150. He just showed he could slump for 3/4 of a season. His final line was .242/.401/.452. He’s had two seasons over 3.7 fWAR.

Soto is a heckuva hitter. He is not a generational talent.

tipaion
1 year ago
Reply to  sogoodlooking

His closest B-R comparable through 22 was Mike Trout. His closest comparables now, through his age 23 season, are Bryce Harper (after his nine win season), Frank Robinson, and Mickey Mantle.

The “two seasons over 3.7 fWAR” argument is a real surface reading of the data that ignores his 2.6 fWAR in 47 games in 2020. Would he have put up 7-8 wins if that season had lasted 162 games? Probably not! But he’s averaged 5.9 per 162 even with a couple of slumps at the plate that were generally agreed to have reflected some poor luck rather than a serious issue with his swing, and baseball players simply don’t do that at 23… also, wait, why is Bryce Harper with more walks a bad thing? This confuses me more and more the more I read it back.

tipaion
1 year ago
Reply to  tipaion

I’ve just noticed Harper’s closest B-R comparable through 29 is Barry Bonds. On the one hand, this illustrates pretty well why I shouldn’t be consistently relying on them in these kinds of arguments. On the other, this is exactly why I find them so entertaining. That’s hilarious.

D.K. Willardsonmember
1 year ago
Reply to  tipaion

..but there *are* serious swing issues that have cropped up. His vertical bat angle has gone from slightly below average to super flat ~ 4th flattest in all of MLB. (Players with low VBA have much less consistency in launch angles and lower contact quality)

2019 – 29.8
2020 – 29.1
2021 – 25.8
2022 – 24.8

While he is great at generating value though his on-base skill – his xWOBA on contact (xWOBAcon) is not even top 35 in the league. So IMO, unless he fixes his path, he will not perform at a level most are expecting.

Last edited 1 year ago by D.K. Willardson
TKDCmember
1 year ago
Reply to  tipaion

Those BR comps are based on very rudimentary tools. I believe several of the FG writers have said they are next to useless, or “just for fun.”

John DiFool2
1 year ago
Reply to  TKDC

To be specific, it was one of Bill James’little toys from the mid-80’s. Someone really needs to overhaul the algorithms there.

PC1970
1 year ago
Reply to  John DiFool2

Yeah, it was from Bill James Hall of Fame book. I think he used it as a basic tool to see who Player X was truly comparable to. IIRC, even he admitted it was pretty basic- relyed on the “baseball card” stats, didn’t take into account rate stats, offensive eras or defensive position, etc.

He used it as a part of an HOF argument- Looking at Player X’s 10 most similar comps- How many are in the HOF? How many will be? Was he better (or worse) than them?

Certainly wasn’t meant as something to definitively rank players. I’m surprised B-R still has it..though it is fun to look at.

Last edited 1 year ago by PC1970
MikeSmember
1 year ago
Reply to  sogoodlooking

He only has 3 seasons below 5.6 WAR. One of those was 116 games as a 19 year old when he put up 3.7, and one was in shortened 2020 when he put up 2.6 in 47 games. He has had one “bad” season out of 5, when he was still worth 3.6 wins.

He has 22.5 wins through his age 23 season. That’s 18th all time and the only guys ahead of him not in the hall of fame are Mike Trout, Joe Jackson, Alex Rodriguez, Andruw Jones, Sherry Magee, and Cesar Cedeno. Just based on numbers, most of those guys should be in the HoF. He’d be higher if he had gotten to play more in 2020.

He needs to keep doing it, but through 23 he is one of the all time greats and there isn’t much reason to believe he is going to get worse since he derives all his value with his bat.

TKDCmember
1 year ago
Reply to  MikeS

Just going to argue with your last line. Why on earth would you believe someone who only derived value from their bat is a safer bet to maintain a high value? Anecdotally, to me, the opposite is true.

MikeSmember
1 year ago
Reply to  TKDC

He probably has more power yet to develop since power develops last and guys who hit like he has through age 23 tend to keep hitting for their whole career. Speed and defense decline with age. For instance, Andruw Jones was a very good hitter, but not a great hitter. Through age 30 he was 126.8 oRAR, but 290.1 dRAR. He got most of his value from being an elite center fielder. Once his defense became just average (which happened all of a sudden at 31), he was a less than 2 win/year player over his last 5 years. Soto is 10th all time in oRAR for his age and the only guys ahead of him on thaqt list who didn’t have very productive offensive years after age 30 are Shoeless Joe and Trout.

TKDCmember
1 year ago
Reply to  MikeS

“Very productive offensive years after age 30” leaves just an absolute ton of wiggle room. Who are these players? I don’t know what oRAR is, but most players who are stud hitters before age 23 are not pumpkins on the bases and in the field like Soto (perhaps in the past a player like Soto wouldn’t have gotten called up until his defense improved?)

Soto may develop power and he may hit at an elite level in his 30s, but there is no evidence that those types of players age better. If anything, it’s the opposite. More athletic players, which can reasonably be judged by skill at all facets of the game, tend to age better.

Andruw Jones had personal issues and gained 50 pounds between age 29 and 31. Not a great example.

Last edited 1 year ago by TKDC
terencemann
1 year ago
Reply to  sogoodlooking
terencemann
1 year ago
Reply to  terencemann

/eyeroll

soddingjunkmailmember
1 year ago
Reply to  sogoodlooking

I don’t think we know yet. If he keeps putting up ‘only’ 5-6 win seasons for another decade and sprinkles in a career year here and there I’m very comfortable calling him a generational talent.

If 5-6 becomes 3-4 in a couple years and he has a gradual decline, then yeah, I’ll agree with you.

jrodmember
1 year ago
Reply to  sogoodlooking

He just turned 24.

sadtrombonemember
1 year ago
Reply to  jrod

There is a ton of time to prove lots of people wrong, but even if he’s not a generational talent or whatever the likely outcome is still awfully good. Is anyone sad that Joey Votto is merely a probable Hall of Famer instead of Ted Williams?

Seamaholic
1 year ago
Reply to  sogoodlooking

You’re completely missing his age, which is why he’s considered generational. Although I gotta say, I’m not a big fan of just assuming a young guy will get better linearly until he hits his peak at 27 or whatever. It’s lazy. It’s quite possible Soto just had a very young peak. I put Acuna in that same category.