How Should We Evaluate a Manager?

October 3, 2016

I’ve got a vote for American League Manager of the Year this season and I’m terrified. My first vote as a member of the Baseball Writer’s Association, and it’s the impossible one.

Maybe impossible is too tough a word. I’m sure I’ll figure something out in time to submit a vote. But evaluating the productivity of a manager just seems so difficult. We’ve seen efforts that use the difference between projected and actual wins, or between “true talent” estimations for the team and their actual outcomes. But those attribute all sorts of random chance to the manager’s machinations.

I’d like to instead identify measurable moments where a manager exerts a direct influence on his team, assign those values or ranks, and see where each current manager sits. So what are those measurable moments?

As far as I, noted idiot, can figure, it’s these variables that define a manager:

When he uses his best relievers.
How rigid his approach to the bullpen is.
Where he puts his best hitters in the lineup.
How often he bunts with non-pitchers.

What follows is a difficult, probably error-laden, and almost certainly futile effort to score managers on how they’ve done in these four categories. And the worst part is that, when it’s done, not only can I not reveal my vote, but the work won’t really be done — because there’s no doubt that managers are also in charge of personalities and shepherding a clubhouse full of athletes into a cohesive unit. We won’t find a number for that one, and it seems a necessary moment of chaos that threatens to upend our best efforts.

For the first of the above variables, however, what we’d like to do is have a measure of the reliever’s true talent, and then we can match it up with leverage index, which tells us how important the moment is in the context of the game. We’ve got two ways to do this.

The first, called Bullpen Management Above Random by creator Tim Kniker, uses weighted on base average allowed to lefties and righties to judge their true talent. Kniker was kind enough to run this year’s BMAR for us.

Bullpen Management Above Random, 2016

Manager	Manager above Marcel	Optimal above Marcel	% of Optimal	Runs Above Marcel	WAM
Bryan Price	0.018	0.091	19.3%	34.2	3.5
Buck Showalter	0.019	0.084	22.4%	33.0	3.4
Joe Girardi	0.020	0.100	20.5%	32.8	3.3
Ned Yost	0.017	0.066	25.4%	28.5	2.9
Pete Mackanin	0.016	0.081	19.7%	27.5	2.8
Bob Melvin	0.015	0.071	21.7%	27.3	2.8
Brad Ausmus	0.016	0.053	31.0%	27.0	2.8
Mike Matheny	0.017	0.084	20.1%	26.9	2.7
Andy Green	0.015	0.061	24.3%	26.8	2.7
Clint Hurdle	0.014	0.063	22.4%	26.3	2.7
Jeff Banister	0.015	0.073	21.0%	25.7	2.6
Kevin Cash	0.016	0.064	24.6%	25.4	2.6
John Farrell	0.016	0.059	27.0%	23.9	2.4
Joe Maddon	0.015	0.068	21.6%	21.9	2.2
AVERAGE	0.012	0.109	11.2%	20.8	2.1
Craig Counsell	0.012	0.061	19.0%	20.6	2.1
Terry Francona	0.013	0.061	21.1%	20.4	2.1
Chip Hale	0.010	0.081	12.7%	19.9	2.0
Robin Ventura	0.012	0.061	20.2%	19.4	2.0
Don Mattingly	0.011	0.068	15.6%	18.7	1.9
Paul Molitor	0.009	0.068	13.9%	17.8	1.8
Walt Weiss	0.010	0.093	10.6%	17.5	1.8
A.J. Hinch	0.009	0.084	11.0%	15.5	1.6
Terry Collins	0.009	0.072	12.3%	14.3	1.5
Fredi Gonzalez	0.027	0.109	24.6%	13.3	1.4
Dave Roberts	0.007	0.069	9.7%	12.0	1.2
John Gibbons	0.008	0.077	10.1%	11.8	1.2
Mike Scioscia	0.006	0.069	9.3%	11.3	1.2
Bruce Bochy	0.007	0.067	10.7%	10.9	1.1
Brian Snitker	0.007	0.079	9.3%	10.2	1.0
Scott Servais	0.005	0.067	8.1%	8.9	0.9
Dusty Baker	-0.003	0.062	-4.9%	-4.6	-0.5

SOURCE: Retrosheet, Tim Kniker

Manager Above Marcel is expressed in points of wOBA the manager suppressed using reliever choice. WAM = Wins Above Marcel.

Many Cards fans don’t respect Mike Matheny’s bullpen management, so perhaps this passes the sniff test. For the purposes of my later vote, I find it interesting that John Farrell is second in the league in the percent of optimal usage, and that Buck Showalter has a really good bullpen and uses it well. Terry Francona is above average by percent of optimal, but third in this trio.

However, wOBA allowed may not be the best way to judge a player’s true talent. Rob Arthur and Rian Watt developed Weighted Reliever Management score to judge the same thing, and he used Baseball Prospectus’ Deserved Runs Allowed as the measure of talent. He’s a generous person, so he ran a one-year version of wRM for this year for our benefit.

Weighted Reliever Management Scores by Team, 2016

Team	DRAcor
Brewers	-0.88
Red Sox	-0.82
Pirates	-0.80
Astros	-0.70
Yankees	-0.68
Braves	-0.66
Royals	-0.66
Padres	-0.65
Phillies	-0.62
White Sox	-0.61
Mets	-0.60
Tigers	-0.54
Giants	-0.52
Rays	-0.51
Indians	-0.49
Orioles	-0.43
Reds	-0.41
Athletics	-0.41
Twins	-0.40
Mariners	-0.36
Blue Jays	-0.35
Angels	-0.32
Rockies	-0.25
Marlins	-0.25
Nationals	-0.22
Dodgers	-0.17
Rangers	-0.15
Cubs	-0.15
Diamondbacks	0.00
Cardinals	0.05

SOURCE: Rob Arthur, Five Thirty Eight

Arthur wanted to point out that these are not the same as the wRM+ scores in the article, because they aren’t based on multiple years and regressed in the same way. Instead, shown here is the raw correlation between DRA and the leverage index of the relievers on the team, weighted by innings pitched. More negative means that the better relievers are being used with the most leverage, which means the manager is doing well.

By this measure, Farrell does well; Matheny, not so much.

Finally, there might be a third way, a better way to evaluate reliever usage — namely, by using projections instead of true-talent estimators. Because Arthur used full-season DRA and leverage numbers in his analysis, he was more answering the question “Looking back, did the manager make the right decisions?” It might make sense, however, to ask “Based on the info he had then, did the manager make the right decisions?” We have the benefit of looking at the results here, but since we also don’t have the full season or multiple years of regression, there’s more uncertainty with these rankings than the ones he published.

I’m a big fan of flexibility in approach. Flexibility, to me, is the sign of a supple mind, one that allows for considerations beyond one’s own worldview. So it’s fun that Ben Lindbergh and Dan Brooks developed a stat that measures the rigidity of reliever usage. He looked at the number of outs the team had recorded before a given reliever came entered, and the variation in that number, to see if some managers were less flexible in their usage. Lindbergh was kind enough to run this stat for 2016, leading up to September, which is fine for me because September baseball is played on the moon by aliens.

Reliever Role Rigidity by Team, 2016

Team	Team Reliever Role Rigidity
ANA	3.64
MIN	3.57
OAK	3.52
CIN	3.33
TBA	3.26
SLN	3.22
CLE	3.16
LAN	3.12
BOS	3.10
HOU	3.09
SDN	3.05
PIT	3.02
ARI	3.00
CHA	2.99
MIL	2.94
KCA	2.93
SEA	2.92
MIA	2.89
ATL	2.87
WAS	2.85
BAL	2.84
CHN	2.82
NYA	2.82
TEX	2.78
PHI	2.77
COL	2.73
NYN	2.68
SFN	2.68
DET	2.64
TOR	2.29

RRR is the standard deviation of the number of team outs recorded before a reliever’s appearance. In this case, we calculated the RRR for relievers with at least ten innings and then did a weighted average for each team. Lower = more rigid.

Here, a lower number means a more rigid approach to the bullpen. Brad Ausmus scores poorly even as he tried to find the right combination of relievers to lead up to closer Francisco Rodriguez. The Indians have been better at moving the pieces around once they got the right pieces — that was the reason Lindbergh brought out RRR this year, since Cody Allen has been moved around more than most closers.

For the next piece, Jonah Pemstein helped assess the ability of a manager to put the right players in the right spots in the lineup. What he did was look at the projected daily wOBA for each player in each lineup slot for each team, which was awesomely provided by Matt Hunter and SaberSim. Pemstein then ranked those wOBAs on the team, and then correlated their actual lineup slot to their ideal lineup slot, using Tom Tango’s The Book as a guide for the best way to put a club’s assorted players into the lineup.

What we have here, then, is a correlation between actual and ideal lineup slots. The closer a team is to one, the closer they were to putting out an ideal lineup every night.

Correlation of Actual to Ideal Lineup by Team

Team	Correlation of Actual to Ideal Lineup
Pittsburgh	0.836
Anaheim	0.806
Chicago (N.L.)	0.751
Baltimore	0.724
Miami	0.716
San Diego	0.715
Toronto	0.707
Boston	0.676
St. Louis	0.674
Washington	0.672
Houston	0.668
Cincinnati	0.667
Colorado	0.652
Detroit	0.647
Arizona	0.641
New York (N.L.)	0.639
San Francisco	0.637
Tampa Bay	0.634
Philadelphia	0.617
Cleveland	0.611
Milwaukee	0.611
Atlanta	0.607
Seattle	0.605
Minnesota	0.598
Texas	0.593
Los Angeles (N.L.)	0.571
Chicago (A.L.)	0.515
Oakland	0.480
New York (A.L.)	0.444
Kansas City	0.426

SOURCE: SaberSim, Jonah Pemstein

To see what these distributions look like, Pemstein also created some visuals. Let’s look at the extremes and find the distribution of lineups for Pittsburgh and Kansas City. Clint Hurdle has his team’s lineups bunched up over by the right side, the ideal side, while Kansas City has a whole lot of Alcides Escobar at the top of the lineup.

For the purposes of my vote, Showalter gets some more love, while Farrell and Gibbons also do well. Francona, though, takes a little hit. Looks like SaberSim liked Tyler Naquin better than other projection systems, and he batted eighth for most of the year.

We’ve saved the easiest for last. Rather than going through each bunt situation in a granular way, I thought we could say something wide-ranging that should still ring true — namely, that it’s rarely a good idea to bunt with a non-pitcher. They should be good enough with the bat that it lowers win expectancy to trade their out for a base.

The American League plays the National League enough that a simple leaderboard tweak should give us what we want. Which teams bunted the least often with their non-pitchers? Answer: the Red Sox, with the Orioles in third, and the Tigers and Blue Jays in fifth and sixth.

In sum, we have some arguments for the managers who will be on the short list for American League Manager of the Year. John Farrell, John Gibbons, and Buck Showalter do well across the board, while Jeff Banister and Terry Francona have their strengths and weaknesses. Picking between them will be difficult. Almost impossible, even.

Thanks to Ben Lindbergh, Dan Brooks, Tim Kniker, Jeff Zimmerman, Jonah Pemstein, Rob Arthur, Rian Watt, and Matt Hunter for their help in putting this together.

48 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

HappyFunBallMember since 2019

8 years ago

Not only is clubhouse management a necessary moment of chaos that threatens to upend our best efforts, it MIGHT be the biggest thing a manager has to do. Not to mention his share of organizational decisions about how to handle near-MLB ready prospects and roster decisions generally.

Dusty Baker is an incredibly confounding example. His entire career is making quantifiably questionable strategic and tactical decisions … and winning a crapload of games anyway.

I would say that a good manager absolutely needs someone on his staff who has a good handle on in-game tactics. That could be the bench coach, however, or the hitting or pitching coaches. But that’s a resource just the same as a bench bat or a LOOGY reliever, to be leaned on when needed but not necessarily every single time.

Doorknob11

Reply to HappyFunBall

Agreed I think the most important part of being a manager is how they manage a clubhouse. Personally I think the pitching coach is the one who makes a lot of the pitching changes decisions. Dusty Baker would not have a manager job if it weren’t for his amazing way to make guys feel good about themselves.

Carlos BaergaMember since 2024

Right there with you. It comes down to leadership. Can you get your guys to walk out there and give more effort than they would give if you weren’t there? Do you create a positive culture where players and fans are smiling more often than they are not? Do your players love you and would they do anything for you and the team? Why don’t you interview a lifetime rental player like Jose Guillen, Matt Stairs, or Octavio Dotel and ask them about their favorite managers. Interview some current young stars who have already played for 3 managers and get their opinion. Not everything in this game can be broken down into a statistic even though sometimes it feels that way.

Arjon

I agree that clubhouse management is a vital part of managing. The big question I can’t begin to answer is how to assess and factor in the degree to which each clubhouse is hard or easy t manage.