The Strange Opacity of the So-Called Pitching Tools

Recently, I put together a calculator for my own personal use — and perhaps, eventually, the use of this site’s readers — that helps to translate (roughly) a batting prospect’s individual tool grades into wins. Lead prospect analyst Kiley McDaniel already provided something along these lines for Overall Future Value back in September by means of a chart, of which this is an excerpt:

Grade Role WAR
80 Top 1-2 7.0
75 Top 2-3 6.0
70 Top 5 5.0
65 All-Star 4.0
60 Plus 3.0
55 Above Avg 2.5
50 Avg Regular 2.0
45 Platoon/Util 1.5
40 Bench 1.0
35 Emergency Call-Up 0.0
30 Organizational -1.0

Even for those of us unfamiliar with the parlance of scouting, this is fairly intuitive. A player who receives a 50 FV grade is regarded as an average player. An average player, in statistical terms, is one who produces roughly two wins over the course of a full season. It follows, then, that a player who receives a 50 FV is one we might reasonably expect to produce about 2.0 WAR in a season at the height of his talents. Players who receive better grades than a 50 are likely, by some order of magnitude, to produce more than 2.0 WAR; worse grades than a 50, less than 2.0 WAR.

My interest in devising a calculator was essentially to create a framework for translating the individual tool grades into an Overall Future Value grade — and also, perhaps, to make the relationships between the different tools a bit more explicit. An 80 hit tool, for example, is more valuable than an 80 run tool. How much more, though? It’s a question I’ve considered here before, but the calculator helps in addressing a large quantity of individual prospects, each with distinct profiles, all at once.

By my flawed calculations, McDaniel’s future-tools grades for Byron Buxton (60 hit, 55 power, 80 speed, 70 field*) translate, in sum, to about six wins over the course of a single season. Kris Bryant’s (50+ hit, 70 power, 45 speed, 45+ field*) translate to about four wins. Both win figures equate roughly to the total denoted by McDaniel’s future value grades for those prospects (70, in each case).

*The arm tool is omitted here, but can move the field grade up or down a bit if it’s particularly good or bad.

The translation of those grades to wins, as I say, don’t precisely match McDaniel’s own FV estimates, but they are close. And, at root, there’s real validity to expressing batter tools in runs or wins — because most of those tools are already directly tied to widely available stats, anyway.

Consider, by way of example, the two tools that relate directly to batting — the hit and power tools. The hit tool denotes a player’s ability to hit for average. The power tool is tied to an estimate of the home runs a player is likely to hit over a full season. The exact relationship between the tool grades and the numbers they represent might differ based either on organization or the era’s current run environment. The basic translations, however, are pretty universal. From McDaniel’s piece last September, an except from another chart:

Grade Avg HR
80 .320 40+
75 .310 35-40
70 .300 30-35
65 .290 27-30
60 .280 23-27
55 .270 19-22
50 .260 15-18
45 .250 12-15
40 .240 8-12
35 .230 5-8
30 .220 3-5

Again, the precise equivalences might differ from one scout or organization to the next, but McDaniel provides this information having worked for three organizations and evaluated the prospects of many more.

Here’s the thing about batting average and home runs: while there are other variables that inform a batter’s production — like his rate of doubles and triples, for example, or his walk rate — batting average and home runs account for a substantial portion of a batter’s output.

To illustrate, I took merely 10 minutes of my dumb day to create an offensive estimator using only batting average and home runs. As a sample, I used all qualified batter seasons — there are 429 of them — from 2012 to -14. I ran a multivariate regression* to determine the relationship between batting average and home runs per 600 plate appearances (on the one hand) and wOBA (on the other).

*The results of that: (AVG * 0.82) + (HR600 * 0.0023) + 0.072 = Expected wOBA

Having merely those two figures, turns out, allows one to make some very reliable assumptions about a batter’s production. Below is the expected wOBA of all those batters from 2012 to -14 — again, utilizing merely batting average and home runs per 600 plate appearances as the inputs — plotted against those same batters’ actual wOBAs.

wOBA Graphs

There’s almost no variance, one finds, between the expected and actual wOBA figures.

Here’s a chart including the top-10 hitters by expected wOBA from the past three seasons:

# Name Team Season PA AVG HR600 xWOBA wOBA
1 Miguel Cabrera Tigers 2013 652 .348 40.5 .452 .455
2 Miguel Cabrera Tigers 2012 697 .330 37.9 .431 .417
3 Ryan Braun Brewers 2012 677 .319 36.3 .419 .413
4 Victor Martinez Tigers 2014 641 .335 30.0 .417 .411
5 Chris Davis Orioles 2013 673 .286 47.3 .417 .421
6 Jose Abreu White Sox 2014 622 .317 34.7 .413 .411
7 Adrian Beltre Rangers 2012 654 .321 33.0 .413 .388
8 Mike Trout Angels 2012 639 .326 28.2 .406 .409
9 Andrew McCutchen Pirates 2012 673 .327 27.6 .405 .403
10 Buster Posey Giants 2012 610 .336 23.6 .403 .406

Again, very little derivation between expected and actual wOBA.

What both the graph and chart above reveal is that batting average and home runs explain a great deal about a hitter’s output. Which, this illustrates the remarkable utility of the scouting tools for hitters: they relate strongly to production in an entirely intuitive way.

And this is where the title of this post becomes relevant. Because, after having created the grades-to-wins calculator for batters, I considered how I might devise an equivalent sort of thing for pitchers. Immediately, the differences between the two evaluation systems — the collection of tools relevant to hitters versus those used for pitchers — became evident (in a way which, were I smarter, would have probably become evident before). For while the batter tools all relate — sometimes more strongly, sometimes more weakly — to a batter’s output or his product, the so-called pitcher tools are very much more representative of his process.

Consider McDaniel’s grades for baseball’s top pitching prospect, Dodgers left-hander Julio Urias, as an example:

Urias

Here one finds that Urias has a 65 future-value grade on his fastball. There actually is some suggestion of objective scale wherein fastball grade is concerned, as McDaniel illustrates by means of a chart from which this is an excerpt:

Grade FBv
80 97
75 96
70 95
65 94
60 93
55 92
50 90-91
45 89
40 88
35 87
30 86

As McDaniel also notes, however, fastball grade isn’t a representation of velocity exclusively. There are also considerations for movement and deception and command integrated into that grade, as well. This is where any relationship to the distinctly objective ends, however, for pitcher tools. For while a 65 grade on a fastball usually means something like “94 mph with standard movement, etc.” a 65 grade on a curveball has no such grounding in the empirical. When I asked McDaniel what a 65 grade on a curveball actually means, he replied that, in the mind of whichever scout has applied that grade to the pitch, it means that the curve in question is definitely better than the Platonic average major-league curveball and also a little bit better than an above-average major-league curveball*. What that means in terms of product, however, remains obscure. “Better” at what isn’t expressly clear (although one would assume that “better at preventing runs” is somewhere at the heart of it).

*Note: he didn’t say these exact words, but some collection of words resembling them. One can only listen so closely to Kiley McDaniel.

It’s strange, the different obligations placed on a scout by the batting and pitching tools. A scout, in attempting to assign a grade to a batter’s hit tool, is asked not merely to synthesize a great deal of observational data concerning that batter’s bat speed and bat control and physical strength but also to compare that information to all the other batters in his mental library and moreover to have some reasonably precise idea about how that information translates to actual, literal batting average at the major-league level. That same scout, attempting to assign a grade to a pitcher’s curveball, is asked to perform the first two of those procedures — i.e. to synthesize observational data regarding the curveball and then to compare it with all the other curveballs in his mental library — but is not then asked to perform that final translation to an actual major-league metric, such as swinging-strike rate or the like. McDaniel suggests that it’d be possible without much difficulty to devise a rough formula that would translate a pitcher’s pitch grades into an Overall Future Value grade. Which, there’s utility in that, but it also lacks any grounding in the empirical such as that offered by the batter tools and grades.

On the one hand, there’s little urgency to change the system. It appears to work for organizations, and that’s where the information is most urgently sought after. Kevin Goldstein, formerly of the Internet and currently of the Houston Astros (as their Director of Pro Scouting), once told me that, even if the grades aren’t always grounded in empirical data, they still work as an effective shorthand between scouts and their scouting directors and their general managers. Which, that’s what language is — in the Saussurean sense, at least: a collection of signs which mean roughly the same thing both for the one producing and also the one consuming them.

For those of an analytical bent, however — which likely describes the reader who’s made his or her way through 1500 words to this point — there’s always a compulsion to link the observational data with the quantitative. The work done by Eno Sarris and Daniel Schwartz on Arsenal Score has done this to some degree. Sarris and Schwartz integrate both swinging-strike and ground-ball rate for each offering within a pitcher’s repertoire and then tie it, at some level, to the run-preventing properties of those two metrics. I attempted a similarly granular experiment last year with Brandon Finnegan, using not pitch-specific data, but fastball velocity, swinging-strike rate, and first-pitch-strike rate, to produce an ERA estimate. These are all just experiments, however. Any further consideration of specific models is outside the scope of this post. Instead, the intention here has been a naive, but hopefully constructive one — namely, to document some differences (rarely expressly stated) between batting and pitching tools, the relevance of the grades applied to each, and their implications for major-league future value.





Carson Cistulli has published a book of aphorisms called Spirited Ejaculations of a New Enthusiast.

7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
phoenix2042
8 years ago

I feel that, as you stated, the grades on pitches should be attached to swinging strike rates and zone rates, as compared to the averages of that pitch type. Maybe ground ball rate as well. I guess zone rate is kind of what the command tool is. But I don’t see anywhere that directly corresponds to strike outs or batted ball profiles.

Bipmember
8 years ago
Reply to  phoenix2042

Zone rate is not a good indicator of command. Every pitcher throws out of the zone, but pitchers with better command do it more on their own terms while ahead in the count, while pitchers with bad command do it more frequently in situations where it is not advantageous to them.