Most of you are familiar with Ultimate Zone Rating (UZR), a metric for evaluating fielding using detailed play-by-play data. For first basemen, like all other fielders, it measures the number of ground balls that are fielded and turned into outs as compared to an average-fielding first basemen, given the same number, type, and location of ground balls, as well as the same number of outs, the base runner configuration, and batter handedness (plus an adjustment for the parks and the ground/fly tendency of the pitchers). The difference between these two values is a fielder’s UZR. It is usually expressed as a number of runs saved or cost compared to an average player at that position over a specified period of time (usually defensive games). You will often see it as a rate stat, generally per 150 games.
What UZR does not measure for first basemen, because the requisite data is not readily available, are the theoretical runs saved or cost by virtue of a first basemen’s skill at successfully catching errant throws or throws in the dirt. For lack of a better word, I call these “scoops,” even though they necessarily include poor throws (e.g., high or off-line) that are not in the dirt.
Fortunately, there is a way to estimate this skill, using a relatively simple method “invented” by Tom Tango, called “without and with you”, or WOWY. The WOWY methodology was explained by Tango in an excellent article in the 2008 Hardball Times as well as in various online forums, including his (and my) blog, www.insidethebook.com.
Basically, it goes like this: Figure out what happens when a particular player is on the field (generally something that explicitly involves that player, or at least is affected by that player, such as the number of ground balls a particular SS – say Derek Jeter – fields and turns into outs). Then figure out the same thing when that player is not on the field, but all other relevant variables (in this example, mostly the pitcher, park, and opponent batters) remain constant. The difference between the two rates (per whatever you want) should reflect the difference in skill, or at least in performance (skill is usually performance regressed toward some mean, the amount of regression being a function of the sample size of the performance) of whatever you are measuring, between the player in question and the average player when the player in question is not on the field (but in the data set).
In other words, to use the Jeter example, if Derek fields 4 balls per 9 innings and all other SS (let’s say that our sample of “other SS” is large and unbiased enough to consider them league average) field 4.5 balls per 9 innings, given the exact same pool of pitchers, batters, and parks, then we can safely say (more or less – within the bounds of sample error) that Jeter is .5 balls per 9 innings worse than the average SS. As I said, it is simple and brilliant. The results are extremely telling if we can get large enough samples of Jeter and lots of other SS that are “matched” with Jeter based upon parks, pitchers, batters, etc.
The methodology is a little more complicated than that, but hopefully you get the general idea. Anyway, the same thing can be done with first basemen in order to estimate their “scooping” performance and ability. What I did was look at every ground ball that was thrown from each infielder to each first baseman. I put that ground ball into one of two buckets: One, when there was no error on the throw, or two, when there was an error on the throw. The assumption is that when a throwing error is made, there is an errant throw (no duh) and the first baseman is not able to somehow coax that bad throw into an out by scooping it out of the dirt, jumping in the air, catching it while off the bag and still making the play, etc. Obviously most of the time when a throwing error occurs, there is nothing that any first baseman can do about it. However, in the long run, we can assume that a certain fixed percentage of bad throws from each infield position will always result in an error while another fixed percentage of those bad throws have a chance to be “saved” by a skilled (or tall) first baseman.
We can also assume that when an error is not made, that sometimes a bad throw occurs and the first baseman “saves the day.” Again, most outs and non-errors occur on easily catchable throws, but a certain fixed percentage of them will occur on bad throws that are skillfully and successfully handled by the first baseman.
Given these assumptions (which are true, by the way), if we look at all throws by a particular player at a particular position to a certain first baseman and then compare the results (error or no error) to when throws are made from the same infielder at the same position to all other first basemen, the difference can be attributed to the “scooping skill” of that particular first baseman.
Say that all infielders threw to Todd Helton 1000 times in 2007. And say that those exact same infielders threw to some other first baseman (and we are going to assume that all of these “other” first basemen are average, collectively) around 1000 times also (it does not really matter how many throws went to these other first basemen, although we would like it to be a lot). Now let’s say that 20 throwing errors were made on those throws to Helton and only 15 were made to all the other first basemen. We can safely say (with some uncertainty of course, due to sample error) that Helton was 5 plays per 1000 throws (about a full season actually) better than the other pool of first basemen, who we are assuming are average. So Todd is 5 plays, or around 4 runs, per season, above average at “scoops.”
In actuality what I did was to match up every player at every infield position (including pitcher and catcher) with every first basemen, and then for each, specific first baseman, I did a “with and without” and prorated or weighted that difference by the minimum of these two numbers – the throws made to the first baseman in question and the throws made to all other first basemen – by the same fielder.
For example, let’s say that over the sample time period, Edgar Renteria threw 300 balls to Albert Pujols (by the way, I used all data from 2000 to 2008) and he made 6 throwing errors. Now let’s say the Renteria also threw 800 balls to other first basemen (on the Cardinals or any other team he played for) and made20 errors. The difference in error rate is .5 errors per 100 throws in favor of Albert. Thus we give him credit so far for .5 errors per 100, weighted by 300 throws (the lesser of 300 and 500). We do that for every fielder who threw to Albert at least 20 times and 20 times to other first basemen, and then we add everything up to get a weighted average. This weighted average represents a first baseman’s “scooping” skill as compared to an average first baseman, or at least as compared to the average first baseman in that player’s “matched pool” of first basemen.
That last part of the last sentence is the primary flaw in the methodology. Since I am only comparing each first basemen to all other first baseman who had the same fielders throwing to them, it is possible, and in some cases, inevitable, for one first baseman to be compared to a pool of primarily good or bad first basemen, since each one will tend to be compared with the others on his own team. For example, Helton will tend to be compared with whomever else manned first base for the Rockies when he was not on the field. If those few backup first basemen happened to be particularly bad at “scooping” balls, then Helton will be overrated by this methodology. In fact, because of this flaw, and because regular first basemen tend to be better at “scooping” than backup first basemen, any regular will tend to be overrated (because they are often compared to a backup) and any backup will tend to be underrated (because they are often compared to a regular). Keep in mind, for example, that because some of Helton’s infielder teammates played on other teams, he is not always going to be compared to another Rockies’ first baseman. Anyway, for now, it looks like we are going to have to live with this flaw or weakness in the system. The correct way to handle the data, of course, is to adjust each first baseman’s rating by that of his “others” by doing an iterative process. In the future I may do this, but for now, we are stuck with version 1.0.
Here are the results:
Again, I used play-by-play data from 2000 to 2008.
Interestingly, and not unexpectedly, there were significant across-the-board differences in the scooping ability of the average tall and short, lefty and righty, and regular and backup first baseman.
Tall (>6’1”) RH: .6 runs per 1000 matched throws, or around a full season.
Tall LH: 1.2 runs
Short RH: -.8 runs
Short LH: .6 runs
Less than 300 matched throws: -1.5 runs
300 to 1000 throws: .4 runs
More than 1000 throws: .6 runs
Players with at least 1000 matched throws total where the minimum of the two pair for each fielder is added to the total:
Best per 1000 throws
Berkman +4 runs, 2928 matched throws
Choi +4, 1361
Conine +3, 3100
Connor Jackson +3, 1790
Loney +3, 1679
Mientkewicz +3, 4344
D Ward +3, 1119
Olerud +2.5, 4481
Sexson +2.5, 6471
Tony Clark +2.5, 2880
Dan Johnson +2.5, 1774
Overbay +2.5, 4834
Karros -4, 2986
Mo Vaughn -4, 1644
Galaraga -3, 1842
Stairs -3, 1122
Bagwell -2.5, 4931
Casey -2.5, 6444
Julio Franco -2.5, 2010
Swisher -2.5, 1134
Thome -2.5, 4476
When we regress these numbers in order to turn them into “true talent” scooping ability, we get slightly smaller values, depending on the number of matched throws (sample size) of course. In fact, it appears that the spread in talent between the best and worst “scoopers” at first base is on the order of 2-3 runs, plus or minus (a 4-6 run spread). So before you start opining about how your favorite first baseman is so great defensively because he “saves so many errors,” consider that scooping ability is probably worth less than a ¼ of total defensive ability or value at first base. Fielding grounders is at least 75% of the package and “scooping” is the rest. But every little bit helps.