Base running linear weights or base running runs, or Ultimate Base Running (UBR), is similar to the outfield arm portion of UZR. Whatever credit (positive or negative) is given to an outfielder based on a runner hold, advance, or kill on a batted ball is also given in reverse to the runner (or runners). There are some plays that a runner is given credit (again plus or minus) for that do not involve an outfielder, such as being safe or out going from first to second on a ground ball to the infield, or advancing, remaining, or being thrown out going from second to third on a ground ball to SS or 3B.
Runs are awarded to base runners in the same way they are rewarded to outfielders on “arm” plays. The average run value in terms of the base/out state is subtracted from the actual run value (also in terms of the resultant base/out state) on a particular play where a base runner is involved. The result of the subtraction is the run value awarded to the base runner on that play.
Read the rest of this entry »
2. Fielder Credits and Debits
3. Batted Ball Types
4. GDP and Arm
5. Base Runner and Outs Adjustments
6. Other Infield Positioning
7. Outfield Positioning and “Wall Balls”
8. Infield Line Drives and Pop-ups
9. Park Adjustments
10. The Baselines
11. Recent Revisions
12. Sample Size and Reliability
13. UZR and Aging
14. Consistency from Year to Year
15. Does UZR tell us what actually happened on the field?
Actually, he didn’t throw any (to them), but the question remains the same. Like the sac bunting question that I addressed a few days ago, the question is not easy to answer because it involves game theory.
Let me start by addressing the issue of a “pitcher’s best pitch.” There is no such thing as a “pitcher’s best pitch!” I know that sounds like blasphemy and it sort of depends on what we mean by “best,” but it’s true. How can I say that? Because a pitcher’s goal against any given batter in any given situation is to throw all of his pitches in a certain proportion (for example, in situation A, I, the pitcher, might throw my fastball 60% of the time, my curveball 25% of the time, and my change up 15% of the time) such that the average result of each pitch (measured any way you want, but ideally in win expectancy) is the same, and such that it doesn’t matter what pitch or pitches the batter is “looking” for. Again, you’ll notice a lot of similarities between this and my sac bunt discussion from a few days ago. And, since a pitcher should in the long run (he doesn’t have to of course) face an unbiased sample of batters and game situations, the average value of all his pitches should be roughly equal. If the value of all a pitcher’s pitches are equal, how can he have a “best pitch?” He can’t – sort of.
How can that be when clearly one pitcher’s fastball is better than another’s, one pitcher’s slider is better than another’s, etc? Because the quality, as measured in the results, of each of a pitcher’s pitches depends on two things: One, the quality of that pitch in a vacuum, as in if we were scouting a pitcher and we said, “Let’s see your fastball,” and two, the percentage of time he throws that pitch in any given situation. The “better” the pitch in a vacuum, the more he throws that pitch, such that all of his pitches eventually have the same value in any given situation. Obviously, the “better” a certain pitch is (in a vacuum), the more he will throw it (depending also on how many other pitches he has).
So, if you define “best” or “good” as the quality of a pitch in a vacuum or if the batter knows that it’s coming (or he thinks that all pitches have an equal likelihood of coming), then yes, pitchers do have a “best” pitch. But, if we define the quality of a pitch as the value of its average result in a game situation, then all pitchers’ pitches are of the same “quality.”
Let me give you an example: Let’s say that Lidge has an average fastball but a very good slider, and that’s all he throws. And let’s say that we have another pitcher with a great fastball and only an average slider. You would be tempted to say that Lidge’s “best” pitch is his slider and that the other pitcher’s “best” pitch is his fastball. And if we put a batter at the plate and told him what was coming, Lidge’s slider would outperform the other guy’s slider and the other guy’s fastball would outperform Lidge’s fastball. Even if we didn’t tell the batter what was coming but he assumed an equal likelihood of receiving each pitch, the results would be the same.
The answer to that question is complicated. There is no easy yes or no answer and that is not so much because there are so many variables we don’t know the answer. It has to do with game theory. Oh, in case you didn’t watch the 6th game of the ALCS or you forgot, in the 8th inning with the Yankees up 3-2, Swisher bunted with a runner on first (and no out of course) and when he reached on an ROE, Melky bunted with runners on first and second.
Many people, including those who are sabermetrically inclined, typically decry the sacrifice bunt – why give away outs? The conventional (and lazy) sabermetric wisdom used to be that sac bunt attempts were almost always incorrect – at least ever since The Hidden Game of Baseball told us so and legions of sabermetric fans and even sabermetricians looked at the RE and WE tables and noticed that the game state after an out and base runner advance was worse than before – hence the sac bunt is wrong.
The problem of course is that that is a ridiculously simplistic way to answer the question on two fronts. One, the WE or RE before and after a “successful” sac bunt, using a standard table, is based on an average batter in an average lineup against an average pitcher and defense in an average stadium on an average Spring day. At least some analysts recognized that in different contexts, those numbers would have to be revised. However, most of them also noted that the gap was so large between the “before” and “after” state (in favor of the “before” state – which assumes hitting away most of the time), that it would take an enormously bad hitter -like a pitcher – to make it correct to bunt. They would basically be right.
Now, there is a more important and pertinent reason why looking at RE and WE charts and comparing the “before” and “after” numbers do not help you in answering the question as to whether a sac bunt (by a non-pitcher) is correct in any given situation. And that is because a sac bunt attempt obviously does not lead to an out and a base runner advance 100% of the time (or even close to 100%); in fact the average result from a sac bunt attempt is not even equivalent to an out and a base runner advance. Also, the average result varies a lot with the speed and bunting skill of the batter and whether and by how much the defense is anticipating the bunt or not (among other things).
Most of you are familiar with Ultimate Zone Rating (UZR), a metric for evaluating fielding using detailed play-by-play data. For first basemen, like all other fielders, it measures the number of ground balls that are fielded and turned into outs as compared to an average-fielding first basemen, given the same number, type, and location of ground balls, as well as the same number of outs, the base runner configuration, and batter handedness (plus an adjustment for the parks and the ground/fly tendency of the pitchers). The difference between these two values is a fielder’s UZR. It is usually expressed as a number of runs saved or cost compared to an average player at that position over a specified period of time (usually defensive games). You will often see it as a rate stat, generally per 150 games.
What UZR does not measure for first basemen, because the requisite data is not readily available, are the theoretical runs saved or cost by virtue of a first basemen’s skill at successfully catching errant throws or throws in the dirt. For lack of a better word, I call these “scoops,” even though they necessarily include poor throws (e.g., high or off-line) that are not in the dirt.
Fortunately, there is a way to estimate this skill, using a relatively simple method “invented” by Tom Tango, called “without and with you”, or WOWY. The WOWY methodology was explained by Tango in an excellent article in the 2008 Hardball Times as well as in various online forums, including his (and my) blog, www.insidethebook.com.
Basically, it goes like this: Figure out what happens when a particular player is on the field (generally something that explicitly involves that player, or at least is affected by that player, such as the number of ground balls a particular SS – say Derek Jeter – fields and turns into outs). Then figure out the same thing when that player is not on the field, but all other relevant variables (in this example, mostly the pitcher, park, and opponent batters) remain constant. The difference between the two rates (per whatever you want) should reflect the difference in skill, or at least in performance (skill is usually performance regressed toward some mean, the amount of regression being a function of the sample size of the performance) of whatever you are measuring, between the player in question and the average player when the player in question is not on the field (but in the data set).
In other words, to use the Jeter example, if Derek fields 4 balls per 9 innings and all other SS (let’s say that our sample of “other SS” is large and unbiased enough to consider them league average) field 4.5 balls per 9 innings, given the exact same pool of pitchers, batters, and parks, then we can safely say (more or less – within the bounds of sample error) that Jeter is .5 balls per 9 innings worse than the average SS. As I said, it is simple and brilliant. The results are extremely telling if we can get large enough samples of Jeter and lots of other SS that are “matched” with Jeter based upon parks, pitchers, batters, etc.
The methodology is a little more complicated than that, but hopefully you get the general idea. Anyway, the same thing can be done with first basemen in order to estimate their “scooping” performance and ability. What I did was look at every ground ball that was thrown from each infielder to each first baseman. I put that ground ball into one of two buckets: One, when there was no error on the throw, or two, when there was an error on the throw. The assumption is that when a throwing error is made, there is an errant throw (no duh) and the first baseman is not able to somehow coax that bad throw into an out by scooping it out of the dirt, jumping in the air, catching it while off the bag and still making the play, etc. Obviously most of the time when a throwing error occurs, there is nothing that any first baseman can do about it. However, in the long run, we can assume that a certain fixed percentage of bad throws from each infield position will always result in an error while another fixed percentage of those bad throws have a chance to be “saved” by a skilled (or tall) first baseman.
We can also assume that when an error is not made, that sometimes a bad throw occurs and the first baseman “saves the day.” Again, most outs and non-errors occur on easily catchable throws, but a certain fixed percentage of them will occur on bad throws that are skillfully and successfully handled by the first baseman.
Given these assumptions (which are true, by the way), if we look at all throws by a particular player at a particular position to a certain first baseman and then compare the results (error or no error) to when throws are made from the same infielder at the same position to all other first basemen, the difference can be attributed to the “scooping skill” of that particular first baseman.
Say that all infielders threw to Todd Helton 1000 times in 2007. And say that those exact same infielders threw to some other first baseman (and we are going to assume that all of these “other” first basemen are average, collectively) around 1000 times also (it does not really matter how many throws went to these other first basemen, although we would like it to be a lot). Now let’s say that 20 throwing errors were made on those throws to Helton and only 15 were made to all the other first basemen. We can safely say (with some uncertainty of course, due to sample error) that Helton was 5 plays per 1000 throws (about a full season actually) better than the other pool of first basemen, who we are assuming are average. So Todd is 5 plays, or around 4 runs, per season, above average at “scoops.”
In actuality what I did was to match up every player at every infield position (including pitcher and catcher) with every first basemen, and then for each, specific first baseman, I did a “with and without” and prorated or weighted that difference by the minimum of these two numbers – the throws made to the first baseman in question and the throws made to all other first basemen – by the same fielder.
For example, let’s say that over the sample time period, Edgar Renteria threw 300 balls to Albert Pujols (by the way, I used all data from 2000 to 2008) and he made 6 throwing errors. Now let’s say the Renteria also threw 800 balls to other first basemen (on the Cardinals or any other team he played for) and made20 errors. The difference in error rate is .5 errors per 100 throws in favor of Albert. Thus we give him credit so far for .5 errors per 100, weighted by 300 throws (the lesser of 300 and 500). We do that for every fielder who threw to Albert at least 20 times and 20 times to other first basemen, and then we add everything up to get a weighted average. This weighted average represents a first baseman’s “scooping” skill as compared to an average first baseman, or at least as compared to the average first baseman in that player’s “matched pool” of first basemen.
That last part of the last sentence is the primary flaw in the methodology. Since I am only comparing each first basemen to all other first baseman who had the same fielders throwing to them, it is possible, and in some cases, inevitable, for one first baseman to be compared to a pool of primarily good or bad first basemen, since each one will tend to be compared with the others on his own team. For example, Helton will tend to be compared with whomever else manned first base for the Rockies when he was not on the field. If those few backup first basemen happened to be particularly bad at “scooping” balls, then Helton will be overrated by this methodology. In fact, because of this flaw, and because regular first basemen tend to be better at “scooping” than backup first basemen, any regular will tend to be overrated (because they are often compared to a backup) and any backup will tend to be underrated (because they are often compared to a regular). Keep in mind, for example, that because some of Helton’s infielder teammates played on other teams, he is not always going to be compared to another Rockies’ first baseman. Anyway, for now, it looks like we are going to have to live with this flaw or weakness in the system. The correct way to handle the data, of course, is to adjust each first baseman’s rating by that of his “others” by doing an iterative process. In the future I may do this, but for now, we are stuck with version 1.0.
Here are the results:
Again, I used play-by-play data from 2000 to 2008.
Interestingly, and not unexpectedly, there were significant across-the-board differences in the scooping ability of the average tall and short, lefty and righty, and regular and backup first baseman.
Tall (>6’1”) RH: .6 runs per 1000 matched throws, or around a full season.
Tall LH: 1.2 runs
Short RH: -.8 runs
Short LH: .6 runs
Less than 300 matched throws: -1.5 runs
300 to 1000 throws: .4 runs
More than 1000 throws: .6 runs
Players with at least 1000 matched throws total where the minimum of the two pair for each fielder is added to the total:
Best per 1000 throws
Berkman +4 runs, 2928 matched throws
Choi +4, 1361
Conine +3, 3100
Connor Jackson +3, 1790
Loney +3, 1679
Mientkewicz +3, 4344
D Ward +3, 1119
Olerud +2.5, 4481
Sexson +2.5, 6471
Tony Clark +2.5, 2880
Dan Johnson +2.5, 1774
Overbay +2.5, 4834
Karros -4, 2986
Mo Vaughn -4, 1644
Galaraga -3, 1842
Stairs -3, 1122
Bagwell -2.5, 4931
Casey -2.5, 6444
Julio Franco -2.5, 2010
Swisher -2.5, 1134
Thome -2.5, 4476
When we regress these numbers in order to turn them into “true talent” scooping ability, we get slightly smaller values, depending on the number of matched throws (sample size) of course. In fact, it appears that the spread in talent between the best and worst “scoopers” at first base is on the order of 2-3 runs, plus or minus (a 4-6 run spread). So before you start opining about how your favorite first baseman is so great defensively because he “saves so many errors,” consider that scooping ability is probably worth less than a ¼ of total defensive ability or value at first base. Fielding grounders is at least 75% of the package and “scooping” is the rest. But every little bit helps.