Author Archive

UZR Updates!

The first UZR updates of the 2010 season are in, and from here on out they’ll be updated every Sunday night.

There have been a few improvements made to UZR this year, which will also be reflected in prior year’s UZR data. The changes do impact a few players, but for the most part, each player’s UZR has remained unchanged or is within a couple runs of what a player was rated before the improvements. Mitchel Lichtman, the man behind UZR, outlines the changes below:

Park factors have been improved, especially for “quirky parks and portions of parks,” such as LF and CF at Fenway, LF in Houston, RF in the Metrodome, and the entire OF in Coors Field. Of course, park factors in general are updated every year, as we get more data in each park, and as new parks come into existence and old parks make material (to fielding) changes.

In the forthcoming UZR splits section, we will also be presenting UZR home and road splits, as a sanity check for those of you who are skeptical of park factors. Please keep in mind that regardless of the quality of the park adjustments, there can and will be substantial random fluctuations in the difference between home and away UZRs and it is best to evaluate a fielder based on as much data as possible (e.g., using home and road stats combined), as we do with most metrics and statistics.

Adjustments have been added to account for the power of the batter as a proxy for outfielder positioning, so that, for example, if an outfielder happened to have “faced” a disproportionate percentage of batters with less than or more than average power, the UZR calculations will make the appropriate adjustments (as best as it can). Obviously, these kinds of adjustments are more important for smaller samples of data than for larger samples, since, in larger samples, these kinds of anomalies (in terms of opponents faced) tend to “even out.”

For infielders, similar adjustments are made for the speed of the batter, as a proxy for infielder positioning and how quickly the infielders have to field and release the ball, as well as the speed of the throw.

When a “shift” is on in the infield, according to the BIS stringers, if the play was affected by the shift, the UZR engine ignores the play. As well, if an air ball hits the outfield wall and in the judgment of the BIS stringers, no outfielder could have caught the ball, the play is similarly ignored.

Also keep in mind that UZR does not include first basemen “scoops” or the ability of the first baseman to influence hits and errors caused by errant throws from the other infielders. According to my (MGL) research, yearly “scoops” numbers are generally in the 1-4 run range, which means that the true talent range of most first basemen with respect to “scoops” is probably in the plus or minus 2 runs per year range – i.e., not much.


Dave Cameron Joins FanGraphs Full Time

I’m pleased to announce that Dave Cameron will be joining FanGraphs full time. Dave will continue to be the managing editor of FanGraphs and will have his duties expanded to other areas of the business.

Dave has played a pivotal role in the growth of FanGraphs since joining in early 2008. It’s very exciting to be able to bring Dave onboard in a larger capacity where he’ll be able to devote even more of his time to all things baseball, in addition to working on new and exciting FanGraphs projects.


Some Thoughts on Batted Ball Data

Colin Wyers wrote a post today about potential bias in batted ball data. While I don’t have anything in particular to say about the results of his bias study, I have to disagree with his conclusion and debunk some of the information provided about the differences between Stat Corner tRA and FanGraphs tRA, which he uses to illustrate his point:

For starters, the difference in tRA between FanGraphs and Stat Corner is a poor stat to illustrate GB/FB/LD bias because there are other differences in the way both sites calculate the stat. Let’s take Felix Hernandez this year, for whom BIS and Gameday have very, very similar batted ball profiles for 2010.

         GB      LD     FB 
BIS     67.6    13.5   16.6 
GD      65.8    13.2   15.8

Now, here’s the difference in FanGraphs tRA vs StatCorner tRA

FG – 4.62
SC – 5.05

Almost a half a run difference. Why are they so different? It’s probably the component park factors, mainly on LD% and HR%, I would imagine.

Actually, I’ll plug both of those stat lines into the FanGraphs tRA calculator and see what I get: 4.62 and 4.70. So, about .08 of the differences is because of GB/FB/LD differences and the other .35 is park factors (or potentially slightly different weights).

Furthermore, if you look at individual player GB% correlation from 2003 to 2008 between BIS and Retrosheet data, you get .94. That’s among all players, whether they pitched 1 inning or 200 innings. Here’s the others:

GB% – .94
FB% – .85
LD% – .72

It’s not like the two data sources are telling you completely different things. For the most part, they agree, especially on GB%.

Baseball Info Solutions also rotates their scorers, to try and avoid any scorer bias as Ben Jedlovec stated here:

BIS Scorers are assigned “randomly”. We’re not using a random number generator, but it’s almost as effective. Scorers have a designated number (Ex. Scorer #11) which are then rotated through different slots in the schedule. If scorers 7 and 8 are scoring the late (west coast) games one day, they’ll be rotated to early games the next time around. There’s some miscellaneous switching to accommodate vacation, etc. too. In the end, everyone’s getting a good mix of every team in every park.

We also have several different quality control methods in place to make sure that scorers are consistent with their hit locations and types. We added some new tests this season using the hit timer to flag the batted ball data, so the 2009 data is better than ever.

Ben continues with:

BIS gets an almost entirely new set of video scouts each season. If you’re seeing the same “bias” in the same parks year after year, I can’t see how it would be related to the individual scorer.

It’s also important to note that BIS has an additional classification of batted ball data, Fliners, which is not displayed on FanGraphs and lumped in with Line Drives and Fly Balls. Fliners come in two varieties, Fliner-Line Drives and Fliner-Fly Balls.

Colin tackled the line drive issue before on the Hardball Times, in which Cory Schwartz of MLBAM responded:

our trajectory data is indeed validated as thoroughly as all of our other data: not just once, but three times: first, by a game-night manager who monitors the data entered by the stringer, second by a next-day editor who reviews trajectories against video, and third by Elias Sports Bureau. We take great care in the accuracy of all our data, including trajectories.

None of this is to say that your original premise is not true: line drive vs. fly ball is indeed a somewhat subjective distinction that may be influenced by a number of factors, not just press box height. But I disagree with your assertion that the accuracy of our quality is inferior in this (or any other) regard.

Now we know that there is subjectivity in batted ball stats, but in Colin’s conclusion he writes:

In the meantime, consider this my sabermetric crisis of faith. It’s not that I don’t believe in the objective study of baseball. I’m just not convinced at this point that something dealing with batted-ball data is, at least wholly, an objective study. And where does this leave us with existing metrics that utilize batted-ball data? Again, I’m not sure.

For me, this is a bit of an extreme conclusion to make. For stats like GB% I think there is little to be concerned about, but once you get to LD%, I think you should realize there is some subjectivity involved. Is it worth disregarding entirely or having a “sabermetric crisis of faith” over? In my opinion, probably not.

We all want best data possible and there are some exciting projects underway to collect more granular and precise data, but in the meantime, I don’t see any reason to dismiss the data that is currently available. Better batted ball data will certainly lead to more accurate results; I don’t think it will show completely different results.

Authors Note: This was an expansion on my thoughts from a comment I posted on insidethebook.com


ZiPS In Season Projections

The ZiPS in-season projections have just gone live on the site and all the pre-season projections are now hidden by default. You can still see the pre-season projections by clicking on the “Show Projections” button right below each section’s heading.

There are two separate in-season ZiPS projections:

RoS = Rest of Season, or what a player will do for the remainder of the season.
Update = An updated full season projection for the player in question.

Huge thanks to Dan Szymborski of Baseball Think Factory for allowing us to use these again this season!


Defensive Runs Saved – Clarification

I’ve seen some confusion out there about the Fielding Bible runs saved metrics on FanGraphs. The Fielding bible has two metrics, one in plays made above average and then another in runs above average.

We are only displaying the information in Runs Above Average.

For those of you who are looking for comparables to UZR it is DRS. rPM (plus minus runs saved), would be comparable to RngR + ErrR.

Here’s correlation between the two for players who played at least 500 innings in a season from 2003 to 2009:

Here’s another more indepth look: A Quick Comparison of UZR and Plus/Minus

For the most part they will agree, but there are definitely some players where they don’t. You can read more about the differences between UZR and Defensive Runs Saved here:

MGL on the differences between UZR and +/-


+/-, RZR, and New Fielding Stats

There have been a few changes to the fielding sections of the site.

The biggest change is that John Dewan’s Fielding Bible +/- runs saved is now available on all the player pages and leaderboards going back to 2003. The stats that are associated with +/- runs saved are:

All in Runs Above Average:
rSB – Stolen Base Runs Saved (Catchers/Pitchers)
rBU – Bunt Runs Saved (1B/3B)
rGDP – Double Play Runs Saved (2B/SS)
rARM – Outfield Arms Runs Saved
rHR – HR Saving Catch Runs Saved
rPM – Plus Minus Runs Saved
DRS – Total Defensive Runs Saved

We’ve also added Revised Zone Ratings, which The Hardball Times used to carry. I’ll quote from THT:

Revised Zone Rating is the proportion of balls hit into a fielder’s zone that he successfully converted into an out. Zone Rating was invented by John Dewan when he was CEO of Stats Inc. John is now the owner of Baseball Info Solutions, where he has revised the original Zone Rating calculation so that it now lists balls handled out of the zone (OOZ) separately (and doesn’t include them in the ZR calculation) and doesn’t give players extra credit for double plays (Stats had already made that change). We believe both changes improve Zone Ratings substantially. To get a full picture of a player’s range, you should evaluate both his Revised Zone Rating and his plays made out of zone (OOZ). You can read more about the Revised Zone Ratings in this article.

BIZ – Balls in Zone
Plays Made – Total Plays Made
OOZ – Plays Made out of Zone

Then there’s some other fielding stats that were added such as:

FE – Fielding Errors
TE – Throwing Errors
DPS – Double Plays Started
DPT – Double Plays Turned
DPF – Double Plays Finished
Scp – First Baseman Scoops

Everything is available in the player pages and leaderboards and will be updated nightly.

This season’s UZR updates will be coming soon and we’ll have more on that later.


Contact% and Swing Strike% Correction

It’s come to my attention that the 2010 Contact% and Swinging Strike % numbers looked a bit off. The 4 stats impacted (Z-Contact%, O-Contact%, Contact%, and SwStr%), will be corrected in tonight’s data load are now fixed.


SBNation Partnership

I’m pleased to announce FanGraphs’ newest partner: SBNation! We’ll be highlighting their awesome content right here on the FanGraphs player pages, in the form of player specific related content, and on the home page and blogs.

And from SBNation.com you’ll be just a click away from FanGraphs stats and articles on their player pages and blogs.

We think this cross-integration of our content will help both FanGraphs readers and SBNation readers be even more informed as baseball fans!


The iPad = My Baseball TV

The first thing I did on my iPad was check out how the FanGraphs app looked on it. It works, but truth be told you’re better off using the website on an iPad, which will work in its entirety. I much prefer the FanGraphs app on the iPhone to the website, but it just doesn’t make a whole lot of sense to use it on a bigger screen. Now if you could use the FanGraphs app as an iPad widget and leave it in the corner all the time or something, that would be cool.

The second thing I did was download the MLB At Bat 2010 for iPad.

MLB.tv through MLB At Bat 2010 works great on the iPad. While I was doing the live chat yesterday, I had the iPad propped up with whichever game I wanted to watch that was not blacked out. It’s like having a handheld television, specifically designed to watch baseball games.

And of course you have all sorts of statistical overlays available to you while you watch the game. Want to know which players are in the field, or want to see a boxscore while watching the game? No problem, those are all available right at your fingertips.

There’s also a high resolution MLB Gameday view, where you can see each at-bat, pitch by pitch along with lineup, boxscore, and video highlights. You can pull up so many extra “stat boxes” that it will more or less fill up the entire screen. It all looks really great and because you can’t access the web based MLB Gameday on your iPad, this is really the only official MLB live scoring option available to you.

The downside is that it costs $15 dollars, again. And I say again because I already have MLB At Bat 2010 for my iPhone, which cost $15 dollars too. I also noticed a few opening day bugs, which weren’t show stoppers or anything, but caused the application to crash on me a couple of times. I’m sure these will get ironed out pretty quickly.

While it’s one of the more expensive apps you’ll purchase, it does provide the most robust and prettiest live scoring experience on your iPad, mainly because of the lack of Flash software that will prevent many popular web based solutions (including MLB’s very own) from working. And if you have an MLB.tv subscription and an iPad, getting MLB At Bat 2010 for the iPad is really a no brainer.


2010 Stats Are Here!

It’s always a mystery to me how long it’s going to take to get the site in a state where it can load the next year’s stats. Fortunately, everything appears to be working (after a hiccup that caused all the player pages to error out) and all one game of 2010 stats have been loaded successfully! Remember that pitch type stats usually run one day behind and UZR is updated every Sunday.

Stats are typically updated nightly around 4am Eastern Time.

Tomorrow night I’ll switch over all the default views to 2010 stats.