Author Archive

Want to Work for a Major League Team?

I’m sure there are more than a few FanGraphs readers qualified for this job:

Major League organization is looking for a motivated individual with significant (5+ years) experience in both web development and database application design/maintenance for a full-time position. Must have experience or working knowledge in all of the following areas: Linux (Ubuntu a plus), PHP, PostgreSQL, HTML, CSS, Unix Shell Scripting, Python, Ruby, and XML data importing. Understanding baseball statistics and player evaluation methodologies is also required. Love of baseball will definitely help but is not strictly required.

Please send your resume and a brief explanation of your experience to mlbprogrammer@gmail.com.


Fan Projections Now Exportable

I’ve made a few updates as requested for the Fan Projections.

– The Fan Projections now only require 15 votes to be eligible. This added an additional 200 or so players into the mix and I imagine that most MLB regulars and some not so regulars should have Fan Projections now.

– There’s now an UZR column in the sortable projections.

– You can now export the Fan Projections to either Excel or CSV files in the projections page.


Poll: Tim Lincecum Arbitration

Tangotiger’s running the same poll over at insidethebook.com:


The 2010 Marcel Projections

Yesterday, Tangotiger released the 2010 Marcel projections and now they’re available on all the player pages and in the sortable projection pages.

Couple things of note about the Marcel projections.

– About Marcel, Tango himself writes, “it is the most basic forecasting system you can have, that uses as little intelligence as possible. It uses 3 years of MLB data, with the most recent data weighted heavier. It regresses towards the mean. And it has an age factor.”

– The wOBA calculation I’m using for FanGraphs is going to be different than the ones included in the projections. To keep things consistent, I’m including SB and CS and using the 2009 weights, just like all the other projections.


2010 CHONE Projections!

Courtesy of baseballprojection.com and Sean Smith, we’ve got the CHONE projections available in all the player pages, the sortable leaderboards!

You’ll also note that there are WAR values generated off the CHONE projections for non-pitchers. Couple things to note about these. They use the CHONE defensive projections (which are based off Total Zone), but everything else is calculated in the same manner as the usual FanGraphs WAR. I am also not park adjusting the “Batting” component, which is the same deal as the “Fan” projections. I’ll probably do the WAR for pitchers sometime next week.


Some Thoughts for the New Year

This past week, while a lot of us were on hiatus, there was a good bit of discussion going on in the blogosphere about the role stats played in baseball. This eventually led to “The Mike Silva Chronicles” posts over at insidethebook.com, in which Mike Silva asked 10 questions and Tangotiger answered them. They’re without a doubt worth reading.

One of these posts in particular had to do with “Stats Saturation” in which Mike Silva asked:

Do you believe the advanced metric community is saturating the market with stats, to the point where progress that was initially made with the evolution of traditional stats is now minimal at best. Wouldn’t it be wiser to allow some of the current metrics to gain acceptance before you “advance the current advanced metrics” (corny phrase so to speak).

FanGraphs has been around for almost 5 years and this is a question which is frequently on my mind.

For the most part, FanGraphs does not create new statistics. It’s more of an aggregation of what I consider the best publicly available sabermetric work on the internet from the minds of people like Tom Tango, Mitchel Lichtman, Dan Szymborski, Dave Studeman, Sean Smith, with contributions from many others including our own staff.

There are certainly a lot of stats available on FanGraphs. By my count there are well over 100 different metrics available for batters and pitchers, which are generally broken into logical sections which make them a bit easier to handle. Some of these stats have caught on more in the mainstream, while others I’m sure will continue to only be used among the sabermetric crowd.

With so many stats, I can certainly understand the question of why things might be getting overkill with additional stats. The truth of the matter is that there really are different categories of stats and even those who are not particularly interested in sabermetrics are very much interested in scouting statistics.

Four of the stat sections on FanGraphs are devoted entirely to scouting statistics. These include the Pitch Types, Plate Discipline, and Batted Ball sections. The data in these sections is based literally off what Baseball Info Solutions scouts see. There are no fancy calculations and everything is fairly intuitive. Was the ball in the strike zone or out of the strike zone? How fast was the pitch and what type of pitch was it? Is there room for some disagreement on these? Of course. There’s always going to be some disagreement between scouts, but you really can’t blame the formulas.

Which brings us to the stats which are based on formulas. Many of the metrics on FanGraphs are rooted in “linear weights”. wOBA, wRC, wRAA, wRC+, FIP, and the Batting component of our Value section are all entirely linear weight based. These all use linear weights to measure different things, or in some instances the same thing expressed in a different way. But at their heart they’re all measuring bucketed stats in runs.

Now neither scouting stats nor linear weight based stats are, in my mind, particularly controversial. There are, however, controversies on how linear weights are adjusted, such as by park or by league. And then when scouting stats and linear weights are combined, it seems to be a particularly hot button issue. UZR, for instance, relies on both scouting data (where the ball landed, how hard it was hit, its trajectory) and turns that data into linear weights depending on how the ball was classified, and then adjustments are also applied.

UZR and the Pitch Type values are really statistical scouting. They’re all about putting a value on what you see.

So getting back to Mike Silva’s question of if it’s better to wait for acceptance before introducing new stats, I’ll say it depends. The standard linear weight based stats on FanGraphs aren’t going anywhere. I really don’t think it’s worth introducing an entirely new named set of statistics for what might be a very small increase in accuracy. The current ones may from time to time be tweaked slightly, but they’ll remain under the same familiar names and will represent the same things.

Though when it comes to stats that really do bring a new point of view, whether it be with new scouting data, or drastically different models, then I’ll say they’re welcome at FanGraphs. The other type of stat that I’m not opposed to adding is one that makes an existing model more accessible, which I felt was the case with wRC+.

With all that said, I’d just like to throw out a quick reminder for the new year about thinking. Please take the time to understand the stat you’re using before you use it in an argument and before you criticize it. It’s best when the stats available on this site are used in thoughtful, open-minded discussions that enhance your knowledge and, most importantly, your enjoyment of the game of baseball.


What is wRC+?

As you may have noticed, there’s now an extra column in the “Advanced” section for batting stats called “wRC+”. You can think of this stat as a wOBA based version of OPS+. It’s park and league adjusted and it’s on a very similar scale as OPS+. The difference is that it uses wRC, which is based on wOBA.

For those of you not familiar with the scale, 100 is average. Anything higher is above average and anything below 100 is below average.

Big thanks goes out to Tangotiger for pointing out how easy it was to implement this particular stat. It’s now available in all the player pages for major league and minor league players going all the way back to 1871. Please note that it is not park adjusted for minor league stats, but it is league adjusted.

It’s also in the career leaderboards and will soon be making its way into the individual season leaderboards too.

Update: I’ve removed pitchers from the league baselines, so the values will have changed slightly since this morning.


The Hardball Times & FanGraphs

The Hardball Times and FanGraphs have a long history of working together. I used to write the original Daily Graphing series over at the Hardball Times when FanGraphs first launched, and if you look at the list contributors over at THT, you’ll see more than a few familiar FanGraphs regulars among them.

To further our relationship, we’ll be promoting the great writing over at THT throughout the site. You’ve probably already noticed the THT article box on our homepage, and relevant THT articles will also start showing up on the player pages. The Hardball Times will no longer be providing stats and will instead defer to the FanGraphs stats sections.

With the THT stats pages departing, there were also a few unique stats to THT that didn’t make the jump over to FanGraphs. We know xFIP is important to everyone, so it’s now available on all the player pages and leaderboards. The more detailed catcher fielding data will eventually be added to FanGraphs in some form or another, but there’s been no decisions made on anything else just yet.

While I do realize that no one’s going to be thrilled by the departure of the THT stats pages, both Dave Studeman and I think that this will be a winning situation for everyone. If there are stats or graphs you really wanted to see updated daily next season that won’t be on THT anymore, just let us know. We’d like to make the transition for you as smooth as possible.


Want to be a MLB.com Stringer?

If you ever wanted to be a MLB.com stringer, now’s your chance to apply!


MLB.com, the Official Site of Major League Baseball, is seeking stats stringers in these markets for the 2010 season:

· Boston
· Cincinnati
· Cleveland
· Detroit
· Miami
· Milwaukee
· Texas

Stats stringers are responsible for digitally scoring games from one of the 30 MLB ballparks, which provides the data used in the live content applications on MLB.com, including Gameday and MLB.TV, real-time highlights and text alerts, and by our business partners. This is a perfect part-time job for a diligent, responsible employee who happens to be a big baseball fan.

Read the rest of this entry »


Fan Projection Targets: 12/2/2009

Today’s targets are: Kelly Shoppach, Josh Johnson, Jake Peavy

Yesterday, our three targets were Jay Bruce, Ricky Nolasco and Edwin Jackson. Let’s see how they fared:

Jay Bruce ended the day with some 200-plus ballots cast and it seems everyone is pretty optimistic about him having a much improved season with a .358 wOBA and somewhere right around 3 WAR. The interesting thing here, though, is that there are apparently only eight Reds fans who visit FanGraphs, and they expect him to have a considerably better year with a .373 wOBA and 4.4 WAR! They project his defense to continue at high levels with an UZR of 7.4 compared to 3.2 for non-fans.

However, fans are not expecting a repeat for Edwin Jackson and instead have projected him for a healthy dose of regression. Everyone thinks the walks will go up, the strikeouts will go down a bit, and the ERA will jump back over 4. Tigers fans think he’ll fare slightly better than non-fans, but everyone seems to be on the same page.

And finally Ricky Nolasco, who is always a hot subject for debate, continued to be controversial in the collective fans’ minds. Now, we don’t have really enough data to run a full study, but generally, fans are torn between two buckets when filling out their ERA selection. Nolasco’s ERA voting was 24% for 3.00-3.49, 46% for 3.50-3.99 and then 25% for 4.00-4.49. Everyone thinks he’ll continue to strike out over 8 per 9 innings and nearly everyone thinks he’ll walk under 2.5 per 9, so no one is questioning his peripherals, but his actual results continue to be a big question mark.