Author Archive

Pitcher Win Values Explained: Part Five

We ended the last post in this series by talking about run environments. Generally, when you hear someone talk about a run environment, you think of either a specific park that has a notable influence on run scoring (say, Coors Field) or an era of baseball where the offensive level was significantly different than it is today (the dead ball era, for instance). In these environments, the game is a bit different, and runs can be either much more valuable or less valuable in helping win a game, depending on the context of the environment.

However, it’s not just in extreme parks or long ago where the run environment varies from the modern norms. Indeed, the run environment of current baseball varies from day to day depending on which pitcher is on the mound. CC Sabathia, through his dominance with the Indians and Brewers last year, created his own traveling low-run environment. When he took the mound, runs became hard to come by. Through his own abilities, Sabathia created a run environment in his own starts that wasn’t that close to the league average run environment of 2008.

This presents an issue. If we were to use the standard league average runs to wins conversion based on a normal run environment, we’d run into problems. By virtue of creating his own run environment, Sabathia has changed the context of the value of runs in relation to wins. All pitchers do this to an extent, and the further away from league average they are, the more they influence their particular environment.

So, if we know that pitchers are changing the relationship between runs and wins in their starts, then we need a dynamic runs to win estimator that adjusts for their individual run environment. This is where the always awesome Tom Tango comes to the rescue, as usual. In talking with him about this, he suggested that we use the formula ((League RA + Pitcher’s RA)/2)+2)*1.5, which would handle the runs to wins conversion in differing environments.

Essentially, that formula averages the pitcher’s FIP scaled to RA with the league average RA, then adds in the constants to create the run to win conversion for a given environment. If we looked at an average pitcher in the AL, for instance, the formula would give us (4.78 + 4.78)/2, which is of course 4.78. (4.78 + 2)*1.5 gives us 10.17, which is the average runs to wins conversion for the AL in 2008.

Now, if we had a pitcher whose park adjusted FIP scaled to RA was 3.00, then the environment would be 3.89 RA, and the runs to wins conversion would be 8.84. As you can see, an excellent pitcher significantly lowers the amount of runs it takes to equal a win. Doing the basic “divide runs by 10” thing shortchanges good pitchers and overvalues bad pitchers.

So, we’ve built this dynamic runs to wins conversion tool into the win values you see on the page. I know, this is a level of detail that most of you won’t care about and can understandably be tough to wrap your head around, but as we said, we want these win values to be transparent, and this calculation is required if you’re trying to re-engineer the values on the site.


Pitcher Win Values Explained: Part Four

As we talked about last night, pitcher replacement level is set at a .380 win% for starters and .470 win% for relievers. However, because of the differences between the AL and the NL, as well as varying offensive levels over the years, that means that there isn’t a fixed mark that we can point to as replacement level FIP that works for each year and each role.

However, since we’ve got the .380/.470 marks, we can derive those numbers with just a little bit of work. Let’s walk through the process, first using a 2008 American League starting pitcher as our example.

The league average runs per game in the AL last year was 4.78. The FIPs that are displayed on the pitcher’s player card here at FanGraphs are scaled to ERA, but for the win values, we modified the formula slightly to scale it to match league RA. However, there’s a shortcut if you want to take a pitcher’s traditional FIP and have it match up with the league RA – that’s dividing his FIP by .92.

For instance, a 4.40 FIP divided by .92 will give you a 4.78 FIP. That .92 is the ERA-RA bridge, and allows us to conclude that 4.40 would be a league average FIP in the American League last year. So, a pitcher with a 4.40 FIP in a neutral park would be a league average pitcher. Or, put back into win% terms, a .500 pitcher.

Now, because we’ve set replacement level at .470 for relievers and .380 for starters, we know that a replacement level FIP for an AL reliever will be a lot closer to 4.40 than it will be for a starter. How much closer? Running the numbers through the formula gives us a 4.68 FIP (traditional, not scaled to RA) for an AL reliever and 5.63 for an AL starter. So, if you’re looking at a pitcher’s FIP here on his FanGraphs page, and that pitcher happens to be in the American League, those are the numbers you’d want to compare him to in order to see how far away from replacement level he is.

For the NL in 2008, the numbers are 4.45 for a reliever and 5.37 for a starter – the lack of a DH drives down the league’s offensive level, and so the performance of a replacement level pitcher will appear better in the NL than in the AL.

Remember, these are park neutral numbers, so if you’re looking at a player who pitched in a park that significantly effects offense, you’ll have to adjust his FIP to account for the park effects. If the NL starter that we were looking at pitched in a park that suppressed offense by 5%, then a replacement level for that park would be 5.10, not 5.37. Thus, you’d want to use the lower replacement level for his home innings, and the league average replacement level for his road innings. Assuming equal distribution, that would make the replacement level FIP 5.23 for that NL starter pitching in a park with a park factor of .95.

As you can see, the run environment that the pitcher exists in has a substantial effect on the replacement level value. But the impact of run environments don’t stop there, and are further complicated by the fact that starting pitchers have a significant impact on their own run environment. The expected offensive level is a lot lower in a game where Johan Santana is pitching than where Cha Seung Baek is pitching. In order to calculate the runs to wins conversion for each pitcher, we have to take into account that a pitcher impacts his own run environment. We’ll talk more about this later this afternoon.


Pitcher Win Values Explained: Part Three

This afternoon, we talked about why we chose FIP as the metric to base our pitcher win values on. Now, we turn our attention to the other key aspect involved with understanding how many wins a pitcher was worth – the value of a replacement level pitcher as the baseline.

As we did with position players, we’re defining replacement level production as the expected performance you could get from players who can be acquired for virtually no cost. This pool of players would include free agents who sign minor league contracts or for the league minimum, rule 5 draft picks, guys claimed on waivers, and minor league veterans who can’t shake the “Quad-A” label.

Some walking examples of that group from this off-season would include R.A. Dickey, Clay Hensley, Jason Johnson, Gary Majewski, and Tomo Ohka. Not the most impressive group of guys ever, but that’s why they’ve signed for nothing. They represent a portion of the free talent community, and that’s the group that we want to define as zero value pitchers.

Here’s a good discussion about the historical quality of replacement level pitchers. As Tom notes, it is extremely important to distinguish between roles.

While both involve hurling a ball towards home plate, starting and relieving are still remarkably different. Relievers are, in general, failed starting pitchers who are given an easier task that their skillset will allow them to handle. They are selectively managed to face hitters whom they have the best chance of getting out, and they get to throw at maximum effort on nearly every pitch, giving them greater velocity over their shorter appearances.

Nearly every starting pitcher in baseball could be a useful relief pitcher. Very few relief pitchers could be useful starting pitchers. The distribution of pitching talent is skewed very heavily towards the rotation, and because of this and the extra skills required to pitch 5+ innings per start, we use different replacement levels for starting and relieving in order to capture the additional value added by starting pitchers above and beyond simple run prevention.

What are those replacement levels? Perhaps it’s easiest to understand them in relation to a single game. If we assume that a team has a league average offense, a league average defense, and a league average bullpen, and that they are playing a league average opponent, we would expect them to win any single game started by a replacement level starter 38% of the time.

At the same time, we’d expect a team with a league average offense, league average defense, and a league average starting pitcher, facing a league average opponent, we would expect them to win 47% of any games in which their replacement level bullpen was used.

So, we call a .380 win% the replacement level line for a starting pitcher, and a .470 win% the replacement level line for a relief pitcher. However, league average is different for the AL and NL, thanks to the DH, and of course offensive levels vary from year to year. So, how do we take these replacement level winning percentages and compare them to the RA-scaled FIPs we talked about earlier?

We’ll work through those calculations tomorrow.


Pitcher Win Values Explained: Part Two

As we announced yesterday, win values for pitchers are now available on the site. As before, we’re going to go through the process of explaining the calculations that lead to the values you see here on FanGraphs and lay the foundation for understanding what these win values represent.

To start with, let’s take a look at the main input that goes into the win value calculation – a pitcher’s FIP, or Fielding Independent Pitching, which calculates a pitcher’s responsibility for the runs he allows based on his walks, strikeouts, and home runs allowed. The FIP formula is (HR*13+(BB+HBP-IBB)*3-K*2)/IP, plus a league-specific factor that scales FIP to match league average ERA for a given season and league. For the win value purposes, we modified the league specific factor to scale FIP to RA instead of ERA.

Why did we use FIP? I know this a popular question, and it’s something I wrestled with myself. However, what I couldn’t get away from is that we wanted the context sensitivity for the position player and pitcher win values to be as close as possible. wRAA, the offensive input into Win Values for position players, is context-neutral – a hitter does not get credit for his situational performance, such as hitting well with runners in scoring position. Since we aren’t giving hitters credit for situational performance, we can’t give it to pitchers either, in order to maintain the same situation neutral scale.

This is going to lead to some questions – we’re aware of that. Claiming that Javier Vazquez was a +5.2 win pitcher in 2006, when traditional metrics will tell you that he went 11-12 with a 4.84 ERA, is going to be a tough sell. We know.

However, the tangled web of responsibility for run prevention is not accurately unraveled by simply giving pitchers credit and blame for all earned runs and fielders credit and blame for all unearned runs. As most of you know, there are so many extra variables that go into a pitcher’s ERA that the pitcher himself simply doesn’t have control over. We have to try to extract the pitcher’s responsibility from his team’s run prevention while he’s on the mound. Using ERA or RA simply adds too many non-pitcher factors into the equation to the point that we’re no longer just evaluating the pitcher.

FIP removes defense from the equation by only looking at three factors that a pitcher has demonstrable control over – walks, strikeouts, and home runs allowed. By using FIP, we’re isolating the pitcher’s core abilities and evaluating him based on those skills. Now, we’re not claiming that FIP captures everything a pitcher is responsible for. It is not the perfect context-neutral pitcher run modeler – we know that. But when confronted with a choice of including way too many non-pitcher inputs or leaving out a few minor actual pitcher inputs, the latter was the better choice. You will get more accurate win values for a pitcher using FIP than you will ERA or RA.

Getting back to Vazquez for a second – his 2006 FIP was a full run lower than his ERA. The driving forces behind his struggles were a .321 BABIP and a 65.8% LOB%. Most everyone would agree that we don’t want to penalize him for poor defense played behind him, but how do we untangle the responsibility for the lack of stranded runners? Vazquez was horrible with men on base in ’06, but most of that was BABIP related – a .343 BABIP with men on versus a .284 BABIP with the bases empty. If we’re going to say that he’s not responsible for his high batting average on balls in play, and the batting average on balls in play was responsible for the lack of runners stranded, than how do we remove the former but not the latter? This is what I mean by a tangled web of responsibility in terms of run prevention.

If you wanted to make the argument that the context-sensitive stuff, such as how often a pitcher leaves runners on base, should be included, then you also need to be prepared to fight for WPA/LI as the offensive metric of choice for hitter win values. And honestly, I won’t put up much of an argument – there’s a case to be made for context-sensitive win values as a useful metric, and I’d imagine there will be a day that those are publicly available too. But, there’s a more compelling argument for context neutral win values, which is what we’ve decided to present here. What most of us are interested in knowing is how well a player performed in helping his team win, regardless of the performance of his teammates. To answer that, we have to strip out as much context as we can.

Think of FIP as the pitcher version of wRAA. wRAA doesn’t include non-SB/CS baserunning or situational hitting. FIP doesn’t include batted ball data or situational pitching. Neither are perfect, but but both give us the vast majority of the context-neutral picture.

That doesn’t mean that we’re set in our ways and that these win values will never be improved upon. If and when a new metric like tRA is proven to be significantly more effective in valuing pitchers (and I’m hopeful that it will be, given more data exploration on the topic), we won’t be standing here as guardians of the infallibility of FIP. We want to get to the truth, and do so as quickly and as accurately as possible. I will encourage you (especially those of you in the “tRA is awesome/FIP sucks” camp), though, to not let minor differences cause you to miss the fact that FIP and tRA lead to very similar results.

This afternoon, we’ll talk about replacement level for pitchers, how it differs for each league and role, and how we tackled the issue.


Pitcher Win Values Explained: Part One

Since we released the Win Values for hitters here on the site, one of the main questions was when we were going to add them for pitchers. The answer: today. If you go to a pitcher’s player page here on FanGraphs, you’ll see the newly added Value section down at the bottom.

For example, here’s Johan Santana’s Win Values for the past five years:

2004: +7.6 wins
2005: +7.2 wins
2006: +7.1 wins
2007: +3.4 wins
2008: +4.6 wins

During his stretch of dominance with the Twins, he was consistently amazing. He took a step back in his last year in Minnesota though, and while he rebounded somewhat this year, he hasn’t been the same elite guy that he was in his prime during the last two years. Still very good, certainly, as a +4.6 win pitcher is among the best in the game, but not quite the guy he was from 2004 to 2006.

Other fun pitchers to look at: Brad Lidge (+3 wins from a closer in ’08 – quite the addition for Philly), Barry Zito (the Giants should have seen this coming), and Ben Sheets (seriously, someone should give this guy some money).

So, now, for the obvious question – how on earth did we come up with these things?

Starting tomorrow, we’ll do a week long series explaining the calculations behind pitcher win values and the questions that arose during the process. They’re far more complicated that hitter win values, honestly – there’s issues of run environments, leverage, and context that had to be accounted for, and in many cases, the decisions of how to handle these things aren’t cut and dried. So, over the next few days, we’ll dig into those issues and talk about how we arrived at the values we did, and what the positives and negatives of those decisions are.


The Young Dilemma

The big news of the day in baseball is that the Texas Rangers have officially asked Michael Young to move to third base to make room for hot prospect Elvis Andrus, and that he was offended enough by that to either request or demand a trade. After all, he was just awarded his first gold glove as a shortstop, and I’m certain that he feels he’s a good defensive player, despite a mountain of evidence to the contrary.

The UZR that we publish here on FanGraphs from the BIS data source has him as an atrocious fielder in 2004 and 2005, but merely just not good from 2006 to 2008. John Dewan’s +/- system, based on the same BIS data, agrees with that assessment. The UZR data that MGL has published based on the Stats Inc data is also in agreement. While there was a rough transition for Young, he’s gotten quite a bit better at shortstop over the last few years, to the point where he’s now just not good instead of disastrous. After working that hard, it’s not surprising that he feels offended at being asked to move off the position.

It’s for the best, though.

A current projection for Young’s defense in 2009 at shortstop would be around -7 or -8 runs, and getting worse by the year. He is 32 years old, and defense peaks at about age 24 or 25 – he’s a pretty long way from his defensive prime. It’s not hard to see him getting back to that -10 to -15 range as an SS pretty soon, especially if he gets bit by an injury or too. And really, there’s very few circumstances where it’s worth it to an organization to keep a -10 to -15 defender at a premium position, especially when they have an opening on the roster at an easier spot to defend.

The Rangers don’t really have a long term third baseman hanging around, but Andrus looks like a legitimate major league shortstop in the making. Andrus is almost certainly the better defender of the two, and if the goal is to have Young stick in Texas while surrounding him with the best talent possible, having him slide to third base to get Andrus’ glove into the line-up makes a lot of sense.

However, that brings us to the second part of the news story – Young apparently has little interest in moving to third base and would prefer to be traded. Does having Young at third make more sense for the Rangers than having no Young at all?

Young is clearly in decline, as we talked about a few months ago. His win values the last three years are 3.6 wins in 2006, 2.6 wins in 2007, and 1.7 wins in 2008. CHONE thinks he’s going to bounce back and be a +2.5 win player in 2009, so he’s not without value, but he’s not the all-star that he’s made out to be anymore.

However, as a third baseman, his defensive deficiencies could be hidden – there’s just fewer opportunities for third baseman to use their gloves, so Young’s issues going to his left wouldn’t be as magnified as they are at shortstop. He’s going to project as a league average-ish hitter either way, and the difference in position adjustments would be mostly covered by his expected improvement in relative defensive performance. He’s a +2 to +2.5 win player at either SS or 3B, though there’s probably some pretty significant decline coming in future years.

Can you trade Young given his contract? He’s due something like $60 to $65 million over the next five years (part of the $80 million extension he signed has already been paid), which values him as a +2.5 to +3.0 win player going forward. Given that he’s a +2.0 to +2.5 win player right now and headed for his 32-36 seasons, the contract costs more than Young is worth, especially in this market. If the Rangers can actually find a team that values him as a shortstop and will take his entire contract off their hands, it’s worth pursuing. $12 million buys a lot in free agency right now, and it seems extremely unlikely that Young would be able to demand a 5 year, $60 to $65 million contract if he were available on the free market.

In reality, I don’t expect a real suitor to step forward for Young. The teams that need shortstops already low-balled Rafael Furcal this winter, and Orlando Cabrera is currently struggling to drum up interest. Having Young accept a move to third base is likely the best scenario for all involved.


A Royal Dump

Man, I feel bad for Royals fans. Today, they announced a two year contract for Willie Bloomquist, worth a total of $3 million. That gives us this unholy quarter of transactions:

Kyle Farnsworth – 2 years, $9 million, 2006 to 2008: 0.1 wins above replacement
Mike Jacobs – 1 year, $3.5 million, 2006 to 2008: 0.7 wins above replacement
Horacio Ramirez – 1 year, $1.9 million, 2006 to 2008: 0.2 wins above replacement
Willie Bloomquist – 2 years, $3 million, 2006 to 2008: 0.3 wins above replacement

Those Win Value totals are not per year, but three year totals. Over the last three seasons, that foursome has been worth 1.3 wins combined. That’s 12 player-seasons to accumulate just over one marginal win. If you were trying to scrape up examples of major league players who represent replacement level performance, you couldn’t do much better than this group.

That those four will earn a collective $11.4 million in 2009 is pretty staggering. $11.4 million for four guys who, if everything goes right, will add something like one win to the Royals roster next year. $11.4 million for one win. I guess it’s better than the $12 million they spent to get 0.2 wins from Jose Guillen last winter, but that’s in the same way that getting stabbed is better than getting shot.

Dayton Moore came from Atlanta with a strong record of player development. He’s going to have be a Hall of Fame caliber GM at cultivating talent from within his own organization to overcome what can only be described as a devastatingly poor ability to evaluate major league players.

Really, in a market that could only be described as the best buyer’s market we’ve seen in a long time, Dayton Moore has still managed to squander money on unproductive players who do nothing to help his team win. That’s amazing.


The Projections

As David announced this morning, the CHONE projections for 2009 are now available on the site. Sean Smith does great work with these, and they’ve been proven to be among the most accurate forecasting systems out there. Unlike the very bare bones Marcels (which are still quite good as a baseline), the CHONE projections include park effects and minor league data, which improve their accuracy a bit, especially for players where we have minimal major league data.

With that announcement, we now have three projections on each player’s page here on the site – Marcel, CHONE, and the Bill James projections from Baseball Info Solutions. One of the things that I’ve seen people do quite a bit is to take an average of the different projection systems available – especially Marcel and the James projections, since some feel Marcel is too pessimistic and James is too optimistic. However, the key is to understand the context of what each system is projecting. I figured today’s announcement would be a good starting point for a look at what I mean.

As an example, let’s take a look at Wladimir Balentien. He’s 24, has had success in the minors, but failed miserably in his first few hundred major league plate appearances last year. Here are the three forecasts for him for 2009 from the projections we have on the site:

Marcel: .239/.298/.402
CHONE: .231/.302/.409
James: .239/.312/.444

All three systems see a guy who isn’t going to hit for a high average due to his contact problems, but they vary a bit on how often he’ll draw a walk and how much power he’ll show. If you just look at the raw projections, you’d say that Marcel and CHONE aren’t big fans, but that the Bill James projections think he’s got some value. After all, it’s a 40 point OPS gap between James and CHONE.

However, take a look at this:

Marcel: 301 AB, -8.3 wRAA, -13.78 wRAA per 500 AB
CHONE: 472 AB, -6.3 wRAA, -6.67 wRAA per 500 AB
James: 426 AB, -6.6 wRAA, -7.74 wRAA per 500 AB

wRAA, of course, is the linear weights runs above (or in this case, below) a league average hitter. CHONE is actually more optimistic about Balentien than James’ projections. The entirity of the 40 point OPS difference is the projected level of offense in baseball next year. It has nothing to do with Balentien, and everything to do with what the various systems forecast as league average offense in 2009. Marcel is the one who really hates Balentien, but that’s to be expected, considering that the only data going in is his 2008 major league performance, while the other two factor in his minor league success.

You can’t look at just the BA/OBP/SLG numbers without adjusting for the projected context of that particular system. For whatever reason (my guess is that they’re not handling minor league translations very well and are overenthusiastic about young players, but that would take some more research to confirm), the Bill James projections always come out very high in forecast offense. The CHONE and Marcel projections usually match the upcoming year’s offensive level a bit better.

So, when looking at the various projection systems and the rate stats of the players, keep the environments that are being projected in mind. The value of a player isn’t in his rate stats, but in the value he provides above the baseline.


The Closer Markup

With Trevor Hoffman signing a one year contract today (fairly rare for a free agent closer), we’re presented with an opportunity to measure just how much teams are willing to pay for the proven closer label.

For instance, let’s compare Hoffman with a very similar non-closer who also signed a one year deal earlier this winter.

BB/9, 2005 to 2008:

Hoffman: 1.87, 1.86, 2.35, 1.79
Not Hoffman: 1.97, 2.00, 2.10, 1.66

From a practical standpoint, there’s no difference in their walk rates.

K/9, 2005 to 2008:

Hoffman: 8.43, 7.14, 6.91, 9.13
Not Hoffman: 5.92, 8.33, 7.97, 7.51

Slight edge to Hoffman here, but not a huge one. We’re talking about half a strikeout per nine innings, or about five to six per year. Very marginal difference.

HR/9, 2005 to 2008:

Hoffman: .47, .86, .31, 1.59
Not Hoffman: .49, .94, .89, 1.66

Again, a slight advantage to Hoffman, but not a big one, especially when park effects are included. And this is what we’d expect, given that they’re almost equal in flyball rates – Hoffman is at 45.5% FB% since 2002, and Not Hoffman is at 45.2%.

Overall, their skill sets are extremely similar. Both are experienced good command flyball guys who miss bats and are slightly HR prone. Both have long track records of success. Both took one year contracts to move to new clubs this winter.

Trevor Hoffman got $6 million, and could earn up to $7.5 million if he pitches well. Bob Howry, our Not Hoffman for this exercise, got $2.75 million, and could earn up to $3.65 million if he pitches well. Hoffman got, essentially, twice as much money for the same skills because he comes with the proven closer label.

Even in a depressed economic market, the closer markup is still ridiculous.


Hoffman to Milwaukee

According to Rosenthal, the Brewers have signed Trevor Hoffman to a one year, $6 million contract for 2009. Hoffman will go to Milwaukee to replace the retired Salomon Torres as the closer for the Brew Crew, leaving the west coast for an opportunity to be the 9th inning guy on a contender.

How much does Hoffman have left, though? At 41-years-old and with an average fastball of 86 MPH, he’s clearly going to be blowing anyone away at this point. He still relies on an excellent change-up that’s been his bread and butter for his whole career, and while his strikeout rate remained as good as ever last year, he was victimized by a lot of long balls that could spell some trouble for his future.

As with most soft-tossers, Hoffman is one of the most extreme flyball pitchers in baseball. Nearly half of his balls in play go in the air, which is the main reason that he’s been able to run a career .280 BABIP – flyballs get turned into outs a lot more often than groundballs. Combined with his excellent command, this has allowed Hoffman to keep people off the bases and rack up the saves.

However, there’s a big difference between being an extreme flyball pitcher in Petco Park and in Miller Park, which will be his new home. While Petco suppressed home runs by 14%, Miller Park inflates them by 6%. San Diego is heaven for extreme flyball guys like Hoffman, and leaving the friendly confines of that huge outfield could present some problems for him.

His CHONE projection for 2009 has him at a 4.17 ERA in a neutral park. In Miller Park, that’s probably more like 4.30 to 4.40. Given that, it’s tough see Hoffman being better than a +1 win pitcher this year. I’m betting that this isn’t going to work out as well as Milwaukee would have hoped, and they’ll be right back in the market for a closer again next winter.