Archive for November, 2009

The Projection Process Behind Being a Fan

Let’s talk a bit more about what makes the Fan Projections different from those that Rally, Dan Szymborski, and Tango Tiger have and will continue to produce.

The most obvious difference is the amount of math involved. Frankly, we’re not requiring any. There’s no need to copy, paste, and weigh each season like these are your personal Marcels. Now, if that method is the most comfortable in your estimation, then sure, do as you please. Most will probably do a little eyeballing and nudge those up or down based on personal knowledge, anecdotal evidence, or just pure gut feelings.

For instance, yesterday I began filling out my Rays projections and up popped B.J. Upton’s name. Everyone – well, those who read that other site I’m on most of the time – knows about my fandom of Upton. I find him to be a fantastic talent with immense upside. The problem being A) how is his shoulder health and B) how much of the potential is left? He’s already a solid player, but after his 2007 the talk about him becoming a 30/30 producer has been left unfulfilled.

This all came to a head when I reached the home runs drop down. Upton hit 24 homers as a 22-year-old and has hit 20 since. The numbers said … well, I didn’t know what they said. I really, really wanted to go 20+. He has the quick wrists as everyone saw in the 2008 post-season. But then again, potential isn’t static and pitchers figured out how to pitch to his weaknesses last year.

To truncate this process which is probably interesting to no one, I chose 15-19 after a good three or four minutes of internal debate. All of these factors came into play. I added the scouting observations, subtracted the health concerns, and did math without really doing math. Only afterwards did I run the 5-4-3 weighting to find 13-14 homers as the average. Not a huge difference all told, but I think most people are going to have small conflicts like this throughout the process.

Ultimately, the resolutions will guide the final projections. Remember, we’re not asking you to submit the ZiPS, CHONE, or Marcels projections for these players. If we wanted that, we would consult with them. We want everyone to vote as they wish and if that means being a little optimistic or pessimistic about certain players, then so be it. Just remember, your ballot won’t be ditched if you disagree with CHONE.


To Offer Or Not?

Tomorrow represents the first real day of the hot stove league, as teams are required to make arbitration decisions on their respective free agents. Once teams understand who will and who will not require draft pick compensation to sign, the process will accelerate, as teams will be able to more accurately assess the cost of signing a particular player.

For some players, the decision to offer arbitration or not is an easy one. The Angels will certainly be offering it to John Lackey and Chone Figgins, while the White Sox will not be offering it to Jermaine Dye. It doesn’t take much in the way of analysis to reach those conclusions.

However, there is a large pool of players for which the decision is not so cut-and-dried. Should the Dodgers offer Orlando Hudson arbitration, even though they don’t really want him back, in order to secure the compensation he would bring as a Type A free agent? How about Mike Cameron, who has already been replaced in Milwaukee and been told that he is not in their plans, but has the type of skillset not likely to be correctly valued in arbitration? Injury prone pitchers such as Rich Harden and Erik Bedard also present dilemmas to the Cubs and Mariners respectively.

For many of these players, teams will decide that the risk of an arbitration offer being accepted is too high, cutting ties with a player they either don’t want or don’t believe they can afford. However, in many of these cases, I believe that teams may be incorrectly valuing the actual cost of the offer.

The marginal cost of the arbitration offer is not the full value of the player’s potential 2010 salary. It is not even the dollars beyond that which a team would be happy to pay the player. It is only the dollars beyond what any one team in baseball would be happy to pay for that player that are actually being risked.

Let’s use Adrian Beltre and the Mariners as an example. Based on some back-of-the-envelope calculations, I’m presuming that the Mariners have approximately $25 million to spend this winter as they shop to fill various needs. That is one of the main reasons why Beltre probably won’t be back in Seattle next year, as he would eat up a significant chunk of that budget, limiting the team’s options when pursuing other positions of need. In reality, Beltre would probably make at least $10 million if he accepted arbitration, and likely closer to the $13 million he earned in 2009.

However, the Mariners aren’t risking $10 to $13 million by offering Beltre arbitration. His market value is significantly north of $0, and on a one year deal, it’s probably somewhere between $8 and $12 million, I’d imagine. So, in reality, the Mariners would be risking something like $4 million, as that would be the potential difference between the arbitration award and his free market value. Remember, a team is free to trade a player who accepts arbitration, so it wouldn’t be particularly hard for the Mariners to then ship Beltre to, say, Philadelphia along with some cash to cover the difference between what Philly wants to pay him and what he may get in arbitration.

So, if the risk if ~$4 million, what is the actual cost of assuming that risk? That requires a probability calculation of how likely the player is to accept the offer. In many of these borderline cases, I’d assume the actual probability is probably around 50 percent, plus or minus 10 percent or so. That’s why they are borderline cases – it isn’t easy to figure out how the player would react to an arbitration offer.

With a probability of around 50 percent, that cuts the total risk in half. In the Beltre example, that would lower the cost of assuming that particular risk to $2 million, once the potential that he wouldn’t accept is factored in.

Is a supplemental pick in the #35-#45 range worth $2 million to the Mariners? Most of the research done on the subject would say yes, and that given these numbers, Seattle is better off taking the risk of Beltre accepting their offer to receive the potential reward of the compensation.

This is the calculation that teams should be doing – figuring the cost of a potential arbitration award over the market value of the player, adjusting for probability that he accepts the offer, and comparing the cost of that risk to the benefit of the compensation pick.

If a team is assessing their actual risk as the full potential salary that they would have to pay out if a player accepts, they’ll be overstating their own liability and miscalculating the costs and benefits of an arbitration offer. I would suspect we will see multiple teams do just this tomorrow, leaving value on the table by being overly risk averse.


The Humility of Statistical Projection

It’s “projection week” here at FanGraphs, which is a nice coincidence, since I was going to post about projections, anyway. While I dabble with my own projections (which probably will never see the light of day), no one wants to hear about that. Instead, I’ve just assembled some (very) non-technical reminders that might be helpful when looking at projections.

I’ve often heard the complaint that projections are “arrogant,” “put too much faith in the numbers,” or the classic “they rely on what a player has already done, but they don’t tell you want a player will do.” I want to emphasize that projection systems are not based on esoteric “tricks,” but rather are based on the fact that we don’t know very much about the player from the numbers.

Projection is not divination. I’ve sometimes heard that projection systems aren’t worth looking at because “after all, they projected an .800 OPS for player x and he ended up with an .850 OPS.” That’s a straw man, but it gets at the general point: projections are not prophetic divinations of the future, but attempts to measure what the “true talent” of players at any given point in time. The “general formula” for player performance is: true talent + luck + environment. (I’ll table discussion of parks and aging for now.)

The problem is that we don’t know, at least from the raw stats, what exactly is “luck” and what represents a player’s “true talent.” Moreover, “luck” doesn’t just mean things like BABIP rates. Even a player getting 700 PA in a season will have varying levels of performance around his true talent, what we call “hot streaks” or “cold streaks.” (Cf. Willie Bloomquist, April 2009.) To single these streaks out begs the question: how do we distinguish the “streaks” from the “true talent” parts of the seasons from which the projections draw? Projection systems use different methods; here I’ll mention basic factors that are used by most good projection systems. This may be old hat, but they are worth discussing because of how often they are passed over.

Regression to the mean. This is a very important concept, so important that I’m leery of screwing up the explanation. The best introductory piece I’ve read is one by Dave Studeman. In short: given a lack of any other information about a player, our “best guess” is that he’s an average member of (some particular) population. The more data we have on the player, the more we can separate him from the “average” population. This is one place where sample size issues come into play. [Note that there is a great deal of debate about how to regress, e.g., what the “population” should be. For examples, search at The Book Blog or Baseball Think Factory.]

Weighted average. Say a projection involves the last three years of performance. Do you simply take the three year average? Well, no, true talent can change from year to year. More recent years are thus weighted more heavily (5-4-3 for hitters and 5-3-2 for pitchers are common weights). Alex Gordon had a .321 major-league wOBA in 2009, and a .344 in 2008. Do we automatically assume that .321 is closer to his true talent? No, because the .321 was in only 189 PA, while the .344 was in 571 PA.

This isn’t all there is to projection, but you’d be surprised how much work those basic concepts do. Tom Tango’s Marcel works entirely from a weighted average, regression, and a very basic age adjustment, and it hangs in with the “big boys” pretty well. No projection system will ever be perfect, of course. Part of that is the influence of “luck” and the limited samples we have from all players. Part of it is also that some players don’t have that much information available on them. Players develop differently.

The point is that we simply don’t know ahead of time which players will be exceptions. Projection systems generally do better when looking at how the project groups of players, rather than focusing in on individual successes or failures, as in the case of Matt Wieters (ahem). The point I’ve been trying to make in a roundabout way is that regression, weighted averages, generic aging curves, etc. might miss out on certain players, but are based on studies that show how most players would do. They are humble confessions of ignorance on an individual level, but are still the best overall bet. Expecting anything more leads to folly.

One might express the difference as that between a making a conservative, diversified investment and “just knowing” that Enron stock will continue to rise. Tough choice.

More later this week on “breakouts,” “outliers,” and other traps.


Pitch Data and Projections

As you know by now, FanGraphs is hosting a new projection system year, the Fan Projections. Dave Cameron filled us in on why projecting players in this manner can be effective, thanks to the wisdom-of-the-crowds idea. My piece of wisdom to impart is the value of pitch data to inform your projection. This data can tell you if a pitcher has gained or lost speed on his fastball, picked up a new pitch or is throwing a certain pitch more or less often. Thus it is another tool in diagnosing if a big shift in pitcher performance was luck-based or a shift in true talent. You can find that data in the pitchf/x section of each pitcher page or in the Pitch Type section (these pitch classifications are from BIS, which can be slightly different than the pitchf/x ones).

A fitting example is Brian Bannister, a pitchf/x devotee himself. Although we don’t ask you to project it here, one value you could project is his 2010 GB%, which will heavily influence his HR/9 and ERA. Here are Bannister’s GB%s over his career (in green).

5718_P_season_mini_9_20091006
Using a Marcel-like 5/4/3 weighting of the past three years to project his 2010 GB% we would get around 43% . But we know that Bannister is a much different pitcher in 2009 than he was in 2008 and 2007. In 2007 and 2008 he threw almost 60% fastballs, but in 2009 he threw the fastball just 17% of the time and a previously unused cutter 50% of the time. This change in pitch type supports the change in GB%, since cutters tend to result in more ground balls than fastballs, and makes us more confident this increase in ground balls is for real going forwards.

We still want to regress somewhat to prior performance and league average, but the change in pitch usage means we should value 2009 more heavily, and project him around 47% or so. From there we can project his HR/9 and ERA based on the assumption that he is above average at getting grounders.


Wisdom Of The Crowd

Today, David announced the newest addition to the site, and one we’re all pretty excited about it – Fan Projections. We’ve hosted the forecasts of most of the various top projection systems over the last few years, and you’ve probably become accustomed to hearing various writers quote CHONE, ZiPS, or Marcel. With FanGraphs now offering the ability to aggregate projections from various sources, we’ll have a new set of projections to offer this winter – those of the crowd.

If you’ve read James Surowiecki’s “The Wisdom of Crowds”, you’ll no doubt recognize the theory behind the endeavor. As Surowiecki suggests, there is evidence that certain groups of lay people can give better estimates than any single expert, due to the unique experiences we all have in life. By blending our understandings together, we can eliminate some of our individual biases and enhance the shared wisdom of the population.

Tom Tango has done some research on this as it pertains to projections in baseball, and Surowiecki’s theory holds up pretty well. In a four part series that matched up six projection systems against 165 fans, the aggregate projections of the fans was essentially the equal of the complex statistical models. Individual fans by themselves didn’t fare so well, but when all fans were combined into a single projection, they held their own.

This is, essentially, Surowiecki’s argument in a nutshell. We all have our limitations of understanding, but the whole can be greater than the sum of its parts.

So, in that spirit, we offer you the opportunity to make your voice heard. Use the Fan Projections to add your personal wisdom the crowd, based on your insights and experiences. Despite the fact that this is site is statistically inclined, we have a fairly broad base of readers, offering a wide variety of opinions and views – the kind of crowd where the wisdom of many can really shine through. Try projecting some players every day, putting real thought into what you expect from each player in 2010.

If we have the diverse, intelligent crowd that I think reads this site, don’t be too surprised if the Fan Projections end up hanging with the big boys. Let’s put the wisdom of the FanGraphs crowd to use and see what happens.


Derek Lowe and Red Flags

Prior to the 2009 season, the Atlanta Braves signed Derek Lowe to a four year, 60 million dollar contract. One year and 15 million dollars later, the Braves seem to be having second thoughts, as Lowe has been repeatedly mentioned in trade rumors.

It’s not that Lowe wasn’t a productive pitcher last year. His 4.06 FIP makes him an above-average SP, and in 195 IP that makes him worth 2.7 wins above replacement. His ERA of 4.67 probably is part of the reason the Braves are willing to shop Lowe, but he’s not as bad as that would suggest. His tRA of 4.61 suggests an ERA (or tERA) of 4.24 – not quite as good as his FIP but still above average, and 195 IP of above-average pitching is valuable. Unfortunately for the Braves, it’s not worth 15 million dollars.

Still, Lowe is only one year removed from a 3.26 FIP/3.27 tERA, the third of three straight sub-4.00 FIP/tERA seasons. He has shown the ability to be an ace-quality pitcher, and even if he doesn’t return to his five-win form from 2008, it’s possible that he could post some four win seasons and be worth his contract.

There are three red flags that come up for Lowe when projecting his future. The first of these is his age. The next three seasons, for which he’ll be paid 45 million dollars, will be his age 37, 38, and 39 seasons. Pitcher attrition happens often at age 30, and the risk becomes even higher at this advanced age. Lowe did see a dip in velocity last year, but he did make 34 starts, and didn’t spend any time on the DL, which would likely be the biggest warning sign that age was catching up with him.

The other two red flags have to do with patterns we see in Lowe’s results this year. First, Lowe’s ground ball rate fell 4% from last year’s 60.3%, already a career low. With statistics like BABIP, we can’t make conclusions based on one year’s worth of data because these statistics don’t stabilize over the course of a season. However, ground ball rate stabilizes after only 200 plate appearances, and this substantial drop of 4% means we’ll see more fly balls and line drives out of Lowe, meaning more home runs and more hits.

The third red flag is probably the most alarming, and that’s the fact that Lowe’s K/BB fell from 3.26 in his fantastic 2008 to a poor 1.76 in 2009. His strikeout rate fell by a point and his walk rate rose by a point. Although it is possible that both of these rates return to form in 2010, given Lowe’s age and the fact that both statistics also tend to stabilize over the course of the season, it’s probably more likely that we see Lowe’s K/BB remain closer to 2.00 than 3.00.

Thanks to his ability to induce ground balls, which even after the drop is still above average, Lowe can still be a productive pitcher, but there are three very good reasons for the Braves to try and get something in return for Lowe’s unfavorable contract, especially when combined with their abundance of starting pitchers. If a team can get the Braves to eat some of Lowe’s salary, they could be getting an asset, but thanks to the red flags mentioned above, it’s unlikely that Lowe will be a 15 million dollar pitcher over the course of his contract.


Andruw Jones & DeWayne Wise Find Jobs

White Sox sign OF Andruw Jones for $500K

Jones’ 2009 season was hardly as well publicized as his 2008 voyage through the depths of offensive hell, but he managed a decent bounce back in 331 plate appearances. 17 homers were hit, a .323 OBP was had, and Jones nearly had a higher slugging percentage (.459) than his 2008 OPS (.505). As for 2010, the deal is a coup for Ken Williams. Jones flashed his highest ISO since 2006 and while playing in Arlington artificially enhanced those numbers, it’s still encouraging to see his HR/FB and ability to drive the ball seemingly regress towards normal. Despite that iffy contribution in 2008 Jones is fully capable of producing league average offense — and perhaps better if he plays mostly against southpaws.

Jones’ athleticism has dipped considerably, but it seems unlikely that his defensive ability has become so poor, so quickly that he wouldn’t be worth at least the half a million in 2010. If he does perform, the deal includes roughly a million in performance incentives, and Jones will be owed whatever amount shy of $3.2M remains. He’s not going broke because of this deal. Frank McCourt, on the other hand, has to be tickled. The worst case here is Jones being released after a slow start to the season. Even then, it’s only $500K; Mark Kotsay is making double that next year.

Phillies sign DeWayne Wise to a minor league deal

The man who Jones somewhat replaces also found a new home before the holidays. Wise is best known for his ridiculous catch to preserve a perfect game last year. Presumably Wise will find some time on the Phillies’ bench throughout the long season.


The 2010 Fan Projections!

For those of you who have been hanging around FanGraphs since at least last season, you’ll know that we carry various projections in the off-season, which are for the most part generated by computer programs.

This off-season, in addition to carrying the various computer generated projections, we’ve teamed up with Tangotiger of insidethebook.com to give you, the fans, a chance to generate your own projection line for each major league player. Hopefully our collective brains will be able to pinpoint things that computer systems don’t.

With that said, let me give you all a quick tour of a projection ballot:

FanProj

Before you can project any players, you’ll have to select the team you follow most closely towards the top of the screen. If you really don’t follow a team, just pick one. You’ll only have to do this once.

After you’ve selected a team, there are 8 categories for pitchers and 10 categories for position players. Pick the values in the drop-down boxes closest to what you think the player will do in 2010, hit the submit button and you’re done! If you made a mistake, you can always go back and change your selection at any time.

That’s really all there is to it. You can filter players by team, or if you go to the player pages, you can project players individually. If you want to see all the players you’ve projected, you can click on the “My Rankings” button which will show you only what you specifically projected a player to do.

As with all new features, we hope everything is bug free and we test things as much as possible, but if you do notice any issues, please let us know.


The Reds’ Bright Spot

Say what you will about the Cincinnati Reds, as a team they play air-tight defense. I don’t think much has been made of it, but the Redlegs led the National League in UZR with 52 runs saved last season. Just looking at the team’s current depth chart, they might possibly improve on last year’s mark. This isn’t to say their squeaky clean glovework is going to somehow launch them into contention next year, but hey, when you can find a bright spot for a languishing franchise such as the Reds, it needs to be highlighted.

There are a couple nifty new sets of defensive projections that have recently come out. Jeff Zimmerman of Beyond the Boxscore has cooked some up, and Steve Sommer has some projections that go the extra mile and regresses UZR to a population based on the Fan’s Scouting Report.

                   Jeff  Steve
Joey Votto         2      3
Brandon Phillips   7      7
Scott Rolen        7      8
Paul Janish        4      7
Chris Dickerson    1      8
Jay Bruce          1      4

Be sure to click on the links if you would like to read up on their methods.

There’s not a weak link on this chain. I’m assuming Drew Stubbs will be their center fielder after Willy Taveras‘ replacement level season, but we can’t put anything past Dusty Baker. Stubbs gets glowing reviews from scouts, and his Total Zone stats (found at MinorLeagueSplits.com) agree: the numbers have him at +58 in 423 minor league games, including 19 runs in 107 games in AAA last season.

UZR had Stubbs at 8 runs in just 42 games in the big leagues, for what it’s worth. The bottom line is he can go get ’em.

Paul Janish may be Adam Everett-light, and I mean that as a compliment. I think. He hit for a paltry .275 wOBA and is projected to do the same next year, but in just less than 600 innings on the field he was good for 12 runs as measured by UZR. Small sample, yes, but the fans like him and the Reds like him enough to start him next season.

The Scott-Rolen-for-Edwin-Encarnacion-plus-prospects trade is still a head scratcher to me, but he was consistently -10 on defense where Scott Rolen is still mostly Scott Rolen.

Things might get ugly yet again this summer in Cincinnati, but it won’t be for a lack of fielding.


Jeff Bailey: The New Josh Phelps?

Remember Josh Phelps? In 2002, he came up as a catcher-turned-first-baseman with the Blue Jays and absolutely smashed the ball (.396 wOBA) for 287 plate appearances. His 2003 was decent, if not mind-blowing. Phelps struggled badly in 2004 and got traded to Cleveland mid-season. He bounced around the majors and minors for a while, and though he never blew anyone away for any length of time again, he was mentioned as recently as last off-season as a minor-league deal or possible stopgap/platoon guy at 1B/DH. With glorious half-season in 2002 far off in the rear view mirror, the days of being the semi-darling of a few isolated bloggers are probably over for Phelps; he’ll be 32 next season and only saw action at the minor-league level for the Giants in 2009.

The point with Phelps was not that he was some super-duper mystery pickup that would put a team over the top. The reason he was brought up was because he was F.A.T. (Freely Acquirable Talent) that could make a contribution in the right situation. Unless a player is utterly horrible defensively (and I realize that one could have made such a claim about Phelps), if he’s an above-average hitter, he probably has a place somewhere, especially if he can be had on a minor-league deal.

Which brings us (finally) to the player at-hand: Jeff Bailey. He’ll be 31 next season (almost as old as Phelps). He’s accumulated 159 plate appearances in the majors from 2007-2009. He’s primarily played first base, but has seen a bit of duty in the outfield as well, although that’s a stretch. He’s basically a glorified career minor-leaguer who’s seen the majors when the Red Sox had injuries.

I’m hardly an expert on all things Bailey, but his CHONE projection caught my eye: .249/.348/.417, or 5 runs above average per 150 games. ZiPS concurs, projecting him at .258/.345/.415. The UZR data is in too small a sample to be relevant; Rally’s TotalZone projection for Bailey last season and the Fans Scouting Report this season suggest that Bailey is probably average-to-below-average defensively.

Bailey looks like about a 1 WAR player. He’s going to be 31 and has little (if any) upside. He’s not a player every team should be after for even the right price. It depends on the situation. For example, if there’s an NL team with a hole at first base, maybe a platoon of Bailey and, I dunno… Eric Hinske might be a good idea if the team lacks other options. If a team really has no AAA depth, Bailey’s definitely worth a look.

Like the latter-day Phelps, Bailey doesn’t have much to offer other than a non-horrible right-handed bat. He should be available on a minor-league deal, or, if a bidding war breaks out, at the major-league minimum. Perhaps this is obvious, but remember the New Josh Phelps when you read about a team trading actual talent for or giving millions to the New Mike Jacobs.