Author Archive

Fans Scouting Report, Part 2

Following up David’s announcement, let me give you the lowdown on a project that is near and dear to my heart. You know how saberists and statheads are accused of being all about the numbers, and ignoring the human component? This project is the antithesis of that. This is all about the human component.

The idea behind it goes all the way back to the mid-1980s when Bill James in his Baseball Abstracts asked his fans to rate each player by position, 1-30. He compiled their results, and it became the rankings in at least one of his annuals. I was sick and tired of how the media would tell us the fans about how good and bad newly-traded fielders were, when invariably, what they said did not matchup to reality. Really, Kaz Matsui was such a good-fielding shortstop that he could displace the young Jose Reyes? I believed that stuff each time, even though we kept getting confirmation that it wasn’t true.

Put the two things together (James’ crowdsourcing plus distrust of media’s objectivity), and you get The Scouting Report, By the Fans, For the Fans, of which I’ve ran it for the last eight years. I can’t tell you how incredibly insightful you guys are. Well, individually, you aren’t. Indeed, individually, you are as useless as I am, and any other individual. That’s just the way it is. The power is when you get just a few of you guys together. That coalescing is where the real brains of the operations lie. All I do is provide the brawns to bring you guys together under one voice (and remove any obvious party-crashers).

The end result is that you have a bunch of Giants fans evaluate Giants players, and a bunch of Rangers fans evaluate Rangers players, and if a Giants fan is interested in the fielding talent traits of Elvis Andus or Vladimir Guerrero, all he has to do is ask 20 or 50 Rangers fans. He does that asking simply by looking at the results of the Fans Scouting Report.

And, the power of crowdsourcing is really making waves. A few years ago, just days before Opening Day, I started collecting your views on the depth chart of your team (expected games played, expected innings). Once again, rather than try to aggregate from 30 media sources on the latest depth for the 30 teams, I instead pooled the power you guys provide. And you continue to impress me with how much, collectively, you know. Fangraphs last year expanded on that idea even more, by including forecasts of performance as well.

I’ve done here-and-there crowdsourcing on contracts. And Fangraphs has really expanded on that idea too this year. I’ve crowdsourced for favorite movies, most outstanding players, among other things. Really, there is so much that, individually, I would never listen to any single person (myself included, because, after all, who am I?), but that collectively, it trumps anything and everyone out there. It seems like a paradox.

Anyway, so here we are. I’ve provided to Fangraphs the results for the 2009 and 2010 seasons, and am working on preparing the other six seasons back to 2003. There are seven traits that the fans are asked to evaluate, on the idea of capturing the entire spectrum of fielding talents on display (from crack of bat, to last out). Furthermore, by focusing on the particulars, it removes from the fan the impulse to give an overall evaluation. When I compile it, I weight them a certain way to give that overall evaluation. The end-result is a score from 0-100, with 50 as the average. One standard deviation is 20, meaning that for each trait, 16% of the players will exceed a score of 70, and the same number will be worse than 30. Furthermore, I convert those scores to runs, so they are directly comparable to MGL’s UZR and Dewan’s plus/minus.

The pinnacle of sabermetrics is the convergence of performance analysis and scouting observations. And I think that the voice you guys provide, as a group, is part of that convergence.


Forecasters Challenge 2010 – Mid-Season Results

Remember when Fangraphs readers were asked to put in their player forecasts? Well…

In this competition, each Pro forecaster is put in his own league against 21 average forecasters (Joe). In order to make the apples-to-apples comparison, the Pro faced off against the same 21 average forecasters, from the same draft spot. So, when Marcel drafts first against Joe1 through Joe21 in the Marcel Draft1 league, then Chone also drafted first against Joe1 through Joe21 in the Chone Draft1 league. Each of the 22 Pro forecasters followed the same pattern. In addition to that, Marcel then drafted second against 21 new Joes (Joe22 through Joe42). And so on, and so on, and so on.

In the end, each Pro ended up in 22 leagues, each facing identical competition.

Where did these Joes come from? That was the fun part. Fangraphs asked their readers in the offseason for their forecasts. So, what I did was this: for each team, I took the 21 fans that provided the most number of forecasted players. For the Yankees, I’d have Reader1 through Reader21. For the Redsox, I’d have Reader22 through Reader42, and so on. I then randomly selected one fan for each of the 30 teams and called that collection of readers Joe1. Of the remaining readers, I repeated those steps, until I ended up with 21 Joes, from Joe1 to Joe21. I then put them into the competition as noted in the earlier paragraph.

I then did the same thing for Joe22 to Joe42: randomly selecting one fan from each team. In this way, I am able to mix-and-match 630 (21×30) fans’ picks into 21×22 collections (462 Joes).

This model, I believe, should represent a reasonable group of average Joes that you would find in a Fantasy League.

We now come to the results, as of the All-Star Game break:

fan points wins value fan
108 968 20 227 Chone
115 960 19 224 John Eric Hanson
105 921 18 218 Brad Null
118 967 17 203 Steamer
116 929 15 198 KFFL
299 917 15 192 Consensus
112 904 14 189 FantasyPros911
113 897 14 184 FeinSports.com
126 890 13 184 Bloomberg Sports
129 858 12 163 Wells Oliver
132 898 12 163 Fantistics
120 895 11 155 Razzball
127 859 9 143 Future of Fantasy
131 832 8 129 Fangraphs Community
217 841 6 115 Marcel
109 828 6 109 Chris Gehringer
102 801 6 100 Ask Rotoman
125 797 4 75 BigScoreSports
111 757 1 35 Fantasy Scope
106 751 0 32 CAIRO
130 689 2 29 Baseball Info Solutions

If you want more details, explanations, etc: Full Explanation & Results


Do Fans Know More About Their Own Teams Than Other Teams?

When it comes to forecasting players, do fans know more about their own team than other teams?

Randy Winn showed the biggest gap in terms of forecasted playing time. The Yankee fans had him down for 96 games, while the non-Yanks fan had him down for 110. (In my processing, where I added additional filters, it was 89 and 119.) And if we look at the Yankee fans at my site, they had him for 89 games. So far, he’s played in 26 out of 40 games, a rate of 105 games per season. He’s pretty much in the middle of the two at this point, though a shade close to the non-Fans.

The next biggest gap was Mike Fontenot of the Cubs, with a Cub forecast of 100 games and non-Cub of 125 games (or 105, 129 using my numbers). He’s played 32 games out of 41, or a rate of 126 games. Non-Cub fans nailed this one. The Cub fans at my site had him at 115 games.

Then there’s Coco Crisp, who has yet to play a game. Let’s forget him.

Jayson Nix of the White Sox: 81 v 98. Actual 13 of 39. Both are bad, but the Sox-fans are closer.

On the flip side are those team fans more enthused about players. Nate Schierholtz was 131 for Giants fans and 116 for non-Giants fans. Actual: 37 of 39. Giants fans nailed it.

At this point, it’s hard to say what’s the best estimate. Are the team fans closer to their teams, or are they too close (eye of the storm) to see well? Or, do we simply do the time-tested tradition of splitting the difference between the two? We’ll check back at the end of the year.


Winnable Games

On Monday, May 17, the Boston Redsox led the New York Yankees 9-7 entering the ninth inning. The historical record tells us that a team has more than a 90% chance of winning that game. The Redsox lost, turning a winnable game into an actual lost game.

Let’s count a winnable game as any game where the team has at least an 81.21% chance of winning at any point during the game prior to entering the ninth inning. We see this happened 2430 times last year. Not coincidentally, there were 2430 games played in 2009. Basically, if you have more winnable games than actual wins, then this means you lost a few more games than expected. You can call it bad timing, or non-clutch, or whatever term you like.

In 2009, the Mets had 77 winnable games, but ended the season with only 70 wins. They also had 89 lose-able games (chance of winning as low as 18.79% or worse), and ended up with 92 losses. They led the league in not winning as many games as they should have. On the other end of the spectrum were the clutch-filled New York Yankees. The Yanks had 97 winnable games, compared to their actual 103, and 69 lose-able games, but ended up losing only 59. That’s an 8 game improvement.

The 2009 Rays had an interesting season: they had 19 games that were considered both winnable and losable. For example, on August 30, 2009:

In the bottom of the 8th inning, up by two, with two outs, they were in a winnable position: 86.4% chance of winning. After Polanco hits a three-run HR two batters later, they had a 83.9% chance of losing. The Rays led the league in most winnable/lose-able games. The Chicago Cubs were on the other end, with only four games that were both winnable and lose-able.

Finally, the Arizona Diamondbacks provided their fans with the most thrills. They had 99 lose-able games, but ended up with the win 17 times to lead the league. The Pirates were only able to win 5 of their lose-able games.

As for 2010, the Diamondbacks have taken a reversal, with 22 winnable and 19 lose-able games, compared to an actual 16-23 record. The Tigers on the other hand have 18 winnable and 24 lose-able games, compared to an actual 22-16 record. And the Redsox have been involved in seven games where they had the chance to both win and lose the game, to lead the league.


Ryan Howard’s Extension

In the four seasons from 2006-2009, Ryan Howard has accumulated 16 wins, according to http://www.baseballprojection.com/war/h/howar001.htm . If we look at all players born since Babe Ruth, for the same age group as Ryan Howard, that puts him in the top 200 players, along with teammates Chase Utley and Jimmy Rollins.

Ryan Howard was already signed for the 2010 and 2011 seasons before he signed his latest extension. Therefore, we are looking to see what happens to the great player in the third through seventh seasons following his established performance level. Focusing on the 178 great players born between 1895 and 1971, the list is headed by Willie Mays at 37 wins, Babe Ruth and Lou Gehrig, each at 36 wins. Here’s how these great players aged:

Age 27 to 30: 21 wins
Age 31 to 32: 9 wins
Age 33 to 37: 11 wins
Age 38 onwards: 2 wins

This does not look good for the Phillies. While a win costs about 4 to 5 million dollars today, each win in the 2012-2016 season will cost somewhere around 7 million dollars. Granting a five-year extension, two years out, to a typical star player should be worth around 77 million$, not 125 million dollars.

Perhaps Howard is a special case. We already gave him a special consideration, seeing that the group average was 21 wins at age 27 to 30, while he was at 16. We’ll give him the following two further considerations: let’s set the minimum win level for the previous four seasons at 20 (which brings us down to 87 players), and a minimum win level of 8 wins for the next two seasons (which implies that we expect Ryan Howard to be healthy and very productive for the next two seasons). That leaves us with 61 players, with the following results:

Age 27 to 30: 25 wins
Age 31 to 32: 12 wins
Age 33 to 37: 16 wins
Age 38 onwards: 4 wins

We are being extremely generous to Ryan Howard with our comparables. We’ve got the four prior seasons totalling 25 wins (compared to the 16 for Howard’s last four seasons) and 12 wins for the two upcoming seasons (of which Howard has played only one month so far). And under those very unrestrictive parameters, the expectation is to get 16 wins for the five years that Howard signed. And if we increase the cost of a win to 8 million dollars per win, that’s 128 million dollars for five years.

Therefore, in order to justify this contract, we have to take the most optimistic position we possibly can on every parameter we have considered in order to justify the extension as a fair deal. And when you are that optimistic, you probably have a big chance of it blowing up in your face.

Given that we’ve shown that, on average, this deal was not good, the question remains: what are the odds the deal will turn out to be good? It’s a bad deal to buy a lottery ticket… unless you are the winner. Looking at a more restrictive pool of players (minimum 16 wins in the previous four years, at least 4 wins in the next two years, paying 7 million dollars per win), we have a total of 161 players in our pool (who averaged 22 wins in the 4 previous seasons), of which 20% generated at least 18 wins in the five years that we are focused on. And generating at least 18 wins is worth at least 125 million dollars. The odds are therefore as follows:

50% chance of deal being worth at least 10 wins (for an average of 18 wins)
25% chance of deal being worth 4 to 9 wins (for an average of 7 wins, or a value of 50 million$)
25% chance of deal being worth 3 or fewer wins (losing over 100 million dollars in value)

This means the Phillies have a 50/50 shot of either breaking even or facing an albatross. Ideally, you want a 50/50 shot of having a good deal and 50/50 on a bad deal. And that’s presuming that Ryan Howard was a superstar who averaged 22 wins, not 16.

***

There’s 101 players born since Babe Ruth, limited by the following:
– 2400 or more PA in the 27-30 age group
– at least 16 wins in the 27-30 age group
– at least 4 wins in the 31-32 age group

The totals for these 101:
– 22 wins at age 27-30
– 10 wins at age 31-32
– 11 wins at age 33-37

Here’s everyone born since Mike Schmidt (29 of those 101 players), ordered by best to worst at age 33-37:

27-30   31-32   33-37   Player Name
32	17	34	Schmidt	Mike
27	19	22	Henderson	Rickey
20	10	21	Palmeiro	Rafael
20	11	21	Sosa	Sammy
35	11	20	Boggs	Wade
16	9	18	Butler	Brett
16	14	18	Thome	Jim
27	6	18	Yount	Robin
17	10	17	Gwynn	Tony

19	15	16	Biggio	Craig
21	14	16	Ripken	Cal
28	13	15	Bagwell	Jeff
24	6	12	Murray	Eddie
28	12	10	Giambi	Jason
21	11	10	Hernandez	Keith
22	8	10	Thomas	Frank
21	10	10	Williams	Bernie
22	10	9	Trammell	Alan
18	6	8	McGriff	Fred
16	8	8	White	Devon

17	6	7	Salmon	Tim
18	10	6	Puckett	Kirby
22	8	3	Belle	Albert
30	7	3	Griffey	Ken
19	8	2	Cirillo	Jeff
16	5	2	McReynolds	Kevin
19	10	1	Murphy	Dale
16	7	0	Vaughn	Mo
18	6	-1	Barfield	Jesse

So, you have a one-third chance of being Jim Thome, a one-third chance of being Frank Thomas, and a one-third chance of being Mo Vaughn.

Ryan Howard was paid like he had a 100% chance of being Jim Thome.


“Those are the little things that I don’t think you can see in the box score, ever.”

That’s what Mike Lowell said. Actually, we can put it in. This is what the win expectancy matrix was set up for.

Let’s look at this chart. It’s the bottom of the 12th (look at the bottom of the 9th), the batter just flied out to make the 2nd out, and Scutaro would 99% of the time remain on 1B. That sets the win expectancy at .562. But Scutaro actually went to 2B, for a win expectancy of .610. So, we credit Scutaro with +.048 wins.

There you go, Mr. Lowell, it is now officially in the boxscore. If it’s tangible, we can measure it. And Scutaro going to 2B is tangible and measurable.

All we need to do is get the scorers to make a better notation in the boxscore so we can separate these plays better. If it was a deep fly ball that any runner would have made it, we give 100% of the credit to the batter. If it was a flyball that essentially makes the Scutaro play like a SB attempt, we give 100% of the credit to the runner. The tough ones are the in-between plays where the split is not so easily done.

By the way, had Scutaro been thrown out, the odds go down to .500, or a drop of .062 wins. So, it’s a very heads up play by Scutaro, as he only needed to be safe 56% of the time to breakeven. So, the issue is not that Scutaro went for it, but rather, why don’t more runners go for it on those plays? As Scutaro said: do or die.

***

Also, someone else pointed out to me that they IBB the next batter, raising the win expectancy from .610 to .613. Why did it go up, if the lead runner wins the game in either case? And, having a runner on base now makes the outs at three bases possible. I didn’t look into it, other than guess that it’s because two walks wins the game.


J.A. Happ, FIP and Situational Walks

“That’s pretty much how I pitch, to try to keep my FIP as low as possible,” Greinke said.

“You’d like everybody to cut down on walks,” [J.A. Happ’s pitching coach Rich] Dubee said, “but you have to look at the walks . . . Are they just some real bad deliveries or are they walks where you are pitching around a guy in a situation? Some walks are justifiable. Some aren’t. He’s got intelligence. He’s got awareness of lineups, hitters, situations. Those all play into the game.”

***
FIP is fielding independent pitching, a metric I devised following Voros McCracken’s sabermetric-shattering DIPS (defense-independent pitching stats) theory. If you split a pitcher’s performance based on those that involve his fielders and those that don’t, then FIP is only concerned with that one component of pitching which does not involve his fielders. This would be analogous to SLG not considering walks, or OBP counting a walk and HR equally. Each of these metrics is only concerned with one perspective.

The focus of FIP is on weighting a pitchers walks, strikeouts, hit batters, and homeruns, and it is a constant across all game situations. This is of course wrong. It is wrong because a strikeout has much more value with a runner on 3B and less than 2 outs than if it occurred with bases empty. A walk is less costly with first base open and two outs, than if you had a runner on first base. As I said, all these metrics are focused on one perspective.

There is another metric I devised called Situational Wins (i.e., sum of each individual WPA/LI), and it gives a separate equation for each game situation. In short, the very thing that Happ’s pitching coach is (correctly) bringing up as a shortcoming of FIP is being handled with Situational Wins.

The average walk costs a pitcher about .030 wins. That is, a pitcher gives up a walk, and his team’s chances of winning goes down by 30 points. If they had a .560 chance of winning before the walk, it goes down to .530. This is true on average. But, Happ’s pitching coach is saying that Happ is not the average and that his walks are actually issued more often when it least matters. Is this true of Happ?

David was kind enough to send me the Situational Wins for J.A. Happ’s unintentional walks. Of the 54 walks he issued, their average win value was… .030 wins! That is, his walks were NOT situational. His strikeouts tell a similar story, as the win value of Happ’s strikeouts are similar to those of the average pitcher.

Where Happ did excel was not allowing many singles, doubles, and triples. In addition to that, he minimized this aspect of his game in dangerous situations. This double-whammy is where he was very successful. It explains how he was able to strand a league high 85% of his runners on base, and no one was anywhere close to that. However, it’s more accurate to say that the Phillies with Happ on the mound were very successful.

The historical precedence is that it would be very difficult for a pitcher to repeat his situational pitching with regards to batting average on balls in play (BABIP). Not only did Happ have a low BABIP overall (.270, which was among the league leaders), but in high-leverage situations, it was an unfathomably low .141.

In the face of two metrics that show Happ performed fantastically well, but in an expectedly non-persistent manner, it is his performance with walks, strikeouts and home runs that are persistent. And his performance in that regard, his FIP, was league-average.


Poll Results: Best-to-Worst 3+ Year Free Agent Contracts for 2010

The first number (ordinal ranking) is where The Book Blog readers ranked the deals. The second number is the percentage of Fangraphs readers who think the player gave the team the best value, and the third number is for the team that signed the worst deal:

Based on 250 Book Blog readers and 3000 Fangraphs readers.

1. 58 – 03 – Figgins

3. 11 – 05 – Polanco
2. 08 – 03 – Byrd

5. 07 – 10 – Lackey
4. 04 – 09 – Wolf

6. 06 – 17 – Holliday
7. 04 – 19 – Bay
8. 01 – 34 – Lyon


The Anti-“Let the Pitcher Hit” Thread

In this thread, your ONLY comments allowed are those that do NOT favor the pitcher being part of the hitting lineup. I will delete any post otherwise.

There is really only one reason that you don’t want the pitcher to bat: the gap in talent level is too great. While a team may play a no-hit, great-field player, a team would never play a no-pitch, great-hit pitcher. A pitcher truly is a different class of player. So, I’m looking for alternatives.

The first two are a hybrid approach:
1. Let the home manager make the call. Just as in spring training, the managers decided whether to play with a DH, and just as in the World Series they alternate the DH, this rule allows the home manager to decide when to play the DH. The marketplace will decide how much DH is too much DH. Story potential for every game.

2. Make the DH as a pinch hitter for the pitcher, but the pitcher can stay in the game (the one-and-done DH). Manager has to decide how much he wants to deplete his bench. He can go to 4 pinch hitters in a game while his pitcher pitches a complete game, or he can let the pitcher bat in low-leverage situations. The marketplace will decide how much DH is too much DH, with an in-game cost. Story potential if the bench gets depleted.

These are pure: no pitcher-as-batter:
3. Rotate the guy as DH onto the field, maybe alternating with the 1B and/or LF. The one-dimensional player like Adam Dunn is always exposed.

4. Limit the number of games a player can play as DH. Similar to the rotation rule in #3, but rather than rotate in-game, he rotates between games. Say, at most 50% of games played as DH. Same “expose” rule as above.

5. Do away with the DH, and go with 8 hitters in a lineup. Wrecks havoc with the magic nine of baseball, but so be it.

6. Same as number 5, and reduce the innings down to 8: 8 batters, 8 innings. And, one less crappy reliever to see. Game time shortened by twenty minutes.

7 to n: You tell me

Related thread: “Pro pitcher-as-batter“.


The Pro-“Let the Pitcher Hit” Thread

In this thread, your ONLY comments allowed are those that favor the pitcher being part of the hitting lineup. I will delete any post otherwise. You can put in whatever argument you want to support you. For example:

1. Tradition

2. All nine players should take the field and bat in the lineup: pure 2-dimensional.

3. The gap in talent level means that you have to decide when to bring in a pinch hitter or when to bunt. Extra layer of strategy consideration.

4 to n. You tell me

Related thread: “Anti pitcher-as-batter“.