THT Projections: A (Quick) Closer Look
Earlier this week the much anticipated Hardball Times 2007 Season Preview was released, and with it a brand new projection system. I recently took a look at Bill James, CHONE, ZiPS, and the Marcel projection systems to see how they differed. Let’s throw THT into the mix and see where it has its major differences.
First off, let’s see how THT fares against the other projection systems in OPS and ERA as a whole when compared to the Marcel projection system (the simplest of the five).
System ERA-R^2 OPS-R^2 ZiPS .725 .908 Bill James .714 .875 CHONE .699 .865 THT .681 .837
And in English, when comparing the other projection systems to the Marcel projection system, THT’s system is the least similar. (When look at batters with 300+ at-bats and pitchers with 100+ innings.)
So which batters does THT disagree on the most in terms of OPS?
Name Bill James CHONE Marcel THT ZiPS Frank Thomas .939 .853 .874 .982 .892 Hanley Ramirez .801 .791 .843 .714 .777 Robinson Cano .860 .842 .852 .766 .836 Chris Duncan .862 .776 .891 .753 .803 Melky Cabrera .766 .796 .787 .715 .800
Except for Frank Thomas, who THT projects is going to have a phenomenal season, they’re the low point for the other four players. It’s interesting to note that those four are also first or second year major league players. There’s generally a lot of disagreement about Chris Duncan and Hanley Ramirez, but the THT projections for Robinson Cano and Melky Cabrera appear to be the sole point of difference. Let’s look at the pitchers:
Name Bill James CHONE Marcel THT ZiPS Tony Armas Jr. 4.85 4.64 4.96 5.81 4.88 Carlos Zambrano 3.40 3.47 3.48 2.77 3.46 Cliff Lee 4.43 4.20 4.48 5.04 4.55 James Shields 4.03 4.29 4.72 5.03 4.70 Brandon Webb 3.53 3.60 3.65 3.07 3.85 Randy Johnson 4.31 3.77 4.33 3.43 3.63
THT clearly hates Tony Armas Jr. (more) with his ERA about a point higher than the others, while they love Carlos Zambrano who they have at about a .75 lower ERA than the other systems. I threw in Randy Johnson since he was next on the list. It looks like the projections are pretty well divided for him between the 4.30-ish ERA, and the 3.50-ish ERA.
Anyway, the THT projections are certainly similar to the others, but there are clearly a number of key differences which are definitely worth a look. There’s also a lot more to projections than ERA and OPS, so I’m sure you’ll find many other unique aspects to THT’s projection system. Like with any projection system, we’ll have to wait and see which one happens to be the most accurate for 2007.
David Appelman is the creator of FanGraphs.
I would drop the threshhold for pitchers to 60 IP, or even 50 IP. Otherwise, you are essentially only looking at starters.
Or, you can not have a threshhold at all, and simply do:
sum(PA * abs(Marcel minus x) ) / sum(PA)
where PA is the PA or IP in system X, and “x” is OPS or ERA of system X
This will neatly bypass the issue of selective sampling.
You should further, if you want to be fair, do this:
The OPS of Jeter under Marcel would be: his OPS minus the league OPS that Marcel expects. Repeat for each of the systems. And to figure the league OPS, you would take the forecasted OPS and PA of each system in question.