New SIERA, Part Five (of Five): What Didn’t Work by Matt Swartz July 22, 2011 SIERA is now available at FanGraphs, and several important changes have been made to improve its performance, as discussed in parts one, two, three and four. Still, there were some components that I didn’t adopt. For the sake of completeness — and for those interested in what I learned about pitching along the way — I include these in the final installment of my five-part series. 1. Park-adjusted batted ball data FanGraphs’ Dave Appelman developed park factors for batted balls for me, but these weren’t used in the new SIERA. I figured that since park-specific-scorer-bias might make batted-ball data less accurate, removing the noise might help. Park effects themselves come with a bit of noise, though, and the effect of park-adjusting the batted-ball data didn’t help the predictions. The gain simply wasn’t large enough: The standard deviations of the difference between actual and park-adjusted fly ball and ground-ball rates was only 1.4% and 1.5%, respectively, and the park factors themselves were always between 95.2 and 103.9 (the average park factor is set to 100). The pitcher with the biggest SIERA difference — when it came to park-adjusted-batted-ball totals — was LaTroy Hawkins, who in 2007 in went from 3.63 to 3.73. The second-biggest difference came when Aaron Cook went from 4.00 to 4.08 in 2008. Brian Lawrence went from 3.55 to 3.63 in 2002, and Cook made the list again in 2007 when went from 4.63 to 4.71. There were only 16 other pitchers with ERA changes more than .05 in either direction. As a result, very few people saw major changes to their SIERAs, and the combined effect was a weaker prediction. 2. Year-specific ground ball coefficients I wanted to know if yearly changes to batted-ball scoring required adjustments, so I allowed the coefficients on ground-ball rate and ground-ball-rate squared to vary year to year. The net effect was that a lack of data dominated the calculations, and no real improvement came from this adjustment. I abandoned my plan when I saw the standard errors on these coefficients. 3. Adding hit-by-pitches and subtracting intentional walks I tried changing BB to UBB + HBP. Due to the volatility of HBP, though, this didn’t improve prediction, either. I left it as BB for simplicity’s sake. The changes also were very small. Since my research showed that pitchers with low walk rates tend to allow less-damaging walks, I think of intentional walks as corner solutions on a spectrum of intent. Put a simpler way, how many of Greg Maddux’s unintentional walks had an element of intent? 4. Including pitchers with fewer than 40 innings If you use all pitchers with 162 IP or more and regress next season’s BABIP against strikeout rate, walk rate, ground-ball rate and ground-ball-rate square you’ll get an R^2 of .0805. But regressing it against previous year’s BABIP yields an R^2 of .0184. Home-runs-per-fly-ball can actually be predicted better using the same peripherals — with an R^2 of .0575 — than with previous year’s home-run-per-fly-ball rate of .0386. If you lower the innings restriction, though, the R^2 statistics shrink. Some of this is sample size — but some of this is because pitchers who don’t throw many innings often don’t deserve to be regressed to typical major league levels of BABIP and HR/FB. Since predicting BABIP and HR/FB is one of SIERA’s major strengths, it’s important not to include pitchers who muddy the analysis. Because of that, I was hesitant to include pitchers with fewer than 40 innings (the cut-off that Eric Seidman and I used when creating the original SIERA). I was curious if keeping pitchers with fewer than 40 innings this time (but weighted less) would help. But because pitchers with few innings can have gigantic ERAs (Sandy Rosario had a 54.00 ERA and one inning last year), the regression formula was designed to match SIERA to ERAs like Rosario’s at any cost to measuring everyone else’s skill. The formula made no sense. 5. SIRA, Skill-Interactive Run Average Finally, I attempted to develop an alternative to Skill-Interactive Earned Run Average (SIERA), which I called Skill-Interactive Run Average — or, SIRA. Since the goal of pitching is to prevent runs –rather than simply preventing earned runs — I figured that scaling SIERA to “run average” might make for a more-accurate estimator. As it turned out, there was basically no benefit to doing this. The coefficients were similar between the two equations and the ordering of pitchers was similar, as well. The following table shows the coefficient for SIRA and SIERA. Note the similarities: Variable SIERA coefficient SIRA coefficient (SO/PA) -15.518 -17.172 (SO/PA)^2 9.146 10.332 (BB/PA) 8.648 8.413 (BB/PA)^2 27.252 27.800 (netGB/PA) -2.298 -2.555 (netGB/PA)^2 -4.920 -5.049 (SO/PA)*(BB/PA) -4.036 8.793 (SO/PA)*(netGB/PA) 5.155 6.808 (BB/PA)*(netGB/PA) 4.546 -0.910 Constant 5.534 6.078 Year coefficients (versus 2010) From -0.020 to +0.289 From -0.003 to +0.320 % innings as SP 0.367 0.363 Year 2002 2003 2004 2005 2006 2007 2008 2009 2010 SIERA Coefficient -.020 +.093 +.154 +.037 +.289 +.226 +.116 +.103 .000 SIRA Coefficient -.003 +.104 +.172 +.031 +.320 +.234 +.118 +.092 .000 With such similar coefficients, it’s not surprising that the SIERA and SIRA pitcher ranking were almost identical. I looked at the Top 25 in SIERA and SIRA from 2002 to 2010. The numbers overlapped at least 24 of 25 pitchers each year — and in some seasons, they were the same 25. The closest thing to a major difference was in 2006, when Johan Santana led the league in SIRA over Brandon Webb, with a 3.42 SIRA to Webb’s 3.43. But because ground-ball pitchers are slightly more likely to give up unearned runs, Webb’s SIERA was 3.02, compared with Santana’s 3.16. Overall, no starting pitcher had a ranking that changed more than 12 spots out of about 90 when flipping between SIERA and SIRA. In 2004, Brandon Webb was 40th in SIERA but only 52nd in SIRA; Ted Lilly was 28th in SIRA but 38th in SIERA. Otherwise, everyone else changed fewer than 10 spots. With so few changes, I opted for familiarity and decided to publish SIERA over SIRA. These five tweaks were abandoned, but some changes — such as yearly run-environment and pitching-role-ERA-differences — stuck and were important. And that, essentially, is what matters. In essence, SIERA was a strong statistic after its original development. But with a new run environment, it needed to be freshened. Now, with its extensive changes here at FanGraphs, the metric should stand the test of time.