Tool Version 2: Pitching Correlations with Improved Filtering by Steve Staude January 23, 2014 Now kids, I’m not a user of the Twitter, but I did follow a link to it in one of Eno’s articles last week, whereupon I came across an interesting question posed in a tweet by one @b_g_h: “Are hitters with low BB% also more volatile [because] of BABIP variance?” There are different ways to address the question statistically, but it seemed to me that with an altered version of my pitching correlation tool, one could provide insight into issues like this, at least regarding pitchers. A hitting version is waiting in the wings, don’t worry. So, what I bring to you today is an interactive and downloadable spreadsheet that allows you not only to analyze the relationship between any two pitching statistics, but also to filter your data by any three statistics of your choosing. I’ll show you an example of what I did with this new version of the tool. I started by setting the minimum innings pitched per season filter to 50, then broke down those pitchers into evenly-split Low, Mid, and High groups in each of the following essential statistics: BB%, K%, HR/FB, and BABIP. The boundaries between the groups were as follows: Stat Low-Mid Mid-High BB% 0.075 0.094 K% 0.161 0.204 HR/FB 0.087 0.108 BABIP 0.281 0.301 I set both Stat 1 and Stat 2 to “wOBA against,” with Stat 1 set to Year 0, and Stat 2 to Year 1 (Stat 1 needs to stay as Year 0 in this version of the tool, by the way). Then, it was a matter of isolating the three evenly-split groups for each stat (tertiles) one at a time with the filters; so, for the Low BB% group, I kept the Min BB% filter at 0, and set the Max to 0.075, then copied and pasted values of the correlation results off to the side (setting a quick access shortcut for the paste values command in Excel is pretty handy for this, by the way…or a macro, if you’re fancy). For the Mid BB% group, I set the Min filter to .07501 (to avoid repeats) and the Max to 0.094… and so on. Here were the results: Statistic Tertile YTY Correlation in wOBA against Sample Size Low Estimate (CI=95%) Base High Estimate (CI=95%) BB% Low 0.377 0.447 0.511 557 Mid 0.283 0.372 0.454 389 High 0.299 0.389 0.472 372 K% Low 0.044 0.143 0.239 391 Mid 0.113 0.206 0.295 425 High 0.244 0.324 0.400 502 HR/FB Low 0.365 0.437 0.504 522 Mid 0.277 0.373 0.462 336 High 0.244 0.328 0.407 460 BABIP Low 0.325 0.403 0.475 484 Mid 0.342 0.431 0.513 352 High 0.337 0.413 0.485 482 Overall (50+ IP) 0.360 0.406 0.450 1318 OK, you’ve probably noticed the sample sizes between the groups don’t look so even here. The reason for that is that some pitchers who make the 50 IP cut in Year 0 don’t make it in Year 1 (the next year). With some exceptions, good pitchers in one year tend to get a decent amount of playing time in the next, unsurprisingly. But, back to the point: the table here does seem to support the notion that some types of pitchers are more consistent than others. One caveat, of course, is that we’re basing this observation off of two seasons per pitcher, and it’s safe to say that not every pitcher’s true “type” will be captured accurately in our samples. Take a look at K%: we can say there’s a 95% chance that pitchers with a low K% have somewhere between a 0.044 and a 0.239 year-to-year (YTY) correlation in their wOBA against; meanwhile, high-strikeout pitchers have a YTY wOBA correlation between 0.244 and 0.400, also with 95% confidence. Since there’s no overlap between those ranges, we can safely say with over 95% confidence that high-strikeout pitchers are indeed more consistent than low-K% pitchers in their wOBA against. That makes sense, because strikeout rates are fairly consistent year-to-year (try K% year 0 vs. K% year 1 in the tool to see what I mean), and striking out hitters is going to be predictably good for their wOBA against. I should note that K% did rise over the course of the sample, so just be aware of that. If you download the spreadsheet and add in columns that allow you to filter by percentiles in each season, you can minimize this issue. HR/FB groups seem to be leaning towards low HR/FB pitchers being more consistent, but there’s too much overlap between the top and bottom groups to confirm that. Getting closer to the original question of this article, there doesn’t seem to be much difference in the BABIP groups, year-to-year — at least not when it comes to pitchers. It’s worth noting that high-K% pitchers tend to have low BABIP and HR/FB rates. So, it’s possible that there are some hidden effects going on here. But that’s what multiple regression is for, and what stats like SIERA and my own BERA and SBERA have attempted to address. Anyway, let’s do the same exercise, but this time with Runs Allowed per 9 IP (RA9) as the basis for the comparison. Statistic Tertile YTY Correlation in RA9 Sample Size Low Estimate (CI=95%) Base High Estimate (CI=95%) BB% Low 0.344 0.415 0.482 557 Mid 0.235 0.327 0.413 389 High 0.263 0.355 0.441 372 K% Low 0.059 0.157 0.252 391 Mid 0.147 0.239 0.327 425 High 0.231 0.312 0.389 502 HR/FB Low 0.312 0.388 0.458 522 Mid 0.295 0.390 0.477 336 High 0.190 0.277 0.359 460 BABIP Low 0.305 0.384 0.458 484 Mid 0.283 0.376 0.463 352 High 0.241 0.323 0.401 482 Overall (50+ IP) 0.323 0.370 0.416 1318 Once again, BB% seems not to matter, but this time the high-low K% difference has quite a bit of overlap at the 95% confidence level; we can’t confirm with 95% confidence that K% makes a difference. I don’t think it’s particularly useful in this exercise, but I know some of you are going to want to know about FIP, so here it is: Statistic Tertile YTY Correlation in FIP Sample Size Low Estimate (CI=95%) Base High Estimate (CI=95%) BB% Low 0.486 0.547 0.603 557 Mid 0.354 0.438 0.515 389 High 0.335 0.422 0.502 372 K% Low 0.124 0.220 0.313 391 Mid 0.188 0.278 0.363 425 High 0.322 0.398 0.469 502 HR/FB Low 0.464 0.529 0.588 522 Mid 0.425 0.508 0.583 336 High 0.347 0.425 0.497 460 BABIP Low 0.392 0.465 0.532 484 Mid 0.377 0.463 0.542 352 High 0.419 0.490 0.555 482 Overall (50+ IP) 0.433 0.476 0.517 1318 Seeing as how BB, K, and HR are the factors of FIP, it’s no surprise that we’re seeing more of an effect from them, I think. OK, now that we’re through with all that, let’s take a look at the same sort of analysis for batter wOBAs: Filter Tertile YTY Correlation in wOBA Sample Size Low Estimate (CI=95%) Base High Estimate (CI=95%) BB% Low 0.274 0.366 0.452 366 Mid 0.360 0.443 0.519 397 High 0.442 0.513 0.577 459 K% Low 0.474 0.542 0.603 465 Mid 0.489 0.559 0.623 407 High 0.387 0.473 0.550 350 HR/FB Low 0.202 0.301 0.393 350 Mid 0.189 0.287 0.380 355 High 0.430 0.498 0.560 517 BABIP Low 0.401 0.481 0.553 398 Mid 0.505 0.581 0.648 333 High 0.478 0.543 0.603 491 Overall (50+ IP) 0.323 0.370 0.416 1222 Here, it looks as though high-BB% hitters are more consistent, though not quite with 95% certainty; I think that makes a lot of sense, seeing as how BB% is very consistent YTY for batters, along with the fact that a walk is an automatic wOBA booster. High-K% batters may be less consistent, but it’s hard to say that with much confidence. High-HR/FB sluggers are very significantly more consistent in their wOBA than low or even mid-HR/FB types. Besides also tending to walk a lot, these slugger types are definitely less affected by the whims of BABIP, due to those uncatchable balls they hit out into the outfield bleachers. Unfortunately, the BABIP difference is looking iffy here. But this could once again be a job for multiple regression. Batter BABIP has some positive correlation to both speed and power, which tend to be negatively correlated with each other (but a guy with both speed and power, e.g. Mike Trout, is likely to have a high BABIP). So this could be a good situation in which to apply multiple filters, when that particular tool is released. I’ll leave you with this bonus interactive tool that illustrates more clearly the range of possible correlations that lie within a given confidence interval, given an observed correlation and a particular sample size: Unless your sample size is gigantic, you can only say the true correlation is the same as the observed correlation with near-0% confidence. There’s really no such thing as nailing a particular correlation down with exactly 100% confidence either, although you can come close with an extremely large sample. On the flip side, see how wide the spread between the minimum and maximum correlation estimates gets when the sample is down to four pairs of observations — if you want to make a statement with 99% confidence with a sample that small, you may have to say the true correlation is “somewhere between 1 and -1.” Well, I hope you enjoy and use these tools well!