Tool Version 2: Pitching Correlations with Improved Filtering

Now kids, I’m not a user of the Twitter, but I did follow a link to it in one of Eno’s articles last week, whereupon I came across an interesting question posed in a tweet by one @b_g_h: “Are hitters with low BB% also more volatile [because] of BABIP variance?”  There are different ways to address the question statistically, but it seemed to me that with an altered version of my pitching correlation tool, one could provide insight into issues like this, at least regarding pitchers.  A hitting version is waiting in the wings, don’t worry.  So, what I bring to you today is an interactive and downloadable spreadsheet that allows you not only to analyze the relationship between any two pitching statistics, but also to filter your data by any three statistics of your choosing.

I’ll show you an example of what I did with this new version of the tool.  I started by setting the minimum innings pitched per season filter to 50, then broke down those pitchers into evenly-split Low, Mid, and High groups in each of the following essential statistics: BB%, K%, HR/FB, and BABIP.  The boundaries between the groups were as follows:

Stat Low-Mid Mid-High
BB% 0.075 0.094
K% 0.161 0.204
HR/FB 0.087 0.108
BABIP 0.281 0.301

I set both Stat 1 and Stat 2 to “wOBA against,” with Stat 1 set to Year 0, and Stat 2 to Year 1 (Stat 1 needs to stay as Year 0 in this version of the tool, by the way).  Then, it was a matter of isolating the three evenly-split groups for each stat (tertiles) one at a time with the filters; so, for the Low BB% group, I kept the Min BB% filter at 0, and set the Max to 0.075, then copied and pasted values of the correlation results off to the side (setting a quick access shortcut for the paste values command in Excel is pretty handy for this, by the way…or a macro, if you’re fancy).  For the Mid BB% group, I set the Min filter to .07501 (to avoid repeats) and the Max to 0.094… and so on.  Here were the results:

Statistic Tertile YTY Correlation in wOBA against  Sample Size
Low Estimate (CI=95%) Base High   Estimate (CI=95%)
BB% Low 0.377 0.447 0.511 557
Mid 0.283 0.372 0.454 389
High 0.299 0.389 0.472 372
K% Low 0.044 0.143 0.239 391
Mid 0.113 0.206 0.295 425
High 0.244 0.324 0.400 502
HR/FB Low 0.365 0.437 0.504 522
Mid 0.277 0.373 0.462 336
High 0.244 0.328 0.407 460
BABIP Low 0.325 0.403 0.475 484
Mid 0.342 0.431 0.513 352
High 0.337 0.413 0.485 482
Overall (50+ IP) 0.360 0.406 0.450 1318

OK, you’ve probably noticed the sample sizes between the groups don’t look so even here.  The reason for that is that some pitchers who make the 50 IP cut in Year 0 don’t make it in Year 1 (the next year).  With some exceptions, good pitchers in one year tend to get a decent amount of playing time in the next, unsurprisingly.

But, back to the point: the table here does seem to support the notion that some types of pitchers are more consistent than others.  One caveat, of course, is that we’re basing this observation off of two seasons per pitcher, and it’s safe to say that not every pitcher’s true “type” will be captured accurately in our samples.

Take a look at K%: we can say there’s a 95% chance that pitchers with a low K% have somewhere between a 0.044 and a 0.239 year-to-year (YTY) correlation in their wOBA against; meanwhile, high-strikeout pitchers have a YTY wOBA correlation between 0.244 and 0.400, also with 95% confidence.  Since there’s no overlap between those ranges, we can safely say with over 95% confidence that high-strikeout pitchers are indeed more consistent than low-K% pitchers in their wOBA against.  That makes sense, because strikeout rates are fairly consistent year-to-year (try K% year 0 vs. K% year 1 in the tool to see what I mean), and striking out hitters is going to be predictably good for their wOBA against.

I should note that K% did rise over the course of the sample, so just be aware of that.  If you download the spreadsheet and add in columns that allow you to filter by percentiles in each season, you can minimize this issue.

HR/FB groups seem to be leaning towards low HR/FB pitchers being more consistent, but there’s too much overlap between the top and bottom groups to confirm that.

Getting closer to the original question of this article, there doesn’t seem to be much difference in the BABIP groups, year-to-year — at least not when it comes to pitchers.

It’s worth noting that high-K% pitchers tend to have low BABIP and HR/FB rates.  So, it’s possible that there are some hidden effects going on here.  But that’s what multiple regression is for, and what stats like SIERA and my own BERA and SBERA have attempted to address.

Anyway, let’s do the same exercise, but this time with Runs Allowed per 9 IP (RA9) as the basis for the comparison.

Statistic Tertile YTY Correlation in RA9 Sample Size
Low Estimate (CI=95%) Base High   Estimate (CI=95%)
BB% Low 0.344 0.415 0.482 557
Mid 0.235 0.327 0.413 389
High 0.263 0.355 0.441 372
K% Low 0.059 0.157 0.252 391
Mid 0.147 0.239 0.327 425
High 0.231 0.312 0.389 502
HR/FB Low 0.312 0.388 0.458 522
Mid 0.295 0.390 0.477 336
High 0.190 0.277 0.359 460
BABIP Low 0.305 0.384 0.458 484
Mid 0.283 0.376 0.463 352
High 0.241 0.323 0.401 482
Overall (50+ IP) 0.323 0.370 0.416 1318

Once again, BB% seems not to matter, but this time the high-low K% difference has quite a bit of overlap at the 95% confidence level; we can’t confirm with 95% confidence that K% makes a difference.

I don’t think it’s particularly useful in this exercise, but I know some of you are going to want to know about FIP, so here it is:

Statistic Tertile YTY Correlation in FIP  Sample Size
Low Estimate (CI=95%) Base High   Estimate (CI=95%)
BB% Low 0.486 0.547 0.603 557
Mid 0.354 0.438 0.515 389
High 0.335 0.422 0.502 372
K% Low 0.124 0.220 0.313 391
Mid 0.188 0.278 0.363 425
High 0.322 0.398 0.469 502
HR/FB Low 0.464 0.529 0.588 522
Mid 0.425 0.508 0.583 336
High 0.347 0.425 0.497 460
BABIP Low 0.392 0.465 0.532 484
Mid 0.377 0.463 0.542 352
High 0.419 0.490 0.555 482
Overall (50+ IP) 0.433 0.476 0.517 1318

Seeing as how BB, K, and HR are the factors of FIP, it’s no surprise that we’re seeing more of an effect from them, I think.

OK, now that we’re through with all that, let’s take a look at the same sort of analysis for batter wOBAs:

Filter Tertile YTY Correlation in wOBA Sample Size
Low Estimate (CI=95%) Base High   Estimate (CI=95%)
BB% Low 0.274 0.366 0.452 366
Mid 0.360 0.443 0.519 397
High 0.442 0.513 0.577 459
K% Low 0.474 0.542 0.603 465
Mid 0.489 0.559 0.623 407
High 0.387 0.473 0.550 350
HR/FB Low 0.202 0.301 0.393 350
Mid 0.189 0.287 0.380 355
High 0.430 0.498 0.560 517
BABIP Low 0.401 0.481 0.553 398
Mid 0.505 0.581 0.648 333
High 0.478 0.543 0.603 491
Overall (50+ IP) 0.323 0.370 0.416 1222

Here, it looks as though high-BB% hitters are more consistent, though not quite with 95% certainty; I think that makes a lot of sense, seeing as how BB% is very consistent YTY for batters, along with the fact that a walk is an automatic wOBA booster.

High-K% batters may be less consistent, but it’s hard to say that with much confidence.

High-HR/FB sluggers are very significantly more consistent in their wOBA than low or even mid-HR/FB types.  Besides also tending to walk a lot, these slugger types are definitely less affected by the whims of BABIP, due to those uncatchable balls they hit out into the outfield bleachers.

Unfortunately, the BABIP difference is looking iffy here.  But this could once again be a job for multiple regression.  Batter BABIP has some positive correlation to both speed and power, which tend to be negatively correlated with each other (but a guy with both speed and power, e.g. Mike Trout, is likely to have a high BABIP).  So this could be a good situation in which to apply multiple filters, when that particular tool is released.

I’ll leave you with this bonus interactive tool that illustrates more clearly the range of possible correlations that lie within a given confidence interval, given an observed correlation and a particular sample size:

Unless your sample size is gigantic, you can only say the true correlation is the same as the observed correlation with near-0% confidence.  There’s really no such thing as nailing a particular correlation down with exactly 100% confidence either, although you can come close with an extremely large sample.  On the flip side, see how wide the spread between the minimum and maximum correlation estimates gets when the sample is down to four pairs of observations — if you want to make a statement with 99% confidence with a sample that small, you may have to say the true correlation is “somewhere between 1 and -1.”

Well, I hope you enjoy and use these tools well!

Steve is a robot created for the purpose of writing about baseball statistics. One day, he may become self-aware, and...attempt to make money or something?

Newest Most Voted
Inline Feedbacks
View all comments
10 years ago

Great stuff Steve. Tell me you work for a team or BIS or something, because if not, I can’t imagine it would be long before you’d have an opportunity to do so.