Tool Version 2: Pitching Correlations with Improved Filtering

by Steve Staude

January 23, 2014

Now kids, I’m not a user of the Twitter, but I did follow a link to it in one of Eno’s articles last week, whereupon I came across an interesting question posed in a tweet by one @b_g_h: “Are hitters with low BB% also more volatile [because] of BABIP variance?” There are different ways to address the question statistically, but it seemed to me that with an altered version of my pitching correlation tool, one could provide insight into issues like this, at least regarding pitchers. A hitting version is waiting in the wings, don’t worry. So, what I bring to you today is an interactive and downloadable spreadsheet that allows you not only to analyze the relationship between any two pitching statistics, but also to filter your data by any three statistics of your choosing.

I’ll show you an example of what I did with this new version of the tool. I started by setting the minimum innings pitched per season filter to 50, then broke down those pitchers into evenly-split Low, Mid, and High groups in each of the following essential statistics: BB%, K%, HR/FB, and BABIP. The boundaries between the groups were as follows:

Stat	Low-Mid	Mid-High
BB%	0.075	0.094
K%	0.161	0.204
HR/FB	0.087	0.108
BABIP	0.281	0.301

I set both Stat 1 and Stat 2 to “wOBA against,” with Stat 1 set to Year 0, and Stat 2 to Year 1 (Stat 1 needs to stay as Year 0 in this version of the tool, by the way). Then, it was a matter of isolating the three evenly-split groups for each stat (tertiles) one at a time with the filters; so, for the Low BB% group, I kept the Min BB% filter at 0, and set the Max to 0.075, then copied and pasted values of the correlation results off to the side (setting a quick access shortcut for the paste values command in Excel is pretty handy for this, by the way…or a macro, if you’re fancy). For the Mid BB% group, I set the Min filter to .07501 (to avoid repeats) and the Max to 0.094… and so on. Here were the results:

Statistic	Tertile	YTY Correlation in wOBA against			Sample Size
Statistic	Tertile	Low Estimate (CI=95%)	Base	High Estimate (CI=95%)	Sample Size
BB%	Low	0.377	0.447	0.511	557
	Mid	0.283	0.372	0.454	389
	High	0.299	0.389	0.472	372

K%	Low	0.044	0.143	0.239	391
	Mid	0.113	0.206	0.295	425
	High	0.244	0.324	0.400	502

HR/FB	Low	0.365	0.437	0.504	522
	Mid	0.277	0.373	0.462	336
	High	0.244	0.328	0.407	460

BABIP	Low	0.325	0.403	0.475	484
	Mid	0.342	0.431	0.513	352
	High	0.337	0.413	0.485	482

Overall (50+ IP)		0.360	0.406	0.450	1318

OK, you’ve probably noticed the sample sizes between the groups don’t look so even here. The reason for that is that some pitchers who make the 50 IP cut in Year 0 don’t make it in Year 1 (the next year). With some exceptions, good pitchers in one year tend to get a decent amount of playing time in the next, unsurprisingly.

But, back to the point: the table here does seem to support the notion that some types of pitchers are more consistent than others. One caveat, of course, is that we’re basing this observation off of two seasons per pitcher, and it’s safe to say that not every pitcher’s true “type” will be captured accurately in our samples.

Take a look at K%: we can say there’s a 95% chance that pitchers with a low K% have somewhere between a 0.044 and a 0.239 year-to-year (YTY) correlation in their wOBA against; meanwhile, high-strikeout pitchers have a YTY wOBA correlation between 0.244 and 0.400, also with 95% confidence. Since there’s no overlap between those ranges, we can safely say with over 95% confidence that high-strikeout pitchers are indeed more consistent than low-K% pitchers in their wOBA against. That makes sense, because strikeout rates are fairly consistent year-to-year (try K% year 0 vs. K% year 1 in the tool to see what I mean), and striking out hitters is going to be predictably good for their wOBA against.

I should note that K% did rise over the course of the sample, so just be aware of that. If you download the spreadsheet and add in columns that allow you to filter by percentiles in each season, you can minimize this issue.

HR/FB groups seem to be leaning towards low HR/FB pitchers being more consistent, but there’s too much overlap between the top and bottom groups to confirm that.

Getting closer to the original question of this article, there doesn’t seem to be much difference in the BABIP groups, year-to-year — at least not when it comes to pitchers.

It’s worth noting that high-K% pitchers tend to have low BABIP and HR/FB rates. So, it’s possible that there are some hidden effects going on here. But that’s what multiple regression is for, and what stats like SIERA and my own BERA and SBERA have attempted to address.

Anyway, let’s do the same exercise, but this time with Runs Allowed per 9 IP (RA9) as the basis for the comparison.

Statistic	Tertile	YTY Correlation in RA9			Sample Size
Statistic	Tertile	Low Estimate (CI=95%)	Base	High Estimate (CI=95%)	Sample Size
BB%	Low	0.344	0.415	0.482	557
	Mid	0.235	0.327	0.413	389
	High	0.263	0.355	0.441	372

K%	Low	0.059	0.157	0.252	391
	Mid	0.147	0.239	0.327	425
	High	0.231	0.312	0.389	502

HR/FB	Low	0.312	0.388	0.458	522
	Mid	0.295	0.390	0.477	336
	High	0.190	0.277	0.359	460

BABIP	Low	0.305	0.384	0.458	484
	Mid	0.283	0.376	0.463	352
	High	0.241	0.323	0.401	482

Overall (50+ IP)		0.323	0.370	0.416	1318

Once again, BB% seems not to matter, but this time the high-low K% difference has quite a bit of overlap at the 95% confidence level; we can’t confirm with 95% confidence that K% makes a difference.

I don’t think it’s particularly useful in this exercise, but I know some of you are going to want to know about FIP, so here it is:

Statistic	Tertile	YTY Correlation in FIP			Sample Size
Statistic	Tertile	Low Estimate (CI=95%)	Base	High Estimate (CI=95%)	Sample Size
BB%	Low	0.486	0.547	0.603	557
	Mid	0.354	0.438	0.515	389
	High	0.335	0.422	0.502	372

K%	Low	0.124	0.220	0.313	391
	Mid	0.188	0.278	0.363	425
	High	0.322	0.398	0.469	502

HR/FB	Low	0.464	0.529	0.588	522
	Mid	0.425	0.508	0.583	336
	High	0.347	0.425	0.497	460

BABIP	Low	0.392	0.465	0.532	484
	Mid	0.377	0.463	0.542	352
	High	0.419	0.490	0.555	482

Overall (50+ IP)		0.433	0.476	0.517	1318

Seeing as how BB, K, and HR are the factors of FIP, it’s no surprise that we’re seeing more of an effect from them, I think.

OK, now that we’re through with all that, let’s take a look at the same sort of analysis for batter wOBAs:

Filter	Tertile	YTY Correlation in wOBA			Sample Size
Filter	Tertile	Low Estimate (CI=95%)	Base	High Estimate (CI=95%)	Sample Size
BB%	Low	0.274	0.366	0.452	366
	Mid	0.360	0.443	0.519	397
	High	0.442	0.513	0.577	459

K%	Low	0.474	0.542	0.603	465
	Mid	0.489	0.559	0.623	407
	High	0.387	0.473	0.550	350

HR/FB	Low	0.202	0.301	0.393	350
	Mid	0.189	0.287	0.380	355
	High	0.430	0.498	0.560	517

BABIP	Low	0.401	0.481	0.553	398
	Mid	0.505	0.581	0.648	333
	High	0.478	0.543	0.603	491

Overall (50+ IP)		0.323	0.370	0.416	1222

Here, it looks as though high-BB% hitters are more consistent, though not quite with 95% certainty; I think that makes a lot of sense, seeing as how BB% is very consistent YTY for batters, along with the fact that a walk is an automatic wOBA booster.

High-K% batters may be less consistent, but it’s hard to say that with much confidence.

High-HR/FB sluggers are very significantly more consistent in their wOBA than low or even mid-HR/FB types. Besides also tending to walk a lot, these slugger types are definitely less affected by the whims of BABIP, due to those uncatchable balls they hit out into the outfield bleachers.

Unfortunately, the BABIP difference is looking iffy here. But this could once again be a job for multiple regression. Batter BABIP has some positive correlation to both speed and power, which tend to be negatively correlated with each other (but a guy with both speed and power, e.g. Mike Trout, is likely to have a high BABIP). So this could be a good situation in which to apply multiple filters, when that particular tool is released.

I’ll leave you with this bonus interactive tool that illustrates more clearly the range of possible correlations that lie within a given confidence interval, given an observed correlation and a particular sample size:

Unless your sample size is gigantic, you can only say the true correlation is the same as the observed correlation with near-0% confidence. There’s really no such thing as nailing a particular correlation down with exactly 100% confidence either, although you can come close with an extremely large sample. On the flip side, see how wide the spread between the minimum and maximum correlation estimates gets when the sample is down to four pairs of observations — if you want to make a statement with 99% confidence with a sample that small, you may have to say the true correlation is “somewhere between 1 and -1.”

Well, I hope you enjoy and use these tools well!

20 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Billy

11 years ago

Great stuff Steve. Tell me you work for a team or BIS or something, because if not, I can’t imagine it would be long before you’d have an opportunity to do so.

Steve Staude

Reply to Billy

Thank you! Nope, not yet. I’m a bit tied down to staying in Southern California, but if somebody out there wants to offer me something around here or a chance to work remotely…

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG