The Predictive Power of AFL Batting Stats: A Partial Study

November 21, 2013

Despite the fact that they are generally cited as probably providing little in the way of predictive power, the batting lines of prospects in the Arizona Fall League are also frequently cited by baseball writers in discussions of those same prospects. Nor is this entirely surprising: one wants to make some sort of comment about Kris Bryant, for example, who’s just finished his own AFL season with six home runs and a .727 slugging percentage. Even after noting that he recorded those figures in just 92 plate appearances, one is compelled to suggest that Bryant’s performance was impressive. And it was, certainly, within the context of the 2013 season of the Arizona Fall League.

The present author, attempting to behave somewhat responsibly, has produced statistical reports for the AFL this fall which utilize an offensive metric (called SCOUT+) that combines regressed home-run, walk, and strikeout rates in a FIP-like equation to produce a result not unlike wRC+. By isolating and regressing those metrics (i.e. not BABIP) which become reliable in smaller samples, one reasons, it’s possible to reduce the noise otherwise present in slash lines — and perhaps to better identify how performances from the AFL might inform future major-league production.

“How successful is this (theoretically) more responsible and (definitely) more nerdy attempt to measure AFL production, to the extent that it might hold within it some manner of predictive power?” one might, perhaps already has, wondered. “Not very,” appears to be the answer.

To test how well recent AFL performances might correlate with major-league performance, I began by isolating player-seasons from the top and bottom 25% of the SCOUT leaderboards from each of the last three AFL seasons (2010-12, that is). With slightly more than 100 batters in each season, that gives us about 26 “good” hitters and 26 “bad” hitters from each of the past three years — or 79, precisely, of each.

To give the reader a sense of the difference in performance between the best and worst hitters by this methodology, here’s a table featuring the respective performances of each:

Type	Age	PA	HR%	BB%	K%	AVG	OBP	SLG	BABIP	SCOUT+
Best	22.5	82	3.2%	14.0%	14.9%	.306	.408	.496	.340	122
Worst	22.0	76	1.3%	6.8%	25.6%	.252	.310	.370	.335	79

The 79 hitters from the Best group homered about 2.5 times more frequently than, walked twice as often as, and struck out about half the rate of their counterparts in the Worst group. The strengths and weaknesses of each group are exhibited in the average slash lines, as well. The Best group posted a higher batting average (by about 50 points), higher on-base percentage (by about 100 points), and a higher slugging percentage (by nearly 130 points). Conveniently, one finds that the BABIPs for each group are almost identical as well (.340 and .335, respectively), so one needn’t make any other sort of allowances for that.

That the hitters from the Best group have been about half a year older than those in the Worst isn’t entirely suprising. All things being equal, batters who are both (a) young but also (b) closer to their peak age (i.e. about 27) are likely to perform more ably than those who are less close to their peak age.

To get a sense of how these AFL performances might be predictive of major-league performance, what I did was to identify all the players from both groups who’d graduated to the majors. Of the 79 hitters from the Best group, 41* have recorded at least one major-league plate appearance. Of the 79 hitters from the Worst group, only 31 (i.e. 10 fewer) have recorded at least one major-league plate appearance.

*Or, 42 if you count Derek Norris, who qualified as one of the best hitters in two separate AFL season.

That the former group has graduated more of its constituents to the majors probably merits some investigation. Among the possible explanations, it would seem the most likely ones are either (a) the players from the Best group are actually better, (b) the players from the Best group, being older on average, were closer to the majors anyway, or (c) the groups are equally talented, and chance is the cause for the uneven distribution.

Whatever the explanation, it’s worth noting that a higher percentage of the Best group have recorded fewer than 100 major-league appearances. Using that figure (i.e. 100 PA) as the threshold for consideration, one is left with 28 hitters from the Best group and 23 hitters from the Worst group — i.e. a difference of only five, which is less significant.

The 100-plate-appearance threshold is also convenient, insofar as it represents a sample size at which metrics like walk and strikeout rate are beginning, at least, to reflect true talent. Below are the major-league figures thus far for both of the groups in question.

First, from the Best group:

Name	Team	PA	HR%	BB%	K%	BABIP	AVG	OBP	SLG	wRC+
Wil Myers	Rays	373	3.5%	8.8%	24.4%	.362	.293	.354	.478	131
Darin Ruf	Phillies	330	5.2%	10.6%	31.2%	.333	.257	.348	.489	130
Bryce Harper	Nationals	1094	3.8%	10.7%	19.6%	.308	.272	.353	.481	128
Jason Kipnis	Indians	1480	2.6%	10.4%	19.3%	.316	.270	.349	.424	117
Jedd Gyorko	Padres	525	4.4%	6.3%	23.4%	.287	.249	.301	.444	110
Anthony Rendon	Nationals	394	1.8%	7.9%	17.5%	.307	.265	.329	.396	100
Jerry Sands	Dodgers	251	1.6%	10.4%	23.9%	.316	.244	.325	.376	100
Robbie Grossman	Astros	288	1.4%	8.0%	24.3%	.353	.268	.332	.370	97
J.B. Shuck	—	570	0.4%	6.7%	10.7%	.320	.290	.335	.359	96
Derek Norris	Athletics	540	3.0%	10.7%	25.4%	.282	.226	.315	.383	96
Didi Gregorius	—	425	1.6%	8.7%	16.5%	.296	.255	.330	.369	90
Nick Franklin	Mariners	412	2.9%	10.2%	27.4%	.290	.225	.303	.382	90
Dustin Ackley	Mariners	1471	1.5%	9.2%	18.7%	.294	.245	.315	.354	89
Charlie Blackmon	Rockies	481	1.9%	2.9%	15.4%	.332	.291	.321	.416	88
Conor Gillaspie	—	500	2.8%	8.2%	16.4%	.262	.241	.302	.381	83
Nolan Arenado	Rockies	514	1.9%	4.5%	14.0%	.296	.267	.301	.405	79
Marc Krauss	Astros	146	2.7%	6.8%	30.8%	.279	.209	.267	.366	74
Logan Schafer	Brewers	367	1.1%	7.4%	17.4%	.261	.219	.285	.336	70
Johnny Giavotella	Royals	424	0.7%	4.5%	16.7%	.284	.240	.278	.335	65
Tony Cruz	Cardinals	332	0.6%	3.9%	17.2%	.280	.236	.271	.331	64
Brock Holt	—	144	0.0%	7.6%	12.5%	.282	.250	.302	.298	64
Aaron Hicks	Twins	313	2.6%	7.7%	26.8%	.241	.192	.259	.338	63
Eduardo Escobar	—	332	0.9%	6.6%	19.9%	.280	.228	.280	.307	61
Chris Herrmann	Twins	197	2.0%	9.6%	27.4%	.248	.189	.268	.297	58
Ryan Lavarnway	Red Sox	291	1.7%	5.8%	23.4%	.256	.208	.258	.327	55
David Adams	Yankees	152	1.3%	5.9%	28.3%	.263	.193	.252	.286	45
Cord Phelps	Indians	123	1.6%	7.3%	23.6%	.195	.159	.221	.248	32
Josh Vitters	Cubs	109	1.8%	6.4%	30.3%	.154	.121	.193	.202	3
Average	—	449	2.0%	7.6%	21.5%	.285	.236	.298	.364	81

And next, from the Worst group:

Name	Team	PA	HR%	BB%	K%	BABIP	AVG	OBP	SLG	wRC+
Mike Trout	Angels	1490	4.2%	12.5%	20.5%	.366	.314	.404	.544	163
Brandon Belt	Giants	1252	2.6%	10.1%	23.0%	.339	.273	.351	.447	125
Matt Adams	Cardinals	410	4.6%	6.8%	25.4%	.332	.275	.324	.476	124
Christian Yelich	Marlins	273	1.5%	11.4%	24.2%	.380	.288	.370	.396	116
Jean Segura	—	789	1.5%	4.8%	13.6%	.321	.287	.326	.403	99
Corey Dickerson	Rockies	213	2.3%	7.5%	19.2%	.307	.263	.316	.459	98
Nick Franklin	Mariners	412	2.9%	10.2%	27.4%	.290	.225	.303	.382	90
Kirk Nieuwenhuis	Mets	422	2.4%	8.8%	30.8%	.329	.236	.305	.366	88
Brandon Crawford	Giants	1246	1.3%	7.9%	17.8%	.285	.241	.304	.346	83
Grant Green	—	153	0.7%	6.5%	28.8%	.351	.250	.301	.343	83
Derrick Robinson	Reds	216	0.0%	8.3%	20.4%	.331	.255	.322	.323	81
Anthony Gose	Blue Jays	342	0.9%	6.4%	28.1%	.336	.240	.294	.361	77
Xavier Avery	Orioles	107	0.9%	10.3%	21.5%	.286	.223	.305	.340	76
Dave Sappelt	—	274	0.7%	6.2%	14.6%	.292	.251	.301	.343	74
Devin Mesoraco	Reds	589	2.7%	7.5%	17.7%	.248	.225	.282	.359	70
Pete Kozma	Cardinals	552	0.5%	8.2%	20.7%	.292	.232	.293	.315	66
Charlie Culberson	—	127	1.6%	3.1%	23.6%	.333	.264	.286	.355	62
Adeiny Hechavarria	—	715	0.7%	4.8%	17.9%	.279	.232	.269	.311	57
Ryan Wheeler	—	161	0.6%	6.2%	19.9%	.288	.233	.280	.320	56
Andy Parrino	—	229	0.4%	12.2%	27.9%	.267	.186	.295	.242	51
Adam Moore	—	271	2.2%	3.7%	28.4%	.259	.200	.237	.310	48
Austin Romine	Yankees	168	0.6%	5.4%	25.0%	.268	.201	.248	.279	41
Jake Marisnick	Marlins	118	0.8%	5.1%	22.9%	.232	.183	.231	.248	29
Average	—	458	1.6%	7.6%	22.6%	.305	.242	.302	.359	81

What one notices — besides the fact that Mike Trout was somehow the worst at something one time* — is that the groups are more or less pretty similar in terms of major-league performance.

*His line in the 2011 edition of the AFL: 111 PA, 1 HR, 4.5% BB, 29.7% K, .245/.279/.321 (.347 BABIP).

For reference, here’s each group’s average major-league line, one next to the other:

Type	PA	HR%	BB%	K%	AVG	OBP	SLG	BABIP	wRC+
Best	449	2.0%	7.6%	21.5%	.236	.298	.364	.285	81
Worst	458	1.6%	7.6%	22.6%	.242	.302	.359	.305	81

Whereas, previously, one noted rather large differences between the two groups so far as their AFL home-run and walk and strikeout rates were concerned, here — at the major-league level — those differences have disappeared almost entirely. The Best group has homered slightly more often and — for reasons that are probably interesting, but which I’ll ignore presently — have recorded a kinda much lower BABIP. Otherwise, the offensive output has been rather similar — and, in the case of park-adjusted hitting relative to league average, precisely the same (as denoted by the 81 wRC+).

Is the Arizona Fall League good for something? Almost certainly. Are the stats produced by the players there — even those stats which become reliable in smaller samples — predictive of future major-league performance? It’s possible, yes, but if that is the case, it’s difficult to detect by this methodology.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG