Outliers, Breakouts, and the Owl of Minerva

by Matt Klaassen

December 2, 2009

As part of “projection week” here at FanGraphs, this post follows Monday’s by discussing two phenomena that are often brought up in relation to projections: “Outliers” and “Breakouts.” Although they contain elements of truth, these notions are often used in problematic fashion to show that projections are “wrong.”

An “outlier” is a season that appears to differ greatly from a player’s usual performance. Some will claim that said season should be ignored when projecting a player, since it “obviously” does not represent his real skill. A “breakout” season is one in which a (usually young) player greatly exceeds expectations and/or past performance. The season is seen as establishing a new level of performance such that prior performance should be weighted much less heavily or ignored.

You may have noticed the potential contradiction. While the “outlier theory” claims that a single season deviating from an apparently established level of performance should be thrown out, the “breakout theory” claims that a single season deviating greatly from earlier seasons means that it should be looked at to the exclusion of the others. This isn’t necessarily a contradiction, as one could hold that there are particular conditions for outliers and breakouts — outliers might only apply to players in their prime, or breakouts to young players. Still, it’s worth noting, as you’ll often see the same person assert both.

The deeper and more important point is that by looking at one-year deviations as establishing a new level of performance that thus takes on a greater weight (breakout!) or as being irrelevant and thus in need of exclusion (outlier), both positions implicitly assume they already know what we’re trying to find out when projecting a player: his “true talent.” Recall the “general formula for player performance” from Monday’s post: performance = true talent + luck. The various methods that projection systems use (regression, weighted averages, age adjustments, etc.) are meant to take the (limited) data we have for a player and filter out luck in order to estimate his current true talent. These methods are predicated on the fact that we can’t pinpoint the player’s true talent given the limited performance samples we have, so we make our best estimate based on probabilities.

Labeling a single season as irrelevant or supremely relevant to estimating a player’s true talent implicitly assumes that one already knows that player’s true talent. One can certainly cite examples of each kind to support the case for a “breakout” or “outlier.” One could just as easily come up with (many more) examples of the opposite — where a perceived “breakout” or “outlier” turned out not to have the (in)significance assigned to it. But to do either obscures the important point. It is true that individual players age differently and deviate from expectations. However, projection systems only obtain the overall accuracy they have by projecting players as a whole based on the data on hand. An apparent “outlier” season from two years ago may weigh less heavily because time passing and/or, say, BABIP being regressed more heavily than other skills. An apparent “breakout” by a young player may have more impact on the projection because of age adjustment, greater playing time, etc. But projection systems do not and should not take these into account beyond their standard adjustments.

A famous German sabermetrician once wrote, “the owl of Minerva begins its flight only with the onset of dusk.” Although in retrospect we can look back on the careers of particular players and identify certain seasons as “outliers” or “breakouts,” this can only be done years later when we have a perspicuous overview of a period of a player’s career as a whole. Projection systems work in the midst of player performance without the benefit of historical perspective, and have to do the best they can based on the information at hand. Doing anything more would revoke the humble presuppositions upon which player projection rests.

You Aren't a FanGraphs Member

It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.

We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.

1. Ad Free viewing! We won't bug you with this ad, or any other.

2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.

3. Dark mode and Classic mode!

4. Custom player page dashboards! Choose the player cards you want, in the order you want them.

5. One-click data exports! Export our projections and leaderboards for your personal projects.

6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)

7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.

8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.

9. A weekly mailbag column, exclusively for Members.

10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!

We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

Click Here To Become a Member

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG