Small Sample Usefulness

Over the last decade or so, the main sabermetric truisms have been, in no particular order; we like hitters with plate discipline and power, bunting is bad, the modern day bullpen is inefficient, and don’t make decisions based on small sample sizes. The latter is brought up often early in a season, when strange things happen like Mike Hampton striking out 10 batters in a game or Emilio Bonifacio hitting .600 for a week. We trot out the old “it’s early, don’t make any rash judgments” line, and work to convince people that what they’ve seen so far isn’t likely to continue.

However, like most truisms, this is often taken to a non-logical extreme. People have begun to lean on “small sample size” like a crutch that helps them defend their original position in the face of evidence that should convince them that they might not be correct. The evidence might not be overwhelming, but as it begins to pile up, remaining wedded to your preseason thoughts is just as ignorant as overreacting to the performance.

Let’s use Victor Martinez, as an example. I talked about him whacking the ball in April the other day. He’s had a great month, hitting .388/.438/.624 and generally being one of the best hitters in baseball. We can be pretty sure that Martinez won’t keep hitting this well, of course, as it is a small sample of data so far.

However, regardless of what your preseason projection for Martinez was, you should now be quite a bit more optimistic about his performance over the rest of the year than you were on Opening Day. Dan Szymborski released an updated ZIPS projection that accounts for April data (thanks Dan, great stuff!), and ZIPS now thinks Martinez will hit .305/.380/.467 the rest of the season, up from a preseason projection of .293/.366/.447. That small sample size that we’re not supposed to get excited about has increased his projection for the remainder of the season by 34 points of OPS. That’s a significant change.

April is only one month of the season. Things won’t end the way they are now. We do have to be careful about drawing conclusions on small sample sizes. However, let’s not fall into the opposite trap, either – there is useful information to be gleaned from the beginning of the season. Pretending like nothing has changed is just as uninformed as pretending like the current performances will be sustained.

Don’t hang onto your preseason projections like they’re gospel. You’ve got new information in front of you. Use it.

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.




Dave is the Managing Editor of FanGraphs.

34 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ryan B
16 years ago

When does a small sample size cease to be small? I’m sure everyone can agree that a week of Bonifacio is small, but what if one week had turned into two? or three? At what point does it become meaningful?

lookatthosetwins
16 years ago
Reply to  Ryan B

There isn’t a certain point when things become meaningful. Every bit of information has a certain amount of meaning. One day of going 4 for 4 might add a half of a point to a projected OPS. One week might add 5 points. These numbers are completely made up, but the point is that every game, every at bat is meaningful, it just becomes more so as the sample size increases.