Like my work last year around pitcher aging and velocity decline, I am always looking for reliable indicators or signals of change in players. One thing I’ve been interested in trying to better understand are changes in performance that might signal or herald a large droop-off in performance in the following year.
So, one Saturday morning I decided to do some statistical fishing.
I took three year trended data for hitters with greater than 300 plate appearances from 2007 through 2012 and identified hitters whose wOBA declined by at least .030 points from one year (YR2) to the next (YR3). I then backed up and looked at the change in various statistics between YR1 and YR2 to see if there was a way to better predict those players that were poised to experience a large decline in their wOBA in YR3.
I even had a nifty nickname ready for the eventual statistic–CLIFFORD. (The idea being that the statistic would help predict hitters likely to “fall off a cliff” offensively. Hey, when you do this kind of work as much as I do, it just comes with the territory.)
I calculated quartile values for the year-over-year change in about 40 statistics, including various plate discipline and PITCHf/x metrics. I then calculated the relative risk that a hitter would put up a wOBA at least .030 points lower in YR3 if their performance in each of those statistics decreased from YR1 to YR2 by at least as much as the 25th percentile.
Here are the metrics with a relative risk over 1:
|Metric||Relative Risk of YR3 Delcine|
I then decided to see if there was some combination of metrics that would provide decent predictive power. So, I tried a few combinations, looking to classify hitters as “cliff” candidates if they had at least three of the following four:
- YR2 – YR1 at or below the 25th percentile in Z-Contact%(pfx)
- YR2 – YR1 at or below the 25th percentile in UBR
- YR2 – YR1 at or below the 25th percentile in Spd
- YR2 – YR1 at or below the 25th percentile in FA%
Why did I choose these four? First, I needed to start somewhere. Second, they were the four metrics with the highest relative risk to players that declined by more than .030. And, third, they made sense as a starting point. Players that suffer a large decline in zone contact could be experiencing a decline in bat speed. The two base running metrics (UBR and Spd) might help predict a little bit of aging, but more importantly just a general decline in speed and quickness which would impact performance. Four-seam fastball percentage also made sense to me, since hitters that faired better against straight fastballs might have trouble adjusting if pitchers stopped throwing them as much.
I then calculated the relative risk of a player’s wOBA declining by more than .030 if they were tagged as a CLIFFORD candidate to those who were not tagged:
|Group||% Declining by more than .030|
I here’s where I got excited. This methodology identified players that were twice as likely to see their offensive performance decline sharply. Not only did I get excited, I also got ahead of myself. I took to twitter, teasing that I had this wonderful new toy that could help identify decline candidates. I had the article all written in my head. I was all set to go.
But then I stopped and thought, well, maybe projection systems already do as good of a job predicting this type of decline. Maybe even Tango’s Marcel does–I just had assumed they didn’t. The hypothesis, then, was that CLIFFORD would produce a higher relative risk regarding declining players than Marcel.
So I went back and matched up Marcel projections for a players in the data set and identified those players that Marcel thought would decline by at least .030 and then calculated the relative risk compared to players not tagged by Marcel as decliners:
|Group||% Declining by more than .030|
Stymied, thwarted, mission aborted.
For all my calculations and gyrations, simply looking at Marcel projections would have done just as good–in fact, slightly better–of a job at identifying cliff candidates*.
Why am I sharing this negative (in some sense, embarrassing) result with you fine readers?
For starters, we don’t do enough of this. By we, I mean any researcher in just about any discipline–not just baseball analysis. More and more, journals, newspapers, online forums, and conferences are filled with the reporting out of positive results (“hey, look, I confirmed my hypothesis! I discovered a new thing!”). And while positive results are interesting and important in their own right, negative results are (or, should be) just as important. Any discipline progresses in large part by falsifying hypotheses, replicating results, and figuring out what doesn’t work so it can focus on what might.
Of course, I’m just as guilty of this as the next guy. Hopefully this will inspire more people to write up and post their own failures, since they can move us forward just as much as the successes.
*A few commenters have rightly noted that given the lack of overlap between CLIFFORD and Marcel decline candidates there is still a chance of something interesting there. I still think the “failure” story is worth telling, but I will be looking into what use we might still be able to pull out of CLIFFORD in the near future.
Bill leads Predictive Modeling and Data Science consulting at Gallup. In his free time, he writes for The Hardball Times, speaks about baseball research and analytics, has consulted for a Major League Baseball team, and has appeared on MLB Network's Clubhouse Confidential as well as several MLB-produced documentaries. He is also the creator of the baseballr package for the R programming language. Along with Jeff Zimmerman, he won the 2013 SABR Analytics Research Award for Contemporary Analysis. Follow him on Twitter @BillPetti.