You Should Trust the Projections
Here on FanGraphs, a lot of the data we present is built around future projections. We host both pre-season and in-season forecasts from multiple systems, and these forecasts form the basis for many of our tools, including our Playoff Odds model. It’s safe to say that we like forecasts.
But I know not everyone reading this post trusts the projection systems, especially when they are forecasting things that we don’t want to be true. It’s natural to want to discard prior information when it doesn’t align with what we want to believe, and often, the forecasts serve as a wet blanket alternative to enthusiasm and excitement. What’s to like about a system that takes the fun out of breakout performances, or tells us that we need to be more patient with the guy who just looks terrible every time we watch him?
And it’s not like projection systems are infallible. Those forecasts never saw Cliff Lee coming, and they certainly didn’t anticipate Jose Bautista was going to go from career scrub to superstar at the age of 29. There are plenty of examples where players made dramatic, unexpected turns in their career arc, and using their track record to project their future would have been wrong. The fact that there are players who have radically changed their performance base allows us to extrapolate that possibility to every player who is performing in a way inconsistent with their projection.
But we just shouldn’t do it. Projections might be fun-sucking, soulless algorithms that are incapable of picking up on adjustments that players can and do make, but the reality is that these forecasts do a pretty good job of predicting the future, even for players whose current performances and projections don’t line up at all.
Mitchel Lichtman has done the heavy lifting here, and has two great posts up at his blog on this topic. I’m going to borrow liberally from the posts because the conclusions are important, but don’t short yourself and just read these excerpts; go read the entirety of both posts. It’s worth your time.
Let’s start with his post on hitters.
I have a database of my own proprietary projections on a month-by-month basis for 2007-2013. So, for example, 2 months into the 2013 season, I have a season-to-date projection for all players. It incorporates their 2009-2012 performance, including AA and AAA, as well as their 2-month performance (again, including the minor leagues) so far in 2013. These projections are park and context neutral.
We can then compare the projections with both their season-to-date performance (also context-neutral) and their rest-of-season performance in order to see whether, for example, a player who is projected at .350 even though he has hit .290 after 2 months will perform any differently in the last 4 months of the season than another player who is also projected at .350 but who has hit .410 after 2 months. We can do the same thing after one month (looking at the next 5 months of performance) or 5 months (looking at the final month performance). The results of this analysis should suggest to us whether we would be better off with Butler for the remainder of the season or with Gomez, or with Hosmer or Morse.
I took all players in 2007-2013 whose projection was at least 40 points less than their actual wOBA after one month into the season. They had to have had at least 50 PA. There were 116 such players, or around 20% of all qualified players. Their collective projected wOBA was .341 and they were hitting .412 after one month with an average of 111 PA per player. For the remainder of the season, in a total of 12,922 PA, or 494 PA per player, they hit .346, or 5 points better than their projection, but 66 points worse than their season-to-date performance.
Here, we see some evidence that perhaps the rest-of-season forecast is underweighting the recent performance relative to past track record, but it’s doing so to the tune of five points of wOBA, which is not much at all. What about guys on the other end of the spectrum? How do the projections work for guys who look totally lost at the plate?
There were 92 such players and they averaged 110 PA during the first month with a .277 wOBA. Their projection after 1 month was .342, slightly higher than the first group. Interestingly, they only averaged 464 PA for the remainder of the season, 30 PA less than the “hot” group, even though they were equivalently projected, suggesting that managers were benching more of the “cold” players or moving them down in the batting order. How did they hit for the remainder of the season? .343, or almost exactly equal to their projection.
The optimistic “don’t worry about it, it’s just a slump” forecast basically nailed the future performance of the guys who looked the worst at the plate. When a guy is 70 points under his wOBA forecast, it’s easy to think that maybe he’s dealing with an injury or some off-the-field issue that the projection system isn’t accounting for, or maybe pitchers are exploiting a flaw that they hadn’t found previously, but in the long-term, these guys just went right back to being good hitters.
But maybe you don’t care about a one month slump or hot streak. Mabe once it gets to two or three months, you don’t consider it randomness anymore, and now you think the forecasts are definitely missing something. Here’s some more data.
About half into the season, around 9% of all qualified (50 PA per month) players were hitting 40 points or less than their projections in an average of 271 PA. Their collective projection was .334 and their actual performance after 3 months and 271 PA was .283. Basically, these guys, despite being supposed league-average full-time players, stunk for 3 solid months. Surely, they would stink, or at least not be up to “par,” for the rest of the season. After all, wOBA at least starts to “stabilize” after almost 300 PA, right?
Well, these guys, just like the “cold” players after one month, hit .335 for the remainder of the season, 1 point better than their projection. So after 1 month or 3 months, their season-to-date performance tells us nothing that our up-to-date projection doesn’t tell us. A player is expected to perform at his projected level regardless of his current season performance after 3 months, at least for the “cold” players. What about the “hot” ones, you know, the ones who may be having a breakout season?
There were also about 9% of all qualified players who were having a “hot” first half. Their collective projection was .339, and their average performance was .391 after 275 PA. How did they hit the remainder of the season? .346, 7 points better than their projection and 45 points worse than their actual performance.
Again, we see some evidence that the forecasts are not putting enough weight on the recent performance, but it’s a half dozen points of wOBA, and the forecasts are much closer to reality than thinking that the current performance is their new level of talent.
But that’s hitters. What about pitchers? Pitchers can add new pitches, tweak their mechanics, move to the other side of the mound, or change arm-angles. They can add and lose velocity. They can learn a new grip on a pitch. There are so many additional variables that forecasting pitchers should be much more difficult, and we’ll need to lean more heavily on current performance rather than trusting the forecasts to the same degree that we trust them for hitters, right?
After one month, there were 256 pitchers, or around 1/3 of all qualified pitchers (at least 50 TBF), who pitched terribly, to the tune of a normalized ERA (NERA) of 5.80 (league average is defined as 4.00). I included all pitchers whose NERA was at least 1/2 run worse than their projection. What was their projection after that poor first month? 4.08. How did they pitch over the next 5 months? 4.10. They faced 531 more batters over the last 5 months of the season.
What about the “hot” pitchers? They were projected after one month at 3.86 and they pitched at 2.56 for that first month. Their performance over the next 5 months was 3.85. So for the “hot” and “cold” pitchers after one month, their updated projection accurately told us what to expect for the remainder of the season and their performance to-date was irrelevant.
MGL’s “NERA” metric is component-based, so you’re not just seeing the vagaries of team defense or ball in play luck evening out over the course of the season. You’re looking at guys who were legitimately pitching well or poorly, the Aaron Harangs and Marco Estradas of the world. And as he notes, there is even less evidence that in-season performance tells us something about pitchers who get off to a one-month start that differs from what the forecasts would have expected.
What about if we break it out by pitcher-type, based on what we know about pitchers already?
In fact, if we look at pitchers who had good projections after one month and divide those into two groups: One that pitches terribly for the first month, and one that pitches brilliantly for the first month, here is what we get:
Good pitchers who were cold for 1 month
First month: 5.38
Projection after that month: 3.79
Performance over the last 5 months: 3.75
Good pitchers who were hot for 1 month
First month: 2.49
Projection after that month: 3.78
Performance over the last 5 months: 3.78
So, and this is critical, one month into the season if you are projected to pitch above average, at, say 3.78, it makes no difference whether you have pitched great or terribly thus far. You are going to pitch at exactly your projection for the remainder of the season!
Yet the cold group faced 587 more batters and the hot group 630. Managers again are putting too much emphasis in those first month’s stats.
What if you are projected after one month as a mediocre pitcher but you have pitched brilliantly or poorly over the first month?
Bad pitchers who were cold for 1 month
First month: 6.24
Projection after that month: 4.39
Performance over the last 5 months: 4.40
Bad pitchers who were hot for 1 month
First month: 3.06
Projection after that month: 4.39
Performance over the last 5 months: 4.47
Same thing. It makes no difference whether a poor or mediocre pitcher had pitched well or poorly over the first month of the season. If you want to know how he is likely to pitch for the remainder of the season, simply look at his projection and ignore the first month. Those stats give you no more useful information.
MGL goes through and repeats the process for larger samples for pitchers as well, and finds the same thing. In-season forecasts predict future performance extremely well. Even though pitchers can make all kinds of tweaks, the projections handle the outliers better than treating them as outliers and assuming the forecasts just missed something. Even at four or five month samples, the forecasts beat season-to-date performance.
This doesn’t mean the forecasts are correct 100% of the time, of course. There are exceptions. Cliff Lee and Jose Bautista do exist. But the existence of exceptions doesn’t mean that we can use seasonal performance to identify the next Cliff Lee or Jose Bautista. If you’re setting out to find the next Lee or Bautista, you have to use something other than seasonal performance in order to identify which guys are “for real” and which guys aren’t, because the seasonal numbers won’t help you find the true exceptions to the rule. If you’re going to assume that a forecast for a player is no longer valid, you need something other than performance-deviating-from-that-forecast to support your argument.
When a true breakout like that happens, the forecasts will be slow to identify it, and will lag months behind until it finally weights the new data heavily and decides “hey, this guy is good now.” But the alternative is reacting far too quickly to seasonal performance, changing your opinions like Tony LaRussa changes relievers. Without perfect information, we’re going to be wrong on some guys. The evidence suggests the conservative path, leaning almost entirely on forecasts and putting little weight on seasonal performance, is the one that is wrong the least.
Dave is the Managing Editor of FanGraphs.
Interesting. Just a thought: how can we automatically blame the managers for giving the cold-starters less innings and at bats? Maybe some people in the cold-starting group actually WERE dealing with injuries which were helping them get off to a cold start and therefore were more likely to end up on the DL. For example, this year something didn’t seem right about Fielder, and it ends up injuries belied his cold start. Same with others.
Yes, that is true to some extent. But surely managers ARE giving cold hitters and pitchers less playing time and the hot ones more playing time solely based on their stats. After all, virtually ALL managers believe in hot and cold players (just based on the stats) and NO manager looks at player projections.
We are sure Maddon isn’t looking at player projections? (just for an example) I wouldn’t expect ’em to ‘fess up to it, for a couple of reasons.
Some changes in performance are real; some are luck. If you were a manager and couldn’t tell where a large change in performance was real or lucky, which way would you be biased? I would probably give the players the benefit of the doubt since the alternative would mean not rewarding or punishing performance. The players would have no incentive to improve if a manager relied exclusively on projections
Excellent insight, thank you. (interjecting that players would still have long-run incentives, but we human types respond much more energetically to shorter-term ones)
The projections always hate the A’s, and I think a big part of the reason why is that the projection system doesn’t know how to properly evaluate what the team does with platoons. I think weighting the data toward the stats accumulated since they started batting in the platoon might be relevant.