Let’s Make Sure We’re Honest About Projections by Jeff Sullivan February 7, 2018 Over at Baseball Prospectus, it’s PECOTA day. Now, that means a whole lot of different things, but one of the things it means is that now BP gets to unveil its projected 2018 standings. Some years ago, I used to get extremely excited about seeing the first standings projections. Then FanGraphs decided to start hosting projections year-round. They’re always right here, and for months, that information has been based on the Steamer projection system. Pretty soon, well in advance of opening day, the ZiPS system will get folded in, to say nothing of remaining transactions. The point being, having projected standings is fun. They serve to keep the mind occupied with thoughts of baseball. Projected standings aren’t just a toy for the public. I mean, the public ones are, but teams also have their own private projections, that might be better, or that might be basically the same. Team projections drive perceptions, and team projections drive transactions. We feel like we have a pretty good idea of which teams are situated to contend. We also feel like we have a pretty good idea of which teams are far away. Right now, in 2018, based in part on the projections and in part on what just happened a year ago, we have the sense we’re in an era of super-teams, which might be keeping the market slow. Other teams might not feel like they’re close enough to invest. I love having access to team projections. I use them all the time for analysis and articles. But I feel like I should remind you of the limitations. This is something I could probably write every single year. I’m sure on some level you already know what I’m going to say. Projections are pretty good. They can also end up very, very far off. I’ll get right to the point. As I’ve noted probably dozens of times, I have a spreadsheet of preseason team projections, going back to 2005. The types of projections have changed over time, so in that sense I don’t have perfect consistency of sources, but all projections are fundamentally similar. So, 13 years of data, for 390 team-seasons. Here are actual wins, against preseason projected wins. The linear relationship is obvious, and it had better be obvious, because, otherwise, boy, would we all have been wasting our time. It’s perfectly clear that projections can spot good teams and bad teams. But our R2 is 0.36. I don’t know what mark would be “good enough,” but there’s clearly plenty of over- and under-performance. And let’s say you’re of the opinion that it’s not fair to group everything together, since projections have gotten more advanced. Our overall R2, again, is 0.36. It’s been higher than that just once over the past four seasons. The projections on my sheet actually peaked in 2007. Looking at the overall sample, the average error has been about seven wins, with a median of six. Looking at just the last five years, the average error is still about seven wins, with a median of five. I don’t think we’re yet to the point where we can assign individual teams specific and individual error bars, but we know there are error bars, and they’re fairly long. It might be our own fault for not displaying any. That probably serves to suggest we have more confidence in our projected midpoints. But take a look at our Steamer-projected standings. Imagine that every projection could be off by seven wins. There’s only a 12-win difference between, say, the Red Sox and the Rays. Same with the Indians and the Twins. Fifteen wins would presently separate the Dodgers from fourth place. We know that there are some obvious favorites. They’re just all more vulnerable than they appear. I have a couple tables to get to, but first, consider that, from 2005 – 2017, there were 44 teams projected to win at least 90 games. Of those teams, 28 actually won at least 90 games, while 36 won at least 85. Four teams fell apart and finished under .500: the 2005 Dodgers, the 2008 Tigers, the 2012 Red Sox, and the 2013 Angels. I’ll note that the 2015 Nationals didn’t quite finish under .500, but they did fall a dozen wins shy of their projection, after seeming like a super-team all offseason long. Things happen. That’s the whole point of baseball. Sticking with the theme of under-performance, I looked at every team that fell short of the preseason projection by at least ten wins. This table breaks it down by year. Under Projections By 10+ Wins Year Overall .500+ 2005 3 2 2006 4 2 2007 1 1 2008 5 3 2009 6 3 2010 4 2 2011 6 4 2012 5 4 2013 4 3 2014 5 3 2015 8 5 2016 3 0 2017 4 4 Total 58 36 We’ve got 58 teams that meet the criteria, for an average of about four and a half per season. The worst year was 2015, when eight teams fell at least ten wins short. The best year was 2007. Meanwhile, we’ve got 36 teams that meet the criteria, after being projected to play at least .500 baseball. That’s an average of nearly three teams per season. Only once has there been a season without such an underachiever. Just last year, there were four: the Giants, Tigers, Mets, and Blue Jays. The Giants’ projection wound up being off by a full 25 wins, which is so far the largest gap I’ve observed. Flipping things around, we can take a similar look at over-performance. I looked at every team that exceeded its preseason projection by at least ten wins. Beat Projections By 10+ Wins Year Overall sub-.500 2005 3 3 2006 3 2 2007 2 2 2008 6 4 2009 4 3 2010 5 3 2011 4 1 2012 6 3 2013 4 2 2014 2 1 2015 6 2 2016 2 2 2017 5 4 Total 52 32 We’ve got 52 teams that meet the criteria, for an average of four per season. In the worst years — for the projections — there have been six such over-achievers. There have yet to be fewer than two. Meanwhile, we’ve got 32 teams that meet the criteria, after being projected to play sub-.500 baseball. That’s an average of about two and a half teams per season. There hasn’t yet been a year without at least one such over-achiever. Just last year, there were four: the Diamondbacks, Brewers, Yankees, and Twins. The Rockies just missed being included. Have to draw a line somewhere. Team projections serve a valuable purpose. I don’t want to stray too far from that message — they’re good, and they’re important, and they help to explain why the moves that happen happen. Projections inform the probabilities, and the teams with the strongest projections are indeed the teams most likely to keep playing in October. But for all we talk about how various teams are projected, we might not talk enough about how wide the error bars can be. How wide the error bars have been proven to be. It can feel, sometimes, like the playoff picture is already decided in March. Like we know exactly whose Septembers will be interesting. Maybe that’s how it’ll work out this time. Based on the history, though — nope. We’re not as good at this as projected-standings pages might convey. Humans happen — humans play — and they tend to do very human things.