Regression and Albert Pujols’ Slump by Piper Slowinski April 13, 2011 If you haven’t taken a statistics class, regression can be rather tricky to grasp at first. It’s a word you’ll hear bantered about frequently on sabermetrically inclined websites, especially during the beginning of the season: “Oh, Albert Pujols is hitting .200, but it’s early so he’s bound to regress.” “Nick Hundley is slugging over .700, but that’s sure to regress.” This seems like a straightforward concept on the surface – good players that are underperforming are bound to improve, and over-performing scrubs will eventually cool down – but it leaves out an important piece of information: regress to what level? The common mistake is to assume that if a good player has been underperforming, their “regression” will consist of them hitting .400 and bringing their overall line up to the level of their preseason projections. I like to call this the “overcorrection fallacy”, the belief that players will somehow compensate for their hot or cold performances by reverting to the other extreme going forward. While that may happen in select instances, it’s not what “regression” actually means. Instead, when someone says a player is likely to regress, they mean that the player should be expected to perform closer to their true talent level going forward. This makes sense when we put it on a personal level: if you normally make 5 out of 10 free throws, yet go on a cold streak and make 0 out of 10, does that mean you’re going to sink 10 out of 10 free throws your next time out to compensate? No, of course not: you’re much more likely to perform close to your true talent level and make about 5 out of 10 free throws. So if we expected Albert Pujols to hit .320 before the season began, then we should expect him to hit at a .320 level going forward. This will make his full season line slightly below a .320 batting average, since this slump is “in the bank” and can’t be changed. But wait a second – we can’t just ignore if a player is slumping or streaking, right? What if their current performance, as outrageous as it may, is rooted in some sort of skill level change? In other words, if you sink 0 of 10 free throws one day, should we expect you to perform at your normal average (5 of 10), or slightly below that (4 of 10)? There’s no simple answer to this question; it depends on a lot of different variables. How long was the streak? How old is the player? Are there reasons to believe their performance could be for real (young player breaking out, old player declining)? This is where the ZiPS in-season projections come in handy, as they take a player’s current production over the season and include it when projecting a player’s true talent level at the moment. Albert Pujols has definitely been slumping so far this season; he’s posted a .245 wOBA, a .067 ISO, and walked in only 8% of his at bats. Cardinals fans are no doubt concerned, as the Cards have already been hit with bad luck this season with Adam Wainwright’s season-ending injury and Matt Holliday’s freak appendectomy. If Pujols isn’t hitting, their overall offense becomes a lot less productive and their chances of making the playoffs decrease even further. But thankfully, Pujols has a much larger history of being awesome, so ZiPS has barely adjusted Pujols’s talent level as a result of this slump: BB% ISO wOBA Preseason Proj. 14.9% 0.278 0.415 Updated Proj. 14.8% 0.270 0.411 If Pujols performs at his updated projection level for the remainder of the season, it means that this initial slump would drag his overall wOBA down to .398 – the first time he’s ever posted a wOBA below .400. However, Cardinals fans should be encouraged that Pujols has so far shown no degradation of his batting eye: plate discipline statistics normalize quickly (around 50 PA) and Pujols is sitting right around his career norms in every category. Separating slumps or hot streaks from talent level can be difficult with young players that have little exposure at the major league level, but the in-season projections work very well for the majority of players. So if you see a player streaking or underperforming early this season, take it with a grain of salt – they’ll likely regress, and now you’ll know how much. For more on regression, see the FanGraphs Saber Library page on Regression to the Mean.