Playing the Regression Game the Right Way

July 9, 2014

Speculation is part of the fun of watching and discussing sports. “Seriously, if he can keeping hitting like that and the other guy gets his act together at the plate, this offense is really going to be fun to watch in the second half.” This sort of casual speculation is not incompatible with trusting the projections. The line between speculation and analysis-based projection may not always be clear, though, and there are cases when it can be crossed to fit one’s own views.

It is all part of what I call the “Regression Game.” Okay, maybe that is not a great name, but it beats something like “The Process of Filtering Through Projections And Determining Collective True Talent.” Whatever the proper name, one can find it in various forms, particularly in the middle of the season. I am not saying one should never try to take into account various scenarios, that is, to play the game. Just like baseball, though, we need to make sure we are being consistent and playing the regression game the Right Way.

Projections are simply estimates of true talent. Projections, while they can be adjusted by recent performance during the season, can not take everything into account. Most understand that some players are going to play better than their projections and some are going to play worse. Projections are rarely going to “nail” a player’s performance. One reason is uncertainty regarding the player’s true talent level, and another part is random variation. There are many well-known limitations of projections. It is simply a way of noting that it is fine to try and fine tune projections provided good reasons. Those reasons need not just be numbers-based. Scouting opinions are obviously a big part of baseball, and although professional scouts are great, this is not to say amateur scouting opinions are necessarily invalid, either. After all, if there can be amateur saberists who turn out to be good enough to work for teams, the same can obviously be true of amateur scouts.

That is all a prelude to saying that one can have reasons point to certain players as likely to be better or worse in the future than one might normally glean from public projections, or the last few seasons of performance, or their peripherals, or whatever else one might take as in indicator of true talent. Being more specific is a good thing. But while applying such techniques selectively might lead to interesting hypothetical scenarios, sometimes it is done in a manner that adds very little, analytically.

To illustrate the point, let’s look at the top three teams in American League Central: Detroit, Kansas City, and Cleveland. Kansas City is four and a half games back at the moment, and Cleveland is six and a half, and their projected odds of catching Detroit are pretty low. Still, it could happen, we have seen crazier runs, and the sort of writing that utilizes selective regressions/projections I am thinking often are not all that concerned with these odds. They are often looking at “what ifs.”

For each of these three teams, I am going to present relatively brief write ups. These are not my own personal analyses. They are simply examples of what someone might reasonably think based on recent performance, projections, peripherals, or something of the sort. Although the style of the write-ups have been “inspired” by things I have read, they should not be taken to reflect anyone’s work in particular. I think they are close enough to reality to illustrate the point. For each of the three teams, I will give a positive and negative take. The issue is not whether these particular tidbits are right or wrong, so try not to get hung up on that sort of thing. The idea is to see different ways the regression game can be played when combined in certain ways.

Tigers, Positive Version: “Victor Martinez is having the best year of his career. Miguel Cabrera is hitting great, too, but he has not yet approached the level of the previous three seasons, so it is not unreasonable to think he could be even better as the season continues. J.D. Martinez is showing that he is a major leaguer, perhaps a very good one. Torii Hunter may be old, but he has to be better than this given what has done the last few seasons, and Austin Jackson will come around — he has never been this bad.”

“As for the pitching, Rick Porcello is finally showing why he was such a highly touted pitching prospect and Anibal Sanchez has been excellent since returning from the DL. As good as Max Scherzer has been, he could be even better if his ERA catches up to his FIP like it did last year. There is no getting around Justin Verlander’s lackluster performance so far this season, but he was one of the best pitchers in baseball from 2009 to 2012, and even last season’s performance was only bad compared to that lofty standard. It was still really good. I’ll stick with Verlander’s long history of greatness over what is probably just a half-season slump. Joe Nathan has been bad this year, but it is still just thirty-something innings, and I trust his long history as a dominating closer.”

Tigers, Negative Version: “Victor Martinez is hitting so far over his head it is not even funny — he has never hit for this kind of power before. Miguel Cabrera is still great, but was was not going to maintain a .400+ wOBA forever, and at 31, he might be merely a great hitter rather than an unstoppable one. Does anyone really think that J.D. Martinez is really the answer? The Tigers lucked out there while waiting for Andy Dirks to come back… eventually. Torii Hunter looks like he is toast. As for Austin Jackson, without his power and BABIP luck, his mediocre plate discipline may finally be catching up with him.”

“The rotation is still good, but it is not what it once was. Max Scherzer is still good, but last year he had a BABIP advantage that he has not really had over his career, so expecting his ERA to catch up with his FIP probably is not realistic. Anibal Sanchez has looked like himself since coming back from injury, but shades of his old health issues may come back to haunt the Tigers. Rick Porcello is finally having a good year, but after four previously disappointing seasons, one half of a good season is too soon to say he has figured out how to have a low BABIP. As for Justin Verlander — he looked like he was on the decline last year, and this year the strikeout and walk rates indicate that he may very well be done as anything more than a mid-rotation starter. Joe Nathan looks like another lesson in why teams should not blow big cash on so-called proven closers. He might just be finished.”

Kansas City, Positive Version: “The offense has been terrible, no doubt, but on the bright side, it can’t get much worse. In fact, with guys like Lorenzo Cain and Alcides Escobar having nice years at the plate, there is definite hope. Eric Hosmer and Billy Butler have to be better than this based on their past performances. That is even true of Mike Moustakias, who has been heating up a bit lately. Omar Infante is probably better than this, but even if he is not, Christian Colon might just be the answer at second base.”

“Even the pitching, which has more than held its own this season, could get better. Yordano Ventura has lived up to the hype, and Danny Duffy’s sub-three ERA has been impressive as well. Jeremy Guthrie and Jason Vargas continue to show that they can outperform their peripherals. James Shields has not pitched very well recently, but he is probably closer to the previous two or three seasons in true talent than his numbers so far in 2014.”

Kansas City, Negative Version: “The offense probably can’t be quite this bad (or can it?), but how much better can it really get? Billy Butler’s lack of athleticism might be catching up with him, and what little power he has displayed in the past is gone. Other than short stretches, Mike Moustakas and Eric Hosmer have both looked lost at the plate this year, with Moose being sent down and Hosmer looking like he needs to be. Sure, Lorenzo Cain has good numbers so far this year, but his plate discipline is still poor, and do you really think he became a .390-plus BABIP hitter overnight? Alcides Escobar has had runs like this before, but he still has a career 74 wRC+, so don’t get too excited. Omar Infante is 32, so at this point he may just be hitting like the part-time middle infielder he really is. As for Christian Colon, try not to jump on the bandwagon before looking at his minor league numbers and comparing them to Johnny Giavotella’s.”

“Obviously, the Royals’ pitching has been good, but it is far from clear how the rotation can keep this up. Yordano Ventura has been everything Kansas City hoped, but he is probably getting a bit lucky with runners on base. Moreover, Ventura never threw more than 135 innings in a single minor league season, so there are legitimate questions about how he might pitch down the stretch. Given Danny Duffy’s recent past, his ability to stay the rotation for the rest of the season has to be questioned, and if you leave aside his numbers as a reliever this year and just look at his numbers as a starter, they are pretty mediocre. Guthrie and Vargas might have a history of outperforming their peripherals, but they have neither has really done it to quite this extent, and given their ages, one or both could be in for a big reality check in the near future. As for Shields, as good as he has been in the past, something really seems to be wrong. His strikeout rate is down, and that is rarely a good sign.”

Cleveland, Positive Version: “Michael Brantley and Lonnie Chisenhall are both having breakout seasons. Carlos Santana got off to a horrible start, but he has been coming around, and his line projects improve as the season continues. Jason Kipnis is just rounding into form after coming back from injury. Nick Swisher is clearly aging, but there is almost no chance, given his past history, that he is going to finish this year with anything like his current line.”

“The pitching continues to be a question mark, but there are reasons to think it will get better. Corey Kluber was a surprise last year, and this year has been even better, showing that he is no flash in the pan. Justin Masterson’s ERA looks bad, but even with his control issues, his FIP indicates that he is likely going to pitch better than this going forward. Josh Tomlin has pitched well enough to be a mid-rotation starter this year, and even if Trevor Bauer is not blowing people away this year, he has pitched well enough to solidify his spot in the rotation.”

Cleveland, Negative Version: “Does anyone really buy into Michael Brantley and Lonnie Chisenhall after one half of a season? Neither has done anything like this before in the major-league careers. Yeah, Santana was just in a slump to start the year, but expectations for how much he will rebound should be lowered — the data from the beginning of the year still counts. Jason Kipnis might get a little better, but if you look at this season’s line, it looks a lot like his 2012 line, so 2013 is likely just a career year. Nick Swisher looks like he is just about finished.”

“Corey Kluber has clearly proven he belongs, but come on, his true talent might be something like he showed in 2013, but there is nothing in his past to indicate he is a true-talent, sub-3 ERA pitcher. He is going to come back to earth. Justin Masterson has often had trouble getting his ERA to live up to his FIP, this is nothing completely new. This year, his walk rate is reaching the point of being scary. Josh Tomlin has been pretty impressive so far this year, but there is almost nothing in his history prior to this season that makes one think he can continue to strike out hitters at this rate. Trevor Bauer is holding his own, but just barely, and it is easy to see him being worse — perhaps much worse — as the season wears on.”

Try to leave aside your own thoughts on what you agree with or not here, which is not the issue. Nothing in the different takes is completely out to lunch, or at least not unusual for the sort of thing you might read in an mid-season report card. Notice two things. First, and more obviously, one could take any one positive take and any two negative takes and you can get a loose scenario in which Detroit runs away with the division, or Kansas City or Cleveland makes a run at it.

Second, and closer to the point, is how each “take” mixes ways of evaluating a player. For some players, the current season is take as their likely performance going forward for the rest of the season. For other players, the are assumed to be likely to play better or worse as the season goes on because of their past history, peripherals, or other reasons. The point here is not to decide which way is best, but to note the inconsistency. It may seem obvious when laid out like this, but when reading mid-season report cards, it happens all too often.

Such reports are usually couched as hypothetical scenarios, and they are true enough as such. For example, if Cleveland’s pitching can hold on, Chisenhall and Brantley keep hitting like this, and Kipnis, Swisher, and Santana’s lines improve, while Detroit and Kansas City do not improve or even play worse as in the negative scenarios for those teams, then Cleveland has a good shot. One can pick whichever combination one wants. It is fun to talk about, but does it really add all that much to an analytical discussion? It seems to be just another way of saying something like this: “if Team A plays better, and Teams B and C play worse, then Team A will close the gap.” It is true, but in a somewhat trivial sense.

This is not meant that we should just be boring and always assume that some projections or set of them is the only thing that matters. Projection systems can not “see” everything. One might have very good reason to think that, say, Rick Porcello has finally turned the corner, that James Shields is not what he used to be, or that Lonnie Chisenhall is now a superstar hitter in ways that are be taken into account by projections. Those reasons might be statistically based or not. But to add anything of analytical interest, the reasons need to be given. And such reasoning needs to be consistent, or the differing application of reasons need to be explained. If one player’s past performance is ignored in favor of the current, we need to have a reasons why it is in his specific case, especially when past performance is taken to be relevant to other players in the same breath (or paragraph).

This does not mean hypothetical speculation should be forbidden. But we should recognize its limits. Projections, too, have limits, but at least they are consistent in their methodology. But be wary of the temptation to confuse “if Team X reverts to the mean in those things they have done poorly, and does not in the areas in which they are overperforming, they will be better” with substantial analysis. I am sure I have done it, too, but it is something I want to avoid. Just as baseball players should strive to play their game properly, we need to play the regression game the right way.

16 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

jim S.

10 years ago

So, most people don’t know what they are talking about. We knew that already. Just keep in mind that most of those we are reading or listening to are being paid to do what they are doing. They have to write SOMETHING, right?

-1

KK-Swizzle

Reply to jim S.

Yes, but Matt’s point is that there are better and worse ways to do it, and that writers should strive for improving the quality of their prognosticating.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG