I don’t think I’m all that different from most fans who glance at stats — when I see them, I automatically tend to view them as a player’s real talent. But one thing I’ve taken away from my reading of baseball analysts far more intelligent than I (granted, that’s not a very high standard), is that there’s an important distinction to be made between observed performance and true talent. Past performance should certainly inform how we estimate future performance. But it isn’t enough on its own. One of the most important tools for estimating true talent relative to observed performance and its sample size is regression to the mean. A good place to start reading with reference to the current discussion is The Book.
One bad habit many of us might get into it looking at the platoon splits of two players at the same position, one with a career wOBA of .390 vs. RHP, the other with a career wOBA of .400 vs. LHP, and thinking, “Wow, that platoon would be almost as good as Ryan Braun!.” It isn’t that simple. As in most other things, regression shows us that the distance from average is closer than it appears. Technical explanations aside, I’ll simply summarize what is relevant for estimating platoon skills.
How much we regress depends on the variation of skill in the relevant population. The less variation there is, the more likely deviations from the mean are random occurrences. Practically speaking, left-handed hitters display more variation in platoon skill than right-handed hitters, so in estimating the platoon skills of left-handed hitter, we use less regression.* According to The Book, we regress lefties’ platoon skills against 1000 PA against LHP of league average splits for left-handed hitters, and righties against 2200 PA against LHP. This means that when hitters have less than 1000/2200 PAs vs LHP, we estimate their platoon skill to be closer to league average than to their observed platoon performance. In practical terms, it also means that for righties, we’re usually safe in assuming they have near-average platoon skills.
* Switch-hitters display the most platoon skill variation as a population, but that is a can of worms for another day. The Book says that after 600 career PA against LHP, one has a pretty good idea of a switch-hitter’s platoon skill.
Some concrete examples might help. For my league average, I’ve taken MLB-wide splits from 2007 to 2009 from Baseball Reference and converted them to wOBA. This is just going to be a very basic demonstration, as, e.g. I wasn’t able to exclude pitchers from the splits, or remove switch-hitters, or leave out steals, weighted, and so on, but I think it will give the general idea. From 2007 to 2009, the average wOBA split for left-handed hitters was about 8.6%, and for right-handed hitter, about 6.1% (following The Book [I think], I use a percentage split to avoid potential logical absurdities and to reflect the reality that better hitters usually have larger splits.
We’ll begin with everyone’s favorite example of a “big splits” guy: Curtis Granderson. For his career, Granderson is a .358 wOBA hitter. However, while he has hit a robust .380 vs. RHP, in 685 versus LHP, he’s been 2009 Yuniesky Betancourt with a .270 wOBA. That’s a whopping 110 points of wOBA difference, about 30.7% in observed performance.
But remember — skill is closer to average than it appears. Regressing Granderson’s 685 PA of 30.7% against 1000 PA of league average (8.6%) — (.307*685+.086*1000)/(685+1000) — we get an estimated platoon skill of 17.6%. “Centering” the split is a bit of a challenge, but I weighted it by the number of PAs the player has against LHP in his career (for Granderson, about 23.7%). For Granderson’s split, then, I have +4.2% vs. RHP, and -13.4% vs. LHP. Applying this to his 2010 CHONE projection of .359 wOBA, we’d forecast his 2010 wOBA against RHP as .374, and against LHP as .311. .311 is below average, but it’s far better than .270, and given Granderson’s skill in the field, you’d be hard-pressed to find a right-handed platoon partner that would offer an overall advantage to just playing Granderson. You’d also need a pretty good right-handed bench bat in order to overcome the “pinch-hitting penalty” when hitting for Granderson.
For a right-handed example, let’s use Ryan Garko, recently acquired by the Mariners as a platoon 1B/DH. Garko’s career wOBA is .347, .332 vs. RHP in 1229 PA, and .382 vs. LHP in 485 PA — a 14.4% difference. But he’s a righty, so we regress toward 2200 PA of the average (6.1%): (.144*485+.0611*2200)/(485+2200) for an estimated platoon skill of 7.6%. Using the CHONE projection of .345 wOBA, we’d estimate Garko to be a .338 hitter versus RHP, and .364 versus LHP. That’s a good hitter versus lefties, and while the .338 isn’t great for a 1B/DH, it isn’t as if he’s helpless against RHP.
Before I call it a post, I thought it would be interesting to quickly estimate the platoon skills of two players who have “reverse” splits for their careers.
Right-handed hitting Matt Holliday has a career wOBA of .400, but has hit .402 vs. RHP (2793 PA) and and .377 vs. LHP (845 PA), a -6.3% split (negative indicating “reverse”). After regression, we get a 2.7% estimated platoon skill. Given CHONE’s .389 wOBA forecast for Holliday, we’d estimate his skill as .387 wOBA vs RHP, and .397 vs. LHP. Not quite a “reverse,” but you don’t really want to “burn” a ROOGY against Holliday, either.
Colorado’s Ian Stewart has a career .337 wOBA, .334 vs RHP (655 PA) and .346 vs LHP, a -3.6% split. After regression, it comes to a 6.7% split. Given CHONE’s .358 wOBA forecast, we’d expect Stewart to his around .363 vs. RHP and .339 vs. LHP, a nice split for a lefty, but not a reverse one.
Like all forecasts, these are estimations (and crude ones, at that). To be more thorough, we’d have to assign confidence intervals/reliability scores. We’ simply trying to minimize our error. But keep in mind that splits in the retrospective mirror are almost always smaller than they appear.
[Note: After completing this post, I realized that Tom Tango had already posted about this on his blog, using Granderson as an example. D’oh. Fortunately, my results are almost exactly the same]