# Estimating Hitter Platoon Skill

I don’t think I’m all that different from most fans who glance at stats — when I see them, I automatically tend to view them as a player’s real talent. But one thing I’ve taken away from my reading of baseball analysts far more intelligent than I (granted, that’s not a very high standard), is that there’s an important distinction to be made between observed performance and true talent. Past performance should certainly inform how we estimate future performance. But it isn’t enough on its own. One of the most important tools for estimating true talent relative to observed performance and its sample size is regression to the mean. A good place to start reading with reference to the current discussion is The Book.

One bad habit many of us might get into it looking at the platoon splits of two players at the same position, one with a career wOBA of .390 vs. RHP, the other with a career wOBA of .400 vs. LHP, and thinking, “Wow, that platoon would be almost as good as Ryan Braun!.” It isn’t that simple. As in most other things, regression shows us that the distance from average is closer than it appears. Technical explanations aside, I’ll simply summarize what is relevant for estimating platoon skills.

How much we regress depends on the variation of skill in the relevant population. The less variation there is, the more likely deviations from the mean are random occurrences. Practically speaking, left-handed hitters display more variation in platoon skill than right-handed hitters, so in estimating the platoon skills of left-handed hitter, we use less regression.* According to The Book, we regress lefties’ platoon skills against **1000 PA against LHP** of league average splits for left-handed hitters, and righties against **2200 PA against LHP**. This means that when hitters have less than 1000/2200 PAs vs LHP, we estimate their platoon skill to be closer to league average than to their observed platoon performance. In practical terms, it also means that for righties, we’re usually safe in assuming they have near-average platoon skills.

** Switch-hitters display the most platoon skill variation as a population, but that is a can of worms for another day. The Book says that after 600 career PA against LHP, one has a pretty good idea of a switch-hitter’s platoon skill.*

Some concrete examples might help. For my league average, I’ve taken MLB-wide splits from 2007 to 2009 from Baseball Reference and converted them to wOBA. This is just going to be a very basic demonstration, as, e.g. I wasn’t able to exclude pitchers from the splits, or remove switch-hitters, or leave out steals, weighted, and so on, but I think it will give the general idea. From 2007 to 2009, the average wOBA split for left-handed hitters was about **8.6%**, and for right-handed hitter, about **6.1%** (following *The Book* [I think], I use a percentage split to avoid potential logical absurdities and to reflect the reality that better hitters usually have larger splits.

We’ll begin with everyone’s favorite example of a “big splits” guy: Curtis Granderson. For his career, Granderson is a **.358 wOBA** hitter. However, while he has hit a robust **.380** vs. RHP, in 685 versus LHP, he’s been 2009 Yuniesky Betancourt with a **.270** wOBA. That’s a whopping 110 points of wOBA difference, about **30.7%** in observed performance.

But remember — skill is closer to average than it appears. Regressing Granderson’s 685 PA of 30.7% against 1000 PA of league average (8.6%) — **(.307*685+.086*1000)/(685+1000)** — we get an estimated platoon skill of 17.6%. “Centering” the split is a bit of a challenge, but I weighted it by the number of PAs the player has against LHP in his career (for Granderson, about 23.7%). For Granderson’s split, then, I have +4.2% vs. RHP, and -13.4% vs. LHP. Applying this to his 2010 CHONE projection of .359 wOBA, we’d forecast his 2010 wOBA against RHP as .374, and against LHP as .311. .311 is below average, but it’s far better than .270, and given Granderson’s skill in the field, you’d be hard-pressed to find a right-handed platoon partner that would offer an overall advantage to just playing Granderson. You’d also need a pretty good right-handed bench bat in order to overcome the “pinch-hitting penalty” when hitting for Granderson.

For a right-handed example, let’s use Ryan Garko, recently acquired by the Mariners as a platoon 1B/DH. Garko’s career wOBA is **.347**, **.332** vs. RHP in 1229 PA, and **.382** vs. LHP in 485 PA — a 14.4% difference. But he’s a righty, so we regress toward 2200 PA of the average (6.1%): **(.144*485+.0611*2200)/(485+2200)** for an estimated platoon skill of **7.6%**. Using the CHONE projection of **.345 wOBA**, we’d estimate Garko to be a **.338** hitter versus RHP, and **.364** versus LHP. That’s a good hitter versus lefties, and while the .338 isn’t great for a 1B/DH, it isn’t as if he’s helpless against RHP.

Before I call it a post, I thought it would be interesting to quickly estimate the platoon skills of two players who have “reverse” splits for their careers.

Right-handed hitting Matt Holliday has a career wOBA of **.400**, but has hit **.402** vs. RHP (2793 PA) and and **.377** vs. LHP (845 PA), a **-6.3%** split (negative indicating “reverse”). After regression, we get a **2.7%** estimated platoon skill. Given CHONE’s .389 wOBA forecast for Holliday, we’d estimate his skill as **.387 wOBA** vs RHP, and **.397 vs. LHP**. Not quite a “reverse,” but you don’t really want to “burn” a ROOGY against Holliday, either.

Colorado’s Ian Stewart has a career **.337 wOBA**, **.334** vs RHP (655 PA) and **.346** vs LHP, a **-3.6%** split. After regression, it comes to a **6.7%** split. Given CHONE’s **.358 wOBA** forecast, we’d expect Stewart to his around **.363** vs. RHP and **.339** vs. LHP, a nice split for a lefty, but not a reverse one.

Like all forecasts, these are estimations (and crude ones, at that). To be more thorough, we’d have to assign confidence intervals/reliability scores. We’ simply trying to minimize our error. But keep in mind that splits in the retrospective mirror are almost always smaller than they appear.

*[Note: After completing this post, I realized that Tom Tango had already posted about this on his blog, using Granderson as an example. D’oh. Fortunately, my results are almost exactly the same]*

Matt Klaassen reads and writes obituaries in the Greater Toronto Area. If you can't get enough of him, follow him on Twitter.

What about pitcher platoon splits? I have a few questions that I’ve been unable to answer myself, so perhaps you can help.

1. Do pitchers exhibit splits in a similar manner (similar effect, similar distribution of effect, etc.)

2. When a pitcher with a strong split faces a hitter with a split, is the effect combined?

The only reason I could see the pitcher not exhibiting splits would be if the true effect actually resided in the hitter, i.e., the pitcher is handedness-neutral, while the hitter’s altered ability creates the delta in results. This almost makes the most sense to me, personally, what with we know about things like pitcher BABIP, etc.

Other question: in your Granderson example, isn’t it possible that he exhibits a huge platoon split as HIS true talent, whereas your analysis is only in the aggregate? After all, if we were talking about regressing to the mean in the HR category, we wouldn’t pick out Albert Pujols to suggest anything, would we? I understand the sample size concerns, but it seems to me that while the aggregate split might be minimal, a guy like Granderson (or Ryan Howard is another one who comes to mind, although I don’t have the data handy) is a bad example to illustrate this, as they both look to be rarities in the platoon-split component.

Travis —

1. According to The Book, not only do pitchers exhibit splits, but they can be measured much more reliabily — at around 700 PA(BFP) vs. LHP for RHP, and 450 vs. LHP for LHP. In The Book, they use wOBA generally — I’m not sure if the same numbers could be used for FIP or RA or whatever, but the general principle is the same.

2. I would assume that the effect would be combined. I’m not sure exactly how it would go, but my guess is that one would figur eit using something like the Odds Ratio Method. I’m not an expert on that (or anything else), but see

http://www.insidethebook.com/ee/index.php/site/comments/the_odds_ratio_method/

As for the Granderson/Howard example — yes, it is POSSIBLE that Granderson, Howard, or any hitter, or any skill for any player that deviates greatly from the mean. But the question, as in any of these cases, is — how to you know (without hindsight) which player is the exception to the general rule? In the regression above, we still have Granderson with a larger than usual split. That accounts for the “rarity” as best we know how. Is it your “gut” telling you this when you lok at the numbers?

I don’t any of that in a sarcastic manner, but that’s the whole point of looking at data as a whole, in group context — I’d need to know why we’d want to ignore the usual methods in these particular cases.

Again, note that this is an “estimation” of the skill — it’s making the best estimation based on the same methods we use for all players (and again, this does _not_ ignore Granderson’s numbers — that’s why we estimate a 17.6% split for him instead of an average 8.6). By applying the same methods to all players, we try to minimize our error as a whole. But I’d like to know how people would suggest how to pick out the exceptions _ahead of time_ who aren’t going to regress, then have them do that for a large number of players, and have reasons for why they picked the players they did.

How about using a learning model? In a sense you are doing it: you are putting lesser weight on the average (8.6) with every passing year (1000/1685 as of now), and more weight on the player’s own numbers (685/1685 as of now), but this seems to be somewhat arbitrary unless there is good theoretical foundation behind it.

Why not using something like a Bayesian learning? I.e., use the prior distribution of 8.6 percent (with the adjoining standard deviation) to begin with, but after each period, update the distribution for the player using Bayes rule to obtain the posterior distribution, and keep doing after every year.

It would be hard to pick out those outliers, but I think that is where scouting can come into play. Or, at the very least, a scouts recommendations can give you more things to consider and another perspective beyond looking at a player’s splits from a pure statistical viewpoint. I’m not entirely sure how they would do this, but I’m also not a scout. A stats guy is supposed to provide the research and stats, like you correctly did in this article. And then the scouts are supposed to give the GM their recommendations. Maybe Granderson has some mechanical flaw against lefties that prevents him from hitting them well and maybe his true talent level against them will end up being below a .300 wOBA? From a statistical perspective, it is too early to tell. But, scouting and statistics/research should be, imo, split about 50/50 when making roster decisions, so the statistical point of view only gives one side of the argument.

Scottwood:

I completely agree that scouts (and Pitch f/x and Hit f/x “heat zone sort of stuff) come into play here. Of course, the challenge would be to integrate that with a statistical analysis.

However, since almost all LHH are worse against LHP, I would guess that most of them also have more holes in their swing against LHP, etc., so one would need to show 1) how Granderson, Howard, or whoever has “worse” mechanical flaws — and it would need to be other than saying, “well, they just do, just look at their platoon splits numbers,” since the numbers are what we’re supposedly trying to get beyond, and 2) again, how that integrates into estimating the platoon skills.

The possibility isn’t denied by this method of estimating skills. But I would object to the claim that someone can tell “just from the numbers” that someone is an exception to the rule.

I would agree 100% that you cannot tell from the numbers that someone is an exception to the rule or an outlier. If I implied anything but that in my post, then I apologize b/c that was not be intention. My main point was that this is an area where a GM can lean on his scouts to see if they can see anything, b/c looking strictly at the numbers won’t less us conclude all that much.

In the grand scheme of things, it sucks b/c this is a tough area to get a read on. But, there will always be outliers and the best teams and front offices find ways to discover those outliers and use them to their advantage. A combination of looking at various data points from Hit F/x and Pitch F/x might help and leaning on team scouts is always a nice option. This is just another area where we need to grow and learn more about in the coming years.

Scottwood:

No need to apologize at all, even if we were disagreeing, since disagreement is part of rational life.

IN any case, I don’t even think we’re disagreeing!

Thanks for the thoughtful comment.

Great reply, thanks!