On Tuesday, I discussed MLB’s expected wOBA (or xwOBA) metric and one of its problems — namely, that guys with great speed might have the ability to outperform their xwOBA on a regular basis. I also pointed out that, despite this drawback, xwOBA should have considerable utility. This post looks at one potential aspect of that utility when it comes to projecting future performance when we have only completed just a small portion of the season.
Comparing wxOBA and wOBA for individual players over the course of a season, one find a pretty strong relationship — a point which I establish in that Tuesday post. To take things a step further, I’d like to look here at the relationships of these stats over the course of a couple seasons and see how they correlate from year to year. In order to establish a baseline, let’s look at how players with at least 400 at-bats in both 2015 and 2016 fared by wOBA.
So we see a decent relationship between wOBA marks in consecutive season. It certainly would be strange if there weren’t some relationship between a player’s offensive statistics from year to year, as players generally don’t get a lot better or a lot worse in such a short span of time — even if the players who do meet those criteria make for more interesting stories and analysis. So we see that, from 2015 to 2016, there is a relationship with wOBA. What about xwOBA?
Now that’s interesting. We see a much tighter grouping, and the relationship between the previous year’s stat is much stronger for xwOBA than it was for wOBA. There are a few different potential reasons for this. One is that, by removing things like the ballpark in which a player plays — and many players switch teams in the offseason — we reduce one outside factor that both (a) changes from year to year but that (b) wouldn’t necessarily affect a player’s hitting skill. The other major factor is that, by stripping out defense in xwOBA, the influence of BABIP fluctuation or the loss of extra base-hits is removed. Hitters generally possess a signature BABIP, one that’s the product of skill, but there are variations in BABIP that are going to be out of a player’s control. Using xwOBA eliminates some of those variations.
Now we come to a much more interesting test. A player’s xwOBA in one season might do pretty well in projecting xwOBA the next season, but how well does it do projecting actual production — a point which might be of more interest to those who study the game? Below is a graph that shows 2015 xwOBA and 2016 wOBA, keeping in mind the first plot in this piece compared wOBA to wOBA in those seasons.
Well, that’s pretty interesting, too. For at least one season, which is the only season we can measure thus far, the relationship between one season’s xwOBA and the next season’s wOBA is actually slightly stronger than one season’s wOBA and the next season’s wOBA. There just might be some predictive value in xwOBA. One caveat here is that, as a comprehensive offensive stat, wOBA in general takes a pretty long time to be sustainable or predictive. Another caveat is that we’re dealing with some survivorship issues here, as the only players included in the sample were both good enough and healthy enough to play a majority of the time in two straight seasons. That said, let’s plow ahead.
Here we are in the middle of May and players are only one-quarter into the season — the “three-quarter pole,” as it is sometimes called — so we’re generally going to rely on past year’s performance pretty heavily in projecting how a player will do the rest of the season. What I thought I would do is to look at where a player was at this time last season, and compare that to how he finished up. I looked at all players with at least 100 at-bats through May 15 of last season and at least 300 at-bats from May 16 to the end of the season. Here’s how the wOBA-to-wOBA comparison looked.
There might be something there, but there’s a reason why we rely on projections from before the season starts and that even the ROS projections are anchored to those projections, generally only moving a little bit at a time due to recent performance. Now let’s look at the same players, except with xwOBA. We saw a much stronger relationship when we went year to year with xwOBA. Will the same hold true after only one-quarter of the season?
Once again the relationship appears to be quite a bit stronger. What this likely means is that xwOBA becomes a more reliable predictor at a faster rate than wOBA. Why would that be important? Well, if the data from above hold up and xwOBA has a just as strong relationship with future wOBA, we might be able to better predict wOBA when changes take place. We should expect the early-season xwOBA to have a stronger relationship with wOBA than early-season wOBA did with its rest of season counterpart. Here is early-season xwOBA and rest-of-season wOBA.
Just as we suspected, the relationship here is stronger. There is still a bunch of noise and the relationship isn’t incredibly strong, but it appears from the data we have that xwOBA, in small sample sizes, might predict future wOBA better than wOBA itself does. A real comparator would be to look at projections compared to xwOBA, but that might be a task for another day. I suspect that projections would still do a lot better in sample sizes like these — and even in half-season samples (which also might be a task for another day) that it would do better. However, if the relationship gets stronger over time, xwOBA might be worth incorporating into future projections to make them even better than they already are.
Craig Edwards can be found on twitter @craigjedwards.