Evaluating the Royals

The Kansas City Royals have lost 11 consecutive games, and at 3-13, they have the worst record in baseball. Even their most ardent supporters are walking away. Whatever optimism their farm system had generated before the season began has been washed away in a sea of losses, and now, the Royals just look like the same old last place team they’ve always been.

That’s the story if you just look at wins and losses, anyway. If you look a little deeper and ask why the Royals are currently 3-13, though, the story becomes a lot more interesting.

At the plate, the Royals are averaging just 3.56 runs per game – AL average is 4.47 – third worst in the American League. They are just 0.03 runs per game better than the vaunted Mariners offense. So, the offense has been a problem, right? Well, sort of, but not in the way you might think. The Royals overall line on the season is .255/.315/.413, good for a .316 wOBA. The average AL team is hitting .253/.320/.412 with a .320 wOBA, so overall, KC has been just slightly below average at the plate in their first 16 games. So, how does a team with an average batting line score nearly a run less per game than expected?

Timing. With the bases empty, the Royals have posted a .333 wOBA, fourth best in the American League. With men on base, that’s fallen to .298 – third worst in the league. With runners in scoring position? .275, ahead of only the Oakland Athletics.

The Royals have been pretty good at starting rallies, and absolutely atrocious at getting anything out of them. If the Royals hits had come in a more normal distribution of situations, they’d have scored 68 runs instead of 57. This isn’t just a minor blip – the Royals lack of situational hitting has been a major factor in their overall record.

But, just adding 11 runs doesn’t just fix a 3-13 team, right? So, let’s look at their run prevention. They’ve given up 5.06 runs per game, fourth worst in the American League. Most of the optimism that existed about the Royals before the season related to their young hitting, so it’s not a big surprise that the run prevention hasn’t been all that good. However, the pitching hasn’t actually been as bad as you might think.

BB% – 9.8%, 12th in AL (average is 8.3%)
K%: 18.6%, 8th in AL (average is 8.6%)
HR/9: 1.00, 7th in AL (average is 1.09)
FIP: 4.16, 10th in AL (average is 4.10)

They’ve issued way too many walks (I’m looking at you, Jonathan Sanchez), but they’ve been exactly average in getting strikeouts and slightly above average at keeping the ball in the yard. The walks help push their FIP slightly below average, but they’re a lot closer to the middle of the pack than they are to the teams whose pitching staffs have truly been costing them games. So, if they have a 4.10 FIP, why are they giving up 5.06 runs per game?

Let’s go back to the splits.

Bases Empty: 3.63 FIP, 5th in AL
Men On Base: 4.77 FIP, 12th in AL
RISP: 4.00 FIP, 9th in AL

It’s basically just the same story as on offense. With the bases empty, their pitchers have been pretty darn good. Put men on base, they’ve been a lot worse, even above and beyond the normal difference that pitchers generally face pitching out of the stretch. But FIP doesn’t even tell the whole story here. Back to the splits we go, but this time, we’ll look at BABIP instead of FIP:

Bases Empty: .295 (t-10th in AL)
Men On Base: .317 (11th in AL)
RISP: .369 (14th in AL)

The Royals have given up more hits on balls in play than an average team in each situation, but they’ve been far and away the worst in baseball in BABIP with RISP – the Red Sox are the next worst AL team, and they’re even 30 points better than KC, coming in at .337. The AL average in that situation is .299.

Put together the situational FIPs and BABIPs, and opposing hitters have posted just a .705 OPS against the Royals pitching staff when the bases are empty, but an .841 OPS with men on base and an .840 OPS with runners in scoring position. As with the hitting, the distribution of when the hits have been allowed has been a huge factor in the team’s lack of production.

As it stands, the Royals have scored 57 runs and allowed 81. With a more normal distribution on timing of hits, though, it’d be pretty close to 70 for both RS and RA. And, as everyone who has been beaten to death with pythagorean expectation over the past 20 years knows, a team’s runs scored and runs allowed are a better evaluator of how a team has played than simple wins and losses. Pythag suggests that the 57/81 split in their RS/RA means that the Royals have played more like a .313 team than a .188 team, and their underlying components of run scoring and run prevention suggest that they’ve played more like a .500 team than a .313 team.

So, how should we evaluate the Royals after 16 games? The results have been atrocious, but simply changing the unsustainable distribution of their performance would drastically alter their record. There’s a reason we all look at context-neutral statistics when trying to evaluate the worth of an individual or a team, because context-specific performance contains wild fluctuations that generally hold no predictive value.

The fact that the Royals have been – by far – the worst team in baseball in the clutch during the first couple of weeks of the season doesn’t tell us anything about how they’re going to perform in those situations going forward. Looking beyond the standings reveals that the Royals have actually performed decently at times, and perhaps all this panic in Kansas City is an overreaction to events that just won’t continue. The standings tell us what has happened, but they generally don’t tell us anything about why those things have happened.

That’s why, when Sports Illustrated approached us about helping them come up with a different approach to the traditional weekly Power Rankings column, we agreed to team up with them and offer an alternative to the “re-write the standings in text form” approach. SI still wanted the column to reflect a snapshot of a team’s season performance to date, so just like with any other approach, small sample size is going to offer up some early season weirdness, such as the Royals coming in at 7th in the rankings released yesterday. No one thinks the Royals are really the seventh best team in baseball, including us. In fact, if the rankings were re-done to reflect last night’s game as well, they would have already fallen to 9th, and their .525 WAR Winning % puts them closer to 17th than to 8th. The ordinal ranking looks bizarre, of course, but what Team WAR is saying is that the Royals have played like a team that should have won roughly as many games as they lost.

That shouldn’t be all that controversial of a statement, honestly. The Royals are 3-13 instead of 8-8 because of a lack of performance in the clutch. Pretty much every study ever done on the issue shows that clutch performance has no predictive value, and when evaluating a player or a team’s overall abilities, their clutch performance should almost always just be ignored. Because WAR is a context neutral metric, it does exactly that, and so it will return results that do not line up perfectly with a team’s win-loss record.

This is a feature, not a bug. By rating teams simply based on how they hit, pitch, and field, the SI Power Rankings are going to highlight situations where a team’s win-loss record might not be telling the whole story. If you prefer a set of power rankings that simply reflects the current standings, there are a ton of places offering just such a write-up. With these rankings, you are very likely to see some rankings that look strange, but they’ll give you the chance to see something you might not have seen otherwise. And to me, that’s the entire point of advanced metrics – to shine a light on the non-obvious story, and help us understand the why behind the result.

Dave is the Managing Editor of FanGraphs.

Newest Most Voted
Inline Feedbacks
View all comments
11 years ago

You pretty much say no value should be placed on the results so why hitch yourself to the SI waggon? First impressions and all that

William Strunk Jr.
11 years ago
Reply to  pssguy

Cluster Luck plays havoc at small samples on uber-derivative functions like W-L and ERA, so it’s useful to examine the component parts of such functions to ferret out the Cluster Luck or other anomalies (which you would expect to smooth out with more inputs).

For that reason, most GMs probably ignore WAR, RC/G, ERA, W-L, ERA+ etc, because the GM is not ranking an entire player pool, the GM’s first filters (need, budget, health, availability) shrink his player pool to a tiny group of players. Ranking a tiny group is subject to Cluster Luck and small sample anomalies, so the GM focuses on the base components: bb/k, bb%, K%, etc and scouting. Sabermetrics provides quantum metrics for the front office to supplement its scouts, and broad derivative models for the historians, fans and fantasy players to sift large helpings of disparate data.

I think.