Archive for Research

How Contact Ability Might Influence a Hitter’s Transition to the Majors

Back in February, there was some discussion about the transition from Triple-A to the majors, and whether that jump was getting any more difficult. It certainly seemed that way. Several highly-regarded minor leaguers completely flopped in their first tastes of big league action last year. Gregory Polanco, Jon Singleton, Xander Bogaerts, Jackie Bradley Jr. and the late Oscar Taveras all didn’t hit a lick after tearing it up in the minors. And perhaps worst of all, Javier Baez — a consensus top 10 prospect heading into the year — hit a putrid .169/.227/.324 with an unsightly 41% strikeout rate.

Jeff Sullivan and Ben Lindbergh both looked into the validity of this phenomenon, and wrote response articles more or less debunking it. Both concluded that the gap between Triple-A and the majors wasn’t growing after all, or at least not in any meaningful way. So much for that.

However, after thinking about it for a while, I started to wonder if there might be other ways to explain the initial failures of guys like Baez. Perhaps it might be more informative to look at these transitions from a different angle: Not across time, but across skill sets.

Baez’s flaws were easily identifiable. He struggled to make contact, and also showed a tendency to chase pitches out of the zone. But perhaps his rough transition wasn’t unique to him. Maybe his skill set — his poor plate discipline and/or poor bat-to-ball ability — just doesn’t play well against major league pitching. If that’s the case, it might help us be wary of the next Javier Baez. Read the rest of this entry »


Batted Balls: It’s All About Location, Location, Location

BABIP is a really hard thing to predict for pitchers. There have been plenty of attempts, sure, but nothing all that conclusive — probably because pitchers have a negligible amount of control over it. So naturally, when I found something that I thought might be able to model and estimate pitcher BABIP to a high degree of accuracy, I was very excited.

My original idea was to figure out the BABIP — as well as other batted ball stats — of individual pitches from details about the pitch itself. Velocity, movement, sequencing, and a multitude of other factors that are within the pitcher’s control play into the likelihood that a pitch will fall for a hit (even if to a very small degree). But much more than all of those, pitch location seems to be the most important factor (as well as one of the easiest to measure).

I got impressively meaningful results by plotting BABIP, GB%, FB%, wOBA on batted balls, and other stats based on horizontal and vertical location of the pitch. So I came up with models to find the probability that any batted ball would fall for a hit with the only inputs being the horizontal and vertical location (the models worked very well). I even gave different pitch types different models, since there were differences between, for example, fastballs and breaking balls. I found the “expected” BABIP of each of each pitcher’s pitches, and then I found the average of all of those expected BABIPs — theoretically, this should be the BABIP that the pitcher should have allowed.

Read the rest of this entry »


The Non-Speed Components of Double Plays

Last week, we rolled out some minor tweaks to WAR, one of which was the addition of wGDP. If you haven’t read the primer, wGDP is a measure of double play runs above average and captures how many runs you save your team by staying out of double plays.

In general, it’s a minor piece of the overall puzzle with the best and worst players separated by less than a win of value over the course of a full season. Staying out of double plays helps your team, but even the best players don’t stay out of a large enough number to swing their value in a big way. Introducing wGDP makes WAR a better reflection of reality and that’s a good thing, but it also allows us to better measure the GIDP column we’ve all seen for years because it puts double plays in the context of double play opportunities.

Dave and August have already looked at some surprising and obvious players who are great at staying out of double plays, but I wanted to consider this new statistic from another angle. For the most part, it seems like staying out of double plays should be a base running issue, as you have to be fast enough to get to first before the infield twists it.

Read the rest of this entry »


Investigating the Idea of Scarce Right-Handed Power

I want to put to rest the discussion about the lack of right-handed power in Major League Baseball today. There has been a lot of anecdotal commentary about how scarce right-handed power has become, but there haven’t been too many analytical articles supporting this idea. If anything, the handful of articles that have been written question if the problem even exists in the first place. There are two different arguments about this topic: the first is that right-handed power is scarce — that is to say left-hand power is bountiful — but right-hand power is not, while the second argument, which I won’t address today, is that relative to left-handed power hitters, right-handed power hitters have declined in number.

In a hypothetical choice between players of equal talent, you would almost always prefer a left-handed power hitter to a right-handed power hitter, since the lefty will have the platoon advantage more often and should be more productive as a result. There are valid arguments concerning rounding out line-ups, but right-handed batters are not scarce; good left-handed hitters are actually the scarce commodity.

For reference, the general population is estimated at having a left-handed rate of 10%, while baseball has a left-handed rate among batters is about 33%; lefties are overrepresented in baseball.

This is a box plot of the various player-seasons from 2010 until 2014. I’ve chosen this time span since it’s recent and it falls after the implementation of PITCHf/x, which improved the measurement of the strike zone. I’ve excluded switch hitters for simplicity, and set a floor at 200 plate appearances.

2010-2014 Single Season HR

Read the rest of this entry »


On the Consistency of ERA

We know that ERA isn’t a perfect indicator of a pitcher’s talent level. It depends a lot on the defense behind the pitcher in question. It depends a lot on luck in getting balls in play to fall where the fielders are. It depends a lot on luck in getting fly balls to land in front of the fence. It depends a lot on luck in sequencing — getting hits and walks at times where it doesn’t hurt too much.

That’s why we have DIPS. Stats like FIP, xFIP, SIERA, my recent SERA, and Jonathan Judge’s even more recent cFIP all attempt to more accurately measure a pitcher’s talent by stripping those things out. But what if there was an easy way to figure out how much ERA actually can vary? How likely a pitcher’s ERA was? What the spread of possible outcomes is? The aforementioned ERA estimators do not address that issue. They can tell you what the pitcher’s ERA should have been with all the luck taken away (or at least what they think the ERA should have been), but they can’t answer any of the questions I just posed.

Read the rest of this entry »


Examining SERA’s Predictive Powers

SERA, my attempt to estimate ERA with simulation, started off as an estimator. Then, later, I laid out ways to make it more predictive. Well, here’s the new SERA: a more predictive, more accurate and better ERA estimator altogether.

First, a refresher: The first SERA worked by inputting a pitcher’s K%, BB%, HR% (or HR/TBF), GB%, FB%, LD% and IFFB%. Then, the simulator would simulate as many innings as specified, with each at bat having an outcome with a likelihood specified by the input. A strikeout, walk or home run was simple; a ground ball, fly ball, line drive or popup made the runners advance, score or get out with the same frequency as would happen in real life.

To make SERA a better predictor of future ERA, I outlined a few major ways: not include home runs as an input (since they are so dependent on HR/FB rate, over which pitchers have almost no control), not include IFFB% for the same reason (it is extremely volatile and pitchers also have very little control over it) and regress K%, BB%, GB%, FB% and LD% based on the last three years of available data — or two or one if the player hadn’t been playing for three years. There were some other minor things, too.

Read the rest of this entry »


Towards a Better and More Predictive SERA

My last article introduced the concept of estimating a pitcher’s ERA using a simulation called SERA. As I pointed out throughout the article, SERA was strictly an estimator, not a predictor. That is, a pitcher’s SERA in one season wouldn’t do a great job predicting that pitcher’s ERA the next season. It’s more similar to FIP than it is to xFIP; descriptive rather than predictive.

But what if we want to create a simulator that predicts ERA for the future instead of just estimating what the ERA should’ve been? Some things are going to need to be changed — not just the code for the simulation, but also the inputs.

Read the rest of this entry »


Testing the Lasting Effect of Concussions

Pitchers are expected to lose command after Tommy John surgeries. Prolific base stealers coming back from hamstring injuries are expected to take it slow for a week or two before regularly getting the green light on the basepaths. A broken finger for a slugger is blamed for the loss of power; a blister for a pitcher might mean a loss of feel on their breaking ball. What is not well publicized, however, is how a player recovers from and reacts to returning from a concussion. For an injury that has been talked about in the media so often in the last few years, we know very little about the actual long-term, statistical impacts that concussions have on players that experience them.

Players often talk about being “in a fog” for some time after suffering a concussion – often even after they return to play. The act of hitting is a mechanism that involves identifying, reacting, and deciding on a course of action within half a second. With that in mind, I wondered: do concussions change the quality of a batter’s eye and discipline at the plate? Do brain injuries add milliseconds to those individual steps? Even though each injury is different, do varying lengths of disabled list stints due to concussions change a player’s performance on the field after they return? The most direct route to answering those questions might be studying the impact of concussions on strikeout and walk rates.

Read the rest of this entry »


Estimating ERA: A Simulated Approach

ERA, probably the single most cited reference for evaluating the performance of a pitcher, comes with a lot of problems. Neil does a good job outlining why in this FanGraphs Library entry. Over the last decade, plenty of research has cast a light on the variables within ERA that often have very little to do with the pitcher himself.

But what is the best way to use fielding-independent stats to estimate ERA? FIP is probably the most popular metric of this ilk, using only strikeouts, walks, hit batters, and home runs to create a linear equation that can be scaled to look like an expected ERA. Then there’s xFIP, which is based off the idea that pitchers have very little control over their HR/FB rate; to account for this, it estimates the amount of home runs that a pitcher should have allowed by multiplying their fly balls allowed by the league average HR/FB rate.

For many people, however, these are too simple. FIP more or less ignores all balls in play completely; xFIP treats all fly balls equally. Neither one correctly accounts for the effects that any ball in play can have; we know that the wOBA on line drives is much higher than the wOBA on pop ups, but we don’t see that reflected in many ERA estimators. The estimators we use also are fully linear, and may break down at the extreme ends; FIP tells us that a pitcher who strikes out every batter should have an ERA around -5.70, which is, well you know, not going to happen.

Read the rest of this entry »


Substitution Rules

In September of 2013, Billy Hamilton made his MLB debut. As a pinch runner. It’s wonderful to see Billy Hamilton be a pinch runner. A few days before his 20th birthday, Tim Raines made his MLB debut as a pinch runner, and all six games in his September callup was as a pinch runner. He played 15 games the next season, 7 of which was as a pinch runner. (The season after that, 1981, is when he established his star.)

I see nothing but positive about this kind of development. The only reason we see it in September is because of the loosening of the roster rules. Now, rather than expanding the rosters during the season, or declaring a 15- or 20-man roster on a game-by-game basis, what would happen if we changed the substitution rules?

Say you bring in Hamilton to pinch-run for the catcher. At the end of the half-inning, the manager currently has two choices: (a) choose whether to bring in a new catcher and knock Hamilton out of the game or (b) double-switch Hamilton into the game, with a new catcher. In either case, the original catcher is knocked out of the game. But, what if we add a third option: (c) allow the manager to keep the original catcher while knocking Hamilton out of the game. The substitution still knocks one player out of the game, but now the manager gets to decide which of the players involved gets the boot.

And you can extend that to pinch hitting as well. And if you do that, you can also do away with the DH. You can now bring in a pinch hitter for your starting pitcher in the second inning, without knocking out the pitcher (but you do lose the PH): You simply let the PH hit, and then that guy is out of the game, while the pitcher remains. And the same applies for a defensive substitution: at the end of the half-inning, you decide which of the two players is knocked out of the game.

You still have to worry about your bench and when to allow your pitcher to bat or not. But, you now give the manager a bit more flexibility. And if he wants to have a speedster on the roster, without thinking he’s burning through two roster spots (i.e., knocking out both the original catcher and Hamilton), he’s only burning through one.

What we have here is a rule that:
a. has no roster impact (you always lose a player)
b. is unobtrusive (no one is really going to notice anything)
c. allows the DH to disappear from the rulebook
d. gives the manager great flexibility

Downside? You tell me.

UPDATE: Based on one or two comments, I don’t think I was clear enough: at the end of the half inning, the manager chooses which player he loses for the game. In the above situation, if he goes for option c, he’d knock Hamilton out for the game, and the original catcher is allowed to stay. The same applies for a fielding sub: Hamilton could have come in as a defensive sub in the 8th inning in CF. At the end of the half-inning, the manager decides whether to knock Hamilton out of the game, or knock the guy he had replaced out of the game. There is NO “re-entry”. Someone will get knocked out of the game, just like it is today.