Archive for Research

Andrelton Simmons Is Avoiding Strikeouts Like Tony Gwynn

Andrelton Simmons draws comparisons to Ozzie Smith for his defensive prowess. Both players are recognized as once-in-a-generation all-time greats at their positions, though Simmons has yet rival Smith’s Hall of Fame career.

Apart from the defensive skills, similarities have emerged between Smith and Simmons offensively, as well. Consider that, through the 2016 season, Simmons had taken roughly 2,500 plate appearances and put up a weak 85 wRC+. Compare that to Smith’s first seven seasons, through 1983, when he put up an even worse 74 wRC+ in more than 3,500 plate appearances.

Smith eventually turned his career around offensively, however, putting up a 103 wRC+ from 1984 through 1992 while producing 37 runs by means of the stolen base, a total which might even understate his total offensive value. Smith was bad on offense for quite some time, then he improved and was a good offensive player for a decent portion of his career. It’s possible we are seeing the same type of transformation from Simmons. The Angels shortstop put a 103 wRC+ last season at 27 years old; thus far this season, he’s doing considerably better, with a 143 wRC+ on the strength of his .331/.402/.466 batting line. Most remarkable about Simmons’ hitting numbers are the strikeouts — or lack thereof, rather — as Simmons has struck out in just 10 of his 200 plate appearances.

In 1998, Tony Gwynn stepped up to bat 505 times and struck out on just 18 occasions. The league-average strikeout rate of 17% at that point was nearly five times Gwynn’s 3.6% mark. Preston Wilson made his debut that season and struck out more times than Gwynn despite receiving only 60 plate appearances. Gwynn’s 3.6% strikeout rate isn’t the greatest of all-time. Joe Sewell struck out in under 1% of his plate appearances five times, while 68 players between 1919 and 1951 had qualified seasons with rates lower than 2%. There were 413 seasons during that time where a player’s strikeout rate was lower than Gwynn’s in that 1998 campaign. Gwynn himself even had four seasons with a lower strikeout rate than 1998, but when considering the overall context of strikeouts in the game, Gwynn’s 1998 season is probably the best of all-time. If Andrelton Simmons can keep this up, his season is going to be better.

Read the rest of this entry »


How I Use xwOBA

If you’ve spent any time observing some of the nerdier battles over baseball statistics in the last decade or two, you’re probably familiar with the arguments made for and against certain metrics. Beginning with the relatively simple matter of batting average versus on-base percentage, these debates tend generally to take the same shape. And generally, one recurring blind spot of such debates is that they tend to dwell on what certain statistics don’t do instead of best identifying what they do do.*

*Author’s note: /Nailed It

The last few years has seen the release, by MLB Advanced Media (MLBAM), of a flurry of new data and statistics, generally referred to as “Statcast data.” We’ve also seen advances in the measurement of catcher-framing by the people at Baseball Prospectus, who have also continued making improvements in the evaluation of pitchers in the form of Deserved Run Average (DRA). When new data and metrics emerge, there is inevitably a period of uncertainty that follows. What does this stat mean? What’s the best way to use this data set? Equally inevitable is the misapplication of new statistics. That aspect of potential statistical innovation is not really new.

Today, what is new is xwOBA — and, in part due to the wide proliferation of Statcast data by means of telecasts and MLB itself, more fans are finding and using stats like xwOBA than might have been in previous generations. As with other new metrics, we are still attempting to identify how xwOBA might best be used.

One such study into the potential utility of xwOBA was recently published by Jonathan Judge at Baseball Prospectus. The study is a good one, with Judge focusing on xwOBA against pitchers. While not ultimately his point, Judge does, along the way, object to the “x” in xwOBA, as he feels that “expected” implies predictive power. While I have always interpreted the “expected” to mean “what might have been expected to happen given neutral park and defense” — that is, without assuming a predictive measure — it does appear that reasonable people can disagree on that interpretation.

Read the rest of this entry »


FIP vs. xwOBA for Assessing Pitcher Performance

At a basic level, nearly every piece at FanGraphs represents an attempt to answer a question. What is the value of an opt-out in a contract? Why do the Brewers continue to fare so poorly in the projected standings? How do people behave in the eighth inning of a spring-training game? Those were the questions asked, either explicitly or implicitly, by Jeff Sullivan, Jay Jaffe, and Meg Rowley just yesterday.

This piece also begins with question — probably one that has occurred to a number of readers. It concerns how we evaluate pitchers and how best to evaluate pitchers. I’ll present the question momentarily. First, a bit of background.

Fielding Independent Pitching, or FIP, is a well-known tool for estimating ERA. FIP attempts to isolate a pitcher’s contribution to run-prevention. It also serves as a better predictor of future ERA than ERA itself. The formula for FIP is elegant, including just three variables: strikeouts, walks, and homers. It does not include balls in play. That said, one would be mistaken for assuming that FIP excludes any kind of measurement for what happens when the bat hits the ball. Let this be a gentle reminder that home runs both (a) are a type of batted ball and (b) represent a major component of FIP. There is, in other words, some consideration of contact quality in FIP.

Expected wOBA, or xwOBA, is a newer metric, the product of Statcast data. xwOBA is calculated with run-value estimates derived from exit velocity and launch angle. Basically, xwOBA calculates the average run value of every batted ball for a hitter (or allowed by a pitcher), adds in the defense-independent numbers, and arrives as a wOBA-like figure. The advantage of xwOBA is that it removes the variance of batted-ball results and uses a “Platonic” value instead.

The introduction of Statcast’s batted-ball data is exciting and seems like it might help to better isolate a pitcher’s contributions. But does it? This is where I was compelled to ask my own, relatively simple question — namely, is xwOBA better for assessing pitcher performance than the more traditional FIP? What I found, however, is that the answer isn’t so simple.

The differences between FIP and xwOBA, as well as the similarities, deserve some exploration.

Read the rest of this entry »


Players Don’t Become Terrible at 30

One of the oft-mentioned reasons for the slow free-agent market this winter is that teams are thinking on the same wavelength when it comes to evaluating players. One of the tenets of this theory is that free agents are bad bets because of the aging process. As players age, especially after 30, they get worse on the field, and teams don’t want to get stuck with those decline years.

There is a whole lot of reason in that explanation for the offseason’s lack of activity. There’s also a little bit of faulty logic regarding the aging process, particularly when it comes to this year’s free-agent class and the two biggest names out there, Eric Hosmer and J.D. Martinez.

The first flaw in this argument is based on a misunderstanding of how clubs are compensating players. All teams — and especially the “smart” ones — know and understand that the final year or years of a free-agent contract are unlikely to be valuable in terms of strict wins-per-dollar calculus. It’s generally accepted that those “out” years are going to be mostly dead weight. Players are typically signed to deals for which the total guarantee is equally distributed over the course of a deal. The team isn’t paying an equal amount every year expecting metronomic production over the life of a contract. They expect to receive a surplus of value in the early years and a deficit in latter years. The hope is that the early years compensate for the latter ones.

Teams could, in theory, compensate players a greater amount at the beginning of a contract than at the end, but most clubs choose not to do this because, by spreading the payments out, they get to keep more money in the present, which is more valuable to them. If a team doesn’t want to add, say, a seventh year at $25 million to Eric Hosmer’s offer, it isn’t because they believe he isn’t going to be worth $25 million in that seventh year. It’s because they believe he won’t be worth extra $25 million over the first six years. The value in the seventh year is going to be close to zero in terms of expectations. That’s not the main point I’m trying to make in this post, but it does deserve a mention.

Read the rest of this entry »


Manny Margot and the Stickiness of a Launch-Angle Breakout

Manny Margot had a breakout within a breakout last year. After accounting for his offensive and defensive contributions, the Padres’ rookie center fielder was worth roughly two wins in slightly less than a full season’s worth of plate appearances. Even for a player who was highly touted as a prospect, producing league-average work at 22 years old represents, in itself, a kind of breakout.

Hidden within that strong end-of-year line was a drastic change in the second half, though. Margot started hitting the ball in the air. That’s a change that has powered many other breakouts. But before we book the skinny center fielder for all of the homers next year, we have to ask: what’s happened with launch-angle surgers in the past?

Read the rest of this entry »


Neuroscience Can Project On-Base Percentages Now

I have an early, hazy memory of Benito Santiago explaining to a reporter the approach that had led to his game-winning hit moments earlier. “I see the ball, I hit it hard,” said Santiago in his deep accent. From which game, in what year, I can’t remember. Also, it isn’t really important: it’s a line we’ve heard before. Nevertheless, it contains multitudes.

We know, for example, that major-league hitters have to see well to hit well. Recent research at Duke University has once again made explicit the link between eye sight, motor control, and baseball outcomes. This time, though, they’ve split out some of the skills involved, and it turns out that Santiago’s deceptively simple description involves nuanced levels of neuromotor activity, each predictive of different aspects of a hitter’s abilities. Will our developing knowledge about those different skills help us better sort young athletes, or better develop them? That part’s to be determined.

Read the rest of this entry »


Statcast’s Outs Above Average and UZR

Given the relative novelty of Statcast data, it remains unclear for the moment just how useful the information produced by it can and will be. As with any new metric or collection of metrics, it’s necessary to establish baselines for success. How good is an average exit velocity of 90 mph? What does a 10-degree launch angle mean for a hitter? How does sprint speed translate to stolen bases or defensive ability?

In an effort to begin answering such questions, the Statcast team has rolled out a few different metrics over the past few years that attempt to translate some of the raw material into more familiar terms. Hit probability uses launch angle and exit velocity to determine the likelihood that a batted ball will drop safely. Another metric, xwOBA, takes that idea a step further, using batted-ball data to estimate what a player should be hitting.

Another example is Outs Above Average. In the case of OAA, the Statcast team has accounted for all the balls that are hit into the outfield, determined how often catches are made based on a fielder’s distance from the ball, and then distilled those numbers down to find what an average outfielder would do. The final result: a single number above or below average.

At Reddit, mysterious user 903124 has published research showing that the year-to-year reliability for Outs Above Average has been considerably higher than the Range component for UZR. The user was kind enough (or foolish enough) to create a Twitter account reproduce some graphs of his results, which are shown below. There is a subsequent tweet in the thread that shows left field.

 

 

For those who can’t see the charts and would prefer not to open up a new window, what you’d see here is that, for the group selected, the r-squared is much higher for Outs Above Average than for the Range component for UZR. If Statcast could produce something that is much more reliable and much more accurate than the Range component of UZR, that would be a pretty significant breakthrough and win for Statcast, potentially improving the way WAR is calculated and providing a better measure of a player’s talent and results on the field.

Read the rest of this entry »


What Statcast Says About the National League Cy Young

Over in the American League, there’s a clear two-horse race between Chris Sale and Corey Kluber for the Cy Young Award. Both are head and shoulders above the rest of the league and both have very strong cases for the honor, depending on what metrics you prefer.

Over in the National League, that isn’t quite the case. Max Scherzer is the clear front-runner at this point, with a host of other pitchers behind him all trying to make an argument why they might have had better seasons. Clayton Kershaw has a lower ERA. Zack Greinke pitches in a much tougher park. Teammate Stephen Strasburg has a lower FIP.

Those are just the stats that measure outcomes, though. Let’s see what Statcast has to say about the sort of contact the other candidates are allowing to see if anybody has a real case against Scherzer.

Read the rest of this entry »


How 2017 Compares to the Steroid Era: Part II

Yesterday, I looked at how one of the last seasons of the steroid era (2002) compares to the present one in terms of home runs, total offense, and overall value by position. To summarize the findings of that post briefly: while corner outfielders account for less production now than they did in 2002, infielders and catchers are now responsible for more of it. Moreover, players at the top of the home-run leaderboards now are accounting for a lower percentage of total league-wide homers than in the steroid era. There’s a more even distribution of homers, in other words.

Read the rest of this entry »


How 2017 Compares to the Steroid Era: Part I

Infielders account for a greater percentage of homers now than in 2017. (Photo: Ryan Claussen)

The 2017 season has seen offensive levels rise to a height unmatched in major-league baseball for quite some time. Overall this year, teams are averaging 4.65 runs per game, the highest mark since 2007 — though not quite the five runs per game teams averaged in 1999 and 2000. Most of the offensive increase can be traced to a juiced ball. There’s also been a lot of talk about the role of a fly-ball revolution of some sort or another in the establishment of a new league-wide seasonal home-run record.

An increase in PED use has now been raised as an issue, as well. MLB has administered both PED testing and PED-related suspensions since 2004; both have existed in the minors since 2001. Even with those measures in place, however, power continues to be associated with steroid use, and unfounded rumors have hounded the authors of every breakout season over the last decade. With the rise of power in recent years, the whole league is under suspicion. But how similar is this version of the league to the one now known as the “steroid era”? Let’s take a look at what the latter actually looked like and how it compares to now.

Our split tools are very expansive going back to 2002. This is convenient because 2002 was the last season that lacked PED testing of any kind. It might not have been quite the height of that period now regarded as the “steroid era” — that was probably 1999 and 2000 — but the league looked quite different before testing and suspensions were permitted.

Read the rest of this entry »