Let’s Build a Searchable Baseball-Text Archive

Baseball provides its audience with great data. We’re blessed with a large number of observations and very high-quality record keeping. Even before MLB announced a radar and video tracking system designed to capture seven terabytes of data per game, we had box scores going back a century and 40 years of play-by-play logs. Thanks to organizations like Retrosheet, SABR, and Baseball-Reference, baseball’s data is also very accessible.

Unsurprisingly, baseball fans are excited about the full implementation of Statcast because the system brings with it the promise of new and exciting data. Exit velocity! Time to the plate! Route tracking! While I share some of that excitement, Statcast has my attention because of something else it promises to offer: completeness.

Read the rest of this entry »


Dan Szymborski FanGraphs Chat – 3/17/17

1:59
Dan Szymborski: OK, let’s get the party going.

2:00
bl27: walk issue will be a problem for Blake Snell ? Similar to Edinson Volquez early in his career

2:01
Dan Szymborski: Given his minor league walk rates, I expect it to be his weakness.

2:01
DEF: Wait, today is only Wednesday? This week is really dragging.

2:01
Dan Szymborski: FRIDAY

2:02
Dr Morris: What do the D-backs know about Mitch Haniger that the rest of the world (and the world’s projection systems) don’t? On paper he’s at least an average-plus RF and he’s murdering the ball in ST.

Read the rest of this entry »


Adam Eaton’s Defensive Numbers Keep Getting Even Crazier

In 2017, I am probably more interested in Adam Eaton than I am any other player in baseball. As the centerpiece of a controversial blockbuster, coming off a monster season where a lot of his value was tied a huge swing in his defensive value, Eaton was always going to be a fascinating experiment for paying a perceived premium price for outfield defense. But it gets even more interesting, because the Nationals are switching him from right field back to center field, so we throw a position switch in the mix as well, and get another data point on whether his weird splits between RF and CF actually mean anything.

So when the MLBAM guys released their outfield catch probability leaderboard last weekend, Eaton was naturally one of the first players to examine. And when Jeff took an early look at the published 2015-2016 data, he found that Eaton ranked seventh in Catch+, or whatever we might want to call plays made above the averages of the buckets they had opportunities in. And when he looked at the catch data relative to the range portions of UZR and DRS, he actually found that the Statcast data showed that Eaton had the largest positive difference, suggesting that, by hang time and distance traveled variables, Eaton may have been even better defensively than the public defensive metrics thought.

So yeah, Eaton is really interesting. But the more I dig into Eaton’s defensive data, the more remarkable it all gets.

Read the rest of this entry »


Effectively Wild Episode 1033: Season Preview Series: Mets and Orioles

EWFI

Ben Lindbergh and Jeff Sullivan banter about a beautiful Javier Baez tag, then preview the Mets’ 2017 season with Ted Berg of For The Win and the Orioles’ 2017 season with Jake Mintz of Cespedes Family BBQ.

Read the rest of this entry »


The Top College Players by (Maybe) Predictive Stats

Week: 1 / 2 / 3.

Over the last couple years, the author has published a periodic statistical report designed to serve as a mostly responsible shorthand for people who, like the author, possess more enthusiasm for collegiate baseball than expert knowledge of it. Those reports integrated concepts central to much of the analysis found at FanGraphs — regarding sample size and regression, for example — to provide something not unlike a “true talent” leaderboard for hitters and pitchers in select conferences.

What follows represents the most current such report for the 2017 college campaign.

As in the original edition of this same thing, what I’ve done here is to utilize principles introduced by Chris Mitchell on forecasting future major-league performance with minor-league stats.

Read the rest of this entry »


Jeff Sullivan FanGraphs Chat — 3/17/17

9:04
Jeff Sullivan: Hello friends

9:04
Jeff Sullivan: Welcome to Friday baseball chat

9:04
Bork: Hello, friend!

9:04
Jeff Sullivan: Hello friend

9:05
Jeff Sullivan: Your rare absences only serve to make me appreciate your presence all the more

9:05
Bork: Now that Mike Trout has been proven to be bad, will the Angels be trading him?

Read the rest of this entry »


Imagining the New Matt Harvey

Remember Vintage Matt Harvey? He sat 96 and didn’t walk anyone and could go to any of three plus secondary pitches. Sigh. That Matt Harvey was sweet. And it was only 2015 when we last saw him. We all had hope that thoracic-outlet surgery would bring that Matt Harvey back, but we’re hearing some bad news on that front recently.

“Harvey’s velocity hovered in the 92-mph range — just as it has in all three of his spring starts — as he got roughed up in a 6-2 loss to the Marlins,” wrote Marc Carig on Wednesday before a grumpy Harvey did his best to assuage concerns with the press afterwards. Given his rough season last year, however — when he was down to 94 from 96 the years before — those fears are justified.

“It’s going to be there or it’s not, and I have to go out and pitch,” Harvey told Carig. “And I think after today I feel really confident going into my next outing and moving forward.” He’s right to assert that he has to pitch with whatever he has, and the underlying assumption, that others have been fine at similar velocities, is also correct. But will this righty, with this fastball, be just as well off as, say, two other righties who averaged 92 on their fastballs last year like a Tanner Roark or an Ian Kennedy? What will his work look like if he’s healthy all year?

Read the rest of this entry »


A Lot Will Depend on Sandy Leon

Sandy Leon was one of the more remarkable stories last season. As Jeff chronicled back in August, Leon took over for the Sox when they really needed a hero. He was that hero, at least for a brief time. He was nothing if not fresh — although that was mostly the result of not having played much at the major-league level previously. Now, he might be one of the most important members of the 2017 Red Sox team.

There are a couple of reasons Leon has become so important. The first and most important is that he has one of the highest (if not the highest) betas on his probable outcomes this season. Is Leon the guy who ran a 158 wRC+ from June to August, or the guy who ran a 44 wRC+ in September (and a 53 wRC+ over the first 107 plate appearances of his career, from 2012 to 2014)? The consensus seems to be something in between, but toward the lower end of that range. ZiPS has him pegged for a 78 wRC+; Steamer, a 74 wRC+ mark. Our depth charts split the difference at 76. The FANS projections are usually wildly optimistic, and that can be useful for players who have odd or small samples or some other manner of extenuating circumstance that might throw off those mean old algorithms. But even the FANS aren’t that optimistic: they have Leon down for just an 80 wRC+.

On the other hand, Leon told Evan Drellich of the Boston Herald recently that when he was signed as a professional, he was signed “because I could hit.” He said that defense was the thing on which he needed to work the most. It’s probably fair to say that, when he was signed, he needed to work on everything. Neither Baseball America nor John Sickels placed Leon among their top-10 Nationals prospect for 2008 — Sickels didn’t have him in his top 20. (Leon signed in 2007, but after both had compiled their Washington lists.) The same was true for the 2009 lists, and by that time, Derek Norris and Adrian Nieto were popping up on the Nats’ lists, so we can’t say that Leon was even the most highly rated catcher in the system.

Read the rest of this entry »


FanGraphs Audio: The Ongoing Matter of Remuneration in Baseball

Episode 725
Managing editor Dave Cameron is the guest on this edition of the pod, during which he discusses the topic of remuneration for, like, the whole time.

A reminder: FanGraphs’ Ad Free Membership exists. Click here to learn more about it and share some of your disposable income with FanGraphs.

Don’t hesitate to direct pod-related correspondence to @cistulli on Twitter.

You can subscribe to the podcast via iTunes or other feeder things.

Audio after the jump. (Approximately 34 min play time.)

Read the rest of this entry »


I Found a Statistic Where Mike Trout Is Bad

Mike Trout’s the best. You didn’t need the reminder, but there you go. He’s the best that there is. He won’t always be the best that there is, and maybe even in this coming season someone will emerge to be better, but given what we know right now, it’s Trout, then it’s the others. Spring training is a time for universal optimism. It’s a time for seeing the best in developing players. Take a young player on your favorite team. Imagine that player at his ceiling. Mike Trout is almost certainly better than that, and he has been for five years.

Trout is insanely talented, and that’s his foundation. But to be this good and stay this good, players also need to be able to adjust. They need to see weaknesses and work to eliminate them. Trout’s done that! He used to have a high-fastball problem. Been addressed. Relatedly, he used to have a strikeout problem. Been addressed. A couple years back he didn’t do enough on the bases. Been addressed. He used to run below-average arm ratings in the outfield. Been addressed. He’s so good.

Yet I’m a professional digger, in a sense. I’m always on the hunt for unknown strengths or weaknesses, and I’ve stumbled upon something I didn’t realize. There is an area where Mike Trout was bad. Last season, I mean. Who could’ve known? Even the best have their blemishes.

Read the rest of this entry »