Archive for March, 2013

Sloan Analytics: Rosenheck on BABIP

Last weekend’s MIT Sloan Sports Analytics Conference included a number of Evolution of Sport presentations. Among the best was a study of BABIP factors titled “Hitting ‘Em Where They Are,” by Dan Rosenheck. He is the sports editor of The Economist and a writer for The New York Times’s Keeping Score column on sports statistics, and he gave an overview of his study prior to presenting it on Day Two of the conference.

——

Dan Rosenheck: “It was a great surprise to find out that one of the distinguished presenters on the Baseball Analytics panel was Voros McCracken. His discovery, in 1999, was that BABIP allowed by starting pitchers is, at the very least, extremely noisy and hard to predict from year to year. It was a revolution in sabermetrics and opened the door to a vast amount of research. It changed the way many of us understand the game.

“The BABIP question has been the Great White Whale of the sabermetric enterprise. It is the mystery that, 14 years later, has continued to defy the best efforts of quantitative analysts using public available data. Tom Tango’s FIP assumes that all pitchers have exactly league-average BABIP ability. Even a small increase in predictive ability of that question leads to a huge increase in the accuracy with which you can predict how valuable players will be.

“I studied a bunch of variables I thought might have something to do with hit suppression on balls in play. I came up with two — both FanGraphs stats — that seem to have significant predictive power. The first is pop up rate. The second is z-contact, which is when batters swing at a strike — balls in the strike zone — thrown by a pitcher. What percent of those times does the batter make contact? It turns out that, just like inducing pop ups, it reduces BABIP and correlates consistently year to year. Getting batters to swing and miss at your strikes has strong predictive power on hit suppression.

“I came up with a simple model with two curved fits, using data from 2005-2011, with an R-squared of .15. It accounted for 15 percent of the variance in BABIP for starting pitchers relative to rest of that team’s starting rotation. That factors out for defense and ballpark.

“Fifteen percent might not sound like a lot, and the data is noisy, but it’s a lot relative to zero, which is what FIP will tell you. This little equation correctly identifies every single major BABIP outlier of the last decade. If you look at its leader boards, the guys who most often appear as being projected to have the lowest BABIPs relative to their team, using only data from prior seasons — no cheating — it is Tim Wakefield, Ted Lilly, Barry Zito, Johan Santana, Matt Cain. It is the famous exceptions, one right after the other, after the other.

“The second thing is that it works out of sample. I calculated this equation in March 2012 on data from 2005-2011, and when I applied it to the 2012 season, the R-squared actually went up. It predicted the out-of-sample data even better than the in-sample data. There’s no over-fitting, no cheating or spurious relationships. This is real.

“The third thing that works well is you could have a 15 percent R-squared with a very narrow range of predictions. Let’s say you have the best guy at five points below his teammates and the worst at five points above. That might marginally improve your forecast, but it’s not game changing. This equation gets the magnitudes right. It can forecast very big outliers. The guys who have the lowest BABIPs — Chris Young when he was with the Padres, Jered Weaver now, some of the Ted Lilly seasons — it’s projecting these guys for 30, 40 points of BABIP below their teammates. Huge magnitudes, far and above what you would see in any of the standard projection systems like ZIPS, Steamer or PECOTA. I don’t think any of them are projecting anything close to 40 points of differential. And it’s getting them right.

“The reason the R-squared went up last year is that it made a very bold prediction that Jered Weaver was going to have a BABIP over 40 points lower than his teammates. It got it right to .001 of accuracy. That’s lucky, and just one great prediction, but overall it’s not just improving your accuracy at the margins. It’s identifying big outliers to a big degree.

“I will post my data online, so if anyone wants to poke holes in it, all the better for our understanding of this troublesome phenomenon. I think the best avenue for future research is looking at this equation — at basically the favorite and least favorite pitchers — and asking, ‘What do they have in common?’ The guys who have high pop up rates and low z-contact rates are the guys projected to be good hit suppressors, so what do they throw? How hard do they throw? Are they deceptive? And vice versa for the pitchers the equation doesn’t like.

“I had two hypotheses. I thought tall pitchers, like Young and Weaver, might be good at this. I also thought guys who throw a lot of changeups might be good at this. Cole Hamels and Johan Santana come up very high and they’re great changeup artists. But, in fact, the height and changeup percentage in my high and low BABIP samples were identical.

“I don’t have any piercing insights as to what the guys who are good at this are doing to be good at this. Fortunately, the data is available to everybody and the internet has plenty of smart people who can move our understanding of this issue even farther forward.”


Effectively Wild Episode 154: 2013 Season Preview Series: Arizona Diamondbacks

Ben and Sam preview the Diamondbacks’ season with Doug Thorburn, and Pete talks to AZCentral Sports Diamondbacks beat writer Nick Piecoro (at 19:39).


For Reference: The Dominican Team Has Average Patience

Tom McCarthy: Well, Wheels, that’s what you’re talking about. They’re not going to take many pitches.

Chris Wheeler: No. I was kidding with Juan Samuel about that behind the cage today… I was kidding around with Sammy. He says, “You want to see the lineup?” and he showed us the lineup card. I said, “Not many walks in that lineup, are there?” He said: “I told you a long time ago, we do not come off that island walking. We come off swinging.” And that was something that Sammy had said years and years ago, and it’s so true.

–From the second inning of Tuesday’s Dominican-Phillies exhibition game

During today’s game between the Dominican national team and the Phillies in Clearwater, Florida, the broadcasters for Comcast SportsNet Philadelphia (Tom McCarthy and Chris Wheeler, it appears) made a number of references to the aggressive approach of the Dominican team’s batters as a unit — the comment above being merely an example of a half-dozen or so made by McCarthy and Wheeler over the first two or three innings.

Certainly, this will not mark the first time that the reader has encountered the suggestion that Dominican players — or, perhaps, Latin players as a whole — exhibit less in the way of plate discipline than their American counterparts. To what degree there is or isn’t any truth to this generalization historically is one matter — and is decidedly outside the purview of this present post. The plate discipline of the lineup deployed by manager Tony Pena on Tuesday, however, is something that can be measured with some ease.

Read the rest of this entry »


FanGraphs After Dark Chat – 3/5/13


What’s Required for a Paul Konerko Infield Hit

Who do you think is the slowest player in major league baseball? No fair guessing Bartolo Colon. Allow me to re-phrase. Who do you think is the slowest position player in major league baseball? You probably have a few names floating around in your mind. Many of them are probably catchers. I can tell you I don’t know if I’ve ever seen a worse runner than Jesus Montero. Montero wasn’t just slow, but his technique was so bad he had to spend the offseason learning how to run. Montero is a 23-year-old high-level professional athlete. That whole chapter was embarrassing. Maybe it’s still embarrassing; I haven’t seen the new, improved Montero in 2013.

Montero, then, is a candidate for MLB’s slowest player. So are other catchers, like Jose Molina. But allow me to direct your attention to Paul Konerko, who isn’t a catcher, but who is old and defensively unremarkable. Konerko has very quietly had an outstanding career, and Konerko has very quietly been perhaps the slowest player in the league. If he wasn’t the slowest player before, he certainly hasn’t gotten quicker with age.

Read the rest of this entry »


Longer PED Suspensions: Deterrent or Retribution?

In the wake of the Biogenesis reports linking several more Major League players to a PED supplier, Bud Selig has begun to talk about enacting stiffer penalties for failed drug tests. From last week:

“The time has come to make meaningful adjustments to our penalties,” said Selig, according to CBSSports.com’s Jon Heyman.”We need to do everything possible to deter the use of performance enhancing drugs … [the recent Biogenesis investigation has] driven my intensity to increase the toughness of our PED penalties … Apparently the penalties haven’t deterred some players.”

And then, earlier this week, Selig made this statement:

“If people want to continue to do what they shouldn’t do, then the one thing that you have to do is you have to have stricter penalties,” Selig said. “It’s as simple as that.”

If only it really were as simple as that.

Read the rest of this entry »


Daily Notes: Four Teams Qualify for Second Round of WBC

Table of Contents
Today’s edition of the Daily Notes has no table of contents, it appears.

Four Teams Qualify for Second Round of WBC
Pools A and B of this year’s edition of the World Baseball Classic began this past weekend in Japan and Taiwan, respectively. As noted in a semi-adequate preview of the Classic, many of the first games took place at a time when Americans are either (a) asleep or (b) engaged in some manner of illegal activity or, strangely, (c) both.

In any case, what follows is a record of what has taken place thus far.

Standings
In the first round, comprised of 16 countries, each team plays the other three teams in its pool once. The two teams with the highest winning percentages advance to Round Two. A series of tie-breaking rules exists which the author has no interest in reading even at all.

As of today, four nations (of a possible four from Pools A and B) have qualified for second-round play, where they will compete amongst themselves, in a double-elimination format, for two spots in the final: Chinese Taipei, Cuba, Japan, and the Netherlands.

Read the rest of this entry »


Jeff Sullivan FanGraphs Chat – 3/5/13


Local Interests Stymie Cubs’ Wrigley Restoration Plans

Wrigley Field is falling apart. The Ricketts family, which bought the Cubs for $845 million in 2009, has a plan to spend $300 million of their money to renovate the 98-year-old ballpark. There will be structural upgrades, improved clubhouses, new underground batting cages, upgraded luxury suites and club facilities, more and better concessions and restrooms, and a new patio area in left field to serve the new upper deck. The Cubs also want to add new LED signage and billboards in the outfield. The classic Wrigley look will remain the same: the brick, the marquee outside the ballpark, the ivy and the old scoreboard. Cubs blog Bleacher Nation has conceptual drawings, which you can view here.

The Rickettses are prepared to spend an additional $200 million to develop a hotel across the street from Wrigley, an office building and an open-air plaza to be used for neighborhood and family activities. The open-air plaza will be developed in a triangular-shaped plot just west of Wrigley on Waveland and Clark avenues.

Neither the Cubs nor the Ricketts family are asking for a dime of public money. Instead, they expect the renovation plan to add significantly to public coffers. Julian Green, the Cubs’ vice president of communications, has said 800 new construction jobs will be created to complete the project and 1,300 new permanent jobs will be created with the new hotel, the office building and the open-air plaza. Green also estimates that, once completed, the new Wrigley complex will generate an additional $12 million in sales and property taxes for Chicago — plus an additional $3 million in sales tax for Cook County and an additional $4 million in sales tax for the Illinois. Overall, Green said the renovation will result in an additional $1.2 billion in economic activity and taxes during a 30-year period.

Sounds perfect, doesn’t it? A privately-funded stadium project that will benefit the city, county and state in the short and long term?

Not so fast.

Read the rest of this entry »


Sloan Analytics: Cuban, McCracken, Jedlovec

The SABR Analytics Conference has stolen some of its luster, but the MIT Sloan Sports Analytics Conference remains an important event for the sabermetrics community. The seventh annual took place this past weekend, in Boston, and featured a plethora of thought-leaders. The Baseball Analytics panel included Ben Jedlovic, Jonah Keri, Voros McCracken, Joe Posnanski and Farhan Zaidi. Other speakers included Mark Cuban, Michael Lewis, Sig Mejdal, Dan Rosenheck and Nate Silver.

Several of them took the time to answer questions between presentations. This installment includes conversations with Cuban, Jedlovic and McCracken. Later in the week we’ll hear from Mejdal, Rosenheck and Zaidi.

——
Read the rest of this entry »