Author Archive

Expected BABIP for Pitchers

Recently on FanGraphs, we’ve been referring to a stat called xBABIP or Expected Batting Average on Balls in Play to help justify a pitcher’s current BABIP. There’s been a few questions about what this stat means, so I thought it’d be as good a time as any to try and explain the ins and outs of this particular metric.

The initial concept of BABIP is that pitchers do not have control over what happens to balls once they are hit into the field of play.

BABIP typically fluctuates from year to year with a baseline of around .300. If a pitcher has a particularly high or low BABIP, we may say he’s been lucky or unlucky. Things are of course not quite this simple, but for the most part the rule holds true.

In enters ball in play data; we know how many line drives, fly balls, and ground balls a pitcher allows in to play. Line drives fall for hits the most often and ground balls fall for hits more often than fly balls. What types of batted balls a pitcher allows into play are going to effect a pitcher’s overall BABIP.

BABIP by Type (2007):
Fly Balls – .15
Ground Balls – .24
Line Drives – .73

Ideally, the formula is going to look something like this to find out a player’s expected BABIP:
expected BABIP = .15 * FB% + .24 * GB% + .73 * LD%

For more accuracy you could remove home runs from the batted ball percentages at a rate of 92% from fly balls and 8% from line drives. You could even account for infield fly balls and remove that from total fly balls, but the formula above will get you pretty far.

Dave Studeman a couple of years ago calculated that adding .12 to LD% was good enough for a ball park estimate of a player’s expected BABIP. This is what you’ll often see writers on FanGraphs refer to as xBABIP.

The best way to use this statistic is to attempt to validate a pitcher’s current BABIP. For instance, a pitcher might have an high line drive percentage and a high BABIP. This would give a pitcher a high xBABIP as well and you could say: “Yes, his high line drive percentage is responsible for his high BABIP.”

While this is useful for looking at past performances, the difference in xBABIP and BABIP should not be used in an attempt to evaluate future performance. This is because LD% and BABIP are somewhat independent of each other. While there is some correlation between LD% and BABIP, it isn’t enough to suggest that they will always track each other.

LD% in itself is highly variable and it would be difficult to say that a pitcher with a BABIP of .300 and a LD% of 22% (xBABIP of .340) should do considerably worse going forward because you really don’t know what his LD% is going to be the rest of the season. His xBABIP of .340 was his expected BABIP and will not be his expected BABIP in the future. Typically a pitcher’s expected BABIP in the future will be around the original baseline of .300.


Couple of New Things

I just wanted to point out a couple of new things on FanGraphs:

– In the leaderboard section you can now filter the pitch type by NL/AL and Starters/Relievers. This can be done for the new plate discipline stats for batters and the pitch type data for pitchers.

– The game graphs now have a sidebar with all of the games for that day. Hopefully this will make for easier navigation. There are also two new items in the sidebar: aLI and aWE. These stand for Average Leverage Index and Average Win Expectancy over the course of the entire game. The higher the aLI, the more “exciting” the game was, and the closer the aWE is to .5, the “closer” the game was (throughout the entire game).

– The scoreboard, with all the game graphs on it for one day should load much much faster. Hurray for caching!

That’s all I can think of that’s really new, except for a few changes on the homepage, but you should expect to see a few more seasons of Win Probability rolled out later this week.


Foul Balls Again

In his excellent Ten Things I didn’t Know Last Week column, Dave Studeman speculates on the odds of two fans sitting next to each other catching a foul ball. I was asked about the LA Times article (where two fans sitting next to each other caught back-to-back foul balls) in an e-mail last week and the math to solve this problem became a huge topic of conversation over my weekend.

We previously calculated the odds of catching a foul ball/home run at about 1 in 1000, assuming that everyone in the stadium had access to all foul balls and home runs (which isn’t exactly true).

So let’s say that each foul ball hit into the stands is an independent event and they’re randomly distributed. And let’s say that where you’re sitting you have the ability to catch a foul ball. And let’s say there are 10,000 fans sitting in an area where you can catch a foul ball.

Your odds of catching any one foul ball hit into the stands is 1/10000. Now if there are 30 balls hit into the stands each game, your odds of catching a foul ball have increased to 1-((9999/10000)^30). Which is about 1 in 333.

Now your odds of catching 2 consecutive foul balls in a game is considerably worse and we’re going to assume that both these foul balls are catchable. (Dave Studeman in his evaulation does not assume that and that’s a major difference). Catching two consecutive foul balls would be (1/10000)^2, which is 1 in 100,000,000. But you have 30 chances, so the odds are 1-((99999999/100000000)^30), which is about 1 in 3,333,333.

Those are your odds of catching two foul balls in a row at any one particular game if all things were completely random.

Update: The Numbers Guy over at the Wall Street Journal did a piece on this earlier in the week and Carl Bialik (The Numbers Guy) is who sent me the initial e-mail.


Win Probabiliy: 1981 – 1988

The site has been updated with Win Probability stats for 1981 – 1988.

Please make note of Dwight Gooden’s 1985 season when he had a WPA of 9.94. The only player to have a higher WPA in the years that WPA has been run for is Barry Bonds. Bonds bested him in 2002 and 2004, but that’s it. His BRAA that season was 75, which is 10 runs better than any other pitcher in a single season for all WPA years calculated so far.

The next best player that season in terms of WPA was Eddie Murray who had 6.75 wins, more than 3 wins less than Gooden. And while Gooden unanimously won the Cy Young award that year, he was without a doubt the MVP that season too.


Today in FanGraphs: 5/8/08

Can I Get A Hitter? (Dave Cameron)
– Middle infielders beware: 2008 wrote you a letter and it’s not nice.

Zito…Pitches…Well? (Eric Seidman)
– In a drastic turn of events, it appears Zito did something right.

The Little Red Machine (Marc Hulet)
– Marc thinks the Reds might have found some hidden gems in the 2007 draft.

Designated What? (Dave Cameron)
– Things aren’t good when some team’s pitchers are hitting better than your team’s DH.

Should I Trade Roy Oswalt? (Eric Seidman)
– I’d trade 1 Roy Oswalt for 1 Chase Utley in a heartbeat.


Welcome to the Club, Lieber

Jon Lieber did something today that not a whole lot of pitchers can claim to have done. He gave up 4 home runs in a single inning. He started off the fateful second inning with back-to-back home runs by Adam Dunn and Joey Votto, but eventually settled in and got Edwin Encarnacion to pop-out.

Lieber surely breathed a sigh of relief, only to have Paul Bako hit a home run on the very next pitch. Three batters later Jerry Hairston Jr. hit the final home run of the inning, tying what I believe is a record for most home runs allowed by one pitcher in a single inning.

The last pitcher to do it was Chase Wright of the Yankees, who gave up four home runs in a row on April 22nd last season. Luckily for Wright, it was a nationally televised game against the Red Sox.


Today in FanGraphs: 5/6/08

Win Probability: 1974-1980
– All your favorite players from the latter half of the 70’s now have WPA stats!

Angels Big Two (David Cameron)
– No, it’s not Lackey and Escobar. It’s those other guys: Saunders and Santana.

Furcal En Fuego (Eric Seidman)
– Does batting .366 with 5 home runs count as “en feugo”? Si.

Kerry Wood 20K: 10 Year Anniversary
– Look inside for the video. You have to see it to believe it.

Are the Reds Ready to Get a Little Greener? (Marc Hulet)
– Marc kicks of Reds week with a look at Homer Baily and Jay Bruce.

Try Another Pitch, Fausto (Dave Cameron)
– We know you love your trusty old fastball, but….

Have No Fear, Church is Here
(Eric Seidman)
– That trade isn’t looking so bad right now, is it?


Kerry Wood 20K: 10 Year Anniversary

The New York Daily News recaps Kerry Wood’s incredible 20 strikeout performance all the way back on May 6th, 1998. The SportsCenter highlights are a must watch: (hat tip: Baseball Think Factory)


Win Probability: 1974 – 1980

All win probability stats, game graphs, and play-by-play are now available for the years 1974 through 1980.

The most productive batter in terms of WPA over those 7 years was Rod Carew with 28.9 wins, closely followed by Mike Schmidt with 28.6 wins. Jim Palmer led all pitchers with 23.2 wins, with closer Goose Gossage not far behind with 22.1 wins.

I’m hoping to have 1981 – 2001 available soon.

On a side note: FIP is now correclty adjusted by year and FIP career totals are also weighted appropriately.


Plate Discipline Leaderboards

All the stats made available yesterday in the player pages are now on the leaderboards.

Interesting to see the afore mentioned Nick Johnson with the second best O-Swing% (percent of pitches swung at outside the strike zone) this season. He truly does have great plate discipline.