Author Archive

Leaderboard Update

I’ve made some changes to the leaderboards. You can no longer display all the rows, but there is a handy export button which will let you export all the rows into either excel or a csv file. If this is a real issue for you, let me know, but I have my reasons for removing the show all rows feature. I’m hoping the export option makes this a non-issue.

I’ve also added a month select feature which will let you filter for any particular month, including the last 7 days, last 14 days, and last 30 days. Now you can know that for the last 7 days, Jay Bruce leads the majors with a 1.06 WPA.

There’s also been some minor changes to the game log pages. I’ve repeated the headers so you can figure out what each column is further down the table. This has always been an issue for me and hopefully the game logs are now considerably more readable. I’ve also added the same export options that are available in the leaderboards.


Reviewing the 2007 Draft: The Series

If you somehow missed it, Marc Hulet has been recapping the 2007 draft for the past week in honor of tomorrow’s 2008 draft. They’re a must read if you want to know how the first few rounds panned out one year later.

National League

American League:


Player Linker Updated

For those of you who don’t browse our post archives from many years ago, you may be interested in knowing that we have a tool where you can enter the text of your article and it will put html tags around the player names linking them back to FanGraphs.

It’s mainly use it internally, but there are some sites such as Baseball Analysts that have been using it on a regular basis for years now.

I made a few updates to the Linkifyer to make it work a little better. It’s not perfect, but it will now link Minor League players in addition to Major League players.

Just copy and paste your article text in the text box, hit the “Link Players Now” button and presto! You now have linked players.


Stats Pages Updated

Couple quick updates to the stats pages. Each section now has a header and each section is now linkable. So if you’d like to send someone to the Plate Discipline stats or the Pitch Type stats, you no longer need to tell them to scroll to the bottom of the page. Just click on the header you’d like to link to get get the appropriate link.

And if things look a little weird, just reload the page by hitting F5 or the reload button on your browser.


Player Search Update

I’ve updated player searching. I think it works a lot better now and hopefully you will too. Minor league and major league players now appear on one page and the search algorithm is now much better at picking up misspelled player names and inexact matches.


Jay Bruce is Pretty OK

Jay Bruce arrives with a 3-3 night, including 2 walks. Reds fans rejoice:

“Did he actually touch the ground, or did he just glide from base to base?”On Baseball and The Reds

“Jay Bruce will never record an out in his Major League career.”The Reds Rocket

“Thats what we call a debut!”Reds Minor Leagues

“Wow. That’s all I can say. Just, wow.”Redleg Nation

“Corey Patterson better get comfy in his spot on the bench, center field is now occupied.”Redlegs Rundown

“Bruuuuuuce, Bruuuuuuce”The Real McCoy

“As a side note: Jay Bruce.”Reds Pitchfx

“No, he’s not the savior, […]”Dusty Baker


Those Home Run Blues

We’re about two months into the season, and it’s not a bad time to look which pitchers are allowing too many home runs. Fortunately, there’s a useful metric on FanGraphs to do just that. It’s called HR/FB and while I’m sure many of you are familiar with it, here’s a brief summary of how it works.

There’s been a number of studies done on HR/FB and for the most part, they conclude that pitchers do not have control over how many home runs they allow on outfield fly balls. Your typical starting pitcher should be expected to have a HR/FB of around 10% every year. Anything that deviates from 10% could be contributed to the park he pitches in, or to “luck”. So let’s look at who has been allowing an inordinate number of home runs this season:

Roy Oswalt (23.4%) – Oswalt leads baseball with a rather ridiculous HR/FB rate. Basically one in four of his fly balls have become home runs. I don’t care where he’s pitching, this is just Oswalt having some terrible luck. He’s never had a HR/FB above 12.9% to end the season. A couple weeks ago, Eric Seidman asked if you should trade Oswalt in your league; the answer is still no and now is another prime opportunity to go acquire him.

Brett Myers (21.4%) – Sure he plays half his games in Citizen’s Bank Park and he does have a career HR/FB of just over 15%, but 21% even for him seems quite high. He probably isn’t due for such a drastic adjustment as Oswalt, but I’d imagine it should start to trend towards his career average. He hasn’t allowed a home run in his last two starts either, so perhaps he’s well on his way to normalcy.

Carlos Villanueva (16.9%) – Currently, Villanueva leads baseball with a 2.09 home runs per 9 innings. He’s about as much as a fly ball pitcher as he is a groundball pitcher so he really shouldn’t be tied for 5th with most home runs. While Miller Park isn’t all that favorable to fly balls, he should be able to do considerably better in the home run department and decrease is ERA by more than a little come season’s end.

Johnny Cueto (16.4%) – It looks like the phenom has himself a bit of a home run problem. Since he hasn’t been around for very long, it’s a little tough to say if this is just a luck thing, or of it’s a real problem. I’d venture to say it has more to do with luck then anything else, even if he does play in a park that is prone to home runs. Unfortunately, Cueto is an extreme fly ball pitcher and isn’t expected to be particularly stingy with home runs in general.

Mike Mussina (16.4%) – We all know about Mussina’s decline in fastball velocity. John Walsh’s research suggests that mis-located fastballs of the slower variety could certainly cause an increase in home runs and it’s possible that could be happening to Mussina. I still think his HR/FB should drop as the season continues, but it’s hard for me to be enthusiastic about.

Johan Santana (15.9%) – Santana developed a home run problem last year and it seems to have continued into this year. Shea stadium is slightly worse for home runs than the Metrodome, but it really doesn’t explain such a high HR/FB. It’s hard to imagine it won’t decrease as the season goes on, but unless it drops back down to around 10% or lower, it will be difficult for him to return to sub-3 ERA levels.


Expected BABIP for Pitchers

Recently on FanGraphs, we’ve been referring to a stat called xBABIP or Expected Batting Average on Balls in Play to help justify a pitcher’s current BABIP. There’s been a few questions about what this stat means, so I thought it’d be as good a time as any to try and explain the ins and outs of this particular metric.

The initial concept of BABIP is that pitchers do not have control over what happens to balls once they are hit into the field of play.

BABIP typically fluctuates from year to year with a baseline of around .300. If a pitcher has a particularly high or low BABIP, we may say he’s been lucky or unlucky. Things are of course not quite this simple, but for the most part the rule holds true.

In enters ball in play data; we know how many line drives, fly balls, and ground balls a pitcher allows in to play. Line drives fall for hits the most often and ground balls fall for hits more often than fly balls. What types of batted balls a pitcher allows into play are going to effect a pitcher’s overall BABIP.

BABIP by Type (2007):
Fly Balls – .15
Ground Balls – .24
Line Drives – .73

Ideally, the formula is going to look something like this to find out a player’s expected BABIP:
expected BABIP = .15 * FB% + .24 * GB% + .73 * LD%

For more accuracy you could remove home runs from the batted ball percentages at a rate of 92% from fly balls and 8% from line drives. You could even account for infield fly balls and remove that from total fly balls, but the formula above will get you pretty far.

Dave Studeman a couple of years ago calculated that adding .12 to LD% was good enough for a ball park estimate of a player’s expected BABIP. This is what you’ll often see writers on FanGraphs refer to as xBABIP.

The best way to use this statistic is to attempt to validate a pitcher’s current BABIP. For instance, a pitcher might have an high line drive percentage and a high BABIP. This would give a pitcher a high xBABIP as well and you could say: “Yes, his high line drive percentage is responsible for his high BABIP.”

While this is useful for looking at past performances, the difference in xBABIP and BABIP should not be used in an attempt to evaluate future performance. This is because LD% and BABIP are somewhat independent of each other. While there is some correlation between LD% and BABIP, it isn’t enough to suggest that they will always track each other.

LD% in itself is highly variable and it would be difficult to say that a pitcher with a BABIP of .300 and a LD% of 22% (xBABIP of .340) should do considerably worse going forward because you really don’t know what his LD% is going to be the rest of the season. His xBABIP of .340 was his expected BABIP and will not be his expected BABIP in the future. Typically a pitcher’s expected BABIP in the future will be around the original baseline of .300.


Couple of New Things

I just wanted to point out a couple of new things on FanGraphs:

– In the leaderboard section you can now filter the pitch type by NL/AL and Starters/Relievers. This can be done for the new plate discipline stats for batters and the pitch type data for pitchers.

– The game graphs now have a sidebar with all of the games for that day. Hopefully this will make for easier navigation. There are also two new items in the sidebar: aLI and aWE. These stand for Average Leverage Index and Average Win Expectancy over the course of the entire game. The higher the aLI, the more “exciting” the game was, and the closer the aWE is to .5, the “closer” the game was (throughout the entire game).

– The scoreboard, with all the game graphs on it for one day should load much much faster. Hurray for caching!

That’s all I can think of that’s really new, except for a few changes on the homepage, but you should expect to see a few more seasons of Win Probability rolled out later this week.


Foul Balls Again

In his excellent Ten Things I didn’t Know Last Week column, Dave Studeman speculates on the odds of two fans sitting next to each other catching a foul ball. I was asked about the LA Times article (where two fans sitting next to each other caught back-to-back foul balls) in an e-mail last week and the math to solve this problem became a huge topic of conversation over my weekend.

We previously calculated the odds of catching a foul ball/home run at about 1 in 1000, assuming that everyone in the stadium had access to all foul balls and home runs (which isn’t exactly true).

So let’s say that each foul ball hit into the stands is an independent event and they’re randomly distributed. And let’s say that where you’re sitting you have the ability to catch a foul ball. And let’s say there are 10,000 fans sitting in an area where you can catch a foul ball.

Your odds of catching any one foul ball hit into the stands is 1/10000. Now if there are 30 balls hit into the stands each game, your odds of catching a foul ball have increased to 1-((9999/10000)^30). Which is about 1 in 333.

Now your odds of catching 2 consecutive foul balls in a game is considerably worse and we’re going to assume that both these foul balls are catchable. (Dave Studeman in his evaulation does not assume that and that’s a major difference). Catching two consecutive foul balls would be (1/10000)^2, which is 1 in 100,000,000. But you have 30 chances, so the odds are 1-((99999999/100000000)^30), which is about 1 in 3,333,333.

Those are your odds of catching two foul balls in a row at any one particular game if all things were completely random.

Update: The Numbers Guy over at the Wall Street Journal did a piece on this earlier in the week and Carl Bialik (The Numbers Guy) is who sent me the initial e-mail.