Archive for Research

Predicting 2012’s Strikeout Improvements

If heteroscedasticity lasts longer than three hours, consult your physician immediately.

“Honey, I think I’ve got heteroscedasticity,” I said to my wife when she walked in the door. As a writer who works at home, I spend the majority of my time locked away in my windowless home office, concocting ways to frighten my dear wife who works all day.

“And it’s ruining my spreadsheets,” I finally added, after she had stood wide-eyed and wordless for a few moments.

On Tuesday, we examined the fantastic and bizarre case of rookie right-hander Jeremy Hellickson, whose high swinging-strike rate has not translated into an equally high strikeout rate (K%). Today, let’s expand the scope of that investigation.

Read the rest of this entry »


Umpires of the LDS

The list of umpires scheduled for the LDS has been released. As much as they should not be a factor in the games, several of their decisions will ultimately be scrutinized this postseason. The following is a look at which umpire strike zones are most likely to get notice and affect the game.

I am not going to get into any discussion on if the umpires and their strike zones are good or bad. They are their own individuals. The more I look into the subject, the differences can be some of the 2% that can be exploited to gain an advantage over other teams.

At the beginning of the season, I rated which of the umpires are the most hitter and pitcher friendly. Here is a look at each umpire, their rating and what series and game, for now, they are to umpire. I know there are only five games, but I included the last umpire in case there are any changes. The umpires at the top of the list are more hitter friendly and those at the bottom are more pitcher friendly:

Read the rest of this entry »


Crowdsourcing: Stadium Shadows

A couple days ago, Tom Tango, did a study to show how hitters produced at different times of the day. One possible problem with the data is that shadows from the stadium could be skewing the data. We would like to know if at any time of the day there is a shadow between the pitcher and the hitter and from what time to what time does the shadow stay between the two combatants. Finally, if any one knows the information on any closed stadium, that information would be helpful.

With this information, it can be seen if the shadow has an effect on the hitter-pitcher match-up. I would expect the number of strikeouts to increase as the batter would have a harder time picking up the ball, but it would be nice to put a number to the theory. Thanks for your time and I am sure someone will jump right in for the data on Tropicana Field.


How Should We Measure Power?

What exactly is “power”? Is it the ability to hit home runs? Doubles? Triples? Should we consider how far a player hits a ball, or are we just concerned with the outcome? How would you define it?

If we were to try and define power from the ground up, obviously you’d have to start with home runs. Power hitters are guys that mash lots of home runs, right? When I think power, I think of players like Jose Bautista, Babe Ruth, Hank Aaron, and Barry Bonds. Home runs are so flashy, they steal the show.

But there’s more to power than a player’s raw home run total. You can’t completely ignore other extra base hits, which is why there are statistics like Slugging Percentage and Isolated Power. Slugging Percentage measures a player’s total bases and Isolated Power measures a player’s extra bases*, so both statistics count doubles and triples as well as home runs.

*Quick refresher course for everyone. Slugging Percentage = Total Bases / At Bats ; Isolated Power = Extra Bases / At Bats

Or if you prefer to think about it another way, Jose Bautista has a .330 ISO this season. That means he averages nearly one extra base every three at bats. 

Both these stats have the same problem, though: not all bases are created equal. If a player has accumulated 30 extra bases in 100 at bats, isn’t there a big difference if those extra bases were accumulated through 10 home runs versus 30 doubles ? Both players have the same Isolated Power, but which one has provided their team with more value through their power production?

Good question, I’m glad you asked.

Read the rest of this entry »


Fielding Independent Batting, A ShH Revolution

We have discovered something marvelous! We only need to know four simple measures to predict what a hitter can do: Their walk rate, strikeout rate, home run rate and BABIP.

For weeks now, we’ve been exploring and playing with my initial discovery, and now we’re to the point where I feel comfortable calling it Fielding Independent Batting (FIB) — an homage to Fielding Independent Pitching (FIP), which predates it and partly inspired it.

Read the rest of this entry »


Five-Tool Players by the (Nerdiest Possible) Numbers

If there’s still such a thing as newstands anymore, the issue of Baseball America at your local one (i.e. your local newstand) is that publication’s annual “Tools” edition. No, it’s not (as you might suspect from the title) an issue dedicated entirely to relief pitchers with questionable taste in facial hair. Rather, it’s in this edition of the magazine that the editors of Baseball America attempt to isolate the players — major- and minor-leaguers — with the best baseballing tools (hitting for average, hitting for power, speed, etc.).

Beyond the results of a survey to which each of the league’s 30 managers responded, the issue also includes an attempt by author Matt Eddy to find five-tool players “by the numbers” (subscription required, I think).

After a brief discussion of what a “plus” tool might look like when quantified — and also some notes on the obvious limits of such an endeavor — Eddy suggsts this as a methodology:

For the sake of this exercise, let’s identify an above-average hitter as one who bats at least .285/.360/.460 with an isolated power of .175. That’s a 110 percent bump across the board (and then rounded down slightly to please the eye).

To this, Eddy also adds a speed component (more than 20 stolen bases) and runs his criteria through Baseball Reference’s Play Index for all player seasons 2000-10 — the results of which you can find here. (Note: it appears as though Eddy’s power criteria in that search is actually 20-plus homers and not a floor of a .175 ISO, but the results come out similarly.)

The big winner using this methodology is Bobby Abreu, who meets all of Eddy’s criteria in seven of 11 possible seasons. Hanley Ramirez qualifies in four seasons, while Alex Rodriguez finishes third with three “five-tool” seasons.

Eddy’s experiment is an interesting one, both in and of itself, and also for its potential to be nerd-ified — which, this being FanGraphs, that’s what I’ve endeavored to do in what follows.

Read the rest of this entry »


This ShH Just Got Real!


Should Hit, or ShH — pronounced like: “Shh! Be quiet or the Nazis will hear!”

Last week, while rifling through the lump of cold numbers that is the 2011 season, I stumbled upon a self-illuminating chest of gold coins: A reliable, fielding independent hitting formula. Today, we’re going to take it to the next level and get nerdy up in this beach.

Before we proceed, let’s do some of the research I did not care to do the first time around. Here are some of Should Hits’s predecessors (though they did not directly influence the creation of ShH):

FIP for Hitters
In 2010, Matt Klaassen wrote “FIP for Hitters? Defense Independent Offense.” His work in the article — though it does stay truly defensive independent and does not bring BABIP into the conversation — probably mirrors my work the most. However, his works is different both in process (he excludes BABIP and does not use wRC+) and intention (my tool focuses on regression, not really true talent levels).
Read the rest of this entry »


Defensive Independent Hitting, Or ShH

Maybe there’s is a better way to predict how well a hitter is doing? Rather than glancing at his OBP and SLG and OPS or his wOBA and wRC+ and then mentally calibrating that number according to an inflated or deflated BABIP, maybe we can find a simple means of combining the key elements into a single formula.

Well, I believe I have stumbled onto just such a formula.

Th’other day, when I was trying to solve the mystery of the Tampa Bay Rays and their utterly broken run expectancy chart, I began ruminating about the relationship between walks, strikeouts, and an ability to create runs. You see, the Rays tend towards true outcomes: lotsa walks, lotsa strikeouts. So, for some strange reason — be it bad luck or bad hitter-type chemistry — the Rays seem to have an inability of reaching a standard run expectancy with the bases loaded.

Anyway, I began to investigate this trifle and produced an interesting comparison:


Read the rest of this entry »


Return on Over-Slot Signings (Part 2)

Earlier, we found that teams do indeed pay a premium to acquire early-round talent in the later rounds of the draft. (And I would suggest reading the first part of the study before you read this so that you are caught up on the methodology and some of the terminology). Today, we’ll look at over-slot signees selected from the supplemental to the tenth round, differences among position players and pitchers, as well as differences between high school and college players signed to over-slot deals.

While teams had to pay a fairly significant premium to acquire early-round talent in the later rounds, teams appear to have gotten solid, even good value signing players to over-slot bonuses in the supplemental round through the tenth round of the draft.

Read the rest of this entry »


Return on Over-Slot Signees in Amateur Draft

Since it’s creation, the August 15th deadline for teams to sign drafted players has become one of the most important days on the baseball calendar. Interspersed among the headlines of which early picks have signed are the reports of players drafted in later rounds, sometimes even outside of the first ten rounds, signing six and seven-figure deals. These ‘over-slot’ signees are a bit of a mystery. Few outside of the scouting industry know much of anything about these players, and it’s difficult to judge the value of an over-slot signee relative to a team’s other draft picks. Is the 12th rounder who signed for $500,000 a better prospect than the third rounder who signed for $400,000? What type of premium do teams pay to sign players to over-slot deals? How much does it cost to sign a ‘second-round’ talent late in the draft? Can it even be done? These are the questions I sought to shed light on.

Unfortunately, reliable bonus data on players who sign late in the draft is tough to find before 2005, so our sample will be restricted to the 2005-2009 drafts. Because many of these players are just starting their major league careers or are still playing in the minors, we can’t know precisely how much value each draftee will ultimately provide, but using Erik Manning’s translation of Victor Wang’s research, we can come up with a fairly accurate approximation of each player’s value based on how they ranked on Baseball America’s Top 100 lists and in John Sickels’ team-by-team rankings.

Read the rest of this entry »