Some Thoughts for the New Year

January 4, 2010

This past week, while a lot of us were on hiatus, there was a good bit of discussion going on in the blogosphere about the role stats played in baseball. This eventually led to “The Mike Silva Chronicles” posts over at insidethebook.com, in which Mike Silva asked 10 questions and Tangotiger answered them. They’re without a doubt worth reading.

One of these posts in particular had to do with “Stats Saturation” in which Mike Silva asked:

Do you believe the advanced metric community is saturating the market with stats, to the point where progress that was initially made with the evolution of traditional stats is now minimal at best. Wouldn’t it be wiser to allow some of the current metrics to gain acceptance before you “advance the current advanced metrics” (corny phrase so to speak).

FanGraphs has been around for almost 5 years and this is a question which is frequently on my mind.

For the most part, FanGraphs does not create new statistics. It’s more of an aggregation of what I consider the best publicly available sabermetric work on the internet from the minds of people like Tom Tango, Mitchel Lichtman, Dan Szymborski, Dave Studeman, Sean Smith, with contributions from many others including our own staff.

There are certainly a lot of stats available on FanGraphs. By my count there are well over 100 different metrics available for batters and pitchers, which are generally broken into logical sections which make them a bit easier to handle. Some of these stats have caught on more in the mainstream, while others I’m sure will continue to only be used among the sabermetric crowd.

With so many stats, I can certainly understand the question of why things might be getting overkill with additional stats. The truth of the matter is that there really are different categories of stats and even those who are not particularly interested in sabermetrics are very much interested in scouting statistics.

Four of the stat sections on FanGraphs are devoted entirely to scouting statistics. These include the Pitch Types, Plate Discipline, and Batted Ball sections. The data in these sections is based literally off what Baseball Info Solutions scouts see. There are no fancy calculations and everything is fairly intuitive. Was the ball in the strike zone or out of the strike zone? How fast was the pitch and what type of pitch was it? Is there room for some disagreement on these? Of course. There’s always going to be some disagreement between scouts, but you really can’t blame the formulas.

Which brings us to the stats which are based on formulas. Many of the metrics on FanGraphs are rooted in “linear weights”. wOBA, wRC, wRAA, wRC+, FIP, and the Batting component of our Value section are all entirely linear weight based. These all use linear weights to measure different things, or in some instances the same thing expressed in a different way. But at their heart they’re all measuring bucketed stats in runs.

Now neither scouting stats nor linear weight based stats are, in my mind, particularly controversial. There are, however, controversies on how linear weights are adjusted, such as by park or by league. And then when scouting stats and linear weights are combined, it seems to be a particularly hot button issue. UZR, for instance, relies on both scouting data (where the ball landed, how hard it was hit, its trajectory) and turns that data into linear weights depending on how the ball was classified, and then adjustments are also applied.

UZR and the Pitch Type values are really statistical scouting. They’re all about putting a value on what you see.

So getting back to Mike Silva’s question of if it’s better to wait for acceptance before introducing new stats, I’ll say it depends. The standard linear weight based stats on FanGraphs aren’t going anywhere. I really don’t think it’s worth introducing an entirely new named set of statistics for what might be a very small increase in accuracy. The current ones may from time to time be tweaked slightly, but they’ll remain under the same familiar names and will represent the same things.

Though when it comes to stats that really do bring a new point of view, whether it be with new scouting data, or drastically different models, then I’ll say they’re welcome at FanGraphs. The other type of stat that I’m not opposed to adding is one that makes an existing model more accessible, which I felt was the case with wRC+.

With all that said, I’d just like to throw out a quick reminder for the new year about thinking. Please take the time to understand the stat you’re using before you use it in an argument and before you criticize it. It’s best when the stats available on this site are used in thoughtful, open-minded discussions that enhance your knowledge and, most importantly, your enjoyment of the game of baseball.

15 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

MLT

15 years ago

Great post, especially for us noobs still wading through the mounting heap of acronyms in the baseball world. I think the reason I’m drawn to places like this is that stat development and projection models seem like real bright spots in the midst of an otherwise phony era. I don’t think it’s coincidence that steroids and SABR are blowing up around the same time… Though I don’t have any numbers to back it up, I think fans will always be looking for authenticity in the game.

Juliana

Reply to MLT

Interesting observation, MLT. I suppose sabermetric analysis restores some of the historical continuity that may have been lost for some as a result of the Steroid Era. That’s not to say advanced metrics can account for ped-enhanced performance or determine who’s using and who isn’t (at least not yet!) but they provide more of a historical (warning: cliche to be used ahead) “level playing field” when it comes to side-by-side analysis of a players’ value.

While of the idea of “baseball as a living narrative” has emanated from nostalgia and memory for some, advanced stats allow everyone to view that narrative as based in logic and objectivity instead. I think the decision, whether conscious or not, to embrace sabermetric analysis depends on what is more valuable to you: nostalgia (which can be nebulous) or historical objectivity. I just want to have both…

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG