Player Variance by Run Environment
I’ll apologize in advance, because I don’t draw any stunning conclusions from this post. I am, however, going to present the data from my most recent toilings with baseball data. I was reading Wendy Thurm’s most recent article, and I noticed a commenter that pointed out the positions’ offensive output was more similar and closer to overall league average in the last few years compared to the past 20 years. My immediate reaction was to blame or credit the low run environment. My thoughts are that a higher run environment would produce more random variation for each player, which in turn would produce more variance in the entire league.
There are a multitude of factors that affect the talent distribution aside from the possibility of run environment such as training, performance-enhancing drugs, expansion, wars, and talent evaluation. With my quick exploration, I was not able to take these into account, so the following data visualizations serve as more exploratory analysis than any conclusive analysis.
I found the variance of a handful of offensive stats among players with more than 500 PA and plotted them against the run environment. I included the seasons from 1900 to 2014 for NL and 1901 to 2014 for AL, but I removed the 1981 and -94 season owing to the strikes in those particular years. You can change the stat and the league. I also included a histogram to show the shape of the distribution of the stat for a particular year.
The time series plots point to a correlation between a high run environment and high variance between players. This has been especially true over the last several years, where most stats’ variance decreases as the run environment has decreased.
The histograms are interesting as well if you want to see how the talented has been distributed. Overall, baseball talent is not normally distributed. It does become somewhat normal when you increase the plate appearance (PA) limit. I used only players over 500 PA in a given year, so most of these histograms (particularly BABIP) are normal. Before applying the PA filter, I saw a lot of right-skewed distributions, particularly for WAR.
I build things here.
Seems to me like there’s been an emphasis on outfield defense recently. A while back nobody cared what your LF or RF did, as long as they could hit dingers.