Player Variance by Run Environment
I’ll apologize in advance, because I don’t draw any stunning conclusions from this post. I am, however, going to present the data from my most recent toilings with baseball data. I was reading Wendy Thurm’s most recent article, and I noticed a commenter that pointed out the positions’ offensive output was more similar and closer to overall league average in the last few years compared to the past 20 years. My immediate reaction was to blame or credit the low run environment. My thoughts are that a higher run environment would produce more random variation for each player, which in turn would produce more variance in the entire league.
There are a multitude of factors that affect the talent distribution aside from the possibility of run environment such as training, performance-enhancing drugs, expansion, wars, and talent evaluation. With my quick exploration, I was not able to take these into account, so the following data visualizations serve as more exploratory analysis than any conclusive analysis.
I found the variance of a handful of offensive stats among players with more than 500 PA and plotted them against the run environment. I included the seasons from 1900 to 2014 for NL and 1901 to 2014 for AL, but I removed the 1981 and -94 season owing to the strikes in those particular years. You can change the stat and the league. I also included a histogram to show the shape of the distribution of the stat for a particular year.