If you’re a regular reader, you may have come across my previous article on the limitations of the StatCast batted-ball data. It’s limited in size, not generating a velocity reading on just over 25% of batted balls. That wouldn’t be a huge deal if that substantial number of missed readings were randomly dispersed across BIP types, but they are not. Weakly hit balls are being missed at a much higher rate: over 56% of batted balls classified as popups were missed, as were over 28% of grounders, likely mostly of the weakly hit variety.
This limits the amount of detailed analysis that can be done, but it doesn’t eliminate it. Since the missed reading rate on fly balls (18%) and liners (17%) is much more manageable — and random — analysis of these two groups might yield some useful results. Today, let’s take a crack at calculating some park factors for the first half of the 2015 season.
First, a refresher on the sample that I used to test the StatCast data. During the All-Star break, I downloaded all of the granular batted-ball data available for each pitcher who qualified for the ERA title of that date. This encompassed 98 starting pitchers, and 31,557 batted balls allowed.
To determine whether this would indeed be a representative sample of the whole, I compared the batting average (AVG) and slugging percentage (SLG) of my sample to the MLB averages at the break; my sample’s .317 AVG and .494 SLG was a very close match to the MLB AVG and SLG on balls put in play through that date.
In 2014, I had full access to the pre-StatCast HITf/x data, which captured exit velocity/angle data for just over 95% of batted balls, a much higher rate. With that information, I was able to compile fairly detailed park factors. Basically, I put balls in play (BIP) in different “buckets” by exit velocity and horizontal and vertical angle, and compared each batted ball in each ballpark to the MLB average production for that “bucket.” Run values were applied to each event, scaled to 100, and voila, park factors.
This year, it’s not that easy. Even the data that is publicly available lacks horizontal and vertical angle information, limiting its usefulness to an extent. This creates one additional complicating factor: prior to 2015, I classified every batted ball with an exit angle from 20 to 50 degrees as a fly ball, and every one over 50 degrees as a popup. This correlated very closely to box-score and major-website categorizations. In 2015, the publicly available StatCast data lacks the vertical angle information, so one must rely on the StatCast classification of the play. They classified many batted balls that are considered line drives by almost any measure to be fly balls. Of all the batted balls with a velocity reading through the All-Star break, StatCast categorized 29.0% of them as line drives. That is way too many.
This makes the calculation of separate fly-ball and line-drive park factors a bit dicey. The fly-ball sample is relatively small, and subject to significant fluctuation. The line-drive park factors are also significantly influenced by the additional fly balls within the sample. I concluded that the best course of action would be to, in addition to calculating overall, fly-ball and line-drive park factors, also calculate a fly-ball/line-drive park factor, combining the two. I also went back to 2014, and combined the fly-ball and liner park factors I had previously calculated. The table below lists each park’s overall, fly-ball, line-drive and combined FLY/LD park factors.
|PK FCT||15 ALL||15 FLY||15 LD||15 FLY/LD||14 ALL||14 FLY||14 LD||14 FLY/LD|
The table is listed in descending order by 2015 FLY/LD park factor. Ground-ball park factors were also calculated for both seasons, but are not listed.
To assess the reliability of the 2015 park factors, I determined the correlation coefficients in relation to the 2014 data. As I expected, there was virtually no correlation between the seasonal line-drive park factors; there is an awful lot of randomness to relative performance on line drives. If a fielder happens to be in the way, it’s caught. The overall park factors had a 0.39 correlation coefficient. The most closely correlated park factors were the fly ball (0.48) and FLY/LD (0.47) correlation coefficients. That gave me some comfort that despite the limited sample sizes and the classification issues, that these park factors possess some merit.
By their very nature, park factors are quite variable. Traditional park factors, which do not incorporate batted-ball data, are usually averaged over multiple years to tone down the noise. The introduction of batted-ball data into the equation adds important context to the data being evaluated, and arguably makes a one-year sample reasonably reliable.
Taking a closer look at the table above, 23 of 30 clubs had FLY/LD park factors of above or below 100 in both 2014 and 2015. In addition, the higher of the FLY or LD park factors was the same for 23 of 30 clubs in both 2014 and 2015. There are hitter’s parks which inflate production on both fly balls and liners, and there are pitcher’s parks which deflate production on both. In addition, there are parks which inflate/deflate production on fly balls much more so than on liners. In other words, there are ballparks that are particularly fly-ball or line-drive friendly, or even ballparks that are more conducive to relatively high or low fly balls. Many fitting into the latter category are classified as liners by StatCast. This is quite important when you are evaluating a player for acquisition; a high exit angle guy like, say, Chris Carter is much more productive in Houston than he would be in say, St. Louis, where line-drive power is of much greater relative value.
Let’s take a look at some ballparks that fit into some of these broad categories:
High-Fly-Ball Inflators: HOU, COL, NYY, CIN, LAD, CUB, MIN, MIL
These parks inflated fly-ball production much more than they did liner production in both 2014 and the first half of 2015. Coors Field, obviously, has altitude as its primary driver. A fly ball at 100-104 mph generated .547 AVG-1.919 SLG of production in MLB through the All-Star break; those balls inevitable leave the yard in Coors, yielding .818 AVG-3.091 SLG over that span. Interestingly, Dodger Stadium yielded almost the same production on 100-104 mph fly balls, at .833 AVG-3.167 SLG in the first half. Very quietly, hitting a ball out to the middle of the field isn’t very difficult at Chavez Ravine. If you hit a 100 mph fly ball at Miller Park, it’s likely to leave the yard; 83% of such fly balls in Milwaukee were homers, compared to 48% in all of MLB.
At Wrigley Field and Great American Ballpark, hitters get a better than average return on a 95-99 mph fly ball. MLB hitters batted .209 AVG-.652 SLG on such fly balls in the first half, but batted .257 AVG-.857 SLG and .250 AVG-.929 SLG on them in Chicago and Cincy, respectively. The return is even higher at Minute Maid, where 95-99 mph fly balls yielded a .242 AVG-.970 SLG line. As I wrote in my previous StatCast article, the fly ball “donut hole” or dead zone, now extends from 75-94 MPH. In Houston, you can hit home runs, especially to left field, within that dead zone. Yankee Stadium, with its short porches down both lines, but especially to right field, is another place that will give you dead-zone homers. Less than 2% of 90-94 mph fly balls went over the wall through the break; in HOU and NYY, 6.7% and 7.5%, respectively, went for homers.
Believe it or not, neither HOU or CIN is the best place to hit a 95-99 mph fly ball. That would be Target Field, which yielded .412 AVG-1.118 SLG within that bucket. This should probably be called the Brian Dozier Bucket, strategically located to his pull side. No player has matched his hitting style to what his ballpark gives him better than Dozier.
High-Fly-Ball Deflators: MIA, PHL, SEA, PIT, AZ, DET, OAK, ATL, STL
Marlins Park has supplanted Safeco Field as the leading fly-ball-killing park in 2014-15. Essentially, the fly-ball donut hole extends to the 95-99 mph bucket in Miami: hitters batted only .105 AVG-.316 SLG on such fly balls in the first half. Safeco’s fly-ball park factors have crept up a bit in the last couple of seasons, due to first to a significant change in the dimensions of the park, especially to left-center and center field, and secondly in 2015, to a very mild weather pattern in late spring/early summer. Talk about a dead zone, though: hitters batted .000 on fly balls between 80-94 mph at Safeco in the first half. Interestingly, both PNC Park and Turner Field have also yielded zero production on such fly balls through the break.
You have to hit a 100 mph fly ball to get anywhere at Busch Stadium; a 95-99 mph fly got you only .133 AVG-.400 SLG in the first half. Chase Field has a reputation as a hitters’ park, but on high fly balls at least, it is undeserved. Even in the 100-104 mph fly-ball bucket, hitters only managed .455 AVG-.1.455 SLG. Comerica Park is a pretty close match for Chase, yielding .500 AVG-1.583 SLG on the same group of fly balls. The prize for lowest production within the 95-99 mph fly-ball bucket, however, goes to O.Co Coliseum (.100 AVG-.250 SLG). OAK is not a pure fit in this category, as its results were much different in 2014, but that 35.6 2015 fly-ball park factor simply can’t be ignored.
Consistently Line-Drive/Low-Fly-Ball Friendly: COL, TOR
Only 6.0% of batted balls classified by StatCast as liners left the yard in the first half. 9.5% of liners hit in Coors left the yard, including three in the 95-99 mph bucket. Rogers Centre gives Coors a runs for its money with regard to liners, however: 9.2% of liners there left the yard, including four at 99 mph or lower.
Consistently Line-Drive/Low-Fly-Ball Unfriendly: OAK
Line-drive park factors, as you can see in the table above, have a much narrower range than fly-ball factors. O.Co Coliseum stands out, with liner park factors of 83.8 and 91.5 in 2015 and 2014. In every bucket between 80-109 mph, liner production in OAK was much lower than the MLB average in the first half.
Friendly to Fly Balls and Line Drives: COL, NYY, TOR, BOS, CIN, CUB
Most of these have been discussed above. Overall, FLY, LD, and FLY/LD park factors are all above 100 for each of these parks in both 2014 and 2015. Fenway Park wasn’t listed with the high-fly-ball inflators above, as its FLY and LD factors are fairly equivalent in 2015. Fenway is an equal opportunity high- and low-fly-ball inflator, because of the Monster, which simply gets in the way of routine fly balls. Fenway’s SLG on fly balls isn’t very notable: it was .624 in the first half, compared to .637 with the same batted balls in a neutral park. Fenway’s actual AVG was .258, compared to a contextually adjusted .212. Plenty of would-be outs become doubles, but plenty of would-be homers become doubles as well.
Unfriendly to Fly Balls and Line Drives: MIA, OAK, KC, SEA
Again, most of these have been discussed above. Kauffman Stadium has stealthily limited offense of all kinds in recent years, a trend which has been exacerbated by the excellence of the Royals’ defense. Hard hit balls just don’t get you far in KC. Production on 95+ MPH fly balls and 100+ mph liners is well below MLB average in each bucket at Kauffman, usually by a significant margin.
US Cellular Park has traditionally been a fly-ball friendly, but liner-unfriendly ball park. This year, while its fly-ball park factor still exceeds its liner factor, both are down by a significant margin. It was a cool late spring/early summer in Chicago, but weather didn’t seem to impact Wrigley.
Then there’s Petco Park. Would love to crowdsource some assistance on this one, as I’m out of ideas. In 2012 and 2014, Petco played as an extreme pitcher’s park. In 2013 and thus far in 2015, it’s played as an extreme hitter’s park. It can’t all be attributable to the Padres’ hideous current outfield defense, I would think. Is it some sort of even year marine layer thing? A harbinger of a near-term tectonic event? Statistical noise the equivalent of Led Zeppelin in their prime? Hit me up with your thoughts.