Archive for Research

Side by Side: Liriano and Hernandez

They say that 50 is the new 40, but I like to say that Francisco Liriano is the new Felix Hernandez. Seattle’s Hernandez made a splash in his major debut last year, going 4-4 with a 2.67 ERA and displaying great velocity, control and the ability to keep the ball on the ground. Sadly, he has not performed as well in 2006, though his long-term potential remains intact.

This year, Minnesota rookie Liriano is even better than Hernandez was last year, with a 9-1 record and a 1.99 ERA. Like Hernandez, Liriano strikes out batters, doesn’t walk them and keeps the ball on the ground. But some of you may have forgotten that Liriano actually made his debut last year, when his ERA was very similar to Hernandez’s 2006 ERA (admittedly, in only 23.2 innings pitched). In fact, the two youngsters over the past two years form an “X” on this ERA graph:

ERA

Two pitchers, so similar in stuff, have been contrasts the past two years. Let’s take a closer, graphical look at some of their components to see if we can spot the key differences between them last year and this year.

Both pitchers are premier strikeout pitchers, though Liriano is particularly impressive. He leads the major leagues in strikeouts per nine innings and Hernandez is 14th.

K

Both pitchers also have great control — Liriano has particularly improved his control this year, a key to his great start…

BB

In general, pitchers have the most control over strikeouts and walks — after that, things start to break down a bit. For instance, a big difference between these two young studs in 2006 is the last of the “three true outcomes,” the home run. Hernandez and Liriano have have formed another “X” on either side of the major league average, which partially accounts for Liriano’s improvement and Hernandez’s decline.

HR

Once a ball is hit in play, a pitcher is dependent on his fielders for help. Liriano’s fielding support has remained around the major-league average, but Hernandez’s hasn’t, as you can see in this graph of each pitcher’s Batting Average on Balls in Play.

BABIP

But the most dramatic difference between the two phenoms the past two years has taken place on the basepaths. Take a look at the percentage of baserunners each pitcher has left on base this year and last:

LOB%

Felix’s LOB% has gone down, and Liriano’s has gone WAY up. Now, this change isn’t entirely random; pitchers with a 1.99 ERA will almost always have an impressive LOB%. But it’s one more part of the equation for each pitcher, which might be summarized as follows:

Hernandez in 2006: more home runs, more hits falling in, more baserunners scoring.

Liriano in 2006: improved control, fewer home runs, fewer baserunners scoring.

These trends mean next to nothing regarding each one’s long-term promise. In fact, the most meaningful difference between the two is one I haven’t graphed, age. Hernandez is two-and-a-half years younger than Liriano, which makes his potential long-term career a bit brighter.

Here are a couple of bonus graphs, the batted ball types for both pitchers. First up is Hernandez’s, and you can see that he hasn’t been as strong a groundball pitcher as he was last year (Note: the green line is the percent of batted balls that are groundballs, the blue line represents the flyball percentage and the red line represents the percentage that are line drives):

Hernanbballtype

Finally, here is Liriano’s graph. The key is that he is getting batters to pound the ball into the turf as often as Hernandez. In fact, the two kids rank fourth and fifth in highest groundball percentage among all league pitchers — an extremely impressive stat.

Liribballtype

Remember that combination: groundball pitchers with 95-100 mph fastballs and great control. You can’t beat it.


What’s Up with Mark Teixeira?

Tex ISOMark Teixeira hasn’t had a good first half. He’s batting .282 with an OBP of .366, figures that are very much in line with his performance the last three years. But he’s only hit eight home runs, compared to 38 and 43 the last two years. As a result, his power numbers are down. On the left graph, you can see the key stat for Teixeira, his Isolated Power. Last year, his Slugging Percentage was fifth-best in the league. This year, his power has been just average.

So, what happened? Well, the first thing to note is that Teixeira’s underlying profile hasn’t changed much at all. For instance, both his walk and strikeout rates are in line with career stats — if anything, he’s improved in these two areas.

graphs_1281_batter_season_4_full150225_20060625.pnggraphs_1281_batter_season_5_full150225_20060625.png

When a player’s home run count drops suddenly, you might assume that his flyball rate has dropped, too. After all, pitchers’ home run rates are often a simple result of their flyball rates. However, batters don’t typically change their batted ball profiles very much, and Teixeira hasn’t really changed his this year either, as you can see in this graph:

batted_balls

Teixeira did hit a lot more flyballs in the beginning of the year, but his groundball rate has risen (not a good trend) and his average flyball, groundball and line drives rates are now in line with his previous years. So, what gives indeed?

The simple answer appears to be that he’s not hitting the ball as hard as he used to. Over the last three years, 20% of his outfield flyballs were home runs. This year, he’s at 7%. The good news is that his out rate on outfield flies is holding steady around 77%, which means that last year’s home runs are falling for singles and doubles this year. In fact, he’s tied for second in the AL with 26 doubles; his career high is 41.

Mark Teixeira is the same hitter he’s always been, just less powerful. Perhaps his timing is off, or perhaps pitchers know something about him. Perhaps he’s in a protracted power slump and he’ll turn it around the second half of the year. Perhaps he has a nagging injury. In this age of unsubstantiated rumors, I don’t want to make any other guesses. Let’s just watch and see what happens.


Graphic Grimsley

It’s the latest baseball scandal — Jason Grimsley took steroids, he took greenies, he took Human Growth Hormone. Once diligent steroid testing began (and after he tested positive for steroids in 2003), he quit the ‘roids. But he kept taking HGH, and he’s not the only one. Now that he’s been caught, he’s naming names.

I know it sounds terrible, but I take some pleasure in this latest news. Hasn’t it been obvious that lots of players beside Barry Bonds have taken illegal drugs for years to enhance their performance? Can’t we now spread our ire to the many instead of the few?

This is obviously just the beginning of the next black mark for baseball. In the meantime, I’ve been wondering if we could pick out when, exactly, Grimsley started using the stuff. According to the affidavit, he first took steroids in 2000 to help him recover from Tommy John surgery, but he may have been taking them sooner. Can we pinpoint a date? Did the drugs have an impact? Let’s see. First, here’s a graph of Grimsley’s ERA for every year of his career…

graphs_602_pitcher_season_1_blog_20060607.png

Whoa. Tom Verducci claims that Jason Grimsley was a different pitcher in 1999 and it sure looks like something happened between 1996 and 1999, doesn’t it? Of course, maybe he just got better, or maybe he performed better in relief (he was primarily a starter before 1999 and a reliever afterwards) Let’s take a closer look at some of his component stats. First, strikeouts per nine innings…

graphs_602_pitcher_season_2_blog_20060607.png

If steroids impact strikeout rates, it doesn’t appear that Grimsley’s medical routine had an impact until 2001 and 2002. And even then the evidence is sketchy. How about walks per nine innings?

graphs_602_pitcher_season_3_blog_20060607.png

This is surprising. It appears that Grimsley’s drug routine may have helped him locate the ball better. I’m sure the truth isn’t that simple, but this kind of finding is perplexing. How about home run rates, you ask?

graphs_602_pitcher_season_5_blog_20060607.png

Some evidence, but nothing too compelling here. Steroids and HGH might have helped Grimsley keep the ball in the park but, as a groundball pitcher, he didn’t give up a lot to begin with. For one last clue, let’s take a look at his Batting Average on Balls in Play (or BABIP):

graphs_602_pitcher_season_8_blog_20060607.png

This is, perhaps, most telling. Grimsley’s batted balls were fielded more often, for whatever reason, after 1998. As a groundball pitcher, Grimsley’s BABIP would tend to be above the average, but pehaps he gained an extra measure of movement or speed on his sinker, making his balls more fieldable and giving him more confidence to throw strikes.

This is all speculation, of course. Drugs may have had no impact on Grimsley’s performance at all. Perhaps he was just better suited to a relief role. If drugs did make an impact, it seems that he started taking them before 2000.

Uncovering the impact of performance-enhancing drugs for any single player is not going to be easy.


Research – Dissecting Plate Discipline: Part 2

In Part 1 of Dissecting Plate, I took a look at three stats which I felt dug a little deeper into a player’s plate discipline and I showed how they correlated with either walks, strikeouts or home runs. Since the correlations are fairly high, we should be able to come up with an expected number of walks, strikeouts and home runs based on the components of a player’s plate discipline.

Let’s start with expected walk percentage (xBB%), you’ll remember that there were two stats that have a high correlation with walks. The first is ZRatio, which also has some correlation with home runs, and the second was OSwing which only correlated well with walks. If you multiply a batter’s ZRatio and OSwing, you essentially come up with a proxy for how often a player should be walking.

BBD

What I find most interesting about xBB% is when you see large discrepancies with actual BB%. Looking at the players who walk more than their xBB% implies, you’ll see highly skilled players like Jason Giambi, Jim Thome, and Todd Helton. Our good friend Russel Branyan also shows up in the top 10. Perhaps behind his monster swing there lays some potential.

Walked More than xBB%		Walked Less than xBB%	
Jason Giambi	7.19%		Todd Linden	-3.35%
Jim Thome	6.92%		Carlos Baerga	-3.37%
Jose Valentin	5.70%		Chris Magruder	-3.45%
Adam Dunn	5.34%		Oscar Robles	-3.46%
Russell Branyan	5.10%		John Olerud	-3.59%
Todd Helton	5.05%		Alex Cora	-3.71%
Angel Berroa	4.58%		N. Garciaparra	-3.81%
John Rodriguez	4.49%		Orlando Hudson	-4.17%
Jim Edmonds	4.44%		Kenny Lofton	-4.18%
Craig Wilson	4.33%		Tike Redman	-4.38%

The players who walk less than their xBB% suggests are mostly contact hitters, but notice how Nomar Garciappara shows up on the list. Is that a sign that he was just unfortunate in 2005 and we can expect a rebound in walks? Since I only have one season of data, it’d be silly to draw multi-season conclusions, but it certainly makes you wonder.

Moving on, the one stat which correlated extremely well with strikeouts was Contact, so we’ll use that to calculate a player’s expected strikeout percentage (xK%). Since striking out and walking appear to be two entirely different skills, I’m not going to use ZRatio in calculating xK%.

Struckout More than xK%		Struckout Less than xK%
Jayson Werth	9.90%		Brian Schneider	-5.38%
Frank Menechino	9.84%		Rondell White	-5.39%
Todd Linden	9.24%		Garret Anderson	-5.49%
Chip Ambres	9.23%		Sammy Sosa	-5.74%
Nick Punto	9.15%		Jorge Cantu	-5.86%
Mark Bellhorn	7.76%		Sal Fasano	-6.02%
Jason Dubois	6.96%		Mike Sweeney	-6.54%
Jamey Carroll	6.84%		Andruw Jones	-7.07%
Jason Giambi	6.83%		Moises Alou	-7.28%
Chris Woodward	6.38%		V. Guerrero	-7.29%

Once again, the large discrepancies are the most interesting. Vladimir Guerrero’s xK% suggests he should strike out a lot more than he does. Most of the best players in baseball are better than their xK% suggests. Looking at the top 10 players that strike out more than their xK% indicates, there are a bunch of not so high profile players except for Jason Giambi. Perhaps he was swinging a little harder than he usually would to prove he could still hit home runs?

Finally, there’s expected Home Runs (xHR), which we’ll calculate using both Contact and ZRatio, which both correlated with home runs-per-fly ball. Multiply the two, and the correlation becomes much stronger. From that we can calculate a player’s expected home runs per fly ball, and then finally actual expected Home Runs.

KD

Looking at the players who hit more home runs than expected, these are obviously many of the power hitters in baseball. What I find more interesting are those who hit less home runs than their xHR suggests. Brad Wilkerson (-13) and Vinny Castilla (-9) were definitely not helped by R.F.K. Stadium’s spacious outfield. It will be interesting to see if these player’s home run totals will rebound this coming season.

More Home Runs than xHR		Less Home Runs than xHR	
Alex Rodriguez	20		Brian Giles	-9
Derrek Lee	19		Marcus Giles	-9
Manny Ramirez	19		Vinny Castilla	-9
Andruw Jones	16		Mike Lowell	-9
Albert Pujols	16		Darin Erstad	-9
Tony Clark	15		J.T. Snow	-9
Mark Teixeira	14		Jeremy Reed	-11
Paul Konerko	14		Adam Kennedy	-11
Ken Griffey Jr.	14		Alex Gonzalez	-12
Jermaine Dye	14		Brad Wilkerson	-13

I honestly don’t know whether these differences in expected versus actual walks, strikeouts, and home runs will hold up from season to season, but all of these stats surely have some element of chance in them. The correlations were too close in my opinion to not do this exercise. Hopefully looking at baseball data on a more granular level will help us better weed out the fluke season.


Research – Dissecting Plate Discipline: Part 1

If you’ve been reading the recent Daily Graphing columns, you’ll notice that I’ve been talking about some not so common stats such as, the percent a player swings at pitches outside the strike zone and actual contact rate. These oddball stats are derived from Baseball Info Solution’s “pitch data” which contains the location and result of each and every pitch, among other things. I’ve decided it’s probably a good idea to go over a bunch of these stats and examine why I think they’re meaningful.

When a pitcher throws the ball, it can land either in or out of the strike zone. Not all batters receive the same amount of pitches in and out of the strike zone and this does make a difference. Batters will see anywhere from 45% to as high as 60% of pitches inside the strike zone. I’ve found it most useful to express this as a ratio. Let’s call this Zone Ratio.

Zone Ratio (ZRatio) – the ratio of pitches inside the strike zone to pitches outside the strike zone

ZRatio

ZRatio correlates quite well with two stats that are frequently used: walks and home runs. It makes a lot of sense that batters who see more pitches outside the strike zone would walk more. Finding it correlated well with home runs was slightly surprising, but also made sense because pitchers would want to be more cautious with batters that can generate more power.

Top 5 ZRatio			Bottom 5 ZRatio
Aaron Miles	1.54		Ryan Klesko	0.89
W. Bloomquist	1.53		Carlos Pena	0.88
Tony Womack	1.49		Chipper Jones	0.86
Tike Redman	1.48		Russell Branyan	0.81
David Eckstein	1.46		V. Guerrero	0.80

If you look at the top and bottom 5 batters in ZRatio, it’s probably no surprise to see Vladimir Guerrero receives the least amount of balls in the strike zone. On the high end, you don’t just have guys like Willie Bloomquist (0 HRs) and Tony Womack (0 HRs), but also David Eckstein (8 HRs). I wonder if David Eckstein will remain in the top 5 in 2006. Quickly shifting back to players that have a low ZRatio you’ll notice Russell Branyan who strikes out more than pretty much anyone in baseball. Unlike Vladimir Guerrero, it’s more likely pitchers are exploiting a weakness instead of respecting his power.

Moving along, after the pitch is thrown, there are two things a batter can do: take the pitch or swing at it. I’ve found that it doesn’t seem to matter how often a batter will swing at pitches inside the strike zone, but how often they’ll swing at pitches outside the strike zone. I’m going to call this stat Outside Swing Percentage.

Outside Swing Percentage (OSwing) – The percentage of pitches outside the strike zone a batter swings at.

OSwing

Probably to no one’s surprise, OSwing correlates quite well with walks. Batters will swing at pitches outside the strike zone as low as 8% of the time to as high as 37% of the time. Looking at the top and bottom 5 lists, you’ll see that batters that you typically consider having great discipline such as Chipper Jones and Jason Giambi round out the bottom of the list. At the top of the list you have batters that really like to swing the bat such as Ivan Rodriguez and Jeff Francouer.

Top 5 OSwing			Bottom 5 OSwing
Ivan Rodriguez	37.69%		Jason Giambi	10.03
Bradley Eldred	37.50%		Dave Roberts	9.39
Angel Berroa	37.01%		Chipper Jones	9.33
Jeff Francoeur	34.91%		Eric Young	9.09
Johnny Estrada	33.22%		Brian Giles	8.33%

Finally, if you do decide to swing the bat, one of two things can happen: you either hit the ball or miss the ball. Batters made contact with the ball anywhere between 54% and 94% of the time. Let’s call this stat simply Contact Percentage.

Contact Percentage (Contact) – The percentage of times a batter hits the ball when he swings the bat.

Contact

There are two things that Contact correlates well with: strikeouts and home runs. The correlation with Strikeouts isn’t a shock. If you can’t lay wood on the baseball, you’re going to be striking out a lot. The correlation with home runs, just like with ZRatio was a little surprising, but once again this also made sense. Typically players who are swinging for the fences are going to swing and miss more often.

Top 5 Contact			Bottom 5 Contact	
David Eckstein	94.26%		Craig Wilson	63.23%
Luis Castillo	94.01%		Carlos Pena	63.02%
Oscar Robles	93.51%		Wily Mo Pena	59.83%
Chris Gomez	93.43%		Russell Branyan	59.21%
Kenny Lofton	92.66%		Bradley Eldred	54.57%

Once again, if we look at the top and bottom 5 players in Contact, you’ll see players like David Eckstein and Kenny Lofton at the top of the list. Just outside the top 5 are players you’d expect to see because they’re considered great contact hitters such as Juan Pierre and Placido Polanco. Remember how Russell Branyan had a very low ZRatio? Being one of the worst contact hitters in baseball, this pretty much confirms that pitchers are indeed exploiting his deficiencies.

So for each of these three stats, I talked about how they correlate well with other more mainstream stats. Based on that, we should be able to come up with the expected number of strikeouts, walks, and home runs for each player based on the components of their actual plate discipline. Later this week in Part 2, I’ll be exploring the discrepancies between what a batter is expected to do based on his plate discipline and what he actually does.