Archive for Research

Would Somebody Please Score For Harang?

Prior to this season I strongly contended that Aaron Harang was the most underrated starting pitcher in the National League. A little over a month into the season and my contention has grown. There’s a formula I like to use to determine underratedness:

Underrated = (Being On A Bad Team + Very Few, If Any, Nationally Televised Games)/(Never Posting Insane Conventional Stats, Like 20 Wins, That Would Force Somebody To Take Notice)

It’s a bit long but it gets the job done and sums up Harang. Some may argue, “But Eric… Harang does post great stats,” and I would agree; however, he has never won 20 games or posted a 2.40 ERA, or struck out 265 batters, which are numbers that would literally force Cy Young Award voters and more casual fans to take notice.

From 2005-2007, The Harangutan has made 32+ starts, pitched in 210+ innings, walked no more than 56, and increased his strikeouts from 163 to 218. His WHIP has ranged from 1.27 to 1.14 and his ERA has decreased from 3.83 to 3.73. His FIPs of 3.67-3.71 imply his ERA has mostly painted an accurate portrait.

This year, Harang has the following line:

8 GS, 3.09 ERA, 3.17 FIP, 1.10 WHIP, 55.1 IP, 48 H, 13 BB, 47 K; opponents are hitting .235/.281/.382 (.663 OPS); and his average Game Score is 59.

His record, you may ask? 1-5. One win and five losses despite posting the numbers above. Harang has given up more than three runs in just one start. If we remove that start he has a 2.55 ERA, a 1.03 WHIP, and an average Game Score of 61.

How has this happened? Well, run support! The Reds have scored a measly 14 runs for Harang. On top of that, four of those runs came in his lone victory. In his other seven starts he has received 10 runs of support in 47.1 innings. Jorge de la Rosa got seven runs from his team just yesterday.

I probed the B-R Play Index to find pitchers in recent years who have experienced similar plight. Though this is likely specially selecting stats, the criteria used:

a) In the team’s first 40 games
b) From 1999-2008 (ten years)
c) Game Scores of 55+
d) No more than 3 ER
e) Resulting in a Loss or No-Decision

That criteria brought a five-way tie for first place, with each of the five recording 5 losses/no-decisions. The players were: Harang, Odalis Perez (2008), Roger Clemens (2005), Chris Carpenter (2006), and Jason Bergmann (2007).

If the Reds do not start picking up their ace he could very well fall into the 2007 Matt Cain category of extreme unluckiness.


Shutdown Innings

Primarily due to the MLB Extra Innings package and my sheer love of the sport I am a true baseball junkie. I don’t care if it’s the generic Red Sox/Tigers matchup on ESPN or the local feed of a Reds/Nationals game; watching various games helps me connect better with players who would otherwise be nothing more than names on a page. Something that comes with the territory of watching so much baseball is listening to different sets of broadcasters and their terminology or beliefs.

Though many differ in opinion over issues pertaining to clutch hitting, or Barry Bonds, one common weapon in their broadcasting arsenal involves some form of the following phrase: “Well, (insert pitcher) needs to just shut them down this inning to keep the momentum going.”

These assertions generally occur after their team has scored to either a) take the lead, b)tie the game, or c) make a significant effort to come back. Regardless of which takes place, the idea is that momentum has begun its shift into their dugout and, by shutting down the opponent in the following half-inning and preventing them from tacking on more runs, it can sustain its new position.

Hearing about these magical shutdown innings so much made me research which pitchers are truly the best at ensuring a change in momentum is not a fluke. Essentially, I’m defining these Shutdown Innings (SHIP) as any half-inning following one in which the team scores, to bring themselves within a maximum of four runs (ahead or below).

Trailing by nine runs and scoring two would not cause the following half-inning to qualify as a potential shutdown inning; trailing by nine runs and scoring five or six would. On a similar token, leading by one run and then scoring five would not lead to a SHIP; however, leading by one and then scoring two or three would.

Just looking at the National League, for now, I took every pitcher with 41+ innings (the top 25) and scanned their individual game logs to come up with the following top ten:

1) Todd Wellemeyer, 9-9, 1.000
2) Braden Looper, 10-11, .909
3) Scott Olsen, 7-8, .875
4) Ryan Dempster, 11-13, .846
5) Tim Lincecum, 10-12, .833
6) Jake Peavy, 9-11, .818
7) Mark Hendrickson, 13-16, .813
8) Jair Jurrjens, 12-15, .800
9) Dan Haren, 12-15, .800
10) Aaron Cook, 11-14, .786

Wellemeyer and Looper have provided 19 SHIP out of a possible 20. Of these 25 players, the worst three were:

23) Roy Oswalt, 6-11, .545
24) Adam Wainwright, 5-10, .500
25) Brett Myers, 8-16, .500

The next step of this, which I’ll get into later in the week, deals with just how much the failed SHIP attempts (trying really hard to resist a dock or anchor or sea metaphor) effected the teams; some players may be handed a four-run lead and give up just one run whereas others will be given a four-run lead and give up four runs. Shutdown Innings do not tell us which pitchers are better but rather who sustains the momentum discussed by broadcasters most often.


Maddux Not Alone in Milestone Struggles

After five innings of shutout work against the Dodgers on April 13th, Greg Maddux had won 349 games in his illustrious career. Since then, winning his 350th has proved to be quite difficult, as his last four starts have resulted in three losses and a no-decision. Even though one was a tough loss and his no-decision consisted of seven shutout innings, Maddux still sits on 349 wins. Seeing my idol fail to secure this milestone brought to mind what other pitchers struggled in recording wins milestones of their own.

The last pitcher to 350 was Roger Clemens, who did so with his third win last season. Prior to that 8 IP-2 H performance against the Twins, Clemens had failed in three consecutive starts to get to 350.

Nolan Ryan won 324 games in his career but #320 did not come easily. He won his 319th game in July of 1992 but then failed in TEN consecutive starts closing out the season (6 losses, 4 no-decision) to get his milestone. As fate would have it, he would get #320 in his first start of the 1993 season.

In an odd turn of events, Don Sutton lost his next start after winning his 319th game, but picked up #320 after going 4.1 IP in relief soonafter.

Phil Niekro won his 299th game in 1985 but missed out on #300 in four subsequent starts. He would win his 300th in the final start of that 1985 season.

It took Early Wynn six tries to get his 300th win and, unfortunately for him, these attempts spanned two years. After missing out in his final three 1962 starts, he could not win in his first three 1963 starts. I guess you could say his 300th win came…early…in the 1963 season. (Note – I apologize to pun-haters). He finally won his 300th in his fourth start in 1963, however it came with the ugly-in-1963 line of 5 IP, 6 H, 3 ER.

Maddux has struggled to win his 350th but, have no fear, he will reach this milestone even if it takes him a Nolan Ryan-esque ten attempts.


2008 Fan Save Values

One of the main reasons I got involved with sabermetrics was to show fans intimidated by statistical analysis that not all effective evaluative methods consist of complex mathematical formulas. My favorite statistics are ones like FIP, wOBA, and EqA: better indicators of skill and performance but scaled to look similar to the more commonly accepted barometers. In using similar scales more of a chance exists for widespread acceptance.

Through surveys I found that a major reason fans shy away from better evaluative methods is that they are unfamiliar with the baselines. They don’t necessarily know what a good VORP is, or a good WPA; however, they definitely know what constitutes a good or bad ERA or batting average.

One of these stats not mentioned too much is called “Fan Save Value,” and it was created by analyst Ari Kaplan while working for the Orioles in 1990. Essentially, as described in the book Baseball Hacks by Joseph Adler, FSV measures the difficulty level of each save by taking into account the lead with which the closer enters as well as the number of outs he must record to secure a win for his team. When all of the results are added together we are left with a number similar to the saves total but more indicative of how hard a closer had to work.

The formula for FSV is (X/Y)/2, where X=the amount of outs to record and Y=the lead of his team. For instance, recording a one-inning save with a two-run lead would result in an FSV of 0.75; 3 outs divided by 2 runs ahead, then divided by 2. Using this statistic I decided to look at the current saves leaders and determine how hard each has had to work:

Francisco Rodriguez, LAA: 12 saves, 11.3 FSV
George Sherrill, Bal: 11 saves, 7.6 FSV
Joe Nathan, Min: 10 saves, 10.8 FSV
Jonathan Papelbon, Bos: 8 saves, 10.3 FSV
Mariano Rivera, NYY: 8 saves, 9.8 FSV
Huston Street, Oak: 8 saves, 8.3 FSV

Brian Wilson, SF: 10 saves, 10.5 FSV
Eric Gagne, Mil: 9 saves, 8.0 FSV
Jason Isringhausen, StL: 9 saves, 6.5 FSV
Brandon Lyon, Ari: 9 saves, 9.5 FSV
Brad Lidge, Phi: 7 saves, 7.5 FSV
Jon Rauch, Was: 7 saves, 6.1 FSV

The saves leader in each league remains on top when using FSV but the rest of the leaderboard shifts. The biggest dropoffs come from Isringhausen and Sherrill: Both of these pitchers have entered save situations in which their teams led by large margins and/or recorded a couple of 0.2 IP saves.

Papelbon, on the other hand, has pitched more than one inning in a few of his saves. Since the overall result looks similar to the saves total, and saves are commonly used as an end-all when evaluating closers, the FSV is an easy to use and better evaluative tool because it adds context to a normally context-free statistic.


Buehrle Cruises But Loses

Following an Alex Rios lineout and an Aaron Hill strikeout, Mark Buehrle induced a grounder to third off the bat of Scott Rolen that should have ended the inning. A throwing error, ground-rule double, and single up the middle later, Buehrle found himself the victim of two unearned runs. He struck out Lyle Overbay to end the inning but the Blue Jays ultimately had all the run support they would need.

Despite posting a complete game line of 8 IP, 5 H, 0 ER, 0 BB, 7 K, Buehrle suffered the loss; the committee of Shaun Marcum, Jeremy Accardo, Jesse Carlson, Shawn Camp, and Scott Downs collaborated on a two-hit shutout.

Buehrle threw exactly 100 pitches and 74 of them were strikes. Ironically, ten of those 26 balls came on the first pitch to batters. Here is his pitch breakdown:

  • Fastball: 45, 87.8 mph
  • Curve: 7, 72.4 mph
  • Slider: 27, 83.5 mph
  • Changeup: 21, 79.4 mph

Nobody swung and missed at his fastball though his slider proved difficult to make solid contact with; the White Sox fouled off nine of them and flat out missed six. Here are location charts showing where he threw his slider to righties and lefties:

buehrle-slider-location.bmp

While he tried to stay inside with the slider when facing lefties, he challenged righties in the zone more often with this offspeed pitch.

Facing a total of 29 batters, Buehrle threw a first pitch strike 16 times; ten of the first pitches were balls and another three were put in play. Unlike Matt Cain, who essentially throws a fastball to start every hitter, Buehrle started the White Sox offense with 18 fastballs and 11 breaking balls. With an 0-1 count Buehrle threw just two fastballs out of his 16 total pitches; with a 1-2 count he primarily threw changeups; on a 3-2 count he did not rely on his fastball, throwing two sliders and a changeup out of five pitches. Normally I would show this data in graphical or chart form but, when dealing with just one start, it would suggest patterns and tendencies that just cannot be determined with such a small sample.

Buehrle did not have an easily identifiable “out-pitch” last night as, with two strikes, he threw 13 fastballs, 2 curves, 7 sliders, and 8 changeups.

The aspect of pitch data that fascinates me most right now is sequencing: What does a pitcher throw after a certain pitch? Buehrle threw very few pitches to lefties but, against righties it becomes clear that he successfully mixed up his pitches.

buehrle-sequence.bmp

When throwing his changeup to righties he definitely relied on additional offspeed pitches to accompany it rather than mixing speeds by throwing a fastball. In fact, there were six different plate appearances in which Buehrle threw three consecutive offspeed pitches. Never one to waste time on the mound, his quick work and mix of pitches and speeds successfully kept the Blue Jays off balance. Though his overall numbers suffer from an atrocious opening day start, Buehrle will have to pitch like this much more often for the White Sox to avoid the label of “pretender” and be in the race all year.


O-Swing% Correlations

As David announced earlier this week, fangraphs now has swing data for hitters, giving us a breakdown of who swings at what and how often. This is just another great resource he’s added to the stat pages. And, as always, more data means a chance for more research.

One of the first things I wanted to look at in this data was the effects of swinging at pitches outside of the strike zone. It’s been a sabermetric credo for a while now that good hitters are selective at the plate and don’t chase pitches out of the zone, but players like Vladimir Guerrero, Nomar Garciaparra, Ivan Rodriguez, and Ichiro Suzuki have all had tremendous careers despite swinging at pitches that no sane person would think they could hit. If you can be a hall of fame hitter without being selective, how important is it?

So, I decided to take the list of qualified hitters for 2008 (197 in all) and look at the correlations between their O-Swing%, which measures how often they swing at pitches out of the strike zone, and their BB% and K% rates. Intuitively, I would have thought that the more often a player swings at pitches that would otherwise be called balls, he’d have a lower walk rate and a higher strikeout rate. Here are the results:

O-Swing%/BB% correlation: -0.67
O-Swing%/K% correlation: -0.10

The walk rate correlation matches up with expectations, as there’s a strong negative correlation between swinging at pitches outside the strike zone and walk rate. This, of course, makes sense – the more often you swing at pitches that would have otherwise been called balls, the less likely you are to draw four balls in any given plate appearance. Guys like Matt Diaz eschew the walk through sheer determination.

However, look at the correlation between O-Swing% and K%. I’d have expected a fairly strong positive correlation, as you tend to think of guys flailing at sliders in the dirt as more prone to strikeouts. However, the correlation barely exists at all, and it goes to the negative to the point that there is any correlation. We see this manifest in guys like Erick Aybar and A.J. Pierzynski, who both swing at 37% of pitches outside of the strike zone but strikeout just 7% and 6% respectively.

It appears that players who swing at pitches outside of the zone do so because they can hit them – it is the ability to make contact that creates the aggressive approach, and not just a player’s desire to hack at anything.


28 Scoreless: Breaking Down Cliff Lee’s Streak

Indians lefty Cliff Lee recently put the finishing touches on an absolutely incredible April. Ironically, the 6 IP, 3 ER, 3K performance on Wednesday night—one many pitchers would love to have—was far and away his worst of the month. He will likely garner AL Pitcher of the Month honors, in unanimous fashion, and rightly so: Anyone who posts a legitimate W-L of 5-0, a 0.96 ERA, 16.0 K/BB, and has surrendered just one home run in 37.2 innings truly deserves any and all recognition.

Oh, and Lee also pitched 28 consecutive scoreless innings.

It began in the fifth inning of his April 13th start and lasted all the way until the seventh inning of Wednesday’s start. Despite this, he did not even get halfway towards Orel Hershiser’s oft-underrated and mind-boggling 59 consecutive scoreless innings record. Seeing as Lee’s effort has come to an end I decided to break everything down and analyze what happened in these innings. Hmm… a summary of Cliff’s streak… wait for it… Cliffsnotes:

Overall Streak Line: 28 IP, 10 H, 0 R, 1 BB, 24 K

Of those 28 innings, 18 were perfect (no baserunners) and 4 more saw Lee face the minimum; in each, singles were erased by double plays.

He allowed 11 runners to reach base and only two found themselves at third base. In the fifth inning on April 24th, Jose Guillen doubled and advanced to third on a wild pitch. With two outs, Lee struck Miguel Olivo out to end the threat. In the fifth inning Wednesday night, Wladimir Balantien reached third before Ichiro Suzuki flied out.

The situation with the highest Leverage Index (1.95) occurred when John Buck led off the ninth inning on April 24th; Buck flied out.

He found himself in eight situations with a Leverage Index of 1.60+ and produced the following results: 4 flyouts, 2 groundouts (1 GIDP), and 2 strikeouts.

His WPA over the course of these four games (31 IP) accounted for 1.72 wins; his WPA during this streak (28 IP) accounted for 1.67 wins.

Using TangoTiger’s Marcel projections I classified those with expected slugging percentages of .445+ as power hitters and under that mark as contact. Against Lee, power hitters went: 5-25, 2B, 4 GIDP, 5 K. Contact hitters went: 5-69, BB, 19 K.

The blog Defensive-Indifference does not think Lee will be able to sustain his current pace and I happen to agree. One of the reasons he is unlikely to continue dominating is that his current BABIP of .195 exceeds expectations based on a line drive frequency of 13.7%; with that frequency it should be closer to .257. Additionally, despite posting a 0.96 ERA, his FIP implies it is closer to the 2.01 mark; still great, but not superhuman. Regardless of what happens from May until September, though, Lee had an absolutely phenomenal opening month.


Roy “The Complete Game” Halladay

Blue Jays righty Roy Halladay has earned his reputation as being a top-tier pitcher, one of the best this decade. Predominantly utilizing a combo of fastballs, cutters, and curveballs, Halladay keeps the ball on the ground, generating grounders just about 60% of the time. Though he has suffered some injuries in the last few years, when healthy, going late into games is not an issue; he has pitched 220+ innings in four of the last six years. This durability has led to 32 complete games since 2002, seven more than closest competitor Livan Hernandez. The entire AL had 64 complete games last year and Halladay accounted for seven of them.

There have been 16 complete games this year and Halladay currently has four of them. Additionally, his other two starts have seen him go for seven and eight innings respectively. Oddly enough, despite pitching a complete in each of his last four starts, his W-L in that span is 1-3. His three consecutive complete game losses ties a Blue Jays record set in 1980 by Jim Clancy. The last pitcher to lose consecutive complete games is another Roy: Roy Oswalt, who did so in 2006.

The last major league pitcher to throw four consecutive complete games? Well, that would be Halladay back in 2003. The all time record for consecutive complete games belongs to Jack Lynch, with 198; his record was compiled between 1881 and 1887. Halladay would need an average of 33 starts in each of his next seasons just to make 198 starts, let alone go the distance.

Very interesting about his complete games is that he is not overextending himself by throwing a ton of pitches. Through 49.2 innings he has thrown 648 pitches, which amounts to an average of 13 per inning. His complete games have consisted of the following pitch counts: 110, 117, 107, 112. The Blue Jays have scored 115 runs this year, just 17 of which have come in Halladay’s six starts. The team has provided an average of 4.67 runs/gm in non-Halladay games and just 2.83 runs/gm for their ace. Hopefully the team can start scoring runs or else this former Cy Young Award winner may find himself qualifying for next month’s edition of unluckiest pitchers.


Dave Bush Sent Down Due to…Um…

The Milwaukee Brewers optioned Dave Bush to their AAA Nashville club this week following an 0-3 start with a 6.75 ERA. This irked many in the statistical community, especially MGL, because not only is Bush pitching better than the barometric numbers indicate, he is pitching right on par with normal #3 or #4 starters.

The problem, after looking into it more, is not necessarily that Bush’s optioning stems from his slow start but that it may have to do with his Jekyll-like transformation once runners get on base. In 1649 career PA, batters are posting a .253/.307/.420 clip against Bush while the bases are empty. When occupied, though, this jumps to .314/.355/.546 in 1121 PA. Despite these numbers, if the decision to option Bush stemmed from a kneejerk reaction to a small sample size, this argument does not even hold water.

This year, with 57 PA, opponents are hitting .320/.404/.560 with the bases empty. In 47 PA, they are hitting just .263/.340/.368 with runners on base. In the early going he has allowed a higher percentage of runners to reach base but has actually reduced his slash line once they get to their respective bases.

Ned Yost attributed the move to control issues. In his four 2008 starts, Bush has thrown 57.6% of his pitches for strikes. In 2007 he threw 65.6% strikes out of 2972 pitches. In 2006 he threw 66.4% strikes out of 3021 pitches. He has thrown a lower percentage of strikes, putting himself into more “Hafta’ Counts” and forcing himself to pitch from behind. The Brewers had a pitching surplus entering the season, often fostering rumors of trading Bush and/or Claudio Vargas.

Seemingly happy with the production from Ben Sheets, Yovani Gallardo, Jeff Suppan, Carlos Villanueva, and Manny Parra, the team felt they could handle letting Bush get his control back in a few minor league starts. The decision was definitely based off of a small sample size and, in that regard, makes little sense, but I hope they realize it is way too early to sell their stock on Bush.


The Unluckiest April Pitchers

When discussing the early season productivity of a player it is important to remember that the barometric statistics analyzed by those on television often have better evaluative indicators beneath the surface. Many of these indicators deal with luck, or a lack of luck, and provide a more telling window into what should be expected from the player in the coming weeks and months. Dave Cameron wrote earlier today about Nick Johnson and how, based on his high percentage of line drives but low BABIP, Johnson has experienced an unlucky streak this month. Using a slightly expanded method to the ones currently used to determine luck (relative to GB/FB/LD and BABIP) I decided to figure out the unluckiest pitchers of the month.

Initially the method consisted solely of BABIP and the difference between FIP and ERA (FIP-ERA) but I then extended it to include their current W-L records. Though anyone at least casually versed in statistical analysis will explain the faults of a W-L record, fans that simply love the game without analyzing anything equate W-L records to quality.

A pre-requisite also found its way into the criteria in that the FIP could not be above 4.30, a cutoff I usually look for when determining the difference between good and average. This way players like CC Sabathia and Barry Zito could be avoided; players who, despite having large FIP and ERA discrepancies, still had very high FIP and ERA counts. In summation, the players I considered to be the unluckiest were those with high FIP and ERA discrepancies, a high BABIP against, and a W-L record that does not do a good job of representing the quality of games pitched. This left me with the following four candidates:

    Nate Robertson: 6.91 ERA, 4.18 FIP, -2.73 FIP-ERA, .333 BABIP, 0-3 W-L
    Nick Blackburn: 3.45 ERA, 2.72 FIP, -0.73 FIP-ERA, .347 BABIP, 1-1 W-L
    Ian Snell: 4.45 ERA, 3.03 FIP, -1.42 FIP-ERA, .358 BABIP, 2-1 W-L
    Zach Duke: 5.34 ERA, 3.97 FIP, -1.36 FIP-ERA, .369 BABIP, 0-2 W-L

So, of these guys, whom do you consider to be the unluckiest this month?