Author Archive

Hudson & Perez: BIS & Pitchf/x

I thought it’d be interesting to take a look at how Baseball Info Solutions pitch type data matches up with the Pitchf/x’s pitch type identification data for just the two starters in the Nationals Opener. Before I begin, Pitchf/x has a new field in their data called “type_confidence” which appears to be a measure of how accurate their classification is. All the Pitchf/x aggregates were found on Josh Kalk’s player cards.

Tim Hudson:

Fastball - BIS: 70.5% (89.3mph) |  Pitchf/x: 71.8% (91.9mph)
Slider   - BIS: 19.2% (84.5mph) |  Pitchf/x: 18.0% (86.9mph)
Changeup - BIS:  5.1% (78.2mph) |  Pitchf/x:  7.7% (80.5mph)
Cutter   - BIS:  2.6% (86.5mph) |  Pitchf/x:  2.6% (89.1mph)
Splitter - BIS:  2.6% (78.0mph) |  -------------------------

Upon closer inspection the pitch logs are incredibly similar for Hudson, with BIS and Pitchf/x disagreeing on a mere 3 pitches. Two pitches classified as changeups by Pitchf/x were classified as splitters. The third was classified as a fastball by Pitchf/x, but as a slider by BIS. That third pitch in disagreement had the lowest Pitchf/x type_confidence of any pitch in the game.

Odalis Perez:

Fastball - BIS: 59.7% (86.0mph) |  Pitchf/x: 78.8% (88.7mph)
Cutter   - BIS: 28.4% (83.6mph) |  -------------------------
Changeup - BIS: 10.4% (79.3mph) |  Pitchf/x:  4.4% (81.5mph)
Curve    - BIS:  1.5% (74.0mph) |  -------------------------
Slider   - -------------------- |  Pitchf/x: 14.5% (86.5mph)
Splitter - -------------------- |  Pitchf/x:  2.9% (85.3mph)

Needless to say, things were not as rosy for Perez. There were 27 pitches in disagreement, not counting 3 pitches that disagree because of non-identification on either BIS or Pitchf/x part. 9 pitches that BIS classified as cutters were classified as sliders by Pitchf/x. The average type_confidence for the other 18 pitches in disagreement was .56, where it was .68 for pitches in agreement.

For what it’s worth, Perez claims to throw a fastball, cutter, changeup and curveball.

That’s all I got for now. I haven’t had time to take a hard look into why the pitches might have been classified differently by using any of the break data, and nothing is standing out as different about those pitches at first glance.

Update: Josh Kalk over at The Hardball Times took an indepth look at Pitchf/x’s pitch classification on Tim Hudson’s first start.


Welcome to the Majors: 3/30/08

Gregor Blanco made his major league debut last night as a pinch runner. He replaced Brayan Pena as the runner at first in the top of the 8th and did not return to field in the bottom of the inning. Blanco had a good spring by batting .326/.464/.442 which impressed enough to win him a roster spot as a potential outfielder. Last year in the International League (AAA) he had the 10th best OBP among qualified players.

Not exactly a player, but Nationals Park needs to be welcomed to the majors as well. I was in attendance during last night’s game and have to say the stadium is really nice. It reminds me a bit of Citizen’s Bank Park, which happens to be one of my favorite ball parks. Security was extremely tight including mandatory metal detectors due to President Bush’s attendance.

Much to my surprise, getting to the game was a cinch by Metro. I expected to have considerable wait times between transfers, but there was zero wait time and relatively short lines. Without a parking spot, I guess I’ll get to see if Metro can continue this level of competence all season long.


Get to Know: Win and Loss Advancement

+WPA (win advancement): The amount of positive wins a player contributed to his team, including only the plays where he increased his team’s win expectancy.

-WPA (loss advancement): The amount of negative wins a player contributed to his team, including only the plays where he decreased his team’s win expectancy.

How it’s calculated: It’s calculated exactly the same as WPA, but it only includes the positive (+WPA) or negative (-WPA) results.

Why you should care: It further breaks down WPA letting you understand how big a positive or negative contribution a player made to his team. A player who has a WPA of 1.25 could have made both huge positive and huge negative contributions to his team.

Links and Resources:

Win Shares and Loss Shares


Get to Know: Win Expectancy

WE (win expectancy): The percent chance a particular team will win based on the score, inning, outs, runners on base, and the run environment.

Assumptions: Win expectancy as it’s currently calculated assumes that each team has an equal chance of winning at the start of a game.

Specifics: FanGraphs uses Tangotiger’s most current win expectancy tables which are available for 3.0 to 6.5 run environments in increments of .5 runs. The league average run environment is used to calculate win expectancy. When the run environment falls in between a .5 increment, the tables are then weighted accordingly to achieve the correct win expectancy.

Links and Resources:

Hardball Times: The One About Win Probability
Walk Off Balk: Win Expectancy Finder
The Book Wiki: Win Expectancy


Get to Know: Clutch

Clutch: A measurement of how much better or worse a player does in high leverage situations than he would have done in a context neutral environment.

How it’s calculated: WPA / pLIWPA/LI

Why you should care: Unlike tradition clutch statistics (close & late), Clutch is a much more comprehensive statistic taking into account all situations that may or may not have been high leverage. Additionally, instead of comparing a player to the rest of the field, it compares a player to himself. A player who hits .300 in high leverage situations when he’s an overall .300 hitter is not considered Clutch.

Links and Resources:

All About Clutch
Baseball Fever Forum: SABR Matt


Notes From the Opener

The Japan series is over and no longer do I have to wake up at 6am to catch my daily dose of baseball. I like the idea of having a few games in Japan, but I don’t like having the season opener completely disjointed from the “other” season opener that will take place this Sunday night. It’s kind of a tease. Which leads me to Rich Harden.

Oh Rich Harden, I took a chance on you in all my fantasy drafts last year, but completely forgot about you this year! Needless to say, it looks like this could have been a huge mistake on my part. You struck out 9 in 6 innings of work while only allowing 3 hits and 1 run. The last time you struck out 9 was all the way back in 2005. I pleaded with you last year to stay healthy and it didn’t work so I’ll try again: Please stay healthy this year so you can dazzle us with your pitching ability.

Keith Foulke who retired in 2007 due to health reasons, pitched an inning in both games and struck out 2 while allowing not a single run. He entered both games as the setup man and was used in a fairly high leverage situation (LI of 2.17) in the first game. So far so good for the 35 year old.

Emil Brown made a big base-running error in the first game costing the A’s .25 wins (only -.064 wins overall). He tried to remedy his error in the second game by leading all batters with a WPA of .12 wins, courtesy of his 3 run homer in the 3rd inning.

And finally, Manny Ramirez now leads everyone in WPA and has already eclipsed his 2007 WPA total of .31. Looks like he’s already started to rebound from his worst season ever.

And don’t forget to check out Studes’ latest “Ten Things” column over at The Hardball Times where he delves into the merits of WPA/LI among other things.


2008 Stats: Now Updated

The site is now up to date with 2008 stats and will of course be updated nightly. The new pitch data usually runs a day or two behind and that’s why all the pitchers in yesterday’s game have undefined pitch type and velocity data.

Couple quick reminders:

– We’ll have live win probability data all season long including the all-star game and playoffs.

– The live play-by-play data is different from the data that becomes “official” in the nightly loads, which often causes WPA values to change slightly from those found in the live data.


The Great Clutch Project

Tangotiger is running his Clutch Project to see if you know which players on your favorite team are clutch. All you need to do is vote for which player you’d want to have at-bat when the game is on the line.

Once the votes are in, we’ll be compiling the results on a daily basis during the season. In the mean time, go vote!


Live Play Logs & Box Scores

Just as the title says, there are now play logs and box scores for live data. Unlike the graphs they do not update automatically, so you’ll have to refresh the page if you want the latest data. I might change that if enough people would prefer to have it update automatically every 30 seconds to a minute.


Get to Know: WPA/LI

WPA/LI (context neutral wins / game state linear weights): How many wins a player contributes to his team with the Leverage Index aspect removed, invented by Tom Tango.

Calculating WPA/LI: WPA is divided by LI for each individual play attributed to a specific player and then the WPA/LI for the individual plays is then added up to create WPA/LI for an entire season. This is considerably different then taking a player’s WPA and dividing it by pLI.

Why you should care: Unlike standard linear weights, WPA/LI does take into account the situation. So at times when a walk would be just as valuable as a home run, WPA/LI accurately weights the walk and the home run, where linear weights would still give .13 wins to the home run and the walk .03 wins.

Links and Resources:

Unleveraging Win Probability
The Book Wiki: Linear Weights