Hudson & Perez: BIS & Pitchf/x

I thought it’d be interesting to take a look at how Baseball Info Solutions pitch type data matches up with the Pitchf/x’s pitch type identification data for just the two starters in the Nationals Opener. Before I begin, Pitchf/x has a new field in their data called “type_confidence” which appears to be a measure of how accurate their classification is. All the Pitchf/x aggregates were found on Josh Kalk’s player cards.

Tim Hudson:

Fastball - BIS: 70.5% (89.3mph) |  Pitchf/x: 71.8% (91.9mph)
Slider   - BIS: 19.2% (84.5mph) |  Pitchf/x: 18.0% (86.9mph)
Changeup - BIS:  5.1% (78.2mph) |  Pitchf/x:  7.7% (80.5mph)
Cutter   - BIS:  2.6% (86.5mph) |  Pitchf/x:  2.6% (89.1mph)
Splitter - BIS:  2.6% (78.0mph) |  -------------------------

Upon closer inspection the pitch logs are incredibly similar for Hudson, with BIS and Pitchf/x disagreeing on a mere 3 pitches. Two pitches classified as changeups by Pitchf/x were classified as splitters. The third was classified as a fastball by Pitchf/x, but as a slider by BIS. That third pitch in disagreement had the lowest Pitchf/x type_confidence of any pitch in the game.

Odalis Perez:

Fastball - BIS: 59.7% (86.0mph) |  Pitchf/x: 78.8% (88.7mph)
Cutter   - BIS: 28.4% (83.6mph) |  -------------------------
Changeup - BIS: 10.4% (79.3mph) |  Pitchf/x:  4.4% (81.5mph)
Curve    - BIS:  1.5% (74.0mph) |  -------------------------
Slider   - -------------------- |  Pitchf/x: 14.5% (86.5mph)
Splitter - -------------------- |  Pitchf/x:  2.9% (85.3mph)

Needless to say, things were not as rosy for Perez. There were 27 pitches in disagreement, not counting 3 pitches that disagree because of non-identification on either BIS or Pitchf/x part. 9 pitches that BIS classified as cutters were classified as sliders by Pitchf/x. The average type_confidence for the other 18 pitches in disagreement was .56, where it was .68 for pitches in agreement.

For what it’s worth, Perez claims to throw a fastball, cutter, changeup and curveball.

That’s all I got for now. I haven’t had time to take a hard look into why the pitches might have been classified differently by using any of the break data, and nothing is standing out as different about those pitches at first glance.

Update: Josh Kalk over at The Hardball Times took an indepth look at Pitchf/x’s pitch classification on Tim Hudson’s first start.





David Appelman is the creator of FanGraphs.

6 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
John Beamer
16 years ago

Great analysis dave. It is really important we can cross reference the data sources. Is the gameday pitch analysis done by sight or through an algorithm