Hudson & Perez: BIS & Pitchf/x

I thought it’d be interesting to take a look at how Baseball Info Solutions pitch type data matches up with the Pitchf/x’s pitch type identification data for just the two starters in the Nationals Opener. Before I begin, Pitchf/x has a new field in their data called “type_confidence” which appears to be a measure of how accurate their classification is. All the Pitchf/x aggregates were found on Josh Kalk’s player cards.

Tim Hudson:

Fastball - BIS: 70.5% (89.3mph) |  Pitchf/x: 71.8% (91.9mph)
Slider   - BIS: 19.2% (84.5mph) |  Pitchf/x: 18.0% (86.9mph)
Changeup - BIS:  5.1% (78.2mph) |  Pitchf/x:  7.7% (80.5mph)
Cutter   - BIS:  2.6% (86.5mph) |  Pitchf/x:  2.6% (89.1mph)
Splitter - BIS:  2.6% (78.0mph) |  -------------------------

Upon closer inspection the pitch logs are incredibly similar for Hudson, with BIS and Pitchf/x disagreeing on a mere 3 pitches. Two pitches classified as changeups by Pitchf/x were classified as splitters. The third was classified as a fastball by Pitchf/x, but as a slider by BIS. That third pitch in disagreement had the lowest Pitchf/x type_confidence of any pitch in the game.

Odalis Perez:

Fastball - BIS: 59.7% (86.0mph) |  Pitchf/x: 78.8% (88.7mph)
Cutter   - BIS: 28.4% (83.6mph) |  -------------------------
Changeup - BIS: 10.4% (79.3mph) |  Pitchf/x:  4.4% (81.5mph)
Curve    - BIS:  1.5% (74.0mph) |  -------------------------
Slider   - -------------------- |  Pitchf/x: 14.5% (86.5mph)
Splitter - -------------------- |  Pitchf/x:  2.9% (85.3mph)

Needless to say, things were not as rosy for Perez. There were 27 pitches in disagreement, not counting 3 pitches that disagree because of non-identification on either BIS or Pitchf/x part. 9 pitches that BIS classified as cutters were classified as sliders by Pitchf/x. The average type_confidence for the other 18 pitches in disagreement was .56, where it was .68 for pitches in agreement.

For what it’s worth, Perez claims to throw a fastball, cutter, changeup and curveball.

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

That’s all I got for now. I haven’t had time to take a hard look into why the pitches might have been classified differently by using any of the break data, and nothing is standing out as different about those pitches at first glance.

Update: Josh Kalk over at The Hardball Times took an indepth look at Pitchf/x’s pitch classification on Tim Hudson’s first start.





David Appelman is the creator of FanGraphs.

6 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
John Beamer
17 years ago

Great analysis dave. It is really important we can cross reference the data sources. Is the gameday pitch analysis done by sight or through an algorithm