When making any prediction for a young player, dealing with minor league data in an absolute necessity. This still remains a relatively new thing in baseball’s history, with little attention given to minor league stats until Bill James introduced his method of Major League Equivalency in the 1985 Baseball Abstract. Twenty-five years ago, I wrote one of the first things of mine to ever hit the broader internet, a quick primer on how to calculate James’ MLEs. Working with the data was immensely difficult at the time, and even worse when James was developing MLEs. There was no central repository of minor league stats, and just getting the current year was highly difficult; on the young internet of the time, you basically had to copy and paste from Baseball America’s basic data. For past years, there was just about nothing outside of what you could get from STATS. As a youngster, I pretty much spidered the data off of STATS on AOL, which surprisingly had the most data available publicly at the time.
Sabermetrics was a more difficult task back then. Even when Baseball-Reference initially became the first actually usable website, powered by the Lahman database, for the first few years, stats were updated after the season. There was no minor league data there, or anywhere, really. That improved in subsequent seasons, and with more data than James had to work with, people such as Clay Davenport, Voros McCracken, and myself were able to put together our own systems. ZiPS never becomes a thing without minor league data to work on to make the inputs properly. Since James is the one that broke ground, I still call the ZiPS translations zMLEs. These days, I have minor league translations going all the way back to the 1950s.
As we approach midseason, many of the current minor league translations in the upper minors have become highly interesting the farther we get from Small Sample Shenanigans. I wanted to take the opportunity to highlight some of the numbers with relevance to the rest of the major league season. Remember: minor league translations are not actual predictions but should be treated like any other line of play, with the same possible pitfalls, the same need for context, and the same opportunity to be misleading in certain ways, such as freak BABIP totals (though ZiPS tries to adjust for the last one). All these lines are adjusted to the context of the parent club’s home park and 2023’s level of offense in the majors. All translations are through Monday’s games. Read the rest of this entry »
As a measure to improve baseball for the average fan — or even the decidedly non-average fans who frequent our pages — I think the pitch clock has been a resounding success. Trimming almost half an hour from the length of games hasn’t diminished baseball itself, with the cutting room floor mainly littered with the things that take place in between the action. Now, you can argue that we’ve also eliminated some of the dramatic tension from crucial situations in important games. But for every high-stakes matchup between two great players in a big moment, there were a multitude of unimportant ones stretched out endlessly by a parade of uniform readjustments and crotch reconfigurations. I enjoy having a leisurely Campari and soda with a friend while waiting for dinner, but I certainly don’t want to do that for every meal, and if I could chop down cocktail hour to get my food more quickly, I’d happily find other moments for social bonding.
Of course, game length isn’t the only consideration when assessing the pitch clock. I’m frequently asked in my chats if I think a given pitcher’s underperformance relative to expectation can be attributed to the clock. It can’t feel great to do a job for a number of years and suddenly experience such a monumental change in how you go about executing it. Steve Trachsel ain’t punching no time clock!
Another big question is whether the pitch clock, which can result in mechanical changes, could have an effect on injuries, a subject Will Sammon, Brittany Ghiroli and Eno Sarris explored for The Athletic after a high injury rate in April. While we obviously don’t have enough data to reach a verdict on the long-term effects of the clock (and things like Tommy John surgery count are still going to involve relatively small samples), as we near the halfway point of the season, we do have enough information to look at how the data are shaking out and arrive at some kind of preliminary conclusion about what’s going on. Read the rest of this entry »
One of the strange things about projecting baseball players is that even results themselves are small sample sizes. Full seasons result in specific numbers that have minimal predictive value, such as BABIP for pitchers. The predictive value isn’t literally zero — individual seasons form much of the basis of projections, whether math-y ones like ZiPS or simply our personal opinions on how good a player is — but we have to develop tools that improve our ability to explain some of these stats. It’s not enough to know that the number of home runs allowed by a pitcher is volatile; we need to know how and why pitchers allow homers beyond a general sense of pitching poorly or being Jordan Lyles.
Data like that which StatCast provides gives us the ability to get at what’s more elemental, such as exit velocities and launch angles and the like — things that are in themselves more predictive than their end products (the number of homers). StatCast has its own implementation of this kind of exercise in its various “x” stats. ZiPS uses slightly different models with a similar purpose, which I’ve dubbed zStats. (I’m going to make you guess what the z stands for!) The differences in the models can be significant. For example, when talking about grounders, balls hit directly toward the second base bag became singles 48.7% of the time from 2012 to 2019, with 51.0% outs and 0.2% doubles. But grounders hit 16 degrees to the “left” of the bag only became hits 10.6% of the time over the same stretch, while toward the second base side, it was 9.8%. ZiPS uses data like sprint speed when calculating hitter BABIP, because how fast a player is has an effect on BABIP and extra-base hits. Read the rest of this entry »
The prospect ranks are as high as an elephant’s eye at Castellini Farms. The Reds may have entered this rebuilding cycle with all the grace of an angry cat trying to get a cereal box off its head (as opposed to the awkward toe-dipping of the last go-around), but through trades and their own scouting, they’ve accumulated an impressive amount of talent in the minors. By our in-progress farm system rankings, only the Baltimore Orioles place higher for the 2023 season. Mean ol’ Grandpa ZiPS agrees; the Reds had seven prospects on the preseason ZiPS Top 100, a total that trailed only the Guardians and the O’s. Baltimore and Cincinnati combined seem to have about 80% of the shortstop prospects in baseball.
Whether you go by human or machine, no Red ranked more highly this winter than Elly De La Cruz, who was no. 6 (60 FV) on the prospect team’s Top 100 and no. 15 on the ZiPS list. After an impressive 2021 full-season debut, De La Cruz cranked things up a notch in 2022, hitting 28 homers and slugging .586 combined across High- and Double-A despite only being 20 years old. Questions still remain about his long-term defensive position, but his bat has proved to be even more potent for Triple-A Louisville, as he hit 12 home runs in a mere 38 games and is already two-thirds of the way to last year’s walk total. He’s responsible for the International League’s ERA going up by nearly half a run a game from 2022! OK, I made that last bit up, but you had to actually think about it for a full second before you smelled burning khaki. Read the rest of this entry »
The Pirates are having a relatively successful 2023, as the team has defied expectations and is currently in second place in the NL Central, just a half-game behind the first-place Brewers. The record on the field isn’t the only thing coming up Pirate this year: the team successfully signed Bryan Reynolds to an eight-year, $106.75 million contract extension, ending the eternal and well-founded speculation about which team the Bucs would trade him to and when. With Ke’Bryan Hayes already signed to a $70 million extension — then the largest dollar figure for a contract in franchise history — Pittsburgh has discussed locking up two other foundational talents, Mitch Keller and Oneil Cruz.
Despite the Pirates signing a nine-figure deal with Reynolds, it would be a mistake to assume that it foreshadows a new era in team spending that gets them into the next tier up in the spending ranks. The last time they finished even 20th in baseball in payroll was 20 years ago, in 2003, and they’re usually in the bottom five. There they will stay, but if they spend a good proportion of that self-limited budget on their best young talent, they get their best shots at the NL Central and don’t explicitly look like a stop for young players between Triple-A and the majors. To manage this, Pittsburgh has to sign its young players sooner rather than later, and absorb additional risk.
Of these two players, Keller’s extension is probably the more urgent matter to attend to. The least expensive time to sign him would have been a few years ago, when he was struggling to adjust to the majors and the Pirates could, as noted above, defray some of the cost by taking on that additional risk that he’d never develop. Keller is eligible to hit free agency after the 2025 season, so there’s a real ticking clock here; the longer the Pirates take to come to an agreement, the less financial reason their ace has to take one and the less talent would come to Pittsburgh in the event of a trade. Now that Keller’s breakout appears to be a reality and not a fluke or merely speculation about the future, he has a lot more financial leverage than he did a year ago.
While Keller worked out most of his remaining command issues last season, he still suffered a bit from having strikeout stuff but not being great at actually collecting those Ks. The full version of ZiPS still sees his improved swinging-strike rate not supporting the impressive 50% bump in his overall strikeout rate, but it does agree that his performance in 2023 in this department represents real and significant improvement. As such, Keller has one of the biggest bumps among pitchers from his preseason long-term projection. Even the simpler in-season projection version of ZiPS still has him finishing in the top five in the NL in WAR, behind just Zac Gallen, Zack Wheeler, Spencer Strider, and Logan Webb.
Perhaps the most striking example of Keller’s continued breakout is just how improved his cohort of most similar past pitchers is. Here are the top 50 pitchers in ZiPS similarity (with the specific year at which their baseline is similar) for Keller both before 2023 and now:
You will note that I didn’t label which column was which, because I’m just that confident that you’ll know in about a half-second of glancing which list is the better one!
In sum, ZiPS suggests a fair six-year deal right now would be six years, $116 million:
ZiPS Projection – Mitch Keller
Year
W
L
ERA
G
GS
IP
H
ER
HR
BB
SO
ERA+
WAR
2024
10
8
3.39
28
28
167.3
147
63
16
39
187
123
3.7
2025
9
8
3.48
27
27
160.3
142
62
15
37
175
119
3.4
2026
8
9
3.60
26
26
157.3
142
63
15
36
168
115
3.2
2027
8
9
3.73
26
26
152.0
141
63
15
35
157
111
2.9
2028
8
9
3.80
26
26
149.3
142
63
15
35
150
109
2.6
2029
7
9
3.96
24
24
145.3
143
64
16
34
141
105
2.3
This reflects the fact that he has two more years of arbitration; a projected offer as a free agent would be six years, $153 million, or seven years, $171 million.
Despite being a shortstop — for now at least — rather than a pitcher, Cruz is the riskier of the pair. He’s less established in the majors than Keller and is currently on the IL with a fractured ankle, making a projection that much trickier. But when agreeing to a mutually beneficial contract, you basically have to pay either in currency or risk, and if the Pirates don’t want to give up a ton of the former, they’ll have to be willing to pay by taking on a bunch of the latter. Even if the current injury makes the atmosphere a little too much like gambling for either side of the negotiations, a healthy Cruz — which is expected to be a thing sometime around August — should be enough to kickstart talks.
Cruz was one of my breakout picks this year, and while the ankle means that’s one that I’m unlikely to get right, he still has a great deal of upside with his game-changing power. And his contact issues, while concerning, are at least a problem that you can pinpoint; it only take a few percentage points of a bump in contact rate for ZiPS to start projecting him with Javier Báez’s prime. Cruz had already shown an uptick in his nine games this year, walking seven times as his contact rate hit 70%. Nine games is a pitifully small sample size, even for less volatile numbers, but it’s certainly better than those numbers going in the opposite direction!
ZiPS Projection – Oneil Cruz
Year
BA
OBP
SLG
AB
R
H
2B
3B
HR
RBI
BB
SO
SB
OPS+
DR
WAR
2024
.249
.314
.457
394
66
98
18
5
18
64
36
121
12
109
-4
2.1
2025
.250
.317
.459
412
70
103
19
5
19
68
39
121
13
111
-4
2.3
2026
.253
.321
.463
430
75
109
20
5
20
71
42
122
12
113
-4
2.5
2027
.254
.324
.465
437
77
111
21
4
21
72
44
121
12
114
-4
2.7
2028
.254
.324
.456
441
77
112
21
4
20
72
45
120
11
112
-4
2.5
2029
.251
.322
.442
439
76
110
21
3
19
69
45
118
9
108
-5
2.2
2030
.250
.321
.444
428
74
107
20
3
19
67
44
116
9
108
-5
2.1
Cruz has more upside than Hayes does, but the latter is healthy and closer to his potential than the former is to his higher potential, and the contract projection comes out similarly to the extension that Hayes signed: a seven-year, $67 million extension that delays Cruz’s free agency by two years.
Will contracts like these single-handedly make the Pirates a perennial contender? No, but signing young players to long-term deals at least gives them the path to long-term relevance in the NL Central and gives a fanbase that’s been beaten up for 30 years some hope that it’ll get to see PNC Park be the long-term home for the team’s core rather than a set of turnstiles.
One of the strange things about projecting baseball players is that even results themselves are small sample sizes. Full seasons result in specific numbers that have minimal predictive value, such as BABIP for pitchers. The predictive value isn’t literally zero — individual seasons form much of the basis of projections, whether math-y ones in something like ZiPS or simply personal opinions on how good a player is — but we have to develop tools that improve our ability to explain some of these stats. It’s not enough to know that the number of homers allowed by a pitcher is volatile; we need to know how and why pitchers allow homers beyond a general sense of pitching poorly or being Jordan Lyles.
Data like that what StatCast provides gives us the ability to get what is more elemental, such as exit velocities and launch angles and the like — things that are in themselves more predictive than their end products (the number of homers). StatCast has its own implementation of this kind of exercise in the various “x” stats. ZiPS uses slightly different models with a similar purpose, which I’ve dubbed zStats. (I’m going to make you guess what the z stands for!) The differences in the models can be significant. For example: when talking about grounders, balls hit directly toward the second base bag became singles 48.7% of the time from 2012 to ’19, with 51.0% outs and 0.2% doubles. But grounders hit 16 degrees to the “left” of the bag, over the same time period, only became hits 10.6% of the time and toward the second base side, 9.8%. ZiPS uses data like sprint speed when calculating hitter BABIP, because how fast a player is has an effect on BABIP and extra-base hits.
ZiPS doesn’t discard actual stats; the actual models all improve from knowing the actual numbers in addition to the zStats. For data on how zStats relate to actual stats, I’ve talked more about this here and here.
But you’re here to see the numbers themselves, not the exposition, so let’s star wipe to the main storyline. Read the rest of this entry »
May was a successful month for the Tigers, a franchise which in recent years has been lacking in happily remembered calendar pages. Detroit’s .577 winning percentage in May (15–11) is its best full month since a .615 mark (16–10) all the way back in July 2016. And while it would be a stretch to say that everything has been coming up bengal, given that the team’s run differential is still slightly in the negative for the month, the bleakness of the AL Central has allowed the Tigers to come within a game of the division lead. Even Spencer Torkelson, whose bat disappeared in 2022, has been playing better baseball, putting up a 119 wRC+ in May. Unfortunately, fate wasn’t even kind enough to give Detroit the whole month; a couple of days before the calendar flip, injuries to Eduardo Rodriguez and Riley Greene have ended May on a decidedly sour note.
These Tigers certainly aren’t strangers to injury. Every team faces injuries sooner or later, but Detroit managed to win in May despite an entire rotation’s worth of promising pitchers — Tarik Skubal, Matt Manning, Spencer Turnbull, Casey Mize, and Beau Brieske — out with injuries. The contributions of Rodriguez and Greene had a lot to do with that. The former’s hot April start continued in May with a 2.03 ERA/2.61 FIP over five outings; the latter hit a star-level .365/.435/.573 for the month. That came crashing down on Tuesday with two bits of very unwelcome news. Read the rest of this entry »
Well-Beered Englishman: What’s a weird historical hot take you had that you can fess up to? This morning, I was remembering when I used to think, “How will the Angels make room for Mike Trout when they already have Peter Bourjos?”
12:00
Dan Szymborski: I liked the Soriano-Wilkerson trade. The Wilkerson part.
12:00
Blake: Is Mitch Haniger going to get it going?
12:00
Dan Szymborski: To an extent.
12:00
Ray: When does Shane McClanahan get universally recognized as a top five pitcher in baseball?