A Primer on a New and Improved KATOH
My role here at FanGraphs is to write about minor league players. Nearly all of my articles focus on the output from my KATOH projection system, which produces long-term forecasts for players who are still in the minor league phase of their careers. Today, I’m unveiling some updates to my model that will be reflected in my analysis from this point forward.
I’ve been meaning to work these updates into KATOH for quite some time now, but haven’t had the chance to finish up until now. Some pieces of this took a bit longer than expected, and day job stuff along with this year’s onslaught of prospect debuts pushed things to the backseat a bit. But I’m all caught up now and ready to unveil my new and improved KATOH. Here we go!
Rather than just putting out a straight leaderboard, I thought I’d use this as an opportunity to explain some of the inner workings of KATOH. I wanted to say something more insightful than “These are the best prospects because math.” That’s why this piece runs 2,000+ words without reference to a single baseball player. If you’re just interested in the output rather than the nitty-gritty, check back after Thanksgiving for KATOH’s top 100 list. I just wanted to get all of this background stuff down in one place, rather than cluttering future pieces with extra information.
*****
Obligatory Technical Details
The general framework of my model is largely the same as it’s always been. As I did in the past, I deployed a series of probit regressions to see what factors are most predictive of major league performance. For each player, I generated probabilities that he would achieve certain benchmarks through his age-28 season: play in the major leagues, earn at least 1 WAR, earn at least 2 WAR, etc. These percentages gave me a probabilistic outlook for each player, and enabled me calculate an “expected value” for his WAR through age 28.