From the Majors to the NPB: Analyzing MLB Expats by Bradley Woodrum October 11, 2012 Data! On Monday, I took a look at how the new ball in Japan’s NPB league may be affecting the predictability of Japanese talent. The first signs are good: It looks like the new, standardized baseball in Japan — a ball which mimics the MLB design, starting in 2011 — may be resulting in more direct skill translations into the MLB. This means MLB franchises will be able to identify priority athletes better and extend more appropriate contracts and posting fees. In the comments of that article, KJOK of Seamheads.com suggested — quite rightly — that the conversion conversation should go two ways: …I think it is important to examine the players that went TO Japan since the new ball was introduced, and see how they also performed against expectations. Let’s do just that. First of all, let’s discuss the major problem we have. More than just rampant sample size issues, we have the overarching issue of survivor bias leftovers, as I call them. Usually when we deal with survivor bias, we have a positive skew to it, here it is negative. What do I mean? When we look at, say, aging data, we usually want the biggest sample sizes possible, which can mean taking players who had more than a few thousand plate appearances. The net result is we are pulling data of only good players; bad players or players who peaked too early will not keep receiving plate appearances (usually). So, we have a bias towards the surviving careers — careers which may in fact defy the typical aging odds to begin with. Here, we have kind of the sludge from survivor bias. Most MLB players who go to Japan do so because they are trying to either resuscitate their careers or finish out their days against weaker competition. We have some guys like Matt Murton, who did well in the US before going to Japan, but most of the expats in the NPB are like Wladimir Balentien — guys who mashed in Triple-A, but struggled at the MLB level — or Kaz Matsui — guys who were useful in their time, but are on the wrong end of the aging curve. So the NPB is a mixed bag. And that is how you end up with this: Powered by Tableau In no reasonable universe should their be a negative relationship between MLB performance and NPB performance (I am surprised to even see the R-squared as high as 0.69). A negative relationship implies the worse a player does domestically, the better he will do overseas. This data suggests the worst hitter NPB history would be someone of Albert Pujols‘ ilk. If we limit, however, the data to at least 2000 PA combined (career MLB PA plus NPB PA from the last two seasons), we get a more reasonable relationship — well, at least it’s positive — but still a complicated one. For instance, we are comparing age 30 through 33 Tadahito Iguchi against age 37 and 38 Iguchi. The comparisons are not clean. Several players spent time in Triple-A or Mexico or other foreign leagues before earning a roster spot in Japan. The aforementioned Balentien swung at his last MLB pitch at age 24 — now he is 27 and dominating the NPB. Odds are, he got his 559 MLB plate appearances before his physical peak. So this is still very much a work in progress. In general, the best comparison will come from the other direction — NPB stars joining the MLB — because those particular players tend to get steady playing time on both sides of the transition, and there are not as many intermission years. Also, as more years of new ball data accrue, more players like Wily Mo Pena, who joined the NPB just this year, will offer better transition samples. But, if the current data outlay can be trusted even a little, it is not a great sign for the new NPB ball increasing the predictive powers of Japanese hitting statistics.