Job Posting: Pirates Data Architect

Position: Data Architect

Location: Pittsburgh, PA

Description:
The Data Architect is responsible for putting into place and maintaining the processes and systems to efficiently integrate and effectively make available baseball related data from both external and internal sources in order to provide the backbone of evidence based decision making.

Responsibilities

Primary:

  • Responsible for the daily operation, performance, and maintenance of the data assets used within Baseball Operations. Makes use of Pirates’ standards and industry best practices to implement efficient and high performance access to data.
  • Design, create, and extend processes for data extraction, transformation, cleansing, and load to and from internal and external data sources for both structured and unstructured data.
  • Evaluate potential data providers and design and implement data models for storage and access to new types of data that integrate with the existing data and application architectures.
  • Design, create and maintain reporting structures using SQL Reporting Services and Tableau and participate in one-off research projects to answer specific questions.
  • Design and implement data mining processes as a part of predictive modeling in conjunction with the Quantitative Analyst and other staff members.
  • Design and implement data mining processes as a part of predictive modeling in conjunction with other staff members.

Secondary:

  • Departmentally: Participate in gathering and documenting user requirements for existing and new systems. Understand business processes and required outcomes of the system and creates requirements definition document defining the business use cases.
  • Organizationally: Acts as a resource for database and SQL coding projects within the organization. Assist other staff developing SQL scripts, stored procedures, and other database objects where required.
  • Industry: Acts as the point of contact with MLB in understanding and planning for future infrastructure needs and changes as the structure and breadth of information changes over time.

Position Requirements

Required:

  • Bachelor’s Degree or higher in Computer Science, Information Systems, or equivalent.
  • Two years experience administering enterprise level data structures using SQL Server technologies including SQL Reporting Services and SQL Server Integration Services.
  • Expert knowledge of SQL and database administration tools. Knowledge of SQL Server replication topologies. Understanding of database documentation and design tools.
  • Experience with cloud based architectures and tools including Amazon S3, EC2, Databricks required.
  • Experience participating in multiple aspects of the software development life cycle including requirements definition, design, development, testing, and implementation.
  • Demonstrated ability to work with users to understand business processes, document system requirements, and develop data structures that meet business objectives.

Desired:

  • Experience with Python, .NET..
  • Experience with statistical analysis software such as R, SAS, SSPS.
  • An understanding of sabermetric techniques for player evaluation strongly preferred.

The Pirates are an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, national origin, disability status, protected veteran status or any other characteristic protected by law.

To Apply:
Please apply here.


Daily Prospect Notes: 7/11

Notes on prospects from lead prospect analyst Eric Longenhagen. Read previous installments here.

Andres Gimenez, SS, New York Mets (Profile)
Level: Hi-A   Age: 19   Org Rank: 3   FV: 50
Line: 3-for-5, 2B, 3B

Notes
Gimenez is a 19-year-old shortstop slashing .280/.350/.430 in the Florida State League. That’s good for a 107 wRC+ in the FSL. Big-league shortstops with similar wRC+ marks are Trea Turner (a more explosive player and rangier defender than Gimenez) and Jurickson Profar, who have both been two-win players or better this year ahead of the break. Also of note in the Mets system last night was Ronny Mauricio, who extended his career-opening hitting streak to 19 games.

Read the rest of this entry »


Futures Game Scouting Reports on THE BOARD

Major League Baseball announced the rosters for its Futures Game last week, the annual “prospect All-Star Game” which takes place the Sunday before the All-Star Game itself. We’ve made a page on THE BOARD specifically for the Futures Game representatives. It has tool grades, scouting reports, video, and other stuff with which you can play around.

Click here to see Futures Games participants on THE BOARD.

There may be some discrepancies between the player evaluations on this page and those that appear elsewhere on THE BOARD. The grades on the Futures Game section of THE BOARD represent our most current evaluations of these players. The evaluations on the team lists, meanwhile, will be updated altogether shortly in the future. Kiley and I will be in Washington D.C. a week from now to see these guys in person. The Futures Game is Sunday, July 15 in Washington D.C.

Click here to see Futures Games participants on THE BOARD.


In Which Mike Leake Uses a Paper Cup

I’ve touched on it before, but I don’t think we talk nearly enough about how weird it must be to be a major-league baseball player. Sure, they’re wealthy and get annoyed with their boss like anyone else, but they’re also frequently naked around their work friends. Their workplace is so strange. People spit in the office, and those spitters don’t always aim well, which means flecks of spit on their shoes. We watch them be bad at their jobs, but they never get to see us botch a sale or split an infinitive.

Sometimes, their pants rip just a little bit, near the butt.

This happens to the literal best players. They continue to wear those pants for several hours. They then get new pants. They don’t have to pay for them.

And they litter. Gum wrappers, seeds, the sticky leftovers of athletic tape. They use a bunch of paper cups — seemingly a different one every inning! — and then just throw those cups on the floor of the dugout. It’s a move that must strike your average desk-job worker as odd; insurance adjusters are generally expected to recycle.

Perhaps Mike Leake secretly aspires to more traditional office work. In the top of the fifth inning of his July 4th tilt against the Angels, Leake was pulled in favor of Nick Rumbelow. He looked hot and frustrated. The camera finds him as he takes a long, grumpy drink of water. Here we are, watching him after he just failed. He gives the cup an angry little flick. Runners on the corners. Might be time to litter. He stares into the middle distance. Grumble, grumble, bad day. But then…

https://gfycat.com/AfraidNextDikkops

He puts the cup back! He drinks from the cup, puts his lips full on it, and then puts it back on top of the stack. Presented with this less typical baseball workplace weirdness, we are left to conclude that Leake either has a stack of cups for his own personal use throughout the game — the result of considered cup tastes — or cares deeply about the environment, or else is thoroughly untroubled by the idea of a teammate grabbing his previously full-lipped cup off the top of the stack. It might be an act of care, but it could also be sort of awful.

I suppose, in that respect, Mike Leake has shown that the dugout is actually just like every other workplace, defined as they often are by drudgery, and grumpy types, and small bits of heedlessness borne of opaque personal agenda or indifference to washing the mug you just used before putting it back in the break room dish drainer.

Maybe being a baseball player isn’t so weird after all. Except for when you’re naked with your fellows. That remains strange.


Scouting the Reds’ Return for Dylan Floro

On Wednesday, the Reds sent righties Dylan Floro and Zach Neal, as well as international pool space, to the Dodgers for RHPs James Marinan and Aneurys Zabala. Marinan, Los Angeles’s fourth-round pick in 2017 out of Park Vista High School in Florida, made three starts in the AZL before the trade. Zabala, whom the Dodgers originally acquired from Seattle for Chase De Jong, was pitching in the Low-A Great Lakes bullpen.

Both pitchers have size and big-league arm strength. Marinan is 6-foot-5, 220, while Zabala (though listed at 175) is closer to 250. Zabala was throwing 96-100 while he was with Seattle, but his conditioning wavered after the trade and the fastball was in the low 90s when I saw him last year. He also had, and still has, issues repeating his delivery, which leads to scattershot fastball command. His velocity is back up into the upper 90s this year, and he can spin a breaking ball. He has above-average relief stuff, but is a high-risk prospect because of how far the command needs to come — and because the stuff has roller coastered over the last two years.

Marinan is a bit more stable. He has a four-pitch mix glued together by a low-90s sinker and average change that flashes above. He can throw an average curveball for strikes, and the slider can miss bats away from righties when located. He could end up with a bunch of 50s, maybe a 55 changeup and command, and become a solid No. 4/5 starter. Both players are likely three years away from the majors, at least, though Zabala will essentially be ready as soon as his fastball command improves, if it does.


NERD Scores Now Available from Internet Robot

For five or seven years or whatever, the author of this post published a daily collection of so-called NERD Scores — ratings, that is, intended to summarize, in one number, the appeal of a particular game to the sort of people who visit FanGraphs. Most readers ignored this daily service. Others were compelled to note its flaws with some regularity. A small minority suggested that it was of some benefit.

Due to a combination of influences — the birth of a child, a slight change in roles here at the site, ungovernable sloth — the author decided this spring not to publish the daily NERD posts this year. This has allowed me to dedicate extra time to pursue my favorite — namely, adding hyphens to the compound adjectives utilized by other authors in their work at this site. Unsurprisingly, FanGraphs has survived the absence of this content.

Recently, however, one interested party has proven what I suspected all along — that the NERD scores are the province of unthinking and -feeling robots. Enterprising millennial John Edwards has identified the least profitable use for his programming and coding skills — namely by crafting a Twitter bot that publishes the NERD scores automatically.

Here, for example, is today’s (somewhat uninspiring) slate of games:

And an apology that suggests that alerts will mark the start of each game:

Edwards himself has suggested that the feed is a work in progress. For those who prefer some guidance for their daily viewing habits, this seems like a promising resource.


Jay Jaffe and Keith Law in Washington, DC, July 14

Whether you’re a Washington, DC-area resident or a visitor who will be in town for the All-Star Futures Game, you’re invited to the great Politics and Prose Bookstore on Saturday, July 14 at 6 pm, where ESPN’s Keith Law and I will discuss and sign our respective books, Smart Baseball (William Morrow, 2017, now out in paperback) and The Cooperstown Casebook (Thomas Dunne Books, 2017). During my years at Baseball Prospectus, I was part of half-a-dozen bookstore events at P&P, and they always drew great crowds for engaging discussions and enjoyable interactions with readers, and I have no doubt that this event will deliver the same.

As a prospect-focused senior baseball writer for ESPN Insider, a former member of the Toronto Blue Jays’ front office, and an alumnus of Baseball Prospectus, Keith is uniquely positioned to explain the evolution of talent evaluation within baseball and to examine the trends that are shaping the field in the 21st century. Whether you’re a long-time stathead or a newcomer to this little corner of the game, you’ll enjoy his book (full title Smart Baseball: The Story Behind the Old Stats That Are Ruining the Game, the New Ones That Are Running It, and the Right Way to Think About Baseball) and you can order a signed, personalized copy ahead of time here.

The Cooperstown Casebook is the culmination of a decade and a half of my research and writing about the National Baseball Hall of Fame and Museum — the forces that shaped the Hall and its honorees, and the battles to recognize and honor the game’s greats and its history, with an emphasis on how advanced statistics have changed our understanding of what makes a Hall of Famer. You can read plenty more about the book here and at my publisher’s page as well as Paul Swydan’s review, and there’s an exceprt from my chapter about Bert Blyleven and Jack Morris here. To order a signed, personalized copy to pick up at the event, click here.

This event is free to attend with no reservation required. Seating is available on a first come, first served basis.

Politics & Prose Bookstore
Saturday, July 14, 6 pm
5015 Connecticut Ave NW
Washington, DC 20008
Map


Max Muncy’s Home Run Hit Albert Almora on the Head

https://gfycat.com/BriskEnchantedHalicore

Dodgers infielder Max Muncy is an instrument of the Absurd, nor is there much evidence to the contrary. He owns, for example, a name that has traditionally been the province exclusively of mid-century private detectives. He’s also a former fifth-round pick who entered the season with roughly -1 WAR and yet who, somehow, is currently leading his club by that same measure. Muncy’s bona fides wherein the ridiculous is concerned are beyond reproach.

It should come as no surprise, then, that Max Muncy has once again had a rendezvous with the improbable. Batting in the first inning tonight against the Cubs, Muncy drove a fastball from Kyle Hendricks to center field — over the center-field wall, in fact. Instead of remaining over the center-field wall, however, what Muncy’s home run did instead was to re-enter the field of play and strike innocent bystander Albert Almora on the head. Did it kill or even just injure Almora? Signs point to “No.” But did it cause him a moment’s indignity? Yes, not unlike the sort one experiences just by living.

https://gfycat.com/MeekComplexCrab


Scouting the Royals Return for Kelvin Herrera

On Monday, Washington sent a three-player package of middling talents back to Kansas City in exchange for reliever Kelvin Herrera. Those prospects are 3B Kelvin Gutierrez, CF Blake Perkins and RHP Yohanse Morel.

Perkins and Gutierrez were each on our Nationals team write-up as 40 FVs. Gutierrez has a strong contact/defense profile. (He was bad at third base in my extended look at him last Fall and received some playing time at first in anticipation of Ryan Zimmerman’s continued health problems.) He lacks corner-worthy power, however. Perkins is a glove-first center-field prospect with premium strike-zone awareness (he has a 12% career walk rate) and very little power.

We have each of them evaluated as big-league role players. Gutierrez is probably a low-end regular or bench/platoon option at third base and, down the line, a couple other positions. If he alters his approach in a way that coaxes out more of his average raw power in games, he could be more than that. Perkins has a bit more variability because he hasn’t been switch-hitting for very long (he only started in 2016) and might yet grow into some competency as a left-handed hitter, but his lack of in-game power might also undercut his walk rate at upper levels of the minors — and in the big leagues, too — because pitchers are going to attack him without fear that he’ll do any real damage on his own. He also might become such a great defensive center fielder that he plays every day despite providing little offensive value.

Read the rest of this entry »


Trout, Davis, and the Largest Seasonal WAR Differentials

As I noted earlier today, Orioles slugger Chris Davis, who through the Orioles’ first 67 games has already dug himself a -1.9 WAR hole, is on pace for the worst season ever by that measure, -4.6 WAR. At the other end of the spectrum, Mike Trout is having not just the best season of his already amazing career, but one for the pantheon. His 5.7 WAR through the Angels’ first 69 games prorates to 13.4 over a full season, which would rank third all-time, behind the 1923 and 1921 seasons of Babe Ruth (15.0 and 13.9 WAR, respectively), making Trout’s season “only” the best in the past 95 years. What a slacker.

Even if that’s the case, Trout and Davis could combine for the largest WAR differential between two position players in one season, a chasm wider than the Grand Canyon. Below are the 20 largest single-season gaps, with some player-seasons, such as Ruth’s 1920, included more than once. I’ve also included two hypothetical end-of-season figures for Trout and Davis: their WAR differential based both on current pace and also our Depth Chart projections.

Largest Single-Season WAR Differentials Since 1901
Season Player 1 Team WAR Player 2 Team WAR Dif
2018 Mike Trout PACE Angels 13.4 Chris Davis PACE Orioles -4.6 18.0
1923 Babe Ruth Yankees 15.0 Shano Collins Red Sox -2.5 17.5
1920 Babe Ruth+ Yankees 13.3 Ivy Griffin Athletics -2.8 16.1
1920 Babe Ruth+ Yankees 13.3 Chick Galloway Athletics -2.4 15.7
2002 Barry Bonds Giants 12.7 Neifi Perez+ Royals -2.9 15.6
1927 Babe Ruth Yankees 13.0 Ski Melillo Browns -2.5 15.5
1924 Babe Ruth Yankees 12.5 Milt Stock+ Robins -2.7 15.2
1924 Rogers Hornsby Cardinals 12.5 Milt Stock+ Robins -2.7 15.2
1927 Lou Gehrig Yankees 12.5 Ski Melillo+ Browns -2.5 15.0
2001 Barry Bonds Giants 12.5 Peter Bergeron Expos -2.4 14.9
1931 Babe Ruth Yankees 10.7 Jim Levey Browns -3.3 14.0
1993 Barry Bonds+ Giants 10.5 David McCarty Twins -3.1 13.6
1929 Rogers Hornsby Cubs 11.1 Tommy Thevenow Phillies -2.4 13.5
1905 Honus Wagner Pirates 10.8 Fred Raymer Beaneaters -2.4 13.2
1912 Tris Speaker Red Sox 10.6 Frank O’Rourke Braves -2.6 13.2
1928 Babe Ruth Yankees 10.6 Doc Farrell Braves -2.6 13.2
1930 Babe Ruth Yankees 10.5 Fresco Thompson Phillies -2.7 13.2
1993 Barry Bonds+ Giants 10.5 Ruben Sierra Athletics -2.6 13.1
1993 Barry Bonds+ Giants 10.5 Luis Polonia Angels -2.6 13.1
1927 Rogers Hornsby Giants 10.4 Ski Melillo+ Browns -2.5 12.9
2002 Alex Rodriguez Rangers 10.0 Neifi Perez+ Royals -2.9 12.9
2018 Mike Trout PROJ Angels 10.8 Chris Davis PROJ Orioles -1.7 12.5
+ = Player-season appears more than once.

Seven separate Ruth seasons are represented here, along with three apiece from Hornsby and Bonds. Aside from the projection of Trout, only one other post-World War II player besides Bonds is represented above on the good side of things, namely A-Rod in 2002. That’s just one small set of data points related to the fact that the spread of talent between the best and worst players is much less now than it was 75 or 100 years ago and that leagues today are stronger than the ones of decades past.

Ruth hit “only” 41 homers during his 15.0-WAR 1923 season, but via his .393/.545/.764 (231 wRC+) line, he set career highs in the first two categories even while somehow failing to win a batting title. (The Tigers’ Harry Heilmann hit .403.) His dance partner from the 1923 season was Collins, a light-hitting outfielder who batted .231/.265/.289 for a lousy 43 wRC+ that year and was six runs below average on defense. To the extent that Collins has any other claim to fame, it’s apparently that he was the only player in the White Sox’ starting lineup for Game One of the World Series who didn’t wind up either banned for life as part of the Black Sox scandal or elected to the Hall of Fame (as Eddie Collins and Ray Schalk were). Ruth’s 1931 season (.373/.495/.700, 46 HR) is paired with Levey, the Browns shortstop who actually had an even worse season (1933, -4.0 WAR) that represents the record Davis is trying to avoid.

As for Trout, to date, the largest WAR gap of his career is 13.9 WAR, from 2013, when he set a career best with 10.1 WAR and Yuniesky Betancourt turned in a -1.8 WAR clunker. Even if Davis didn’t play another game this year, Trout would only need to add another 4.5 WAR over the Angels’ 93 remaining games to surpass that previous high. While he and Davis don’t have much margin for error in surpassing Ruth and Collins, it still boggles the mind that we could be seeing such extremes in the same season.