Job Posting: Rockies Research and Development Data Engineer
Position: Data Engineer
Description:
The Rockies are looking for a Data Engineer to join their Research and Development team. The successful candidate will be responsible for expanding and optimizing their data warehouse and data pipeline architecture, with a focus on collecting, cleaning, transforming, managing and validating data using distributed computing and storage systems. The goal of the Data Engineer is to democratize data, support data initiatives, ensure consistent data delivery and empower Rockies personnel to derive powerful and actionable insights.
Responsibilities and Duties:
- Create, maintain and optimize data ETL/ELT pipelines
- Documentation of data/pipelines
- Ensure the ingestion of data and errors are handled without interruption
- Process and securely store extremely sensitive data for callback and future use
- Prepare distributed, disjoint, multi-formatted data sets for data scientists
- Research and investigate new and interesting datasets to include in our data warehouse
- Perform quantitative research related to baseball strategy and player evaluation
- Collaborate with coaches, scouts and baseball operations to suggest process improvements
Requirements:
Education and Work Experience
- Bachelor’s degree in Computer Science/Engineering
- Candidates still in school (junior or senior level) with extensive work towards such degree will be considered
 
- SQL knowledge and experience working with a variety of relational databases such as MySQL, PostgreSQL, or SQL Server
- Experience with a variety of structured, semi-structured and un-structured data formats including delimited files, XML, JSON and natural language text
- Ability to effectively use multiple programming languages including one of the major data science languages of Python, R or Scala
- Experience or working knowledge of “Big Data” tools such as Hadoop, Hive, Spark or Presto is a plus
- Experience with AWS Cloud services such as EC2, RDS, and S3 is a plus
- Experience with data workflow tools such as Luigi, or Airflow is a plus
- Knowledge and understanding of baseball and baseball statistics
Functional Skills
- Ability to work evenings and weekends required
- Passion for the intersection of baseball and data
- Passion for quality data
- Strong organizational skills and ability to self-start
- Strong intellectual curiosity
- Desire to learn and contribute
- Ability to work in a collaborative and open team environment
- Ability to develop and maintain successful working relationship with members of the Front Office
To Apply:
Qualified candidates should send their letter of interest and resume to BaseballJobs@rockies.com no later than June 3, 2018.
Meg is the editor-in-chief of FanGraphs and the co-host of Effectively Wild. Prior to joining FanGraphs, her work appeared at Baseball Prospectus, Lookout Landing, and Just A Bit Outside. You can follow her on Bluesky @megrowler.fangraphs.com.
I’m just curious. Do you ever get feedback from these listing agencies indicating how much value presenting this information to the Fangraphs community adds to the eventually candidate class? I’m assuming it adds something, otherwise I imagine that they wouldn’t work with you in providing the job post information. I’m just kind of curious regarding the nature of the relationship.
I’m not certain, but i think most of these were posted at MLB.com and maybe thats where the poster got the information.
Teams actually reach out to us with postings. Most are also publicly posted on team sites and job boards, but FanGraphs is a place where the postings can be centralized and easily accessed for the job seekers among our readers.
I stand corrected!
Meg, occasionally a job is posted here but not elsewhere (MLB.com, BP’s website, Teamwork, etc.). Do you/Fangraphs have good relationships with some teams and that’s how this could happen, or are the teams trying to hire quick and reduce the number of applicants to sift through?