This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Lead Data Engineer will be part of a team building Stanford Health Care's (SHC) solutions incorporating Artificial Intelligence including providing health care solutions in the areas of patient care, medical research and administrative services. This group is designed to bring Artificial Intelligence (AI) and other emerging machine learning (ML) based innovations in data science into healthcare and will partner closely with individuals across clinical specialties and operations areas to deploy algorithms that can lead to better patient outcomes. Reporting to the Data Science Director and working closely with Stanford Medicine's inaugural Chief Data Scientist, this role will be responsible for building, scaling and maintaining the compute frameworks, analysis tooling, model implementations and agentic solutions that form our core AI platform.
Job Responsibility:
Build end-to-end data pipelines and infrastructure for ML models used by the Data Science team and others at SHC
Understand the requirements of data processing and analysis pipelines and make appropriate technical design and interface decisions
Understand data flows among the SHC applications and use this knowledge to make recommendations and design decisions for languages, tools, and platforms used in software and data projects
Troubleshoot and debug environment and infrastructure problems found in production and non-production environments for projects by the Data Science Team
Work with other groups at SHC and the Technology and Digital Solutions (TDS) group to ensure servers and system maintenance based on updates, system requirements, data usage, and security requirements.
Requirements:
Bachelor’s or Master’s degree in Computer Science, Engineering, or related, or equivalent working experience
5+ years experience in building data infrastructure for analytics teams, including ability to write code in SQL, R, or Python for processing large datasets in distributed cloud environments
Experience with cloud deployment strategies and CI/CD
Experience building and working with data infrastructure in a SaaS environment
Experience overseeing, developing or implementing machine learning operations (MLOps) processes
Experience mentoring junior engineers and enforcing best practices around code quality
Knowledge of multiple programming languages, commitment to choosing languages based on project-specific requirements, and willingness to learn new programming languages as necessary
Knowledge of resource management and automation approaches such as workflow runners
Collaborative mentality and excitement for iterative design working closely with the Data Science team.