This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Principal Machine Learning Engineer, you will lead the architecture and development of a core AI platform capability that enables researchers and engineers across Amgen to build, deploy, and operate advanced ML and Generative AI systems at scale. You will operate as the technical lead for a small engineering team and own the design and evolution of a platform that simplifies the lifecycle management of complex ML workloads including LLMs, fine-tuned SLMs, and next-generation AI systems. This platform powers a self-service ML ecosystem that enables researchers to move from experimentation to production quickly, with built-in MLOps, observability, and governance capabilities.
Job Responsibility:
Architect and build a scalable ML platform for training, deployment, and lifecycle management of ML, LLM, and Generative AI models
Lead development of infrastructure that supports production hosting of complex AI systems, including large-scale inference workloads
Design developer-friendly abstractions and automation that make it easy for researchers to build and deploy models within the Amgen ecosystem
Implement and evolve MLOps capabilities including experiment tracking, model versioning, CI/CD for ML, monitoring, and reproducibility using tools such as Databricks and MLflow
Build platform capabilities supporting Generative AI and emerging Agentic AI systems
Serve as the technical leader for a team of engineers, guiding architecture, design reviews, and engineering best practices
Partner with AI researchers, data scientists, and platform teams to translate cutting-edge AI research into reliable production systems
Evaluate and adopt emerging technologies across the modern AI stack including foundation models, vector databases, agent frameworks, and model serving systems
Champion AI-native engineering practices, leveraging tools like GitHub Copilot, Codex, and AI-assisted development workflows
Contribute to the broader strategy and evolution of the Enterprise AI Platforms ecosystem
Requirements:
Bachelor’s degree in computer science, Engineering, Data Science, or a related field with 12 to 17 years of total experience
8+ years of experience in software engineering, machine learning engineering, or ML infrastructure
Strong experience building production ML systems or ML platforms
Hands-on experience with MLOps frameworks and tools such as MLflow / Equivalent - Model lifecycle management frameworks
Strong programming experience in Python and modern software engineering practices such as API Driven Architecture and Event based systems
Experience designing scalable distributed systems or cloud-native architectures
Experience deploying and operating machine learning models in production environments
Solid understanding of modern ML workflows including training, evaluation, deployment, monitoring, and retraining
Nice to have:
Advanced degree (Masters) in Computer Science, AI/ML, Data Science, or related discipline
Experience building infrastructure for LLMs, Generative AI, or foundation models
Understanding of Agentic AI systems and orchestration frameworks
Experience with LLM/SLM fine-tuning and production deployment
Familiarity with modern AI ecosystem technologies such as: Retrieval-Augmented Generation (RAG)
Vector databases
Model serving frameworks
Agent frameworks
Experience building internal ML platforms used by researchers or data scientists
Experience operating large-scale inference or GPU-based workloads
Strong technical leadership and mentoring ability
Ability to drive architecture and technical direction
Excellent cross-team collaboration and communication
Strong ownership mindset and bias toward execution
Passion for staying current with emerging AI technologies