This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Associate Machine Learning Engineer position offers a unique opportunity to join a fun, innovative engineering team within the AI & Data Science (AI&D) organization. You’ll work on next-generation capabilities and services in Applied AI & Automation using innovative COTS products, open-source software, frameworks, tools, and cloud computing services. The role also emphasizes demonstrating these capabilities to support critical business operations and initiatives, in ensuring quality, compliance, and performance across Amgen’s Applied AI & Automation Footprint. The role is responsible for building and scaling our AI and machine learning solutions from development to production. Your expertise in MLOps will be essential in creating efficient and reliable ML pipelines. The role represents working in the solution engineering and DevOps teams to lead the transition from Intelligent automation to Agentic Process Automation and ensures that strategy and implementation remain connected throughout the value stream. The ideal candidate has experience in building AI & ML solutions, has excellent communication skills, and an understanding of Agile methodologies. This role focuses on supporting the development, deployment, operation, and reliability of AI and LLM-based solutions in production environments, working under guidance of senior engineers and platform teams. The position emphasizes operational excellence, observability, and compliance for applied AI and GenAI systems.
Job Responsibility
Collaborate with data scientists to develop, train, and evaluate machine learning models
Build and maintain MLOps pipelines, including data ingestion, feature engineering, model training, deployment, and monitoring
Leverage cloud platforms (AWS, Databricks) for ML model development, training, and deployment
Develop solutions using DevSecOps framework that are secure, scalable, reliable, and aligned with enterprise architecture standards
Evaluate model performance using appropriate metrics and optimize models for accuracy and efficiency
Develop and execute unit tests, integration tests, and other testing strategies to ensure the quality of the software
Create and maintain documentation on software architecture, design, deployment, disaster recovery, and operations
Identify and resolve technical challenges effectively
Provide ongoing support and maintenance for applications, ensuring that they operate smoothly and efficiently
Analyze customer feedback and support data to identify pain points and opportunities for improvement
Evaluate and recommend technologies and tools that best fit the solution requirements
Support operationalization of machine learning and GenAI models developed by data scientists and solution teams
Assist in evaluating model and LLM performance using metrics related to reliability, efficiency, and response quality
Support deployment and operation of LLM-based workflows, including prompt configurations, retrieval-augmented generation (RAG) pipelines, and agent-based automations
Assist with monitoring AI and LLM systems for availability, latency, error rates, and quality degradation
Support model, prompt, and pipeline versioning across development, test, and production environments
Participate in incident triaging, root cause analysis, and rollback or mitigation activities for AI services
Assist with evaluation runs for LLM outputs, including grounding, reliability, and safety checks
Follow established AI governance, security, and compliance standards when operating AI and GenAI solutions
Monitor AI and LLM endpoints for availability, latency, throughput, and error rates using enterprise monitoring tools
Assist with dashboards, alerts, runbooks, and operational documentation to support reliable AI system operations
Requirements
Any degree and 3 to 5 years of Computer Science, IT or related field experience
Strong foundations in machine learning algorithms and techniques
Experience in MLOps practices and tools (e.g., MLflow, Kubeflow, Airflow)
Experience in model monitoring, including model observability and explainability
Proficiency in Python (or R) and relevant ML libraries (e.g., TensorFlow, PyTorch, Scikit-learn)
Experience with big data technologies (e.g., Spark, Hadoop), and performance tuning in query and data processing
Understanding of the LLM inference lifecycle, including prompt execution, retrieval, and response generation
Awareness of common LLM failure modes such as hallucinations, prompt injection, and data leakage
Nice to have
Good understanding of cloud platforms (e.g., AWS, Databricks) and containerization technologies (e.g., Docker, Kubernetes)
Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Splunk)
Experience with data processing tools like Hadoop, Spark, or similar
Knowledge of GenAI tooling: vector databases, RAG pipelines, prompt-engineering DSLs and agent frameworks (e.g., LangChain, Semantic Kernel)
Ability to analyze client requirements and translate them into solutions
Exposure to LLM evaluation techniques beyond accuracy, including grounding, faithfulness, latency, and reliability of metrics