CrawlJobs Logo

Principal Machine Learning Engineer

India, Hyderabad · Job Posted March 19, 2026
Apply Position
Job Link Share

Job Description

As a Principal Machine Learning Engineer, you will lead the architecture and development of a core AI platform capability that enables researchers and engineers across Amgen to build, deploy, and operate advanced ML and Generative AI systems at scale. You will operate as the technical lead for a small engineering team and own the design and evolution of a platform that simplifies the lifecycle management of complex ML workloads including LLMs, fine-tuned SLMs, and next-generation AI systems. This platform powers a self-service ML ecosystem that enables researchers to move from experimentation to production quickly, with built-in MLOps, observability, and governance capabilities.

Job Responsibility

  • Architect and build a scalable ML platform for training, deployment, and lifecycle management of ML, LLM, and Generative AI models
  • Lead development of infrastructure that supports production hosting of complex AI systems, including large-scale inference workloads
  • Design developer-friendly abstractions and automation that make it easy for researchers to build and deploy models within the Amgen ecosystem
  • Implement and evolve MLOps capabilities including experiment tracking, model versioning, CI/CD for ML, monitoring, and reproducibility using tools such as Databricks and MLflow
  • Build platform capabilities supporting Generative AI and emerging Agentic AI systems
  • Serve as the technical leader for a team of engineers, guiding architecture, design reviews, and engineering best practices
  • Partner with AI researchers, data scientists, and platform teams to translate cutting-edge AI research into reliable production systems
  • Evaluate and adopt emerging technologies across the modern AI stack including foundation models, vector databases, agent frameworks, and model serving systems
  • Champion AI-native engineering practices, leveraging tools like GitHub Copilot, Codex, and AI-assisted development workflows
  • Contribute to the broader strategy and evolution of the Enterprise AI Platforms ecosystem

Requirements

  • Bachelor’s degree in computer science, Engineering, Data Science, or a related field with 12 to 17 years of total experience
  • 8+ years of experience in software engineering, machine learning engineering, or ML infrastructure
  • Strong experience building production ML systems or ML platforms
  • Hands-on experience with MLOps frameworks and tools such as MLflow / Equivalent - Model lifecycle management frameworks
  • Strong programming experience in Python and modern software engineering practices such as API Driven Architecture and Event based systems
  • Experience designing scalable distributed systems or cloud-native architectures
  • Experience deploying and operating machine learning models in production environments
  • Solid understanding of modern ML workflows including training, evaluation, deployment, monitoring, and retraining

Nice to have

  • Advanced degree (Masters) in Computer Science, AI/ML, Data Science, or related discipline
  • Experience building infrastructure for LLMs, Generative AI, or foundation models
  • Understanding of Agentic AI systems and orchestration frameworks
  • Experience with LLM/SLM fine-tuning and production deployment
  • Familiarity with modern AI ecosystem technologies such as: Retrieval-Augmented Generation (RAG)
  • Vector databases
  • Model serving frameworks
  • Agent frameworks
  • Experience building internal ML platforms used by researchers or data scientists
  • Experience operating large-scale inference or GPU-based workloads
  • Strong technical leadership and mentoring ability
  • Ability to drive architecture and technical direction
  • Excellent cross-team collaboration and communication
  • Strong ownership mindset and bias toward execution
  • Passion for staying current with emerging AI technologies

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal Machine Learning Engineer

8 matching positions

Principal Machine Learning Engineer

VideoAmp is on a mission to create the best employee and workplace experience wh...
Location
Location
United States , Los Angeles; New York; Boulder; Chicago; Dallas; St. Petersburg
Salary
Salary:
184000.00 - 200000.00 USD / Year
videoamp.com Logo
VideoAmp
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in Machine Learning Engineering, Data Engineering, or a similar technical role
  • Expert-level proficiency in Python, ML frameworks, Temporal, and distributed data processing (Spark, Hive)
  • Experience with making models reproduceable as needed
  • Deep understanding of data quality methodologies, fault-tolerant data systems, and validation frameworks
  • Strong experience designing and scaling ML infrastructure, automated pipelines, and production-grade deployment workflows
  • Hands-on experience with cloud-native architectures, ideally on AWS
  • Expertise with CI/CD, version control, and modern DevOps practices
  • Strong communicator with the ability to translate complex technical concepts into clear, actionable insights
  • Demonstrated ability to lead cross-functional technical initiatives with minimal guidance
Job Responsibility
Job Responsibility
  • Architect and lead development of advanced machine learning models, quality frameworks, and large-scale data validation systems that power VideoAmp’s measurement and optimization products
  • Design and optimize ML infrastructure, including scalable distributed pipelines, model lifecycle tooling, and automated validation frameworks
  • Lead experimentation strategies, including model benchmarking, reproducibility, evaluation methodologies, and statistical validation
  • Drive data quality standards across the organization by partnering with Data Engineering, Core Engineering, and Measurement Science teams
  • Own cross-functional initiatives end-to-end, from requirements definition through production deployment, monitoring, and iteration
  • Influence technical direction, contributing to architectural decisions, design reviews, and long-term platform strategy
  • Mentor and uplevel engineers, providing guidance on ML best practices, system design, data quality, and code excellence
  • Communicate complex findings, insights, and recommendations to technical and non-technical stakeholders
What we offer
What we offer
  • Discretionary and flexible paid time off
  • In addition to standard US holidays off, VideoAmp employees also partake in Spring, Summer and Winter breaks
  • Comprehensive medical, dental, and vision benefits for you and your dependents—including multiple options fully covered by VideoAmp
  • Unlimited financial wellness sessions with Origin financial advisors
  • 401k Plan with matching
  • HSA & FSA
  • Commuter Benefits
  • Cell Phone Reimbursement
  • Paid Maternity and Parental Leave for All Family Additions
  • Equity
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer in ZMS, you will be the tech lead worki...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
zalando.de Logo
Zalando
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Excellent software development engineering skills to design computationally effective solutions for machine learning operationalization and maintenance (MLOps/MLaaS) in large-scale production environments (data engineering, data version control, model serving, continuous monitoring & alerting)
  • Strong verbal and written communication and presentation abilities when discussing complex ideas with both technical and non-technical stakeholders
  • Hands-on professional experience in programming, using Python, Java Flink, pySpark, PyTorch, and TensorFlow
  • Strong programming skills with a high performance language (Java, Scala, Go, etc) and experience working with Python in production
  • Experience building, deploying and operating data-driven systems in a cloud environment, including experience with feature stores & feature engineering pipelines, data ingestion & transformation, machine learning workflow orchestration
  • Thrive to coach and mentor senior engineers, and work closely with applied scientists, senior machine learning engineers and data scientists
Job Responsibility
Job Responsibility
  • Drive the operationalization of solutions deployed in production, and help the team grow and cultivate best practices in software development and MLOps
  • Architect and lead the development of machine learning solutions that can handle low latency, high availability and high volume scenarios
  • Mentor engineers and provide technical guidance across multiple projects simultaneously while managing competing priorities effectively within agreed-upon timelines
  • Apply techniques and create processes to optimise deployed models for better performance, latency, and memory usage
  • Work closely with applied science and engineering teams, product managers and other business stakeholders to bring our state-of-the-art solutions to customers and to discover and identify new opportunities
What we offer
What we offer
  • 27 days of holiday a year to start for full-time employees (+1 day for every calendar year up to 30 days)
  • 2 paid volunteering days a year
  • Hybrid working model with up to 60% remote per week
  • Work from abroad for up to 30 working days a year
  • Employee shares program
  • 40% off fashion and beauty products sold and shipped by Zalando, 30% off Lounge by Zalando, discounts from external partners
  • Relocation assistance available (subject to prior agreement)
  • Family services, including counseling and support
  • Health and wellbeing options (including Wellhub, formerly Gympass)
  • Mental health support and coaching available
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer in ZMS, you will be the tech lead worki...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
zalando.se Logo
Zalando Sverige
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Excellent software development engineering skills to design computationally effective solutions for machine learning operationalization and maintenance (MLOps/MLaaS) in large-scale production environments (data engineering, data version control, model serving, continuous monitoring & alerting)
  • Strong verbal and written communication and presentation abilities when discussing complex ideas with both technical and non-technical stakeholders alike
  • Hands-on professional experience in programming, using Python, Java Flink, pySpark, PyTorch, and TensorFlow
  • Strong programming skills with a high performance language (Java, Scala, Go, etc) and experience working with Python in production
  • Experience building, deploying and operating data-driven systems in a cloud environment, including experience with feature stores & feature engineering pipelines, data ingestion & transformation, machine learning workflow orchestration
  • Thrive to coach and mentor senior engineers, and work closely with applied scientists, senior machine learning engineers and data scientists
Job Responsibility
Job Responsibility
  • Drive the operationalization of solutions deployed in production, and help the team grow and cultivate best practices in software development and MLOps
  • Architect and lead the development of machine learning solutions that can handle low latency, high availability and high volume scenarios
  • Mentor engineers and provide technical guidance across multiple projects simultaneously while managing competing priorities effectively within agreed-upon timelines
  • Apply techniques and create processes to optimise deployed models for better performance, latency, and memory usage
  • Work closely with applied science and engineering teams, product managers and other business stakeholders to bring our state-of-the-art solutions to customers and to discover and identify new opportunities
What we offer
What we offer
  • 27 days of holiday a year to start for full-time employees (+1 day for every calendar year up to 30 days)
  • 2 paid volunteering days a year
  • Hybrid working model with up to 60% remote per week
  • Work from abroad for up to 30 working days a year
  • Employee shares program
  • 40% off fashion and beauty products sold and shipped by Zalando, 30% off Lounge by Zalando, discounts from external partners
  • Relocation assistance available (subject to prior agreement)
  • Family services, including counseling and support
  • Health and wellbeing options (including Wellhub, formerly Gympass)
  • Mental health support and coaching available
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

This is a high-leverage leadership role that spans architecture, execution, and ...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
ema.co Logo
Ema
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s (or PhD) degree in Computer Science, Machine Learning, Statistics, or a related field
  • A strong track record (usually 10-12+ years) of applied experience with ML techniques, especially in large-scale settings
  • Experience building production ML systems that operate at scale (latency / throughput / cost constraints)
  • Experience in Knowledge retrieval and Search space
  • Exposure in building Agentic Systems and Frameworks
  • Proficiency in relevant programming languages (e.g. Python, C++, Java) and ML frameworks (TensorFlow, PyTorch, etc.)
  • Strong understanding of the full ML lifecycle: data pipelines, feature engineering, model training, serving, monitoring, maintenance
  • Experience designing systems for monitoring, diagnostics, logging, model versioning, etc.
  • Deep knowledge of computational trade-offs: distributed training, inference, optimizations (e.g. quantization, pruning, batching)
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Lead the technical direction of GenAI and agentic ML systems that power enterprise-grade AI agents — spanning reasoning, retrieval, tool use, and integrations across various SaaS products
  • Architect, design, and implement scalable production pipelines for model training, fine-tuning, retrieval (RAG), agent orchestration, and evaluation — ensuring robustness, latency efficiency, and continuous learning
  • Define and own the multi-year ML roadmap for GenAI infrastructure — including agent frameworks, RAG systems, world-class evaluation loops, and integration with MCP, browser, and vision pipelines
  • Identify and integrate cutting-edge ML methods / research (deep learning, large models, recommender systems, LLMs, etc.) into Ema’s products or infrastructure
  • Research, prototype, and integrate cutting-edge ML and LLM advancements (reasoning, memory architectures, multi-modal perception, long-context models, autonomous agents) into the platform
  • Optimize trade-offs between accuracy, latency, cost, interpretability, and real-world reliability across the agent lifecycle — from prompt design to orchestration and execution
  • Champion engineering excellence — drive observability, reproducibility, versioning, testing, and bias-aware development across ML and agentic systems
  • Mentor and elevate senior engineers and researchers, fostering a culture of scientific rigor, experimentation, and system-level thinking
  • Collaborate cross-functionally with product, infra, and research teams to align ML innovation with enterprise needs — enabling secure integrations, privacy-aware deployments, and scalable use cases
  • Influence data strategy — guide how retrieval indices, embeddings, structured/unstructured corpora, and feedback loops evolve to improve grounding, factuality, and reasoning depth
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

With Prisma AIRS, Palo Alto Networks is building the world's most comprehensive ...
Location
Location
United States , Santa Clara
Salary
Salary:
185200.00 - 299475.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS or Ph.D. in Computer Science, a related technical field, or equivalent practical experience
  • Extensive professional experience in software engineering with a deep focus on MLOps, ML systems, or productionizing machine learning models at scale
  • Expert-level programming skills in Python are required
  • Deep, hands-on experience designing and building large-scale distributed systems on a major cloud platform (GCP, AWS, Azure, or OCI)
  • Proven track record of leading the architecture of complex ML systems and MLOps pipelines using technologies like Kubernetes and Docker
  • Mastery of ML frameworks (TensorFlow, PyTorch) and extensive experience with advanced inference optimization tools (ONNX, TensorRT)
  • A strong understanding of popular model architectures (e.g., Transformers, CNNs, GNNs) is a must
  • Demonstrated expertise with modern LLM inference engines (e.g., vLLM, SGLang, TensorRT-LLM) is required
Job Responsibility
Job Responsibility
  • Lead the architectural design of a highly scalable, low-latency, and resilient ML inference platform capable of serving a diverse range of models for real-time security applications
  • Define technical approaches to less-defined product requirements, ensuring the best fit between product features and technical implementation
  • Explore new product opportunities by maintaining a deep understanding of LLM and Generative AI research trends
  • Provide technical leadership and mentorship to the team, driving best practices in MLOps, software engineering, and system design
  • Drive the strategy for model and system performance, guiding research and implementation of advanced optimization techniques like custom kernels, hardware acceleration, and novel serving frameworks
  • Establish and enforce engineering standards for automated model deployment, robust monitoring, and operational excellence for all production ML systems
  • Act as a key technical liaison to other principal engineers, architects, and product leaders to shape the future of the Prisma AIRS platform and ensure end-to-end system cohesion
  • Tackle the most ambiguous and challenging technical problems in large-scale inference, from mitigating novel security threats to achieving unprecedented performance goals
What we offer
What we offer
  • restricted stock units
  • bonus
  • employee benefits
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

Lime is hiring a Principal Machine Learning Engineer to join the Data Science & ...
Location
Location
Canada
Salary
Salary:
192000.00 - 264000.00 CAD / Year
li.me Logo
Lime
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of professional experience in software engineering or applied ML, with a record of delivering production level systems
  • Fluency in Python
  • Experience with modern ML frameworks (e.g., PyTorch, TensorFlow) and data tools (e.g. SQL, pandas, spark, airflow)
  • Strong foundation in ML fundamentals, including model evaluation, experimentation, optimization, production deployment, and operations
  • Strong system design skills and comfort working with distributed systems
  • Track record of influencing ML architecture and practices across multiple teams
Job Responsibility
Job Responsibility
  • Drive alignment across teams on ML strategy, standards, and long-term technical direction by serving as a technical leader for Lime’s ML Center of Excellence
  • Guide recommendations for ML infrastructure, tooling, and architecture (training, serving, feature stores, experimentation, monitoring)
  • Define and evolve ML development processes, including model review, experimentation rigor, deployment, optimization, and operations
  • Establish best practices for ML monitoring, observability, alerting, and model performance health in production
  • Drive reusable feature development patterns and shared ML capabilities that enable teams to move faster and more safely
  • Partner with platform, data, and product engineering teams to ensure ML systems are reliable, scalable, and cost effective
  • Identify and prioritize opportunities where ML will improve Lime’s product, operations, or efficiency
  • Act as a force multiplier by mentoring data scientists and machine learning engineers, raising the quality bar for machine learning across Lime
What we offer
What we offer
  • Offers Equity
  • Offers Bonus
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

Lime is hiring a Principal Machine Learning Engineer to join the Data Science & ...
Location
Location
United States
Salary
Salary:
240000.00 - 330000.00 USD / Year
li.me Logo
Lime
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of professional experience in software engineering or applied ML, with a record of delivering production level systems
  • Fluency in Python and experience with modern ML frameworks (e.g., PyTorch, TensorFlow) and data tools (e.g. SQL, pandas, spark, airflow)
  • Strong foundation in ML fundamentals, including model evaluation, experimentation, optimization, production deployment, and operations
  • Strong system design skills and comfort working with distributed systems
  • Track record of influencing ML architecture and practices across multiple teams
Job Responsibility
Job Responsibility
  • Drive alignment across teams on ML strategy, standards, and long-term technical direction by serving as a technical leader for Lime’s ML Center of Excellence
  • Guide recommendations for ML infrastructure, tooling, and architecture (training, serving, feature stores, experimentation, monitoring)
  • Define and evolve ML development processes, including model review, experimentation rigor, deployment, optimization, and operations
  • Establish best practices for ML monitoring, observability, alerting, and model performance health in production
  • Drive reusable feature development patterns and shared ML capabilities that enable teams to move faster and more safely
  • Partner with platform, data, and product engineering teams to ensure ML systems are reliable, scalable, and cost effective
  • Identify and prioritize opportunities where ML will improve Lime’s product, operations, or efficiency
  • Act as a force multiplier by mentoring data scientists and machine learning engineers, raising the quality bar for machine learning across Lime
What we offer
What we offer
  • A choice of medical, dental, and vision plans
  • company-paid life and disability insurance
  • company-funded mental health benefits
  • 401(k) plan with both pre-tax and Roth options
  • access to a Health Savings Account (HSA) with a monthly company contribution
  • Paid parental leave for birthing and non-birthing parents
  • fertility and family-forming benefits
  • Unlimited vacation
  • paid leaves
  • 10 company holidays
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

Health Futures is a Research and Incubation team working at the intersection of ...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Masters in Computer Science or related technical field AND 6+ years technical engineering experience including significant work in machine learning or applied AI
  • OR equivalent experience
  • Proven track record of designing and deploying large-scale ML or MLops systems in research or product settings
  • Hands-on experience with large-scale distributed training of ML models
  • Deep expertise in ML algorithms, model optimization, and frameworks (e.g., PyTorch, TensorFlow)
  • Experience with one or more of: optimizing data mixes, mid-training, post-training, model merging, or model distillation
  • Familiarity with security and compliance standards for enterprise and health data
  • Demonstrated ability to communicate effectively and solve problems in collaborative, research-driven environment
Job Responsibility
Job Responsibility
  • Lead the design and development of machine learning models and systems for health and life sciences applications, ensuring scalability and reliability
  • Define technical strategy and architecture for ML pipelines, including data ingestion, feature engineering, model training, evaluation, and deployment
  • Collaborate with interdisciplinary teams (including scientists, researchers, and software engineers) to envision and develop AI-augmented scientific systems
  • Mentor engineers and researchers, promoting best practices in ML development, experimentation, and responsible AI principles
  • Ensure security, privacy, and regulatory compliance across ML workflows and data handling
  • Fulltime
Read More
Arrow Right