Principal Engineer Software (Machine Learning) Job at Palo Alto Networks (Santa Clara)

Senior Software Engineer and Principal Software Engineer - Power Point AI Team

The PowerPoint team is embarking on an exciting new chapter - evolving a product...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
8+ years of experience in backend service engineering, including work on high-scale infrastructures
Proficiency in one or more systems programming languages such as C#, C++
1+ years of experience in software engineering, designing and developing systems (and APIs) that deploy and integrate with AI models
2+ years of experience working with rich telemetry, making data driven decisions, and carrying out rapid experimentation
2+ years of experience building software for scale, performance, and reliability
Academic or industry experience with building, finetuning, deploying or building eval-driven systems utilizing the models (any category)

Job Responsibility

Lead design and delivery of complex, scalable AI features ensuring resilience and exceptional user experience
Drive technical strategy and architecture decisions across multiple services, influencing partner teams and aligning with compliance and security requirements
Champion modern engineering practices, including AI-driven approaches, automation, and cloud-native patterns, across the full development lifecycle
Mentor and guide engineers, fostering technical excellence and continuous improvement in security, reliability, and performance
Collaborate cross-org to solve challenging technical problems, streamline processes, and reduce operational costs while improving live-site health
Design and implement scalable backend services optimized for machine learning workflows and large language model integration
Develop and maintain evaluation-driven systems that leverage text and multimodal inputs (e.g., images) to power visual-creation experiences
Build and optimize APIs and infrastructure to support high-performance model inference and experimentation at scale
Collaborate with product, ML, and design teams to integrate models into user-facing features, ensuring seamless functionality and performance
Conduct model evaluations and experiments, analyze results, and iterate on improvements to enhance accuracy and user experience

Fulltime

New

Principal Machine Learning Engineer - Forecasting

We are seeking a Principal Machine Learning Engineer to join the Forecasting tea...

Location

India , Hyderabad

Salary:

Not provided

Amgen

Expiration Date

Until further notice

Requirements

Degree and 12+ years of experience in machine learning engineering, software engineering, data science engineering, or a related quantitative discipline.
10+ years of professional experience building, deploying, and operating production ML, AI, data, or software systems, including significant experience as a technical lead on complex, cross-functional initiatives.
Demonstrated track record of designing or architecting new and existing systems with emphasis on reliability, scale, security, maintainability, and operational excellence.
Deep hands-on experience with the full ML engineering lifecycle, including data pipelines, feature engineering, experimentation, model training, model integration, testing, deployment, monitoring, evaluation, observability, and continuous improvement.
Strong experience deploying forecasting, probabilistic, Bayesian, predictive, NLP, deep learning, or LLM-based systems in production environments.
Experience building or integrating AI systems, including LLM-powered applications, agentic workflows, retrieval or information-retrieval systems, evaluation frameworks, and human-in-the-loop review patterns.
Strong object-oriented programming skills in Python and SQL, with experience using modern ML and software development frameworks such as scikit-learn, PyTorch, TensorFlow/JAX, Spark, Ray, MLflow, Airflow/Prefect/Dagster, FastAPI, or equivalent technologies.
Experience with cloud platforms and distributed systems, including containerization, CI/CD, infrastructure-as-code, model serving, workflow orchestration, batch and streaming data processing, and production support.
Strong software engineering fundamentals, including system design, architecture trade-off analysis, testing strategies, code reviews, source control, build and release processes, performance optimization, and maintainability.
Demonstrated ability to communicate technical strategy, system tradeoffs, and delivery risks to technical and non-technical stakeholders, including senior leaders, product/program owners, scientists, and business partners.

Job Responsibility

Define and drive the technical strategy for enterprise forecasting and AI decision systems, aligning architecture, reusable platforms, and delivery roadmaps to Amgen's planning, supply, commercial, manufacturing, operations, and patient-focused priorities.
Partner with data scientists, product and program leaders, operations, commercial, manufacturing, supply chain, finance, and other business stakeholders to translate ambiguous requirements into shipped software and measurable business outcomes.
Architect, build, and scale production ML, LLM, and agentic AI systems that combine forecasting, predictive analytics, simulation, optimization, and autonomous or semi-autonomous workflow automation.
Productionize advanced statistical, Bayesian, deep learning, and machine learning models, including training, validation, inference, serving, evaluation, lifecycle management, and governed deployment.
Lead development of AI agent components that automate complex forecasting and operational workflows across multiple systems, decision points, datasets, and user groups while preserving appropriate human-in-the-loop review and escalation patterns.
Design secure integrations across enterprise APIs, databases, analytics platforms, workflow systems, cloud services, and AI orchestration patterns to enable multi-system decision support and scalable automation.
Establish robust MLOps and AI engineering capabilities, including model versioning, CI/CD, automated retraining, performance monitoring, observability, drift detection, service-level reliability, rollback strategies, and operational runbooks.
Implement guardrails, model and agent evaluation frameworks, auditability, explainability, responsible AI controls, and human-in-the-loop operating models for production AI systems in high-impact and regulated business contexts.
Research and evaluate state-of-the-art open-source, vendor, and internal tools related to forecasting, LLMs, AI agents, MLOps, model optimization, model serving, and scalable AI infrastructure for potential application to Amgen business problems.
Provide principal-level technical mentorship, design review leadership, and engineering standard-setting across teams, promoting code quality, documentation, reproducibility, testing, security, privacy, maintainability, and operational excellence.

Fulltime

Principal Engineer (Machine Learning)

We are seeking a highly experienced Sr Principal ML Engineer with a good underst...

Location

United States , Santa Clara

Salary:

185200.00 - 299475.00 USD / Year

Palo Alto Networks

Expiration Date

Until further notice

Requirements

4+ years of experience using Python to build complex backend systems, ML experience is preferred. 10+ years of experience in software development
Strong background on machine learning and ML frameworks (e.g., TensorFlow, PyTorch)
Experience with cloud-native service development stack on GCP is a plus
Solid grasp of RESTful API design and micro services architecture
Skilled in diagnosing and solving complex problems while providing detailed technical analysis
Attention to details and high behavioral standards
Team player with can-do attitude to tackle difficult problems and you inspire your team to do the same
High energy and the ability to work in a fast-paced environment
Excellent collaboration and communication with multiple teams
Fast learner and eager to absorb new emerging technologies

Job Responsibility

Architect and implement new ML models and pipeline to support efficient model training, validation, and real-time inference
Optimize the existing ML models and pipeline
Ensure smooth integration of ML solutions into production systems, focusing on performance, reliability, and scalability
Build automation tools for continuous integration, delivery, and deployment of backend and ML components
Work closely with cross-functional teams (product, QA, DevOps, and customer support) to align development efforts with business needs
Troubleshoot and resolve complex issues that arise within both the backend infrastructure and ML models
Ensure code quality, security, and data privacy by following industry best practices
Maintain clear and concise documentation for system architecture, API endpoints, and ML model integration processes

What we offer

restricted stock units
bonus

Fulltime

Principal Machine Learning Engineer - Evisort AI

Principal Machine Learning Engineer - Evisort AI at Workday. As a Principal Mach...

Location

USA , Seattle; Atlanta

Salary:

210600.00 - 316000.00 USD / Year

Workday

Expiration Date

Until further notice

Requirements

10+ years experience as a member of a data science, machine learning engineering, or other relevant software development team building applied machine learning products at scale, including taking products through applied research, design, implementation, production, and production-based evaluation
4+ years of professional experience in machine learning and deep learning frameworks & toolkits such as Pytorch, TensorFlow
6+ years of professional experience in building services to host machine learning models in production at scale
3+ years of demonstrated experience working with large language models (LLMs), text generation models, and/or graph neural network models for real-world use cases
6+ years of proven experience with cloud computing platforms (e.g. AWS, GCP, etc.)
Proven track record of successfully leading, mentoring, and/or managing ML Engineering teams, taking ownership of development lifecycle and sprint planning
fostering a culture of collaboration, transparency, innovation, and continuous improvement
Bachelor’s (Master’s or PhD preferred) degree in engineering, computer science, physics, math or equivalent

Job Responsibility

Develop tailored user experiences using advanced LLMs, Knowledge Graphs, personalization, and predictive analysis
Collaborate with other engineers to deliver ML solutions across Workday’s product ecosystem and utilize software and data engineering stacks to enable training, deployment, and lifecycle management of various ML models
Develop and deploy new products at scale and leverage Workday’s vast computing resources on rich datasets to deliver transformative value to our customers

Fulltime

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer in ZMS, you will be the tech lead worki...

Location

Germany , Berlin

Salary:

Not provided

Zalando

Expiration Date

Until further notice

Requirements

Excellent software development engineering skills to design computationally effective solutions for machine learning operationalization and maintenance (MLOps/MLaaS) in large-scale production environments (data engineering, data version control, model serving, continuous monitoring & alerting)
Strong verbal and written communication and presentation abilities when discussing complex ideas with both technical and non-technical stakeholders
Hands-on professional experience in programming, using Python, Java Flink, pySpark, PyTorch, and TensorFlow
Strong programming skills with a high performance language (Java, Scala, Go, etc) and experience working with Python in production
Experience building, deploying and operating data-driven systems in a cloud environment, including experience with feature stores & feature engineering pipelines, data ingestion & transformation, machine learning workflow orchestration
Thrive to coach and mentor senior engineers, and work closely with applied scientists, senior machine learning engineers and data scientists

Job Responsibility

Drive the operationalization of solutions deployed in production, and help the team grow and cultivate best practices in software development and MLOps
Architect and lead the development of machine learning solutions that can handle low latency, high availability and high volume scenarios
Mentor engineers and provide technical guidance across multiple projects simultaneously while managing competing priorities effectively within agreed-upon timelines
Apply techniques and create processes to optimise deployed models for better performance, latency, and memory usage
Work closely with applied science and engineering teams, product managers and other business stakeholders to bring our state-of-the-art solutions to customers and to discover and identify new opportunities

What we offer

27 days of holiday a year to start for full-time employees (+1 day for every calendar year up to 30 days)
2 paid volunteering days a year
Hybrid working model with up to 60% remote per week
Work from abroad for up to 30 working days a year
Employee shares program
40% off fashion and beauty products sold and shipped by Zalando, 30% off Lounge by Zalando, discounts from external partners
Relocation assistance available (subject to prior agreement)
Family services, including counseling and support
Health and wellbeing options (including Wellhub, formerly Gympass)
Mental health support and coaching available

Fulltime

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer in ZMS, you will be the tech lead worki...

Location

Germany , Berlin

Salary:

Not provided

Zalando Sverige

Expiration Date

Until further notice

Requirements

Excellent software development engineering skills to design computationally effective solutions for machine learning operationalization and maintenance (MLOps/MLaaS) in large-scale production environments (data engineering, data version control, model serving, continuous monitoring & alerting)
Strong verbal and written communication and presentation abilities when discussing complex ideas with both technical and non-technical stakeholders alike
Hands-on professional experience in programming, using Python, Java Flink, pySpark, PyTorch, and TensorFlow
Strong programming skills with a high performance language (Java, Scala, Go, etc) and experience working with Python in production
Experience building, deploying and operating data-driven systems in a cloud environment, including experience with feature stores & feature engineering pipelines, data ingestion & transformation, machine learning workflow orchestration
Thrive to coach and mentor senior engineers, and work closely with applied scientists, senior machine learning engineers and data scientists

Job Responsibility

Drive the operationalization of solutions deployed in production, and help the team grow and cultivate best practices in software development and MLOps
Architect and lead the development of machine learning solutions that can handle low latency, high availability and high volume scenarios
Mentor engineers and provide technical guidance across multiple projects simultaneously while managing competing priorities effectively within agreed-upon timelines
Apply techniques and create processes to optimise deployed models for better performance, latency, and memory usage
Work closely with applied science and engineering teams, product managers and other business stakeholders to bring our state-of-the-art solutions to customers and to discover and identify new opportunities

What we offer

27 days of holiday a year to start for full-time employees (+1 day for every calendar year up to 30 days)
2 paid volunteering days a year
Hybrid working model with up to 60% remote per week
Work from abroad for up to 30 working days a year
Employee shares program
40% off fashion and beauty products sold and shipped by Zalando, 30% off Lounge by Zalando, discounts from external partners
Relocation assistance available (subject to prior agreement)
Family services, including counseling and support
Health and wellbeing options (including Wellhub, formerly Gympass)
Mental health support and coaching available

Fulltime

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer, you will lead the architecture and dev...

Location

India , Hyderabad

Salary:

Not provided

Amgen

Expiration Date

Until further notice

Requirements

Bachelor’s degree in computer science, Engineering, Data Science, or a related field with 12 to 17 years of total experience
8+ years of experience in software engineering, machine learning engineering, or ML infrastructure
Strong experience building production ML systems or ML platforms
Hands-on experience with MLOps frameworks and tools such as MLflow / Equivalent - Model lifecycle management frameworks
Strong programming experience in Python and modern software engineering practices such as API Driven Architecture and Event based systems
Experience designing scalable distributed systems or cloud-native architectures
Experience deploying and operating machine learning models in production environments
Solid understanding of modern ML workflows including training, evaluation, deployment, monitoring, and retraining

Job Responsibility

Architect and build a scalable ML platform for training, deployment, and lifecycle management of ML, LLM, and Generative AI models
Lead development of infrastructure that supports production hosting of complex AI systems, including large-scale inference workloads
Design developer-friendly abstractions and automation that make it easy for researchers to build and deploy models within the Amgen ecosystem
Implement and evolve MLOps capabilities including experiment tracking, model versioning, CI/CD for ML, monitoring, and reproducibility using tools such as Databricks and MLflow
Build platform capabilities supporting Generative AI and emerging Agentic AI systems
Serve as the technical leader for a team of engineers, guiding architecture, design reviews, and engineering best practices
Partner with AI researchers, data scientists, and platform teams to translate cutting-edge AI research into reliable production systems
Evaluate and adopt emerging technologies across the modern AI stack including foundation models, vector databases, agent frameworks, and model serving systems
Champion AI-native engineering practices, leveraging tools like GitHub Copilot, Codex, and AI-assisted development workflows
Contribute to the broader strategy and evolution of the Enterprise AI Platforms ecosystem

Principal Machine Learning Engineer

With Prisma AIRS, Palo Alto Networks is building the world's most comprehensive ...

Location

United States , Santa Clara

Salary:

185200.00 - 299475.00 USD / Year

Palo Alto Networks

Expiration Date

Until further notice

Requirements

BS/MS or Ph.D. in Computer Science, a related technical field, or equivalent practical experience
Extensive professional experience in software engineering with a deep focus on MLOps, ML systems, or productionizing machine learning models at scale
Expert-level programming skills in Python are required
Deep, hands-on experience designing and building large-scale distributed systems on a major cloud platform (GCP, AWS, Azure, or OCI)
Proven track record of leading the architecture of complex ML systems and MLOps pipelines using technologies like Kubernetes and Docker
Mastery of ML frameworks (TensorFlow, PyTorch) and extensive experience with advanced inference optimization tools (ONNX, TensorRT)
A strong understanding of popular model architectures (e.g., Transformers, CNNs, GNNs) is a must
Demonstrated expertise with modern LLM inference engines (e.g., vLLM, SGLang, TensorRT-LLM) is required

Job Responsibility

Lead the architectural design of a highly scalable, low-latency, and resilient ML inference platform capable of serving a diverse range of models for real-time security applications
Define technical approaches to less-defined product requirements, ensuring the best fit between product features and technical implementation
Explore new product opportunities by maintaining a deep understanding of LLM and Generative AI research trends
Provide technical leadership and mentorship to the team, driving best practices in MLOps, software engineering, and system design
Drive the strategy for model and system performance, guiding research and implementation of advanced optimization techniques like custom kernels, hardware acceleration, and novel serving frameworks
Establish and enforce engineering standards for automated model deployment, robust monitoring, and operational excellence for all production ML systems
Act as a key technical liaison to other principal engineers, architects, and product leaders to shape the future of the Prisma AIRS platform and ensure end-to-end system cohesion
Tackle the most ambiguous and challenging technical problems in large-scale inference, from mitigating novel security threats to achieving unprecedented performance goals

What we offer

restricted stock units
bonus
employee benefits

Fulltime

Select Country

Principal Engineer Software (Machine Learning)

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?