CrawlJobs Logo

Principal Engineer Software (Machine Learning)

United States, Santa Clara Employment contract 147000.00 - 237500.00 USD / Year · Job Posted July 05, 2026
Apply Position
Job Link Share

Job Description

As a Principal Software Engineer, you will provide technical leadership in designing and delivering robust, next-generation cloud security solutions. You will drive the development of scalable cloud security architecture through hands-on coding, manage the full product lifecycle, and collaborate across teams to simplify complex technical issues and deliver high-quality security-as-a-service offerings.

Job Responsibility

  • Provide technical leadership for end-to-end solution delivery, collaborating with cross-functional teams (Product, SRE, QA, and Support) to align engineering efforts with business objectives
  • Drive the development of scalable cloud security architecture through a balance of strategic planning and hands-on coding
  • Establish and evangelize best practices for model versioning, reproducibility, auditing, and compliance to ensure code quality and data privacy across the organization
  • Architect and lead the entire ML lifecycle, from initial development and training to production deployment and real-time inference
  • Build and maintain automated, resilient systems for continuous integration, delivery (CI/CD), and monitoring of backend and machine learning components
  • Continuously evaluate and integrate cutting-edge MLOps tools and frameworks to enhance system scalability, reliability, and efficiency
  • Design and implement robust, next-generation cloud security solutions to resolve complex backend infrastructure and ML model challenges
  • Strategically manage and optimize ML infrastructure and pipelines to improve performance, ensure smooth production integration, and reduce operational costs

Requirements

  • Strong background on machine learning and ML frameworks (e.g., TensorFlow, PyTorch)
  • Experience with Infrastructure-as-Code (IaC) tools like Terraform or CloudFormation
  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
  • 10+ years of software development experience, with a focus on cloud-native and SaaS applications
  • Proven experience in designing and building large-scale, distributed systems on public cloud platforms (AWS, GCP, Azure)
  • Strong proficiency in at least one modern programming language such as Python, Go, or Java
  • Demonstrated experience with the full machine learning lifecycle, including model deployment and MLOps

Nice to have

  • Master's or PhD in Computer Science or a related technical field
  • Experience in the cybersecurity domain or with network security products
  • Expertise with containerization and orchestration technologies, particularly Docker and Kubernetes
  • Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink)
  • Contributions to open-source projects in the cloud-native or MLOps space

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal Engineer Software (Machine Learning)

8 matching positions

Senior Software Engineer and Principal Software Engineer - Power Point AI Team

The PowerPoint team is embarking on an exciting new chapter - evolving a product...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 8+ years of experience in backend service engineering, including work on high-scale infrastructures
  • Proficiency in one or more systems programming languages such as C#, C++
  • 1+ years of experience in software engineering, designing and developing systems (and APIs) that deploy and integrate with AI models
  • 2+ years of experience working with rich telemetry, making data driven decisions, and carrying out rapid experimentation
  • 2+ years of experience building software for scale, performance, and reliability
  • Academic or industry experience with building, finetuning, deploying or building eval-driven systems utilizing the models (any category)
Job Responsibility
Job Responsibility
  • Lead design and delivery of complex, scalable AI features ensuring resilience and exceptional user experience
  • Drive technical strategy and architecture decisions across multiple services, influencing partner teams and aligning with compliance and security requirements
  • Champion modern engineering practices, including AI-driven approaches, automation, and cloud-native patterns, across the full development lifecycle
  • Mentor and guide engineers, fostering technical excellence and continuous improvement in security, reliability, and performance
  • Collaborate cross-org to solve challenging technical problems, streamline processes, and reduce operational costs while improving live-site health
  • Design and implement scalable backend services optimized for machine learning workflows and large language model integration
  • Develop and maintain evaluation-driven systems that leverage text and multimodal inputs (e.g., images) to power visual-creation experiences
  • Build and optimize APIs and infrastructure to support high-performance model inference and experimentation at scale
  • Collaborate with product, ML, and design teams to integrate models into user-facing features, ensuring seamless functionality and performance
  • Conduct model evaluations and experiments, analyze results, and iterate on improvements to enhance accuracy and user experience
  • Fulltime
Read More
Arrow Right
New

Principal Machine Learning Engineer - Forecasting

We are seeking a Principal Machine Learning Engineer to join the Forecasting tea...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Degree and 12+ years of experience in machine learning engineering, software engineering, data science engineering, or a related quantitative discipline.
  • 10+ years of professional experience building, deploying, and operating production ML, AI, data, or software systems, including significant experience as a technical lead on complex, cross-functional initiatives.
  • Demonstrated track record of designing or architecting new and existing systems with emphasis on reliability, scale, security, maintainability, and operational excellence.
  • Deep hands-on experience with the full ML engineering lifecycle, including data pipelines, feature engineering, experimentation, model training, model integration, testing, deployment, monitoring, evaluation, observability, and continuous improvement.
  • Strong experience deploying forecasting, probabilistic, Bayesian, predictive, NLP, deep learning, or LLM-based systems in production environments.
  • Experience building or integrating AI systems, including LLM-powered applications, agentic workflows, retrieval or information-retrieval systems, evaluation frameworks, and human-in-the-loop review patterns.
  • Strong object-oriented programming skills in Python and SQL, with experience using modern ML and software development frameworks such as scikit-learn, PyTorch, TensorFlow/JAX, Spark, Ray, MLflow, Airflow/Prefect/Dagster, FastAPI, or equivalent technologies.
  • Experience with cloud platforms and distributed systems, including containerization, CI/CD, infrastructure-as-code, model serving, workflow orchestration, batch and streaming data processing, and production support.
  • Strong software engineering fundamentals, including system design, architecture trade-off analysis, testing strategies, code reviews, source control, build and release processes, performance optimization, and maintainability.
  • Demonstrated ability to communicate technical strategy, system tradeoffs, and delivery risks to technical and non-technical stakeholders, including senior leaders, product/program owners, scientists, and business partners.
Job Responsibility
Job Responsibility
  • Define and drive the technical strategy for enterprise forecasting and AI decision systems, aligning architecture, reusable platforms, and delivery roadmaps to Amgen's planning, supply, commercial, manufacturing, operations, and patient-focused priorities.
  • Partner with data scientists, product and program leaders, operations, commercial, manufacturing, supply chain, finance, and other business stakeholders to translate ambiguous requirements into shipped software and measurable business outcomes.
  • Architect, build, and scale production ML, LLM, and agentic AI systems that combine forecasting, predictive analytics, simulation, optimization, and autonomous or semi-autonomous workflow automation.
  • Productionize advanced statistical, Bayesian, deep learning, and machine learning models, including training, validation, inference, serving, evaluation, lifecycle management, and governed deployment.
  • Lead development of AI agent components that automate complex forecasting and operational workflows across multiple systems, decision points, datasets, and user groups while preserving appropriate human-in-the-loop review and escalation patterns.
  • Design secure integrations across enterprise APIs, databases, analytics platforms, workflow systems, cloud services, and AI orchestration patterns to enable multi-system decision support and scalable automation.
  • Establish robust MLOps and AI engineering capabilities, including model versioning, CI/CD, automated retraining, performance monitoring, observability, drift detection, service-level reliability, rollback strategies, and operational runbooks.
  • Implement guardrails, model and agent evaluation frameworks, auditability, explainability, responsible AI controls, and human-in-the-loop operating models for production AI systems in high-impact and regulated business contexts.
  • Research and evaluate state-of-the-art open-source, vendor, and internal tools related to forecasting, LLMs, AI agents, MLOps, model optimization, model serving, and scalable AI infrastructure for potential application to Amgen business problems.
  • Provide principal-level technical mentorship, design review leadership, and engineering standard-setting across teams, promoting code quality, documentation, reproducibility, testing, security, privacy, maintainability, and operational excellence.
  • Fulltime
Read More
Arrow Right

Principal Engineer (Machine Learning)

We are seeking a highly experienced Sr Principal ML Engineer with a good underst...
Location
Location
United States , Santa Clara
Salary
Salary:
185200.00 - 299475.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience using Python to build complex backend systems, ML experience is preferred. 10+ years of experience in software development
  • Strong background on machine learning and ML frameworks (e.g., TensorFlow, PyTorch)
  • Experience with cloud-native service development stack on GCP is a plus
  • Solid grasp of RESTful API design and micro services architecture
  • Skilled in diagnosing and solving complex problems while providing detailed technical analysis
  • Attention to details and high behavioral standards
  • Team player with can-do attitude to tackle difficult problems and you inspire your team to do the same
  • High energy and the ability to work in a fast-paced environment
  • Excellent collaboration and communication with multiple teams
  • Fast learner and eager to absorb new emerging technologies
Job Responsibility
Job Responsibility
  • Architect and implement new ML models and pipeline to support efficient model training, validation, and real-time inference
  • Optimize the existing ML models and pipeline
  • Ensure smooth integration of ML solutions into production systems, focusing on performance, reliability, and scalability
  • Build automation tools for continuous integration, delivery, and deployment of backend and ML components
  • Work closely with cross-functional teams (product, QA, DevOps, and customer support) to align development efforts with business needs
  • Troubleshoot and resolve complex issues that arise within both the backend infrastructure and ML models
  • Ensure code quality, security, and data privacy by following industry best practices
  • Maintain clear and concise documentation for system architecture, API endpoints, and ML model integration processes
What we offer
What we offer
  • restricted stock units
  • bonus
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer - Evisort AI

Principal Machine Learning Engineer - Evisort AI at Workday. As a Principal Mach...
Location
Location
USA , Seattle; Atlanta
Salary
Salary:
210600.00 - 316000.00 USD / Year
Workday
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years experience as a member of a data science, machine learning engineering, or other relevant software development team building applied machine learning products at scale, including taking products through applied research, design, implementation, production, and production-based evaluation
  • 4+ years of professional experience in machine learning and deep learning frameworks & toolkits such as Pytorch, TensorFlow
  • 6+ years of professional experience in building services to host machine learning models in production at scale
  • 3+ years of demonstrated experience working with large language models (LLMs), text generation models, and/or graph neural network models for real-world use cases
  • 6+ years of proven experience with cloud computing platforms (e.g. AWS, GCP, etc.)
  • Proven track record of successfully leading, mentoring, and/or managing ML Engineering teams, taking ownership of development lifecycle and sprint planning
  • fostering a culture of collaboration, transparency, innovation, and continuous improvement
  • Bachelor’s (Master’s or PhD preferred) degree in engineering, computer science, physics, math or equivalent
Job Responsibility
Job Responsibility
  • Develop tailored user experiences using advanced LLMs, Knowledge Graphs, personalization, and predictive analysis
  • Collaborate with other engineers to deliver ML solutions across Workday’s product ecosystem and utilize software and data engineering stacks to enable training, deployment, and lifecycle management of various ML models
  • Develop and deploy new products at scale and leverage Workday’s vast computing resources on rich datasets to deliver transformative value to our customers
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer in ZMS, you will be the tech lead worki...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
zalando.de Logo
Zalando
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Excellent software development engineering skills to design computationally effective solutions for machine learning operationalization and maintenance (MLOps/MLaaS) in large-scale production environments (data engineering, data version control, model serving, continuous monitoring & alerting)
  • Strong verbal and written communication and presentation abilities when discussing complex ideas with both technical and non-technical stakeholders
  • Hands-on professional experience in programming, using Python, Java Flink, pySpark, PyTorch, and TensorFlow
  • Strong programming skills with a high performance language (Java, Scala, Go, etc) and experience working with Python in production
  • Experience building, deploying and operating data-driven systems in a cloud environment, including experience with feature stores & feature engineering pipelines, data ingestion & transformation, machine learning workflow orchestration
  • Thrive to coach and mentor senior engineers, and work closely with applied scientists, senior machine learning engineers and data scientists
Job Responsibility
Job Responsibility
  • Drive the operationalization of solutions deployed in production, and help the team grow and cultivate best practices in software development and MLOps
  • Architect and lead the development of machine learning solutions that can handle low latency, high availability and high volume scenarios
  • Mentor engineers and provide technical guidance across multiple projects simultaneously while managing competing priorities effectively within agreed-upon timelines
  • Apply techniques and create processes to optimise deployed models for better performance, latency, and memory usage
  • Work closely with applied science and engineering teams, product managers and other business stakeholders to bring our state-of-the-art solutions to customers and to discover and identify new opportunities
What we offer
What we offer
  • 27 days of holiday a year to start for full-time employees (+1 day for every calendar year up to 30 days)
  • 2 paid volunteering days a year
  • Hybrid working model with up to 60% remote per week
  • Work from abroad for up to 30 working days a year
  • Employee shares program
  • 40% off fashion and beauty products sold and shipped by Zalando, 30% off Lounge by Zalando, discounts from external partners
  • Relocation assistance available (subject to prior agreement)
  • Family services, including counseling and support
  • Health and wellbeing options (including Wellhub, formerly Gympass)
  • Mental health support and coaching available
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer in ZMS, you will be the tech lead worki...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
zalando.se Logo
Zalando Sverige
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Excellent software development engineering skills to design computationally effective solutions for machine learning operationalization and maintenance (MLOps/MLaaS) in large-scale production environments (data engineering, data version control, model serving, continuous monitoring & alerting)
  • Strong verbal and written communication and presentation abilities when discussing complex ideas with both technical and non-technical stakeholders alike
  • Hands-on professional experience in programming, using Python, Java Flink, pySpark, PyTorch, and TensorFlow
  • Strong programming skills with a high performance language (Java, Scala, Go, etc) and experience working with Python in production
  • Experience building, deploying and operating data-driven systems in a cloud environment, including experience with feature stores & feature engineering pipelines, data ingestion & transformation, machine learning workflow orchestration
  • Thrive to coach and mentor senior engineers, and work closely with applied scientists, senior machine learning engineers and data scientists
Job Responsibility
Job Responsibility
  • Drive the operationalization of solutions deployed in production, and help the team grow and cultivate best practices in software development and MLOps
  • Architect and lead the development of machine learning solutions that can handle low latency, high availability and high volume scenarios
  • Mentor engineers and provide technical guidance across multiple projects simultaneously while managing competing priorities effectively within agreed-upon timelines
  • Apply techniques and create processes to optimise deployed models for better performance, latency, and memory usage
  • Work closely with applied science and engineering teams, product managers and other business stakeholders to bring our state-of-the-art solutions to customers and to discover and identify new opportunities
What we offer
What we offer
  • 27 days of holiday a year to start for full-time employees (+1 day for every calendar year up to 30 days)
  • 2 paid volunteering days a year
  • Hybrid working model with up to 60% remote per week
  • Work from abroad for up to 30 working days a year
  • Employee shares program
  • 40% off fashion and beauty products sold and shipped by Zalando, 30% off Lounge by Zalando, discounts from external partners
  • Relocation assistance available (subject to prior agreement)
  • Family services, including counseling and support
  • Health and wellbeing options (including Wellhub, formerly Gympass)
  • Mental health support and coaching available
  • Fulltime
Read More
Arrow Right

Principal Machine Learning Engineer

As a Principal Machine Learning Engineer, you will lead the architecture and dev...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, Data Science, or a related field with 12 to 17 years of total experience
  • 8+ years of experience in software engineering, machine learning engineering, or ML infrastructure
  • Strong experience building production ML systems or ML platforms
  • Hands-on experience with MLOps frameworks and tools such as MLflow / Equivalent - Model lifecycle management frameworks
  • Strong programming experience in Python and modern software engineering practices such as API Driven Architecture and Event based systems
  • Experience designing scalable distributed systems or cloud-native architectures
  • Experience deploying and operating machine learning models in production environments
  • Solid understanding of modern ML workflows including training, evaluation, deployment, monitoring, and retraining
Job Responsibility
Job Responsibility
  • Architect and build a scalable ML platform for training, deployment, and lifecycle management of ML, LLM, and Generative AI models
  • Lead development of infrastructure that supports production hosting of complex AI systems, including large-scale inference workloads
  • Design developer-friendly abstractions and automation that make it easy for researchers to build and deploy models within the Amgen ecosystem
  • Implement and evolve MLOps capabilities including experiment tracking, model versioning, CI/CD for ML, monitoring, and reproducibility using tools such as Databricks and MLflow
  • Build platform capabilities supporting Generative AI and emerging Agentic AI systems
  • Serve as the technical leader for a team of engineers, guiding architecture, design reviews, and engineering best practices
  • Partner with AI researchers, data scientists, and platform teams to translate cutting-edge AI research into reliable production systems
  • Evaluate and adopt emerging technologies across the modern AI stack including foundation models, vector databases, agent frameworks, and model serving systems
  • Champion AI-native engineering practices, leveraging tools like GitHub Copilot, Codex, and AI-assisted development workflows
  • Contribute to the broader strategy and evolution of the Enterprise AI Platforms ecosystem
Read More
Arrow Right

Principal Machine Learning Engineer

With Prisma AIRS, Palo Alto Networks is building the world's most comprehensive ...
Location
Location
United States , Santa Clara
Salary
Salary:
185200.00 - 299475.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS or Ph.D. in Computer Science, a related technical field, or equivalent practical experience
  • Extensive professional experience in software engineering with a deep focus on MLOps, ML systems, or productionizing machine learning models at scale
  • Expert-level programming skills in Python are required
  • Deep, hands-on experience designing and building large-scale distributed systems on a major cloud platform (GCP, AWS, Azure, or OCI)
  • Proven track record of leading the architecture of complex ML systems and MLOps pipelines using technologies like Kubernetes and Docker
  • Mastery of ML frameworks (TensorFlow, PyTorch) and extensive experience with advanced inference optimization tools (ONNX, TensorRT)
  • A strong understanding of popular model architectures (e.g., Transformers, CNNs, GNNs) is a must
  • Demonstrated expertise with modern LLM inference engines (e.g., vLLM, SGLang, TensorRT-LLM) is required
Job Responsibility
Job Responsibility
  • Lead the architectural design of a highly scalable, low-latency, and resilient ML inference platform capable of serving a diverse range of models for real-time security applications
  • Define technical approaches to less-defined product requirements, ensuring the best fit between product features and technical implementation
  • Explore new product opportunities by maintaining a deep understanding of LLM and Generative AI research trends
  • Provide technical leadership and mentorship to the team, driving best practices in MLOps, software engineering, and system design
  • Drive the strategy for model and system performance, guiding research and implementation of advanced optimization techniques like custom kernels, hardware acceleration, and novel serving frameworks
  • Establish and enforce engineering standards for automated model deployment, robust monitoring, and operational excellence for all production ML systems
  • Act as a key technical liaison to other principal engineers, architects, and product leaders to shape the future of the Prisma AIRS platform and ensure end-to-end system cohesion
  • Tackle the most ambiguous and challenging technical problems in large-scale inference, from mitigating novel security threats to achieving unprecedented performance goals
What we offer
What we offer
  • restricted stock units
  • bonus
  • employee benefits
  • Fulltime
Read More
Arrow Right