CrawlJobs Logo

Engineering Manager - Machine Learning Infrastructure

plaid.com Logo

Plaid

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

241200.00 - 400000.00 USD / Year

Job Description:

We build simple yet innovative consumer products and developer APIs that shape how everybody interacts with money and the financial system. Plaid is evolving into an AI-first company, where data and machine learning are the key enablers of smarter, more secure insight products built on top of Plaid’s vast financial data network. The Machine Learning Infrastructure team sits at the center of this transformation. We build the platforms that enable model developers to experiment, train, deploy, and monitor machine learning systems reliably and at scale — from feature stores and pipelines, to deployment frameworks and inference tooling. We are in the midst of a pivotal shift: replacing legacy systems with a modern feature store, and establishing a standardized ML Ops “golden path.” Our mission is to enable Plaid’s product teams to move faster with trustworthy insights, deploy models with confidence, and unlock the next generation of AI-powered financial experiences.

Job Responsibility:

  • Lead and support the ML Infra team, driving project execution and ensuring delivery on key commitments
  • Build and launch Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Define and drive adoption of an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines, deployment tooling, and inference systems
  • Partner with ML product teams to understand requirements and deliver solutions that accelerate model development and iteration
  • Recruit, mentor, and develop engineers, fostering a collaborative and high-performing team culture

Requirements:

  • 8–10 years of experience in ML infrastructure, including direct hands-on expertise as an engineer, IC/TL
  • 2+ years of experience managing infrastructure or ML platform engineers
  • Proven experience delivering and operating ML or AI infrastructure at scale
  • Solid technical depth across ML/AI infrastructure domains (e.g., feature stores, pipelines, deployment, inference, observability)
  • Demonstrated ability to drive execution on complex technical projects with cross-team stakeholders
  • Strong communication and stakeholder management skills
What we offer:
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission

Additional Information:

Job Posted:
December 11, 2025

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Engineering Manager - Machine Learning Infrastructure

Senior Principal Machine Learning Engineer

You’ll form a new team of passionate engineers dedicated to building and scaling...
Location
Location
United States
Salary
Salary:
222300.00 - 348975.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s, Master’s, or PhD in Computer Science, Statistics, Mathematics, or a related field, or equivalent practical experience
  • 12+ years of industry experience in machine learning, data science, or AI, with a proven track record of delivering production-grade ML systems
  • Deep expertise in Python, Go, or Java, with the ability to write performant, production-quality code
  • familiarity with SQL, Spark, and cloud data environments (e.g., AWS, GCP, Databricks)
  • Experience building and scaling ML models for business-critical applications, ideally in security, privacy, anti-abuse, or compliance domains
  • Strong communication skills, able to explain complex ML concepts to diverse audiences and influence stakeholders
  • Demonstrated ability to solve ambiguous, complex problems and drive projects from ideation to production
  • Agile development mindset, with a focus on iterative improvement and business impact
Job Responsibility
Job Responsibility
  • Lead AI/ML Strategy for Trust: Drive the development and implementation of advanced machine learning algorithms and AI systems for Trust, Security, Product Abuse, and Compliance use cases (e.g., threat detection, vulnerability management, privacy automation, AI safety)
  • Architect and Scale ML Platforms: Design and build scalable, secure, and reliable ML infrastructure and pipelines, ensuring compliance with privacy and regulatory requirements
  • AI Safety and Responsible AI: Develop and champion AI safety practices, including output moderation, explainability, and alignment with evolving regulatory frameworks
  • Cross-Functional Collaboration: Partner with product, engineering, security, privacy, and analytics teams to deliver transformative AI/ML solutions that enhance Atlassian’s trust posture
  • Mentorship and Leadership: Mentor and guide ML engineers and data scientists, fostering a culture of technical excellence, innovation, and continuous improvement
  • Innovation and Research: Stay at the forefront of AI/ML research, evaluating and applying the latest techniques (e.g., LLMs, anomaly detection, privacy-preserving ML) to real-world Trust challenges
  • Platform Enablement: Build reusable ML services and APIs that empower other teams to integrate AI/ML into their products and workflows
  • Operational Excellence: Ensure high availability, reliability, and security of all ML-powered Trust platforms and services
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
  • benefits, bonuses, commissions, and equity
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

Join Axon and be a Force for Good. At Axon, we’re on a mission to Protect Life. ...
Location
Location
United States , Seattle
Salary
Salary:
150750.00 - 221000.00 USD / Year
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree in Computer Science, Engineering, Electronics, Mathematics or an equivalent highly technical field
  • 6+ years of software engineering experience and a proven track record of successfully deploying AI models to the cloud
  • Experience with Infrastructure-as-code and cloud architecture
  • Proficiency in Python and C++
  • familiarity with ML frameworks such as TensorFlow, or PyTorch
  • Advanced knowledge and hands-on experience with Linux
  • Excellent problem solving skills and ability to dive deep into system architecture
  • Excellent software design skills
  • Comfort communicating and interacting with scientists, engineers and product managers
Job Responsibility
Job Responsibility
  • Collaborate with scientists and product managers to build proof-of-concepts (POCs) contributing to shaping the Axon of tomorrow
  • Architect and develop secure, privacy-preserving, solutions to enable the continuous improvement of existing AI models
  • Architect platforms that accelerate research and AI product development
  • Collaborate with scientists in architecting and implementing state-of-the-art training techniques
  • Set high standards for ethical and responsible AI development
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer (Infrastructure)

We are looking for an experienced MLOps Engineer to join our team as a Senior Ma...
Location
Location
United States , Boston
Salary
Salary:
152800.00 - 224100.00 USD / Year
simplisafe.com Logo
SimpliSafe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in software engineering, data engineering, or a related field, with at least 3 years focused on MLOps or ML infrastructure
  • Deep hands-on experience with AWS or similar public clouds, including compute, networking, container orchestration, and observability stacks
  • Hands-on experience with: CI/CD pipelines, Docker
  • Kubernetes
  • Infrastructure-as-code tools (e.g., Terraform, Cloud Formation)
  • Proficiency in programming languages like Python, and familiarity with machine learning frameworks (e.g., TensorFlow, PyTorch)
  • Solid understanding of ML lifecycle management, including experiment tracking, versioning, and monitoring
  • LLM application development, including prompt engineering and evaluation
  • Strong communication skills for partnering with cross-functional technical and non-technical teams
Job Responsibility
Job Responsibility
  • Lead the architecture, deployment, and optimization of scalable ML model serving systems for real-time and batch use cases
  • Collaborate with data scientists, engineers, and stakeholders to operationalize ML models
  • Develop CI/CD pipelines for ML models enabling rapid, safe, and consistent model releases
  • Design, implement, and own comprehensive production monitoring for ML models/systems
  • Manage cloud infrastructure, primarily in AWS or other major public clouds, to support ML workloads
  • Drive best practices in model versioning, observability, reproducibility, and deployment reliability
  • Serve in an on-call rotation as a first responder for software owned by your team
What we offer
What we offer
  • A mission- and values-driven culture and a safe, inclusive environment where you can build, grow and thrive
  • A comprehensive total rewards package that supports your wellness and provides security for SimpliSafers and their families
  • Free SimpliSafe system and professional monitoring for your home
  • Employee Resource Groups (ERGs) that bring people together, give opportunities to network, mentor and develop, and advocate for change
  • Participation in our annual bonus program, equity, and other forms of compensation
  • A full range of medical, retirement, and lifestyle benefits
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

As an ML Engineer at Axon, you will contribute to developing AI solutions transf...
Location
Location
United States , Seattle
Salary
Salary:
150750.00 - 221000.00 USD / Year
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree in Computer Science, Engineering, Electronics, Mathematics or an equivalent highly technical field
  • 6+ years of software engineering experience and a proven track record of successfully deploying AI models to the cloud
  • Experience with Infrastructure-as-code and cloud architecture
  • Proficiency in Python and C++
  • familiarity with ML frameworks such as TensorFlow, or PyTorch
  • Advanced knowledge and hands-on experience with Linux
  • Excellent problem solving skills and ability to dive deep into system architecture
  • Excellent software design skills
  • Comfort communicating and interacting with scientists, engineers and product managers
Job Responsibility
Job Responsibility
  • Collaborate with scientists and product managers to build proof-of-concepts (POCs) contributing to shaping the Axon of tomorrow
  • Architect and develop secure, privacy-preserving, solutions to enable the continuous improvement of existing AI models
  • Architect platforms that accelerate research and AI product development
  • Collaborate with scientists in architecting and implementing state-of-the-art training techniques
  • Set high standards for ethical and responsible AI development
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right

Engineering Manager (Python + Machine Learning)

We are seeking a hands-on Machine Learning Engineering Manager to lead cross-fun...
Location
Location
India , Noida
Salary
Salary:
Not provided
aqusag.com Logo
AquSag Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ yrs of strong background in Machine Learning, NLP, and modern deep learning architectures (Transformers, LLMs)
  • Hands-on experience with frameworks such as PyTorch, TensorFlow, Hugging Face, or DeepSpeed
  • 2+ yrs of proven experience managing teams delivering ML/LLM models in production environments
  • Knowledge of distributed training, GPU/TPU optimization, and cloud platforms (AWS, GCP, Azure)
  • Familiarity with MLOps tools like MLflow, Kubeflow, or Vertex AI for scalable ML pipelines
  • Excellent leadership, communication, and cross-functional collaboration skills
  • Bachelor’s or Master’s in Computer Science, Engineering, or related field
Job Responsibility
Job Responsibility
  • Lead and mentor a cross-functional team of ML engineers, data scientists, and MLOps professionals
  • Oversee the full lifecycle of LLM and ML projects — from data collection to training, evaluation, and deployment
  • Collaborate with Research, Product, and Infrastructure teams to define goals, milestones, and success metrics
  • Provide technical direction on large-scale model training, fine-tuning, and distributed systems design
  • Implement best practices in MLOps, model governance, experiment tracking, and CI/CD for ML
  • Manage compute resources, budgets, and ensure compliance with data security and responsible AI standards
  • Communicate progress, risks, and results to stakeholders and executives effectively
Read More
Arrow Right

LLM - Senior Staff Engineer - Python + Machine Learning

AquSag is seeking a hands-on Machine Learning Senior Staff Engineer to lead cros...
Location
Location
Salary
Salary:
40.00 - 60.00 USD / Hour
aqusag.com Logo
AquSag Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ yrs of strong background in Machine Learning, NLP, and modern deep learning architectures (Transformers, LLMs)
  • Hands-on experience with frameworks such as PyTorch, TensorFlow, Hugging Face, or DeepSpeed
  • Hands-on experience in Docker for Production deployment
  • Proven experience managing teams delivering ML/LLM models in production environments
  • Knowledge of distributed training, GPU/TPU optimization, and cloud platforms (AWS, GCP, Azure)
  • Familiarity with MLOps tools like MLflow, Kubeflow, or Vertex AI for scalable ML pipelines
  • Excellent leadership, communication, and cross-functional collaboration skills
  • Bachelor’s or Master’s in Computer Science, Engineering, or related field (PhD preferred)
  • Overlap of 6 hours with PST time zone is mandatory
  • Commitments Required: 8 hours per day with overlap of 6 hours with PST
Job Responsibility
Job Responsibility
  • Lead and mentor a cross-functional team of ML engineers, data scientists, and MLOps professionals
  • Oversee the full lifecycle of LLM and ML projects — from data collection to training, evaluation, and deployment
  • Collaborate with Research, Product, and Infrastructure teams to define goals, milestones, and success metrics
  • Provide technical direction on large-scale model training, fine-tuning, and distributed systems design
  • Implement best practices in MLOps, model governance, experiment tracking, and CI/CD for ML
  • Manage compute resources, budgets, and ensure compliance with data security and responsible AI standards
  • Communicate progress, risks, and results to stakeholders and executives effectively
  • Fulltime
Read More
Arrow Right

Manager, Machine Learning - Community Support Engineering

The Community Support Platform (CSP) at Airbnb is a critical system that drives ...
Location
Location
United States
Salary
Salary:
204000.00 - 255000.00 USD / Year
airbnb.com Logo
Airbnb
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Expertise in various machine learning and AI methodologies, including LLMs and non-LLMs, tailored for user-facing products
  • Proven experience in leading teams that develop large-scale ML models and systems to improve online user experiences
  • Strong leadership skills with a track record of nurturing an innovative and collaborative team environment
  • Exceptional verbal and written communication abilities, with a keen eye for detail
  • Demonstrated capability to work effectively with stakeholders at all organizational levels, both internally and externally
  • Skilled in navigating and resolving ambiguous challenges through proactive and strategic approaches
  • PhD, or Master's degree in Computer Science, Mathematics, Statistics, or related technical field
  • 10+ years of experience in building and shipping AI models and products, including 2+ years of experience with LLMs
  • 5+ years managing machine learning teams that deliver large impact
  • Expert knowledge of machine learning algorithms and techniques
Job Responsibility
Job Responsibility
  • Lead and mentor a dynamic team of highly skilled applied scientists and machine learning engineers in the research, design and optimization of AI models and services
  • Develop and refine the overarching strategy for the ML and AI aspects of our community support products, focusing on scalability, quality, safety, performance, and reliability
  • Foster rapid development cycles without sacrificing quality, collaborating closely with platform, backend, and frontend engineers to engineer robust ML models and systems that enhance community support initiatives
  • Evaluate technical trade-offs in key decisions, ensuring optimal outcomes through data-backed strategies
  • Conduct thorough design and architecture reviews to continually elevate our standards of technical excellence
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Employee Travel Credits
  • Fulltime
Read More
Arrow Right

Staff Machine Learning Engineer

As a Staff Machine Learning Engineer at Aignostics, you will play a crucial role...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
aignostics.com Logo
Aignostics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree in a relevant field or extensive work experience
  • 8+ years of industry experience, with at least 2 years as Staff Engineer or an equivalent role
  • Proven track record of driving technical excellence and innovation
  • Solid background in data-intensive systems and software architecture, design patterns and clean coding
  • Expert Python programming and fluency in C/C++ or other low-level language(s)
  • Experience with designing and implementing large-scale, distributed ML systems and platforms
  • Proven track record of deploying ML models into production environments
  • Strong knowledge of machine learning fundamentals
  • Experience with deep learning frameworks (e.g. Pytorch and Tensorflow) and state-of-the-art techniques (e.g. generative models)
  • Deep understanding of cloud technologies (e.g. GCP, AWS), containerization and orchestration (Kubernetes)
Job Responsibility
Job Responsibility
  • Define and drive the technical architecture and system design principles for our AI platform and infrastructure
  • Work in close collaboration with engineering leads to build flexible frameworks and systems for model training, evaluation and inference across different pathology applications
  • Guide the CTO office, product management and fellow engineering leads through complex decisions by providing expert consultation on feasibility, architecture, trade-offs and risk mitigation strategies, while ensuring alignment with our technical vision
  • Foster technical alignment across teams by establishing shared architectural principles and best practices, facilitating cross-team design reviews to enable consistent decision-making across domains
  • Champion technical excellence by leading strategic initiatives that modernize our architecture and reduce technical debt while measuring and improving our technical health metrics
  • Elevate the technical capabilities of our engineering staff through structured mentoring, workshops and establishing comprehensive technical guidelines that enable teams to make better design decisions
  • Drive innovation by evaluating emerging technologies, leading proof-of-concept initiatives and building support for strategic technical investments that advance our engineering capabilities while ensuring measurable business value
What we offer
What we offer
  • Cutting-edge AI research and development, with involvement of Charité, TU Berlin and our other partners
  • Work with a welcoming, diverse and highly international team of colleagues
  • Opportunity to take responsibility and grow your role within the startup
  • Expand your skills by benefitting from our Learning & Development yearly budget of 1,000 € (plus 2 L&D days), language classes and internal development programs
  • Mentoring program, you’ll learn from great experts
  • Flexible working hours and teleworking policy
  • 30 paid vacations days per year
  • We are family & pet friendly and support flexible parental leave options
  • Pick a subsidized membership of your choice among public transport, sports and well-being
  • Enjoy our social gatherings, lunches and off-site events for a fun and inclusive work environment
Read More
Arrow Right