CrawlJobs Logo

Senior ML Operations Engineer

whoop.com Logo

Whoop

Location Icon

Location:
United States, Boston

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

150000.00 - 210000.00 USD / Year

Job Description:

At WHOOP, we're on a mission to unlock human performance and healthspan. WHOOP empowers members to perform at a higher level through a deeper understanding of their bodies and daily lives. We are looking for a highly skilled Senior Software Engineer to join our MLOps team, focusing on the development and optimization of ML cloud infrastructure. In this role, you will play a critical part in supporting our Data Science and AI teams by building robust, scalable systems for the productionalization of machine learning models. Your work will be at the heart of bringing advanced AI solutions into production, ensuring they are reliable, scalable, and ready to drive value across WHOOP.

Job Responsibility:

  • Design, develop, and maintain cloud-based infrastructure to support the deployment and scaling of machine learning models
  • Implement automated pipelines for continuous integration and continuous deployment (CI/CD) of ML models, ensuring seamless transitions from development to production environments
  • Collaborate closely with Data Scientists and AI teams to understand model requirements and facilitate the transition from prototype to production
  • Develop APIs, microservices, and other components necessary to integrate ML models into existing systems, enabling real-time inference and decision-making
  • Leverage cloud services to optimize the deployment and performance of machine learning models and associated infrastructure
  • Utilize services such as AWS SageMaker, Lambda, and ECS to build scalable, cost-effective solutions that support real-time ML/AI workloads
  • Monitor and optimize the performance of ML models in production, addressing issues related to latency, scalability, and resource utilization
  • Act as a key technical partner to Data Scientists, providing guidance on best practices for model deployment, versioning, and infrastructure design
  • Support AI teams by troubleshooting and resolving technical challenges related to model deployment and performance in production
  • Stay up-to-date with the latest advancements in ML infrastructure, cloud computing, and AI deployment strategies
  • Proactively suggest and implement improvements to enhance the efficiency, reliability, and scalability of ML operations within the organization

Requirements:

  • Bachelor’s Degree: A degree in Computer Science, Software Engineering, or a related field
  • or equivalent practical experience
  • 5+ years of experience in software engineering, with a significant focus on building and maintaining ML infrastructure in cloud environments
  • Deep expertise in AWS services, including but not limited to SageMaker, Lambda, ECS, S3, and IAM, with the ability to design and optimize cloud-based ML infrastructure
  • Strong programming skills in languages such as Python or Java, with a focus on building robust, maintainable code
  • Proven experience in productionalizing ML models, including building APIs and services that enable real-time inference
  • Expertise in designing scalable, resilient cloud architectures that support large-scale ML operations
  • Strong understanding of microservices, distributed systems, and the challenges of deploying and maintaining ML models in production environments
  • Excellent collaboration skills, with the ability to work closely with Data Scientists, AI and Software teams, and other cross-functional stakeholders
  • Agile Methodologies: Experience working in Agile/Scrum environments, with a focus on rapid iteration and continuous improvement
What we offer:
  • competitive base salaries
  • meaningful equity
  • benefits
  • generous equity package

Additional Information:

Job Posted:
December 13, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior ML Operations Engineer

New

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right
New

Senior Platform Engineer, ML Data Systems

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet t...
Location
Location
United States , Mountain View
Salary
Salary:
137871.00 - 172339.00 USD / Year
khanacademy.org Logo
Khan Academy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
  • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
  • Familiarity with machine learning workflows — from training data preparation to evaluation
  • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
  • Attention to detail and an obsession with data quality and reproducibility
  • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
  • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.
Job Responsibility
Job Responsibility
  • Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
  • Clean, normalize, and enrich data while preserving semantic meaning and consistency
  • Prepare and format datasets for human labeling, and integrate results into ML datasets
  • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
  • Implement automated tests and validation to detect data drift or labeling inconsistencies
  • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
  • Contribute to shared tools and documentation for dataset management and AI evaluation
  • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.
What we offer
What we offer
  • Competitive salaries
  • Ample paid time off as needed
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.
  • Fulltime
Read More
Arrow Right

Senior Support and Operations Engineer

Senior Support and Operations Engineer to join Data Management team, taking owne...
Location
Location
Greece , Athens
Salary
Salary:
Not provided
https://www.metlengroup.com Logo
Metlen Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BSc or MSc in Computer Science or related technical field
  • +4 years of experience in Operations or IT roles (Data Engineering, ML Engineering, Software Engineering, or similar)
  • Experience in system monitoring, technical support, and incident handling
  • Hands-on experience with Cloud platforms
  • Practical exposure to MLOps and DevOps frameworks (Azure DevOps/MLOps, Docker, Kubernetes, AWS DevOps/MLOps)
  • Experience managing CI/CD pipelines, especially in Azure is considered a plus
  • Experience with GitHub is an advantage
  • Solid hands-on experience in SQL and Python
  • Experience managing enterprise-scale Data/ML workflows
  • Strong analytical and problem-solving abilities
Job Responsibility
Job Responsibility
  • Oversee and optimize DevOps and MLOps operations for model deployment, monitoring, and automation
  • Execute, maintain, and improve CI/CD pipelines for Data Engineering and ML deployments
  • Collaborate closely with Data Engineers to strengthen deployment processes and operational efficiency
  • Monitor, troubleshoot, and ensure smooth execution of daily Corporate Data Warehouse workflows
  • Handle technical support requests efficiently, ensuring SLA compliance
  • Maintain high system availability and reliability through proactive monitoring
  • Implement minor enhancements, bug fixes, and performance optimizations
  • Apply version control best practices and ensure proper deployment governance
  • Collaborate cross-functionally to streamline deployment processes across environments
  • Identify opportunities for automation, observability, and improved monitoring
What we offer
What we offer
  • Competitive remuneration package
  • Ticket Restaurant Card
  • Group Health Insurance Plan
  • Preferential Protergia household energy plan
  • Pension Plan
  • Fulltime
Read More
Arrow Right
New

Senior ML Platform Engineer

At WHOOP, we're on a mission to unlock human performance and healthspan. WHOOP e...
Location
Location
United States , Boston
Salary
Salary:
150000.00 - 210000.00 USD / Year
whoop.com Logo
Whoop
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s Degree in Computer Science, Engineering, or a related field
  • or equivalent practical experience
  • 5+ years of experience in software engineering with a focus on ML infrastructure, cloud platforms, or MLOps
  • Strong programming skills in Python, with experience in building distributed systems and REST/gRPC APIs
  • Deep knowledge of cloud-native services and infrastructure-as-code (e.g., AWS CDK, Terraform, CloudFormation)
  • Hands-on experience with model deployment platforms such as AWS SageMaker, Vertex AI, or Kubernetes-based serving stacks
  • Proficiency in ML lifecycle tools (MLflow, Weights & Biases, BentoML) and containerization strategies (Docker, Kubernetes)
  • Understanding of data engineering and ingestion pipelines, with ability to interface with data lakes, feature stores, and streaming systems
  • Proven ability to work cross-functionally with Data Science, Data Platform, and Software Engineering teams, influencing decisions and driving alignment
  • Passion for AI and automation to solve real-world problems and improve operational workflows
Job Responsibility
Job Responsibility
  • Architect, build, own, and operate scalable ML infrastructure in cloud environments (e.g., AWS), optimizing for speed, observability, cost, and reproducibility
  • Create, support, and maintain core MLOps infrastructure (e.g., MLflow, feature store, experiment tracking, model registry), ensuring reliability, scalability, and long-term sustainability
  • Develop, evolve, and operate MLOps platforms and frameworks that standardize model deployment, versioning, drift detection, and lifecycle management at scale
  • Implement and continuously maintain end-to-end CI/CD pipelines for ML models using orchestration tools (e.g., Prefect, Airflow, Argo Workflows), ensuring robust testing, reproducibility, and traceability
  • Partner closely with Data Science, Sensor Intelligence, and Data Platform teams to operationalize and support model development, deployment, and monitoring workflows
  • Build, manage, and maintain both real-time and batch inference infrastructure, supporting diverse use cases from physiological analytics to personalized feedback loops for WHOOP members
  • Design, implement, and own automated observability tooling (e.g., for model latency, data drift, accuracy degradation), integrating metrics, logging, and alerting with existing platforms
  • Leverage AI-powered tools and automation to reduce operational overhead, enhance developer productivity, and accelerate model release cycles
  • Contribute to and maintain internal platform documentation, SDKs, and training materials, enabling self-service capabilities for model deployment and experimentation
  • Continuously evaluate and integrate emerging technologies and deployment strategies, influencing WHOOP’s roadmap for AI-driven platform efficiency, reliability, and scale
What we offer
What we offer
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer - Network Enablement (Applied ML)

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering skills including systems design, APIs, and building reliable backend services (Go or Python preferred)
  • Production experience with batch and streaming data pipelines and orchestration tools such as Airflow or Spark
  • Experience building or operating real-time scoring and online feature-serving systems, including feature stores and low-latency model inference
  • Experience integrating model outputs into product flows (APIs, feature flags) and measuring impact through experiments and product metrics
  • Experience with model lifecycle and operations: model registries, CI/CD for models, reproducible training, offline & online parity, monitoring and incident response
Job Responsibility
Job Responsibility
  • Embed model inference into Network Enablement product flows and decision logic (APIs, feature flags, backend flows)
  • Define and instrument product + ML success metrics (fraud reduction, retention lift, false positives, downstream impact)
  • Design and run experiments and rollout plans (backtesting, shadow scoring, A/B tests, feature-flagged releases) to validate product hypotheses
  • Build and operate offline training pipelines and production batch scoring for bank intelligence products
  • Ship and maintain online feature serving and low-latency model inference endpoints for real-time partner/bank scoring
  • Implement model CI/CD, model/version registry, and safe rollout/rollback strategies
  • Monitor model/data health: drift/regression detection, model-quality dashboards, alerts, and SLOs targeted to partner product needs
  • Ensure offline and online parity, data lineage, and automated validation / data contracts to reduce regressions
  • Optimize inference performance and cost for real-time scoring (batching, caching, runtime selection)
  • Ensure fairness, explainability and PII-aware handling for partner-facing ML features
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right
New

Senior Engineering Manager - Risk

Our mission is to build the intelligent, automated systems and operational tools...
Location
Location
United States; Canada , San Francisco; New York; Portland
Salary
Salary:
239000.00 - 298800.00 USD / Year
mercury.com Logo
Mercury
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ years of software development experience
  • 3–5+ years of engineering management in a high-scale tech environment
  • AI/ML expertise—you’ve built and launched applied AI products (from LLMs to traditional ML models), shipping them from 0→1 and scaling 1→10 in production environments
  • Proven success building large-scale backend distributed systems, ideally involving integrations and decision automation
  • Experience with or curiosity about KYC, AML, risk, or compliance systems in financial services or fintech
  • A track record of raising the bar for quality and reliability, balancing shipping speed with technical excellence
  • Strong communication and leadership skills—you can inspire engineers, partner across functions, and adapt your management style to the moment
  • The ability to hire, retain, and develop exceptional technical talent
  • A pragmatic builder’s mindset: you believe beautiful systems are those that work, adapt, and last
Job Responsibility
Job Responsibility
  • Lead teams (4–8 engineers each) responsible for account onboarding, KYC/KYB, AML, and fraud detection decisioning and workflows, and operational tooling
  • Apply AI/ML—from traditional models to large language models—to unlock faster, real-time bank account application approvals. This work sits on the critical business path, directly driving efficiency and revenue growth
  • Partner with Product, Risk, and Data teams to design and deliver scalable systems that balance user experience with compliance rigor
  • Shape the next generation of our KYC and risk platforms—reliable, resilient, and easy to extend as regulations and business needs evolve
  • Create a strong culture of operational excellence, with measurable improvements to uptime, accuracy, and system quality
  • Build, mentor, and grow engineering talent
  • help managers and senior engineers level up technically and organizationally
  • Drive clarity amid complexity: translating between regulatory nuance and technical execution
  • Foster collaboration across teams to align on priorities, simplify interfaces, and make the whole system more maintainable and elegant
What we offer
What we offer
  • base salary
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

Fospha is dedicated to building the world's most powerful measurement solution f...
Location
Location
India , Mumbai
Salary
Salary:
Not provided
blenheimchalcot.com Logo
Blenheim Chalcot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Excellent knowledge of PostgreSQL and SQL technologies
  • Fluent in Python
  • Understanding of data architecture, pipelines and ELT flows/ technology/ methodologies
  • Understanding of agile methodologies and practices
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field
Job Responsibility
Job Responsibility
  • Implement and maintain ELT (Extract, Load, Transform) processes using scalable data pipelines and data architecture
  • Collaborate with cross-functional teams to understand data requirements and deliver effective solutions
  • Ensure data integrity and quality across various data sources
  • Support data-driven decision-making by providing clean, reliable, and timely data
  • Define the standards for high-quality data for Data Science and Analytics use-cases and help shape the data roadmap for the domain
  • Design, develop, and maintain the data models used by ML Engineers, Data Analysts and Data Scientists to access data
  • Conduct exploratory data analysis to uncover data patterns and trends
  • Identify opportunities for process improvement and drive continuous improvement in data operations
  • Stay updated on industry trends, technologies, and best practices in data engineering
What we offer
What we offer
  • Competitive salary
  • Be part of a leading global venture builder, Blenheim Chalcot, and learn from the incredible talent in BC
  • Be exposed to the right mix of challenges and learning and development opportunities
  • Flexible Benefits including Private Medical and Dental, Gym Subsidiaries, Life Assurance, Pension scheme etc.
  • 25 days of paid holiday + your birthday off
  • Free snacks in the office
  • Quarterly team socials
  • Fulltime
Read More
Arrow Right
New

Senior Engineering Manager, Computer Vision

Hover helps people design, improve, and protect the properties they love. With p...
Location
Location
United States , San Francisco/New York
Salary
Salary:
247000.00 - 305000.00 USD / Year
hover.to Logo
HOVER
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years of managing high impact CV/ML teams (or tech lead / staff+ leadership) with a track record of building high-performing teams
  • 5+ years of hands-on experience in computer vision or ML (ideally 3D reconstruction, multi-view geometry, or ML-based reconstruction)
  • Proven track record partnering with product teams to scope features, run experiments, and iterate based on customer feedback and data
  • Familiarity with modern MLOps stacks (cloud GPUs, CI/CD, monitoring) and a passion for measurable reliability and cost control
  • Ability to articulate complex trade-offs to executives, engineers, and customers alike
  • Bachelor’s, Master’s, or PhD in CS, ML, or related field
Job Responsibility
Job Responsibility
  • Leading the Team: Build and nurture a high-performing, diverse team of senior ICs and emerging leaders. From hiring and onboarding to coaching and career-pathing, you’ll make talent development your first priority
  • Owning a Scaling Product Line: Take end-to-end ownership of a critical computer vision product area, ensuring our research breakthroughs translate into production systems that delight customers at scale
  • Shaping the Roadmap: Partner with Product and Design to translate market opportunities and research advances into a sequenced plan. You’ll balance innovation with operational excellence, driving projects from data strategy and experimentation through to reliable production deployment
  • Driving Technical Excellence: Set engineering standards for accuracy, latency, cost control, and reliability. Model strong cross-functional collaboration and ensure your team’s work integrates smoothly into Hover’s larger platform
  • Communicating Impact: Clearly articulate progress, trade-offs, and technical choices to executives, stakeholders, and the broader team
  • earning trust at every level
What we offer
What we offer
  • Compensation - Competitive salary and meaningful equity in a fast-growing company
  • Healthcare - Comprehensive medical, dental, and vision coverage for you and dependents
  • Paid Time Off - Unlimited and flexible vacation policy
  • Paid Family Leave - We support work/life balance and offer generous paid parental and new child bonding leave
  • Mandatory Self-Care Days - A day set aside each month to allow employees to recharge
  • Remote Wellbeing Resources - We provide recurring fitness classes, meditation/ mindfulness tools, virtual therapy, and family planning assistance
  • Learning - We encourage continued education and will help cover the cost of management training, conferences, workshops, or certifications
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.