CrawlJobs Logo

Senior ML Ops Engineer

edtechjobs.io Logo

EdTech Jobs

Location Icon

Location:
United States , Philadelphia

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

95300.00 - 158800.00 USD / Year

Job Description:

Join Elsevier as a Senior ML Ops Engineer to lead the development of impactful AI-based features within health platforms while bridging the gap between data science and engineering. You will work on AI-based features (GenAI, Agentic AI, RAG, etc.) search/ranking quality, and knowledge graph aware retrieval while enforcing content rights and editorial confidentiality.

Job Responsibility:

  • Automate and orchestrate machine learning workflows across major cloud and AI platforms (AWS, Azure, Databricks, and foundation model APIs such as OpenAI)
  • Maintain and version model registries and artifact stores to ensure reproducibility and governance
  • Develop and manage CI/CD for ML, including automated data validation, model testing, and deployment
  • Implement ML Engineering solutions using popular MLOps platforms such as AWS SageMaker, MLflow, Azure ML
  • Scale end-end custom Sagemaker pipelines
  • Design and implement the engineering components of GAR+RAG systems (e.g., query interpretation and reflection, chunking, embeddings, hybrid retrieval, semantic search), manage prompt libraries, guardrails and structured output for LLMs hosted on Bedrock/SageMaker or self-hosted
  • Design and implement ML pipelines that utilize Elasticsearch/OpenSearch/Solr, vector DBs, and graph DBs
  • Build evaluation pipelines: offline IR metrics (NDCG, MAP, MRR), LLM quality metrics (faithfulness, grounding), and A/B testing
  • Optimize infrastructure costs through monitoring, scaling strategies, and efficient resource utilization
  • Stay current with the latest GAI research, NLP and RAG and apply the state-of-the-art in our experiments and systems
  • Partner with Subject-Matter Experts, Product Managers, Data Scientists and Responsible AI experts to translate business problems into cutting edge data science solutions
  • Collaborate and interface with Operations Engineers who deploy and run production infrastructure

Requirements:

  • Current experience in ML Engineering, MLOps platforms, shipping ML or search/GenAI systems to production
  • Strong Python, Java, and/or Scala experience
  • Hands-on experience with major cloud vendor solutions (AWS, Azure and/or Google)
  • Experience with Search/vector/graph technologies (e.g., Elasticsearch / OpenSearch / Solr / Neo4j)
  • Experience in evaluating LLM models
  • A strong understanding of the Data Science Life Cycle including feature engineering, model training, and evaluation metrics
  • Familiarity with ML frameworks, e.g., PyTorch, TensorFlow, PySpark
  • Experience with large-scale data processing systems, e.g., Spark
  • Experience with statistical analysis, machine learning theory and natural language processing

Nice to have:

Background in health technology and/or medical content workflows

What we offer:
  • Annual incentive bonus
  • Country specific benefits
  • Fair and accessible hiring process with accommodation support

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior ML Ops Engineer

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer – ML Model Compliance & Automation

We are seeking a highly skilled and motivated Senior Software Engineer to lead t...
Location
Location
India , Jaipur
Salary
Salary:
Not provided
infoobjects.com Logo
InfoObjects
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience Required: 3 - 7 yrs
  • GoLang (preferred)
  • Python (preferred)
  • Bash
  • MLOps Tools: KitOps, MLModelCI, MLflow, ONNX, TensorFlow, PyTorch, Docker
  • SBOM & Security: Syft, Grype, Trivy, CycloneDX, SPDX
  • CI/CD: GitHub Actions, GitLab CI, Jenkins, ArgoCD
  • Infra: Kubernetes, Docker, Helm, Terraform
  • Cloud: AWS, GCP, Azure (EKS/GKE/ECS preferred)
  • Version Control: Git, GitOps
Job Responsibility
Job Responsibility
  • Model Packaging & Artifact Management: Design and implement workflows for packaging ML models using KitOps, ONNX, MLflow, or TensorFlow SavedModel
  • Manage model artifact versioning, registries, and reproducibility
  • Ensure artifact integrity, consistency, and traceability across CI/CD pipelines
  • Model Profiling & Optimization: Automate model profiling (latency, size, ops) using MLModelCI, TorchServe, or ONNX Runtime
  • Apply quantization, pruning, and format conversions (e.g., FP32→INT8) for optimization
  • Embed profiling and optimization checks into CI/CD pipelines to assess deployment readiness
  • Compliance & SBOM Generation: Develop pipelines to generate and validate SBOMs for ML models
  • Implement compliance checks for licensing, vulnerabilities, and security using CycloneDX, SPDX, Syft, or Trivy
  • Validate schema, dependencies, and runtime environments for production readiness
  • Cloud Integration & Deployment: Automate model registration, endpoint creation, and monitoring setup in AWS/GCP/Azure
  • Fulltime
Read More
Arrow Right

Senior AI Engineer

We are seeking a Senior AI Engineer (L4, Individual Contributor) to design, buil...
Location
Location
India , Chennai
Salary
Salary:
Not provided
arcadia.com Logo
Arcadia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of professional software engineering experience
  • 3+ years in AI/ML development
  • Strong expertise in Python, PyTorch/TensorFlow, scikit-learn, and ML tooling (MLflow, LangChain)
  • Proficiency with SQL, cloud services (AWS), containers (Docker, Kubernetes), and distributed systems
  • Understanding of modern AI research (LLMs, diffusion models, transformers)
  • Experience deploying ML models in production with CI/CD
  • Strong analytical skills, ability to balance speed and rigor in experimentation
  • A passion for sustainability and the clean-energy mission
  • Experienced with building agentic pipelines with the latest models from Anthropic, Google, OpenAI, and more
Job Responsibility
Job Responsibility
  • Integrate with LLMs and be an expert in prompt engineering to derive the right results from the models with limited hallucination
  • Design and train ML/AI models (forecasting, NLP, graph learning, generative AI) to improve data quality, cost effectiveness, and system scalability
  • Deploy and optimize models for large-scale production workloads using Python-based services in AWS/Kubernetes environments
  • Build robust, automated data pipelines and ML Ops workflows for continuous training and deployment
  • Research and experiment with modern AI methods (transformers, foundation models, reinforcement learning) and adapt them to energy-sector challenges not limited to utility statements
  • Drive performance improvements in model accuracy, latency, and cost efficiency
  • Collaborate with Product, SRE, and Analytics teams to deliver AI-enabled features across Arcadia’s platform
  • Write clean, maintainable code, contribute to architecture reviews, and mentor junior engineers
  • Build true agentic workflows with multi-step processing incorporating RAG pipelines and MCPs
What we offer
What we offer
  • Competitive compensation and employee stock options
  • Hybrid/remote-first working model (India-based role, with global collaboration)
  • Flexible leave policy
  • Comprehensive medical insurance (self + family members)
  • Annual performance cycle + quarterly recognition awards
  • A supportive, diverse engineering culture grounded in empathy, teamwork, and innovation
  • Fulltime
Read More
Arrow Right

Senior Full Stack Software Engineer

Tutor Intelligence builds software to enable ordinary robots to achieve extraord...
Location
Location
United States , Watertown
Salary
Salary:
140000.00 - 190000.00 USD / Year
tutorintelligence.com Logo
Tutor Intelligence
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills in Python
  • Software engineering tooling: git, unix shell, etc
  • Collaborative nature and social skill set
  • Interest in robotics, AI, solving hard problems, or improving the future of humanity
  • Passion for building things (and just getting stuff done)
Job Responsibility
Job Responsibility
  • Architecting and engineering core software across one or more of: robot software, backend services, ML services, cloud infrastructure / dev-ops
  • Involvement in new project planning
What we offer
What we offer
  • generous equity
  • fully covered health + dental
  • unlimited PTO
  • Fulltime
Read More
Arrow Right

Senior Platform Machine Learning Engineer

Machine learning is the crucial enabler for every financial service EarnIn provi...
Location
Location
United States , Mountain View
Salary
Salary:
232200.00 - 283800.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field, or relevant equivalent experience
  • 4+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms like AWS Sagemaker, Databricks, or GCP Vertex AI
  • Experience with LLM Ops, foundation model APIs, and AI engineering
  • Familiarity with data pipeline and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest machine learning and platform engineering industry trends
Job Responsibility
Job Responsibility
  • Design, build, and maintain the ML and AI platform and tools to support the end-to-end machine learning lifecycle
  • Work closely with other machine learning engineers to understand their workflows, optimize model training and deployment processes, and ensure the reproducibility of results
  • Ensure scalability, reliability, cost efficiency, and ease of use of the machine learning platform
  • Contribute to evaluating and adopting new technologies and tools to enhance our machine-learning capabilities
  • Set examples of outstanding operational excellence. Be the catalyst for step-jump changes
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Senior Principal Technical Program Manager - ML Platform

Location
Location
Salary
Salary:
231300.00 - 301975.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience on software teams as Development Manager, Technical Product Manager or TPM leading technical platforms areas
  • Deep domain experience in AI and/or Search. Example: Model Inference, Model Evaluation, Model Training, LLM Ops, Semantic Search, Search Relevance, etc.
  • Partner with Engineering in defining direction, strategy and execution at Platform level
  • Strategic thinking and ability to understand business objectives to translate them into technical problems and programs.
  • Technical understanding of systems involved. Willingness to develop domain expertise in the area they operate - storage, networking, authentication, capacity management, service deployments, etc.
  • TPMs are not expected to write or read code, but are expected to understand system flows, block architectures, APIs and such.
  • Experience defining and running end-to-end complex technical programs
  • Strong leadership, organizational, and communication skills
Job Responsibility
Job Responsibility
  • Understand and stay up-to-date on latest innovations in AI and Search. Partner closely with engineering teams to translate these into practical platform evolution for Atlassian bringing value to our customers.
  • Analyze business objectives, customer needs, product adoption inhibitors and opportunities, industry trends, and based on these, in close collaboration with your stakeholders, define a long-term strategy and roadmap for your platform and product components.
  • Understand business objectives and translate them into technical systems problems that need to be prioritized solved in the current business environment.
  • Define specific systems programs and create a plan of action for realizing those programs. Such programs could be around capacity planning, migration efforts, high availability, network architecture, performance optimization, reliability improvements and more.
  • Use your technical understanding of Atlassian and related systems to partner with and influence engineers and architects in making progress on these problems.
  • Responsible for taking a systematic approach to engineering problems. This includes: prioritizing tasks, scoping out the project, defining objectives, and making consistent progress against each of these.
  • Be accountable for the success of these technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
  • Manage complex dependencies and projects with a broad scope across the company
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , New York
Salary
Salary:
190800.00 - 286800.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
  • Fulltime
Read More
Arrow Right

Senior ML Ops Engineer - Architecture & Strategy

We own the platform blueprint for our ML infrastructure: designing systems that ...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
bmw.de Logo
BMW
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • University degree in Computer Science, Computer/Electrical Engineering or related subjects
  • 5–8+ years in ML platform or infrastructure engineering, with at least two years in a tech lead or architect role
  • Deep expertise in either AWS, Azure or Google cloud, ideally with multi-region or multi-account setups
  • Proven track record designing systems for PB-scale data and hundreds of concurrent training jobs as well as understanding of large vision models and the challenges of compressing them for automotive-grade SoCs
  • Strong knowledge of Kubernetes platform design, GitOps, and infrastructure-as-code
  • Excellent communication skills to align ML researchers, embedded engineers, data teams, and executives
  • Familiarity with edge model compilation toolchains for Qualcomm (QNN, AIMET) and/or NVIDIA (TensorRT, Triton) and experience with automotive data at scale, such as MDF4, MCAP, ROS bags, and multi-sensor synchronisation
Job Responsibility
Job Responsibility
  • You design the reference architecture for the ML platform end-to-end: data ingestion, PB-scale data lake, heterogeneous training clusters, model registry, and deployment-ready artefacts
  • You design the data-format backbone, setting standards for data flows, ingestion, cataloguing, transcoding, and partitioning at PB scale, integrated with dataset management tooling
  • You define the platform component topology and integration contracts for pipeline orchestration, experiment tracking, hyperparameter optimisation, dataset management, observability, and metadata
  • You establish model lifecycle governance, including experiment tracking, approval gates, validation criteria, and clear handoff contracts to deployment teams
  • You drive cost governance at PB scale, including accelerator spot strategies, S3 tiering, cross-AZ traffic reduction, and Kubernetes cluster right-sizing
  • You partner with Security, Legal, and Functional-Safety teams on ISO 26262, ISO 8800, and data-protection compliance
What we offer
What we offer
  • Challenging projects with which we shape the mobility of tomorrow together
  • Wide range of personal and professional development opportunities
  • Attractive, fair and performance-related remuneration
  • High level of job security
  • Annual special payments such as vacation pay, Christmas bonus, and profit sharing
  • Flexible working hours including six weeks annual leave and overtime compensation
  • Discounted BMW & MINI conditions
  • Many other benefits at bmw.jobs/benefits
  • Fulltime
Read More
Arrow Right