CrawlJobs Logo

Senior ML Ops Engineer

edtechjobs.io Logo

EdTech Jobs

Location Icon

Location:
United States , Philadelphia

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

95300.00 - 158800.00 USD / Year

Job Description:

Join Elsevier as a Senior ML Ops Engineer to lead the development of impactful AI-based features within health platforms while bridging the gap between data science and engineering. You will work on AI-based features (GenAI, Agentic AI, RAG, etc.) search/ranking quality, and knowledge graph aware retrieval while enforcing content rights and editorial confidentiality.

Job Responsibility:

  • Automate and orchestrate machine learning workflows across major cloud and AI platforms (AWS, Azure, Databricks, and foundation model APIs such as OpenAI)
  • Maintain and version model registries and artifact stores to ensure reproducibility and governance
  • Develop and manage CI/CD for ML, including automated data validation, model testing, and deployment
  • Implement ML Engineering solutions using popular MLOps platforms such as AWS SageMaker, MLflow, Azure ML
  • Scale end-end custom Sagemaker pipelines
  • Design and implement the engineering components of GAR+RAG systems (e.g., query interpretation and reflection, chunking, embeddings, hybrid retrieval, semantic search), manage prompt libraries, guardrails and structured output for LLMs hosted on Bedrock/SageMaker or self-hosted
  • Design and implement ML pipelines that utilize Elasticsearch/OpenSearch/Solr, vector DBs, and graph DBs
  • Build evaluation pipelines: offline IR metrics (NDCG, MAP, MRR), LLM quality metrics (faithfulness, grounding), and A/B testing
  • Optimize infrastructure costs through monitoring, scaling strategies, and efficient resource utilization
  • Stay current with the latest GAI research, NLP and RAG and apply the state-of-the-art in our experiments and systems
  • Partner with Subject-Matter Experts, Product Managers, Data Scientists and Responsible AI experts to translate business problems into cutting edge data science solutions
  • Collaborate and interface with Operations Engineers who deploy and run production infrastructure

Requirements:

  • Current experience in ML Engineering, MLOps platforms, shipping ML or search/GenAI systems to production
  • Strong Python, Java, and/or Scala experience
  • Hands-on experience with major cloud vendor solutions (AWS, Azure and/or Google)
  • Experience with Search/vector/graph technologies (e.g., Elasticsearch / OpenSearch / Solr / Neo4j)
  • Experience in evaluating LLM models
  • A strong understanding of the Data Science Life Cycle including feature engineering, model training, and evaluation metrics
  • Familiarity with ML frameworks, e.g., PyTorch, TensorFlow, PySpark
  • Experience with large-scale data processing systems, e.g., Spark
  • Experience with statistical analysis, machine learning theory and natural language processing

Nice to have:

Background in health technology and/or medical content workflows

What we offer:
  • Annual incentive bonus
  • Country specific benefits
  • Fair and accessible hiring process with accommodation support

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior ML Ops Engineer

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer – ML Model Compliance & Automation

We are seeking a highly skilled and motivated Senior Software Engineer to lead t...
Location
Location
India , Jaipur
Salary
Salary:
Not provided
infoobjects.com Logo
InfoObjects
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience Required: 3 - 7 yrs
  • GoLang (preferred)
  • Python (preferred)
  • Bash
  • MLOps Tools: KitOps, MLModelCI, MLflow, ONNX, TensorFlow, PyTorch, Docker
  • SBOM & Security: Syft, Grype, Trivy, CycloneDX, SPDX
  • CI/CD: GitHub Actions, GitLab CI, Jenkins, ArgoCD
  • Infra: Kubernetes, Docker, Helm, Terraform
  • Cloud: AWS, GCP, Azure (EKS/GKE/ECS preferred)
  • Version Control: Git, GitOps
Job Responsibility
Job Responsibility
  • Model Packaging & Artifact Management: Design and implement workflows for packaging ML models using KitOps, ONNX, MLflow, or TensorFlow SavedModel
  • Manage model artifact versioning, registries, and reproducibility
  • Ensure artifact integrity, consistency, and traceability across CI/CD pipelines
  • Model Profiling & Optimization: Automate model profiling (latency, size, ops) using MLModelCI, TorchServe, or ONNX Runtime
  • Apply quantization, pruning, and format conversions (e.g., FP32→INT8) for optimization
  • Embed profiling and optimization checks into CI/CD pipelines to assess deployment readiness
  • Compliance & SBOM Generation: Develop pipelines to generate and validate SBOMs for ML models
  • Implement compliance checks for licensing, vulnerabilities, and security using CycloneDX, SPDX, Syft, or Trivy
  • Validate schema, dependencies, and runtime environments for production readiness
  • Cloud Integration & Deployment: Automate model registration, endpoint creation, and monitoring setup in AWS/GCP/Azure
  • Fulltime
Read More
Arrow Right

Senior AI Engineer

We are seeking a Senior AI Engineer (L4, Individual Contributor) to design, buil...
Location
Location
India , Chennai
Salary
Salary:
Not provided
arcadia.com Logo
Arcadia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of professional software engineering experience
  • 3+ years in AI/ML development
  • Strong expertise in Python, PyTorch/TensorFlow, scikit-learn, and ML tooling (MLflow, LangChain)
  • Proficiency with SQL, cloud services (AWS), containers (Docker, Kubernetes), and distributed systems
  • Understanding of modern AI research (LLMs, diffusion models, transformers)
  • Experience deploying ML models in production with CI/CD
  • Strong analytical skills, ability to balance speed and rigor in experimentation
  • A passion for sustainability and the clean-energy mission
  • Experienced with building agentic pipelines with the latest models from Anthropic, Google, OpenAI, and more
Job Responsibility
Job Responsibility
  • Integrate with LLMs and be an expert in prompt engineering to derive the right results from the models with limited hallucination
  • Design and train ML/AI models (forecasting, NLP, graph learning, generative AI) to improve data quality, cost effectiveness, and system scalability
  • Deploy and optimize models for large-scale production workloads using Python-based services in AWS/Kubernetes environments
  • Build robust, automated data pipelines and ML Ops workflows for continuous training and deployment
  • Research and experiment with modern AI methods (transformers, foundation models, reinforcement learning) and adapt them to energy-sector challenges not limited to utility statements
  • Drive performance improvements in model accuracy, latency, and cost efficiency
  • Collaborate with Product, SRE, and Analytics teams to deliver AI-enabled features across Arcadia’s platform
  • Write clean, maintainable code, contribute to architecture reviews, and mentor junior engineers
  • Build true agentic workflows with multi-step processing incorporating RAG pipelines and MCPs
What we offer
What we offer
  • Competitive compensation and employee stock options
  • Hybrid/remote-first working model (India-based role, with global collaboration)
  • Flexible leave policy
  • Comprehensive medical insurance (self + family members)
  • Annual performance cycle + quarterly recognition awards
  • A supportive, diverse engineering culture grounded in empathy, teamwork, and innovation
  • Fulltime
Read More
Arrow Right

Senior Full Stack Software Engineer

Tutor Intelligence builds software to enable ordinary robots to achieve extraord...
Location
Location
United States , Watertown
Salary
Salary:
140000.00 - 190000.00 USD / Year
tutorintelligence.com Logo
Tutor Intelligence
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills in Python
  • Software engineering tooling: git, unix shell, etc
  • Collaborative nature and social skill set
  • Interest in robotics, AI, solving hard problems, or improving the future of humanity
  • Passion for building things (and just getting stuff done)
Job Responsibility
Job Responsibility
  • Architecting and engineering core software across one or more of: robot software, backend services, ML services, cloud infrastructure / dev-ops
  • Involvement in new project planning
What we offer
What we offer
  • generous equity
  • fully covered health + dental
  • unlimited PTO
  • Fulltime
Read More
Arrow Right

Senior Platform Machine Learning Engineer

Machine learning is the crucial enabler for every financial service EarnIn provi...
Location
Location
United States , Mountain View
Salary
Salary:
232200.00 - 283800.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field, or relevant equivalent experience
  • 4+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms like AWS Sagemaker, Databricks, or GCP Vertex AI
  • Experience with LLM Ops, foundation model APIs, and AI engineering
  • Familiarity with data pipeline and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest machine learning and platform engineering industry trends
Job Responsibility
Job Responsibility
  • Design, build, and maintain the ML and AI platform and tools to support the end-to-end machine learning lifecycle
  • Work closely with other machine learning engineers to understand their workflows, optimize model training and deployment processes, and ensure the reproducibility of results
  • Ensure scalability, reliability, cost efficiency, and ease of use of the machine learning platform
  • Contribute to evaluating and adopting new technologies and tools to enhance our machine-learning capabilities
  • Set examples of outstanding operational excellence. Be the catalyst for step-jump changes
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Senior Principal Technical Program Manager - ML Platform

Location
Location
Salary
Salary:
231300.00 - 301975.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience on software teams as Development Manager, Technical Product Manager or TPM leading technical platforms areas
  • Deep domain experience in AI and/or Search. Example: Model Inference, Model Evaluation, Model Training, LLM Ops, Semantic Search, Search Relevance, etc.
  • Partner with Engineering in defining direction, strategy and execution at Platform level
  • Strategic thinking and ability to understand business objectives to translate them into technical problems and programs.
  • Technical understanding of systems involved. Willingness to develop domain expertise in the area they operate - storage, networking, authentication, capacity management, service deployments, etc.
  • TPMs are not expected to write or read code, but are expected to understand system flows, block architectures, APIs and such.
  • Experience defining and running end-to-end complex technical programs
  • Strong leadership, organizational, and communication skills
Job Responsibility
Job Responsibility
  • Understand and stay up-to-date on latest innovations in AI and Search. Partner closely with engineering teams to translate these into practical platform evolution for Atlassian bringing value to our customers.
  • Analyze business objectives, customer needs, product adoption inhibitors and opportunities, industry trends, and based on these, in close collaboration with your stakeholders, define a long-term strategy and roadmap for your platform and product components.
  • Understand business objectives and translate them into technical systems problems that need to be prioritized solved in the current business environment.
  • Define specific systems programs and create a plan of action for realizing those programs. Such programs could be around capacity planning, migration efforts, high availability, network architecture, performance optimization, reliability improvements and more.
  • Use your technical understanding of Atlassian and related systems to partner with and influence engineers and architects in making progress on these problems.
  • Responsible for taking a systematic approach to engineering problems. This includes: prioritizing tasks, scoping out the project, defining objectives, and making consistent progress against each of these.
  • Be accountable for the success of these technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
  • Manage complex dependencies and projects with a broad scope across the company
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
Read More
Arrow Right
New

Senior Applied AI Developer

Intellectsoft is a software development company delivering innovative solutions ...
Location
Location
India
Salary
Salary:
Not provided
Intellectsoft
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of practical experience in Machine Learning
  • Solid understanding of ML and Deep Learning fundamentals, including model fine-tuning (especially LLMs)
  • Proficient in Python
  • experience with PySpark or Scala is a plus
  • Background in backend API development and working with vector databases
  • Experience with ML Ops practices such as model monitoring, tracking KPIs, and managing model drift (preferably using MLflow)
  • Hands-on experience with NLP or computer vision projects
  • Familiarity with ML frameworks such as Keras or HuggingFace
  • Experience in building feature engineering pipelines, inference workflows, and deploying models for real-time predictions
Job Responsibility
Job Responsibility
  • Develop and maintain ML engineering platforms and reusable components
  • Deploy ML models and implement feedback loops to monitor and improve their performance in production
  • Build robust, testable, and scalable code that meets high quality standards and accounts for potential edge cases
  • Collaborate with client-facing teams to gather technical requirements and align solutions with business needs
  • Participate in agile development ceremonies (e.g., scrum meetings) and clearly communicate progress, blockers, and dependencies
  • Contribute to code reviews, version control practices, and bug tracking processes
  • Conduct research and develop prototypes to evaluate emerging tools, frameworks, and architectures
What we offer
What we offer
  • Awesome projects with an impact
  • Udemy courses of your choice
  • Team-buildings, events, marathons & charity activities to connect and recharge
  • Workshops, trainings, expert knowledge-sharing that keep you growing
  • Clear career path
  • Absence days for work-life balance
  • Flexible hours & work setup - work from anywhere and organize your day your way
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer - Infrastructure

Kalepa is looking for smart engineers who love to code, seek challenging problem...
Location
Location
United States
Salary
Salary:
170000.00 - 210000.00 USD / Year
kalepa.com Logo
Kalepa
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering or dev-ops experience, ideally with time spent working on CI/CD pipelines, security, performance and infrastructure
  • Embody hustle and grit, you're excellent at your craft and relentless in pursuing it
  • Take full ownership of your work and drive projects to completion with minimal oversight
  • Communicate proactively, surfacing blockers, asking the right questions, keeping stakeholders in the loop
  • Thrive on experimentation: you're comfortable testing ideas, measuring results, and iterating quickly on problems no one has solved before
  • Have strong fundamentals in system design, debugging, and problem-solving
  • Know Python (our main stack), though we're open to strong candidates from other OO backgrounds
  • Have experience with serverless technologies (Lambda, EC2), asynchronous workflows, and cloud environments
  • Are comfortable with relational databases, ideally PostgreSQL
  • Know terraform, IaaC best practices, Docker or similar tools
Job Responsibility
Job Responsibility
  • Work on the infrastructure and tooling that power our core platform
  • Maintain CI/CD pipelines, automate operational workflows, and ensure we stay secure and compliant to the industry standards
  • Improving infrastructure performance and reliability, monitoring and optimizing system cost, and collaborating with engineering teams to standardize environments and deployment patterns
  • Help to maintain observability and incident response processes, and ensure the scalability of our cloud-based systems as we continue to grow
  • Collaborate with a global team of full-stack, data, ML, and DevOps engineers to build scalable backend solutions
What we offer
What we offer
  • Significant equity options package
  • 20 days of PTO a year
  • Global team offsites
  • 100% covered PPO medical, 100% covered vision and dental for individuals and families
  • Healthy living/gym stipend
  • Mobile phone bill stipend
  • Continuing education credits
  • 401(k) plan with employer contribution (regardless of employee contribution)
  • Fulltime
Read More
Arrow Right