CrawlJobs Logo

Ai Ops Platform Engineer

barclays.co.uk Logo

Barclays

Location Icon

Location:
United Kingdom , London

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Join us as an AI Ops Engineer, to build and run an enterprise AI Factory within our Card Merchant Services organisation, enabling AI‑driven change across the merchant payments lifecycle. This role focuses on acquiring, risk and fraud, and merchant servicing, delivering a secure, scalable, and well‑governed AI platform that operates effectively in a highly regulated payments environment. You will be accountable for the end‑to‑end operationalisation of AI, spanning model, prompt, and agent lifecycles; deployment and monitoring; guardrails; and cost optimisation, ensuring AI solutions are production‑ready, auditable, compliant, and scalable across merchant payment use cases. You will also be accountable for the end‑to‑end engineering of GenAI and ML platforms, embedding governance, observability and operational resilience by design, hile enabling teams to deploy and run AI solutions with clarity, assurance and accountability at scale.

Job Responsibility:

  • Build and run an enterprise AI Factory within our Card Merchant Services organisation, enabling AI‑driven change across the merchant payments lifecycle
  • Accountable for the end‑to‑end operationalisation of AI, spanning model, prompt, and agent lifecycles
  • deployment and monitoring
  • guardrails
  • and cost optimisation, ensuring AI solutions are production‑ready, auditable, compliant, and scalable across merchant payment use cases
  • Accountable for the end‑to‑end engineering of GenAI and ML platforms, embedding governance, observability and operational resilience by design, while enabling teams to deploy and run AI solutions with clarity, assurance and accountability at scale
  • Lead and manage engineering teams, providing technical guidance, mentorship, and support to ensure the delivery of high-quality software solutions
  • Oversee timelines, team allocation, risk management and task prioritization
  • Mentor and support team members' professional growth, conduct performance reviews, provide actionable feedback, and identify opportunities for improvement
  • Evaluation and enhancement of engineering processes, tools, and methodologies
  • Collaboration with business partners, product managers, designers, and other stakeholders to translate business requirements into technical solutions
  • Enforcement of technology standards, facilitate peer reviews, and implement robust testing practices

Requirements:

  • LLMOps / MLOps at production scale, operating the full Generative AI lifecycle including models, prompts and agents, CI/CD pipelines, structured evaluation, drift and hallucination monitoring, and controlled, auditable release processes suitable for banking environments
  • Cloud‑native AI platform engineering on AWS, with hands‑on delivery using services such as Amazon Bedrock for foundation models, agent orchestration patterns, Lambda and Step Functions, alongside demonstrated Python engineering capability and secure microservices and API design
  • AI governance, observability and cost optimisation, embedding governance by design through policy as code, alignment to model risk framework expectations, lifecycle traceability and audit‑ready evidence, supported by SRE‑grade monitoring and ongoing optimisation of token usage and compute cost across AI workloads

Nice to have:

  • Retrieval Augmented Generation (RAG) and vector database implementation, with practical experience using technologies such as OpenSearch, FAISS or similar to support scalable, production‑ready retrieval workflows
  • Data pipeline engineering, building and operating AI‑ready pipelines using AWS Glue, S3 and related services to support model training, inference and evaluation
  • Advanced observability and reliability engineering, including experience with CloudWatch, OpenTelemetry and established production resilience patterns for AI workloads in critical banking systems
What we offer:
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution

Additional Information:

Job Posted:
May 05, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Ai Ops Platform Engineer

ML Ops Engineer

As an MLOps Engineer, you will be responsible for building, maintaining, and opt...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
nstarxinc.com Logo
NStarX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4 to 10 years of experience in MLOps, DevOps, or ML Engineering
  • Strong proficiency with cloud platforms such as AWS, Azure, or GCP
  • Experience with containerization and orchestration tools like Docker and Kubernetes
  • Hands-on experience with ML model deployment, monitoring, and scaling
  • Proficiency with CI/CD tools such as Jenkins or GitLab CI
  • Familiarity with data versioning and management tools such as DVC
  • Strong coding skills in Python with knowledge of ML libraries like TensorFlow or PyTorch
  • Strong problem-solving skills and ability to work in a collaborative environment
  • Effective communication skills for cross-functional teamwork
Job Responsibility
Job Responsibility
  • Develop and manage infrastructure for end-to-end ML workflows including model training, deployment, monitoring, and maintenance
  • Implement CI/CD pipelines for ML models and data workflows
  • Collaborate with cross-functional teams to build scalable and robust ML infrastructure on cloud and on-premises environments
  • Monitor and optimize model performance and infrastructure to ensure efficient resource usage
  • Manage data versioning and model versioning across multiple environments
  • Implement security, governance, and compliance protocols in ML deployment and data pipelines
  • Support troubleshooting, debugging, and incident management for ML infrastructure issues
What we offer
What we offer
  • Competitive compensation
  • Opportunity to work with a dynamic team on cutting-edge AI and ML solutions
  • Professional growth and development opportunities
  • Fulltime
Read More
Arrow Right

Senior AI Engineer

We are seeking a Senior AI Engineer (L4, Individual Contributor) to design, buil...
Location
Location
India , Chennai
Salary
Salary:
Not provided
arcadia.com Logo
Arcadia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of professional software engineering experience
  • 3+ years in AI/ML development
  • Strong expertise in Python, PyTorch/TensorFlow, scikit-learn, and ML tooling (MLflow, LangChain)
  • Proficiency with SQL, cloud services (AWS), containers (Docker, Kubernetes), and distributed systems
  • Understanding of modern AI research (LLMs, diffusion models, transformers)
  • Experience deploying ML models in production with CI/CD
  • Strong analytical skills, ability to balance speed and rigor in experimentation
  • A passion for sustainability and the clean-energy mission
  • Experienced with building agentic pipelines with the latest models from Anthropic, Google, OpenAI, and more
Job Responsibility
Job Responsibility
  • Integrate with LLMs and be an expert in prompt engineering to derive the right results from the models with limited hallucination
  • Design and train ML/AI models (forecasting, NLP, graph learning, generative AI) to improve data quality, cost effectiveness, and system scalability
  • Deploy and optimize models for large-scale production workloads using Python-based services in AWS/Kubernetes environments
  • Build robust, automated data pipelines and ML Ops workflows for continuous training and deployment
  • Research and experiment with modern AI methods (transformers, foundation models, reinforcement learning) and adapt them to energy-sector challenges not limited to utility statements
  • Drive performance improvements in model accuracy, latency, and cost efficiency
  • Collaborate with Product, SRE, and Analytics teams to deliver AI-enabled features across Arcadia’s platform
  • Write clean, maintainable code, contribute to architecture reviews, and mentor junior engineers
  • Build true agentic workflows with multi-step processing incorporating RAG pipelines and MCPs
What we offer
What we offer
  • Competitive compensation and employee stock options
  • Hybrid/remote-first working model (India-based role, with global collaboration)
  • Flexible leave policy
  • Comprehensive medical insurance (self + family members)
  • Annual performance cycle + quarterly recognition awards
  • A supportive, diverse engineering culture grounded in empathy, teamwork, and innovation
  • Fulltime
Read More
Arrow Right

Senior Platform Machine Learning Engineer

Machine learning is the crucial enabler for every financial service EarnIn provi...
Location
Location
United States , Mountain View
Salary
Salary:
232200.00 - 283800.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field, or relevant equivalent experience
  • 4+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms like AWS Sagemaker, Databricks, or GCP Vertex AI
  • Experience with LLM Ops, foundation model APIs, and AI engineering
  • Familiarity with data pipeline and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest machine learning and platform engineering industry trends
Job Responsibility
Job Responsibility
  • Design, build, and maintain the ML and AI platform and tools to support the end-to-end machine learning lifecycle
  • Work closely with other machine learning engineers to understand their workflows, optimize model training and deployment processes, and ensure the reproducibility of results
  • Ensure scalability, reliability, cost efficiency, and ease of use of the machine learning platform
  • Contribute to evaluating and adopting new technologies and tools to enhance our machine-learning capabilities
  • Set examples of outstanding operational excellence. Be the catalyst for step-jump changes
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Staff Machine Learning Engineer

Machine learning is the crucial enabler for every financial service EarnIn provi...
Location
Location
United States , Mountain View
Salary
Salary:
272700.00 - 333300.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field, or relevant equivalent experience
  • 7+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms like AWS Sagemaker, Databricks, or GCP Vertex AI
  • Experience with LLM Ops, foundation model APIs, and AI engineering
  • Familiarity with data pipeline and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest machine learning and platform engineering industry trends
Job Responsibility
Job Responsibility
  • Design, build, and maintain the ML and AI platform and tools to support the end-to-end machine learning lifecycle
  • Work closely with other machine learning engineers to understand their workflows, optimize model training and deployment processes, and ensure the reproducibility of results
  • Ensure scalability, reliability, cost efficiency, and ease of use of the machine learning platform
  • Contribute to evaluating and adopting new technologies and tools to enhance our machine-learning capabilities
  • Set examples of outstanding operational excellence. Be the catalyst for step-jump changes
  • Be a mentor, coach, leader and inspiration to those around you
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Principal Technical Program Manager - ML Platform

Location
Location
Salary
Salary:
231300.00 - 301975.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience on software teams as Development Manager, Technical Product Manager or TPM leading technical platforms areas
  • Deep domain experience in AI and/or Search. Example: Model Inference, Model Evaluation, Model Training, LLM Ops, Semantic Search, Search Relevance, etc.
  • Partner with Engineering in defining direction, strategy and execution at Platform level
  • Strategic thinking and ability to understand business objectives to translate them into technical problems and programs.
  • Technical understanding of systems involved. Willingness to develop domain expertise in the area they operate - storage, networking, authentication, capacity management, service deployments, etc.
  • TPMs are not expected to write or read code, but are expected to understand system flows, block architectures, APIs and such.
  • Experience defining and running end-to-end complex technical programs
  • Strong leadership, organizational, and communication skills
Job Responsibility
Job Responsibility
  • Understand and stay up-to-date on latest innovations in AI and Search. Partner closely with engineering teams to translate these into practical platform evolution for Atlassian bringing value to our customers.
  • Analyze business objectives, customer needs, product adoption inhibitors and opportunities, industry trends, and based on these, in close collaboration with your stakeholders, define a long-term strategy and roadmap for your platform and product components.
  • Understand business objectives and translate them into technical systems problems that need to be prioritized solved in the current business environment.
  • Define specific systems programs and create a plan of action for realizing those programs. Such programs could be around capacity planning, migration efforts, high availability, network architecture, performance optimization, reliability improvements and more.
  • Use your technical understanding of Atlassian and related systems to partner with and influence engineers and architects in making progress on these problems.
  • Responsible for taking a systematic approach to engineering problems. This includes: prioritizing tasks, scoping out the project, defining objectives, and making consistent progress against each of these.
  • Be accountable for the success of these technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
  • Manage complex dependencies and projects with a broad scope across the company
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
Read More
Arrow Right

Principal Engineering Manager, Core Platform & AI Systems

We are looking for a Principal Engineering Manager, Core Platform & AI Systems t...
Location
Location
United States , Seattle
Salary
Salary:
208000.00 - 313000.00 USD / Year
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of software engineering experience with 4+ years in engineering leadership roles
  • Experience managing senior/principal engineers, ideally across multiple functional areas
  • Strong technical background in cloud-native distributed systems, platform engineering, or AI/ML infrastructure
  • Proven track record of scaling SaaS platforms and leading teams responsible for mission-critical backend systems
  • Experience working closely with cross-functional teams such as Product, Infrastructure, AI/ML, and Security
  • Deep understanding of reliability, operational excellence, and cost optimization in cloud environments (AWS, Azure, GCP)
  • Excellent communication, collaboration, and executive stakeholder management skills
  • Passion for developing people and building strong, healthy engineering teams
Job Responsibility
Job Responsibility
  • Lead and grow the Core Platform & AI Systems team
  • Drive the technical roadmap for the platform, ensuring scalability, performance, availability, and cost-efficiency
  • Partner closely with product engineering teams to deliver platform capabilities that unlock business features while simplifying the developer experience
  • Collaborate with Data Science, ML, and AI teams to provide robust ML Ops and AI infrastructure that enables rapid experimentation and production-grade AI deployments
  • Own platform-wide reliability and operational health, continuously investing in observability, incident management, and system resilience
  • Contribute to architectural decisions that shape the long-term direction of Highspot’s SaaS platform
  • Attract, retain, and develop top engineering talent, building a high-performing and inclusive team culture
  • Communicate effectively with senior leadership, providing visibility into roadmap progress, technical trade-offs, and organizational needs
What we offer
What we offer
  • Comprehensive medical, dental, vision, disability, and life benefits
  • Health Savings Account (HSA) with employer contribution
  • 401(k) Matching with immediate vesting on employer match
  • Flexible PTO
  • 8 paid holidays and 5 paid days for Annual Holiday Week
  • Quarterly Recharge Fridays (paid days off for mental health recharge)
  • 18 weeks paid parental leave
  • Access to Coaches and Therapists through Modern Health
  • 2 volunteer days per year
  • Commuting benefits
  • Fulltime
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right