CrawlJobs Logo

AI Engineer - Platform

United Kingdom, London · Job Posted March 20, 2026
Apply Position
Job Link Share

Job Description

At hx, AI is central to how we build software and make decisions across the company. From large-scale document intelligence to a domain-specific AI peer-programmer used by actuaries and underwriters, hx is redefining how the insurance industry uses generative AI in production. The AI Platform team provides the platform that makes all of this possible. We design, build, and operate the shared systems that allow teams to train, deploy, evaluate, and monitor AI safely at scale — including model serving, data pipelines, observability, deployment automation, and compute orchestration.

Job Responsibility

  • Designing and operating scalable AI infrastructure for LLM inference, prompt management, and evaluation pipelines, supporting billions in premium flow
  • Building self-service tools, SDKs, and APIs that empower product teams to move from prototype to production 30% faster
  • Instrumenting production AI/ML workloads with standardised logging, tracing, and evaluation metrics, increasing observability coverage to 100% of deployed models
  • Implementing intelligent routing, caching, and provider optimisation via the LLM gateway, reducing AI compute costs by up to 25%
  • Driving adoption of shared platform services (LLM gateway, evaluation frameworks, monitoring) to replace bespoke solutions, increasing platform adoption across new AI features
  • Championing developer experience by delivering comprehensive documentation and responsive support, resulting in higher internal customer satisfaction

Requirements

  • Built and deployed production AI infrastructure that scaled to support enterprise-grade reliability and observability
  • Delivered self-service tools or APIs that enabled multiple product teams to accelerate their AI/ML development cycles
  • Implemented evaluation frameworks, A/B testing infrastructure, or monitoring solutions that measured and improved model performance, latency, cost, and quality in production
  • Led initiatives to reduce AI compute costs through optimisation strategies such as intelligent routing or caching
  • Successfully migrated teams from bespoke AI solutions to shared platform services, driving measurable adoption
  • Prioritised and improved developer experience through documentation, support, or workflow enhancements

What we offer

  • £5,000 training and conference budget for individual and group development
  • 25 days of holiday plus 8 bank holidays (33 days total)
  • Company pension scheme via Penfold
  • Mental health support and therapy via Spectrum.life
  • Individual wellbeing allowance via Juno
  • Private healthcare insurance through AXA
  • Income protection and Life Insurance
  • Cycle to Work Scheme
  • Top-spec equipment (laptop, screens, adjustable desks, etc.)
  • Regular remote and in-person hackathons, lunch and learns, socials, and game nights
  • Team breakfasts and lunches, snacks, drinks fridge, and a fun office at The Ministry
  • Exceptional opportunities for personal development and growth as we build something remarkable together

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

AI Engineer - Platform

8 matching positions

AI Engineer - Platform

At hyperexponential, we’re building the AI-powered platform that enables the wor...
Location
Location
Poland , Warsaw
Salary
Salary:
Not provided
hyperexponential.com Logo
hyperexponential
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Built and deployed production AI infrastructure that scaled to support enterprise-grade reliability and observability
  • Delivered self-service tools or APIs that enabled multiple product teams to accelerate their AI/ML development cycles
  • Implemented evaluation frameworks, A/B testing infrastructure, or monitoring solutions that measured and improved model performance, latency, cost, and quality in production
  • Led initiatives to reduce AI compute costs through optimisation strategies such as intelligent routing or caching
  • Successfully migrated teams from bespoke AI solutions to shared platform services, driving measurable adoption
  • Prioritised and improved developer experience through documentation, support, or workflow enhancements
Job Responsibility
Job Responsibility
  • Designing and operating scalable AI infrastructure for LLM inference, prompt management, and evaluation pipelines, supporting billions in premium flow
  • Building self-service tools, SDKs, and APIs that empower product teams to move from prototype to production 30% faster
  • Instrumenting production AI/ML workloads with standardised logging, tracing, and evaluation metrics, increasing observability coverage to 100% of deployed models
  • Implementing intelligent routing, caching, and provider optimisation via the LLM gateway, reducing AI compute costs by up to 25%
  • Driving adoption of shared platform services (LLM gateway, evaluation frameworks, monitoring) to replace bespoke solutions, increasing platform adoption across new AI features
  • Championing developer experience by delivering comprehensive documentation and responsive support, resulting in higher internal customer satisfaction
What we offer
What we offer
  • Share Options at a highly successful Series B company
  • 25 days of non-working + Polish bank holidays (B2B) / 26 days of holiday + Polish bank holidays (UoP)
  • £5,000 GBP budget for Learning & Development
  • Mental Health Support and Therapy via Spectrum Life
  • Optional for you: access to Private Healthcare via Luxmed + Multisport (fully funded by yourself as B2B Contractor)
  • Top-spec laptop (MacOS or Windows)
  • Company pension (UoP)
  • Company Sick Pay for 10 days at 100% salary (UoP)
  • Monthly Wellbeing allowance via Juno (UoP)
  • Private Healthcare Insurance via Luxmed (UoP)
  • Fulltime
Read More
Arrow Right

GCP AI Platform Architect / Lead AI Platform Engineer

Our client is an innovative technology company specializing in the development o...
Location
Location
Poland , Kraków
Salary
Salary:
Not provided
teamquest.pl Logo
TeamQuest Sp. z o. o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • GCP Expertise (verifiable - ask for production examples): GCP is their primary cloud not secondary experience alongside AWS/Azure. Production deployments across most of: Vertex AI, Cloud Run or GKE, Pub/Sub, BigQuery, Secret Manager, VPC Service Controls, IAM + Workload Identity. Has designed for GCP from scratch, not migrated from another cloud, end-to-end ownership
  • AI / Backend Engineering: Python is the primary language - production-grade service/API development, not scripting or data science only. Strong track record building distributed systems and integrating LLMs.
  • Agentic Architecture (must be production, not PoC): Hands-on production experience with at least one: LangGraph, Google ADK, CrewAI, or custom multi-agent orchestration layer. RAG pipelines shipped to production. Google ADK: candidate must be able to explain what it is, when to use it, and how it compares to LangGraph and custom orchestration. AI agent workflows, ReAct prompting, and Function Calling in production environments
  • Multi-Tenant Architecture: Has designed a multi-tenant SaaS platform end-to-end - not just contributed. Can articulate tenant isolation strategies: IAM boundary design, data isolation per tenant, VPC controls.
  • API Design & Integrations: Proven ability to create secure, high-performance APIs capable of asynchronously managing traffic and communication between multiple decoupled services.
  • Enterprise Security: Practical knowledge of data isolation in multi-tenant SaaS architectures, IAM, and securing cloud-based environments.
  • Vector Databases: Hands-on experience with semantic search and at least one of: Pinecone, Weaviate, pgvector, or Vertex Matching Engine.
Job Responsibility
Job Responsibility
  • System Architecture: Design and develop a scalable, cloud-native architecture on Google Cloud Platform (GCP) that meets enterprise security and multi-tenant data isolation requirements for a SaaS environment
  • AI Agent Orchestration: Architect and implement autonomous, multi-step AI workflows with a clear separation of agent responsibilities (retrieval, analysis, reasoning, response generation)
  • Hands-on Core Development: Actively contribute to core system development-coding orchestration logic, designing services, optimizing performance, and building secure API integrations for routing queries across internal and external agents
  • Frontend Enablement: Design the backend layer, streaming protocols, and APIs to seamlessly support and integrate with advanced conversational UIs
  • Data Management & Extensibility: Build a robust backend capable of processing qualitative and social data, ensuring the platform is easily extensible to incorporate new data sources
What we offer
What we offer
  • Attractive salary
  • Full remote work
  • Social benefits:sporto card,healthcare insurance
  • Fulltime
Read More
Arrow Right

GCP AI Platform Architect / Lead AI Platform Engineer

Our client is an innovative technology company specializing in the development o...
Location
Location
Poland , Katowice
Salary
Salary:
Not provided
teamquest.pl Logo
TeamQuest Sp. z o. o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • GCP Expertise (verifiable - ask for production examples): production deployments across most of: Vertex AI, Cloud Run or GKE, Pub/Sub, BigQuery, Secret Manager, VPC Service Controls, IAM + Workload Identity
  • Has designed for GCP from scratch, not migrated from another cloud, end-to-end ownership
  • AI / Backend Engineering: Python is the primary language - production-grade service/API development, not scripting or data science only
  • Strong track record building distributed systems and integrating LLMs
  • Agentic Architecture (must be production, not PoC): Hands-on production experience with at least one: LangGraph, Google ADK, CrewAI, or custom multi-agent orchestration layer
  • RAG pipelines shipped to production
  • Google ADK: candidate must be able to explain what it is, when to use it, and how it compares to LangGraph and custom orchestration
  • AI agent workflows, ReAct prompting, and Function Calling in production environments
  • Multi-Tenant Architecture: Has designed a multi-tenant SaaS platform end-to-end - not just contributed
  • Can articulate tenant isolation strategies: IAM boundary design, data isolation per tenant, VPC controls
Job Responsibility
Job Responsibility
  • System Architecture: Design and develop a scalable, cloud-native architecture on Google Cloud Platform (GCP) that meets enterprise security and multi-tenant data isolation requirements for a SaaS environment
  • AI Agent Orchestration: Architect and implement autonomous, multi-step AI workflows with a clear separation of agent responsibilities (retrieval, analysis, reasoning, response generation)
  • Hands-on Core Development: Actively contribute to core system development-coding orchestration logic, designing services, optimizing performance, and building secure API integrations for routing queries across internal and external agents
  • Frontend Enablement: Design the backend layer, streaming protocols, and APIs to seamlessly support and integrate with advanced conversational UIs
  • Data Management & Extensibility: Build a robust backend capable of processing qualitative and social data, ensuring the platform is easily extensible to incorporate new data sources
What we offer
What we offer
  • Attractive salary
  • Full remote work
  • Social benefits: sport card, healthcare insurance
  • Fulltime
Read More
Arrow Right

Senior ML Platform Engineer, AI Platform

We are seeking a skilled and passionate ML Platform Engineer to join our team an...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
airwallex.com Logo
Airwallex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in backend software development
  • at least 2+ years focus on AI/ML Platform or MLOps infrastructure
  • deep expertise in MLOps practices, including automated deployment pipelines, model optimization, and production lifecycle management
  • proven experience designing and implementing low-latency model serving solutions
  • proficiency in Python
  • skill in writing high-quality, maintainable code
  • experience in design and development of large-scale distributed, high concurrency, low-latency inference, high availability systems
  • excellent communication and mentoring abilities
  • a relevant degree in Computer Science, Mathematics or related fields
Job Responsibility
Job Responsibility
  • Platform Development: Design, build, and maintain the end-to-end MLOps platform using Kubernetes and Cloud Services
  • Infrastructure as Code (IaC): Use Terraform or similar tools to manage, provision, and scale all ML-related infrastructure securely and efficiently
  • Pipeline Automation: Implement and optimize CI/CD/CT (Continuous Integration, Delivery, Training) pipelines to automate model training, testing, packaging, and deployment using tools like Argo and Kubeflow Pipelines
  • Serving Infrastructure: Build highly available, low-latency, and high-throughput model serving infrastructure
  • Observability: Implement robust monitoring, alerting, and logging solutions to track infrastructure health, model performance, and data/model drift
  • Tooling & Support: Evaluate, integrate, and support ML tools such as Feature Stores and distributed model training pipelines
  • Security & Compliance: Ensure platform security, implement RBAC (Role-Based Access Control), and manage secrets for sensitive data and production environments
  • Collaboration: Work closely with Data Scientists and ML Engineers to understand their needs and provide technical guidance on best practices for scaling their models
  • Fulltime
Read More
Arrow Right

Senior Lead AI Engineer (Gen AI Platform Services)

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , San Jose, California; New York, New York
Salary
Salary:
250800.00 - 286200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
  • At least 6 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products that change how our associates work and how our customers interact with Capital One
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits
  • Fulltime
Read More
Arrow Right

Sr. Distinguished AI Engineer (Agentic AI Platform)

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , San Jose, California; San Francisco, California
Salary
Salary:
343400.00 - 392000.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or AI plus at least 10 years of experience developing AI and ML algorithms or technologies, or Master's degree plus at least 8 years of experience developing AI and ML algorithms or technologies
  • At least 10 years of experience programming with Python, Go, Scala, or Java
  • 9 years of experience deploying scalable and responsible AI solutions on cloud platforms
  • 2+ years of experience supporting Agentic Frameworks
  • 2+ years of experience with LLMOps
  • 8+ years of experience designing mission-critical machine learning platforms
  • 2+ years of experience architecting, designing, developing, integrating, delivering, and supporting complex AI systems
  • Demonstrated ability to lead and mentor multiple engineering teams and influence cross-functional stakeholders up to the VP level
  • Experience developing AI and ML algorithms or technologies using Python, C++, C#, Java, or Golang
  • Master's degree in Computer Science, Computer Engineering, or relevant technical field
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Contribute to the north star platform architecture, continuously publishing and refining living diagrams and canonical APIs
  • Standardizing and automating agentic workflows
  • Contribute to crafting an end to end GenAI SDK, CLI and starter kits
  • Help bring together a vision of central guardrail services
  • Collaborate with cross organization architects to drive end to end performance
  • Accelerate innovation by incubating proof of concepts and driving RFCs
  • Own central Helm charts, operators and CRDs that auto scale agents to hit tenant SLAs
  • Coach and evangelize - hosting architecture office hours, mentoring Staff, Principal and Senior engineers, authoring technical design documents and blogs and representing Capital One at Tier1 AI conferences
What we offer
What we offer
  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits
  • Fulltime
Read More
Arrow Right

Senior Lead AI Engineer (Gen AI Platform Services)

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , San Jose; San Francisco; New York; Cambridge; McLean
Salary
Salary:
229900.00 - 286200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
  • At least 6 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products that change how our associates work and how our customers interact with Capital One
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Managed AI - AI Platform

Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'...
Location
Location
United States , San Francisco, CA; Sunnyvale, CA
Salary
Salary:
208725.00 - 253000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree in Computer Science/Engineering
  • 8-10+ years of industry experience with demonstrated history of consistent success leading a varied portfolio of initiatives across your function
  • Experience with distributed systems, cloud services (compute, storage, networking, database), and delivering early-stage projects quickly
  • Experience with Generative AI (LLMs, Multimodal) and familiar with AI infrastructure (training, inference, ETL pipelines)
  • Proficient with container runtimes (e.g., Kubernetes), microservices, REST APIs, gRPC, and the full software development lifecycle including CI/CD
Job Responsibility
Job Responsibility
  • Lead the design and implementation of core AI services, including: Resilient fault-tolerant queues for efficient task distribution
  • Model catalogs for managing and versioning AI models
  • Scheduling mechanisms optimized for cost and performance
  • Architect and scale infrastructure to handle millions of API requests per second
  • Implement robust monitoring and alerting to ensure system health and 24/7 availability
  • Collaborate closely with product management, business strategy, and other engineering teams to define the AI platform roadmap
  • Influence the long-term vision and architectural decisions of the platform
  • Contribute to open-source AI frameworks and actively participate in the AI community
  • Prototype and rapidly iterate on emerging technologies and new features
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right