AI Inference Engineer Job at Perplexity (London)

Director of AI Engineering

We are entering a hyper-growth phase of AI innovation and are hiring a Director ...

Location

Canada; United States

Salary:

300000.00 - 450000.00 USD / Year

Apollo.io

Expiration Date

Until further notice

Requirements

10–15+ years in software engineering, with significant leadership experience owning AI/ML or applied LLM systems at scale
Proven history shipping LLM-powered features, agentic workflows, or AI assistants used by real customers in production
Deep understanding of LLM orchestration frameworks (LangChain, LlamaIndex), RAG pipelines, vector search, embeddings, and prompt engineering
Expert in backend & distributed systems (Python strongly preferred) and cloud infrastructure (AWS/GCP)
Strong experience with telemetry, observability, and cost-aware real-time inference optimizations
Demonstrated ability to lead senior engineers, define technical roadmaps, and deliver outcomes aligned to business metrics
Experience building or scaling teams working on experimentation, optimization, personalization, or ML-powered growth systems
Exceptional ability to simplify complex problems, set clear standards, and drive alignment across Product, Data, Design, and Engineering
Strong product sense, ability to weigh novelty vs. impact, focus on user value, and prioritize speed with guardrails
Fluent in integrating AI tools into engineering workflows for code generation, debugging, delivery velocity, and operational efficiency

Job Responsibility

Define the multi-year technical vision for Apollo’s AI stack, spanning agents, orchestration, inference, retrieval, and platformization
Prioritize high-impact AI investments by partnering with Product, Design, Research, and Data leaders to align engineering outcomes with business goals
Establish technical standards, evaluation criteria, and success metrics for every AI-powered feature shipped
Lead the architecture and deployment of long-horizon autonomous agents, multi-agent workflows, and API-driven orchestration frameworks
Build reusable, scalable agentic components that power GTM workflows like research, enrichment, sequencing, lead scoring, routing, and personalization
Own the evolution of Apollo’s internal LLM platform for high-scale, low-latency, cost-optimized inference
Oversee model-driven experiences for natural-language interfaces, RAG pipelines, semantic search, personalized recommendations, and email intelligence
Partner with Product & Design to build intuitive conversational UX that hides underlying complexity while elevating user productivity
Implement rigorous evaluation frameworks, including offline benchmarking, human-in-the-loop review, and online A/B experimentation
Ensure robust observability, monitoring, and safety guardrails for all AI systems in production

What we offer

Equity
Company bonus or sales commissions/bonuses
401(k) plan
At least 10 paid holidays per year
Flex PTO
Parental leave
Employee assistance program and wellbeing benefits
Global travel coverage
Life/AD&D/STD/LTD insurance
FSA/HSA

Fulltime

Data Engineer – AI Insights

We are looking for an experienced Data Engineer with AI Insights to design and d...

Location

United States

Salary:

Not provided

Thirdeye Data

Expiration Date

Until further notice

Requirements

5+ years of Data Engineering experience with exposure to AI/ML workflows
Advanced expertise in Python programming and SQL
Hands-on experience with Snowflake (data warehousing, schema design, performance tuning)
Experience building scalable ETL/ELT pipelines and integrating structured/unstructured data
Familiarity with LLM and RAG workflows, and how data supports these AI applications
Experience with reporting/visualization tools (Tableau)
Strong problem-solving, communication, and cross-functional collaboration skills

Job Responsibility

Develop and optimize ETL/ELT pipelines using Python, SQL, and Snowflake to ensure high-quality data for analytics, AI, and LLM workflows
Build and manage Snowflake data models and warehouses, focusing on performance, scalability, and security
Collaborate with AI/ML teams to prepare datasets for model training, inference, and LLM/RAG-based solutions
Automate data workflows, validation, and monitoring for reliable AI/ML execution
Support RAG pipelines and LLM data integration, enabling AI-driven insights and knowledge retrieval
Partner with business and analytics teams to transform raw data into actionable AI-powered insights
Contribute to dashboarding and reporting using Tableau, Power BI, or equivalent tools

Fulltime

AI Software Engineer

Join Qargo as an AI Software Engineer and help build intelligent, user-centric A...

Location

Belgium , Ghent

Salary:

Not provided

Qargo

Expiration Date

Until further notice

Requirements

Min. 2 years of experience in software engineering, applied AI, or similar technical roles
Strong programming skills (preferably Python and/or modern backend languages)
Experience with AI/ML tools and frameworks such as PyTorch, Hugging Face, LangChain/LangGraph, vector databases, and inference tooling
Proven experience deploying and operating AI/ML systems in a production environment
Ability to experiment quickly, iterate fast, and validate assumptions
Strong problem-solving skills and the ability to work autonomously in a fast-paced environment
Clear communication skills and the ability to collaborate with engineers, product managers, and domain experts

Job Responsibility

Evaluate and prototype with new AI models and techniques to solve document, workflow, and conversational tasks
Bring AI prototypes to production, ensuring quality, scalability, and observability
Monitor and maintain AI systems running in production, optimising cost, latency, and reliability
Collaborate with cross-functional teams to define clear AI tasks (e.g., document classification, summarisation, task prediction)
Develop and enhance AI-driven features such as document extraction, matching flows, quality checks, chatbots, and automated bookings
Stay up to date with advancements in AI and identify opportunities to improve the product

What we offer

Real impact and ownership in a growing international scale-up
A supportive and collaborative team culture
Hybrid working setup with flexibility and trust
Opportunities to learn, grow, and expand your technical knowledge
Competitive salary and benefits package

Senior Software Engineer – AI

NStarX is seeking a highly skilled Senior Software Engineer – AI with a strong f...

Location

India , Hyderabad

Salary:

Not provided

NStarX

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field (PhD is a plus)
9+ years of experience in AI/ML engineering or related roles
3+ years of experience in Generative AI with team leadership responsibilities
Proven track record of production-grade ML and GenAI model development and deployment
Programming: Python (preferred)
GenAI Frameworks: Hugging Face Transformers, Diffusers, LangChain, TGI
Serving & Inference: FastAPI, gRPC, NVIDIA Triton, TorchServe
Cloud Platforms: AWS (SageMaker, EKS), GCP (Vertex AI, GKE), Azure (Azure ML, AKS)
MLOps & DevOps: Kubeflow, MLflow, GitHub Actions, Jenkins, Helm, Terraform
Optimization Techniques: Model quantization, distillation, pipeline and tensor parallelism

Job Responsibility

Design, develop, and deploy machine learning models and AI algorithms to address complex business challenges
Lead and mentor a team of AI/ML engineers, ensuring quality and scalability in solution design and implementation
Collaborate closely with cross-functional teams including data scientists, software engineers, product managers, and UX designers
Lead the development and deployment of Generative AI applications across text, code, image, and audio modalities using state-of-the-art LLMs
Design and implement CI/CD pipelines for the GenAI model lifecycle including training, validation, packaging, and deployment
Apply best practices for model performance tuning, cost optimization, and scalable deployment in cloud and hybrid environments
Develop prompt engineering, fine-tuning strategies (LoRA, QLoRA, PEFT), and evaluation protocols tailored to business use cases
Stay current with emerging trends in AI, ML, and Generative AI and drive adoption across teams
Document processes, model architectures, and deployment strategies for traceability and knowledge sharing
Work closely with cross-functional teams to gather requirements and deliver high-quality solutions

What we offer

Competitive salary aligned with market standards
Opportunities for professional development and skill enhancement
A collaborative and innovative work environment

Fulltime

AI Software Engineer III

Planet DDS is a leading provider of a platform of cloud-based solutions that emp...

Location

United Kingdom , Glasgow

Salary:

Not provided

Planet DDS

Expiration Date

Until further notice

Requirements

5-7 years of professional software engineering experience
At least 4 years in AI/ML-focused roles
Bachelor’s or Master’s degree in Computer Science, Machine Learning, Artificial Intelligence, or related field
Experience working in a SaaS or enterprise software environment
Publications or contributions to open-source AI/ML projects
Exposure to reinforcement learning, generative AI (LLMs, diffusion models), or real-time inference systems

Job Responsibility

Design, develop, and deploy AI and machine learning models in production environments
Architect scalable solutions that integrate AI capabilities into our products and services
Collaborate with data scientists, product managers, and backend/front-end engineers to translate prototypes into reliable, maintainable code
Own end-to-end development of AI systems, including data ingestion, model training, evaluation, and deployment
Implement best practices in model versioning, monitoring, and continuous improvement
Contribute to the evolution of our AI/ML infrastructure, including CI/CD pipelines and MLOps tools
Stay current on advancements in AI, ML, and deep learning and assess their applicability to business needs
Ensure AI solutions are ethical, interpretable, and aligned with regulatory requirements

Fulltime

Principal AI Engineer

We are looking for a Principal AI Engineer to lead the design and deployment of ...

Location

United States

Salary:

200000.00 - 300000.00 USD / Year

Apollo.io

Expiration Date

Until further notice

Requirements

10+ years of software engineering experience
at least 3 years in applied LLM or agentic AI systems (2023–present)
proven success in deploying LLM-powered products used by real users at scale
deep backend & systems engineering expertise with Python, distributed systems, and scalable APIs
familiarity with LangChain, LlamaIndex, or similar orchestration frameworks
experience with RAG pipelines, vector DBs, embedding models, and semantic search tuning
experience managing performance across cloud providers (e.g., AWS Bedrock, OpenAI, Anthropic, etc.)
demonstrated experience building multi-step agents, planning workflows, chaining reasoning steps, and integrating APIs with agent memory/state
comfort with advanced prompting strategies, few-shot and chain-of-thought reasoning, and embedding retrieval setups
strong understanding of AI system evaluation, human ratings, A/B experimentation, and feedback loop pipelines

Job Responsibility

Architect and lead the development of multi-agent systems capable of long-horizon planning, reasoning, and API orchestration
build reusable agentic components that integrate deeply into sales and marketing processes
own and evolve our in-house platform for scalable, low-latency, and cost-efficient LLM and agent deployments
lead design of interfaces powered by natural language understanding and retrieval-augmented generation (RAG)
build embedding-based, intent-aware search and personalization systems tuned to business user needs
drive innovation in personalized outreach generation using context-aware generation pipelines
tune inference pipelines, caching layers, and model selection logic for high-scale, cost-aware performance
define and drive robust offline and online testing methodologies (A/B, sandboxing, human evals) across agents and LLM flows
architect human-in-the-loop systems and telemetry to improve accuracy, UX, and explainability over time

What we offer

equity
company bonus or sales commissions/bonuses
401(k) plan
at least 10 paid holidays per year
flex PTO
parental leave
employee assistance program
wellbeing benefits
global travel coverage
life/AD&D/STD/LTD insurance

Fulltime

Senior Devops & AI Engineer

This role presents a unique opportunity to contribute to the future of impactful...

Location

India , Hyderabad

Salary:

Not provided

Fission Labs

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, Engineering, or related field
6+ years of experience in Infrastructure Mgmt. roles, with a focus on cloud platforms (Azure and AWS Preferred)
Hands-on experience with operations (DevSecOps) principles and best practices
Proficiency in scripting languages such as Python, PowerShell, or Bash
Excellent communication and collaboration skills
In-depth knowledge of Linux operating systems, including CentOS, Ubuntu, and Red Hat, with expertise in shell scripting, package management, and system administration
Hands-on experience with a wide range of AWS and Azure services
Develop and maintain Infrastructure as Code (IAC) templates using tools such as Terraform or AWS CloudFormation
Experience setting up cloud infrastructure stack, databases, service endpoints, GPU as well as CPU resource scaling, optimization etc.
Should have worked AIOps/MLOP

Job Responsibility

Configure and optimize Linux-based servers for performance, security, and resource utilization, including kernel tuning, file system management, and network configuration
Architect cloud solutions leveraging best practices and services offered by AWS and Azure, optimizing for scalability, reliability, and cost-effectiveness
Implement and manage hybrid cloud environments, facilitating seamless integration and interoperability between AWS and Azure services
Establish version control practices for IAC templates, ensuring traceability, auditability, and reproducibility of infrastructure changes

What we offer

Opportunity to work on impactful technical challenges with global reach
Vast opportunities for self-development, including online university access and knowledge sharing opportunities
Sponsored Tech Talks & Hackathons to foster innovation and learning
Generous benefits packages including health insurance, retirement benefits, flexible work hours, and more
Supportive work environment with forums to explore passions beyond work

Fulltime

Artificial (AI) Engineer

VELOX is hiring an AI Developer to help design and implement intelligent systems...

Location

United States , Boise

Salary:

Not provided

VELOX Media

Expiration Date

Until further notice

Requirements

Strong proficiency in Python (Pandas, NumPy, scikit-learn, etc.)
Experience with deep learning frameworks such as TensorFlow or PyTorch
Hands-on experience with natural language processing, retrieval-augmented generation (RAG), or LLMs (e.g., OpenAI, Claude, Mistral)
Understanding of data pipelines, model deployment, and performance monitoring
Experience working with APIs and integrating ML models into production systems
Familiarity with vector databases (e.g., Pinecone, Weaviate, FAISS) and embedding generation
Comfort working in cloud environments (GCP, AWS, or Azure)
Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field
3+ years of experience in applied AI/ML roles
Track record of launching AI tools or systems into production

Job Responsibility

Research, design, and deploy AI/ML models that drive value across client-facing and internal applications
Build tools that support predictive analytics, natural language querying, and campaign automation
Collaborate with product and engineering teams to integrate AI functionality into web platforms
Integrate AI solutions with our PHP/Laravel backend and MySQL databases via REST APIs or microservices
Write clean, scalable code for inference pipelines, model training, and testing environments
Monitor model performance and retrain or refine when necessary
Stay ahead of LLMs, vector DBs, and open-source innovations to enhance our AI roadmap
Contribute to a long-term AI strategy that makes VELOX more automated, intelligent, and insightful

What we offer

Competitive compensation and performance bonuses
Health insurance & 401k options
Paid vacation and holidays
Casual dress and regular team events
On-site gym and personal trainer access
Kombucha on tap

Fulltime

AI Inference Engineer

Perplexity

Location:
United Kingdom , London

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
February 21, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for AI Inference Engineer

Director of AI Engineering

Data Engineer – AI Insights

AI Software Engineer

Senior Software Engineer – AI

AI Software Engineer III

Principal AI Engineer

Senior Devops & AI Engineer

Artificial (AI) Engineer

AI Inference Engineer

Perplexity

Location:United Kingdom , London

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:February 21, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for AI Inference Engineer

Director of AI Engineering

Data Engineer – AI Insights

AI Software Engineer

Senior Software Engineer – AI

AI Software Engineer III

Principal AI Engineer

Senior Devops & AI Engineer

Artificial (AI) Engineer

Location:
United Kingdom , London

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
February 21, 2026