CrawlJobs Logo

Engineering Manager - Inference

perplexity.ai Logo

Perplexity

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

300000.00 - 385000.00 USD / Year

Job Description:

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities. You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes. You will help architect and scale the large-scale deployment of machine learning models behind Perplexity's Comet, Sonar, Search, Deep Research products.

Job Responsibility:

  • Lead and grow a high-performing team of AI inference engineers
  • Develop APIs for AI inference used by both internal and external customers
  • Architect and scale our inference infrastructure for reliability and efficiency
  • Benchmark and eliminate bottlenecks throughout our inference stack
  • Drive large sparse/MoE model inference at rack scale, including sharding strategies for massive models
  • Push the frontier with building inference systems to support sparse attention, disaggregated pre-fill/decoding serving, etc.
  • Improve the reliability and observability of our systems and lead incident response
  • Own technical decisions around batching, throughput, latency, and GPU utilization
  • Partner with ML research teams on model optimization and deployment
  • Recruit, mentor, and develop engineering talent
  • Establish team processes, engineering standards, and operational excellence

Requirements:

  • 5+ years of engineering experience with 2+ years in a technical leadership or management role
  • Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)
  • Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers
  • Experience with inference optimizations: batching, quantization, kernel fusion, FlashAttention
  • Familiarity with GPU characteristics, roofline models, and performance analysis
  • Experience deploying reliable, distributed, real-time systems at scale
  • Track record of building and leading high-performing engineering teams
  • Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
  • Strong technical communication and cross-functional collaboration skills

Nice to have:

  • Experience with CUDA, Triton, or custom kernel development
  • Background in training infrastructure and RL workloads
  • Experience with Kubernetes and container orchestration at scale
  • Published work or contributions to inference optimization research
What we offer:
  • Equity
  • Health
  • Dental
  • Vision
  • Retirement
  • Fitness
  • Commuter and dependent care accounts

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Engineering Manager - Inference

Engineering Manager - Machine Learning

We’re looking for an experienced Engineering Manager to lead the ML Soundtrack t...
Location
Location
Sweden , Stockholm
Salary
Salary:
Not provided
epidemicsound.com Logo
Epidemic Sound
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep ML engineering background with hands-on experience in generative diffusion models for audio/music (including PyTorch and modern training stacks)
  • Proven experience deploying ML systems into production at scale, with a focus on latency, stability, and cost
  • Strong ML system design and architecture skills across the full machine learning lifecycle
  • Track record of managing engineering teams
  • Demonstrated ability to set clear goals, manage performance, and grow engineers through mentorship and feedback
Job Responsibility
Job Responsibility
  • Own the technical roadmap and model strategy for generative music, including diffusion and transformer-based approaches
  • Lead the full lifecycle from research to production, championing training, evaluation, and deployment for real-time inference
  • Drive the productionisation of inference through model optimisation (distillation, quantisation), caching, and cost controls
  • Build and maintain team health through effective rituals, 1:1s, and fostering a psychologically safe, high-ownership culture
  • Manage cross-team dependencies and delivery with data, MLOps, and product engineering teams
  • Fulltime
Read More
Arrow Right

Engineering Manager - Machine Learning Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
241200.00 - 400000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8–10 years of experience in ML infrastructure, including direct hands-on expertise as an engineer, IC/TL
  • 2+ years of experience managing infrastructure or ML platform engineers
  • Proven experience delivering and operating ML or AI infrastructure at scale
  • Solid technical depth across ML/AI infrastructure domains (e.g., feature stores, pipelines, deployment, inference, observability)
  • Demonstrated ability to drive execution on complex technical projects with cross-team stakeholders
  • Strong communication and stakeholder management skills
Job Responsibility
Job Responsibility
  • Lead and support the ML Infra team, driving project execution and ensuring delivery on key commitments
  • Build and launch Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Define and drive adoption of an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines, deployment tooling, and inference systems
  • Partner with ML product teams to understand requirements and deliver solutions that accelerate model development and iteration
  • Recruit, mentor, and develop engineers, fostering a collaborative and high-performing team culture
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineering Manager, Gen AI

We're seeking a Senior Machine Learning Manager (M60) to lead a cross-functional...
Location
Location
United States
Salary
Salary:
193500.00 - 303150.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in ML, search, or backend engineering roles, with 3+ years leading teams
  • Strong track record of shipping ML-powered or LLM-integrated user-facing products
  • Experience with RAG systems (vector search, hybrid retrieval, LLM orchestration)
  • Deep experience in either modeling (e.g., LLMs, search, NLP) or engineering (e.g., backend infra, full-stack), with the ability to lead end-to-end
  • Deep understanding of LLM ecosystems (OpenAI, Claude, Mistral, OSS), orchestration frameworks (LangChain, LlamaIndex), and vector databases (Weaviate, Pinecone, FAISS, etc.)
  • Strong product intuition and ability to translate complex tech into valuable user features
  • Familiarity with GenAI evaluation methods: hallucination detection, groundedness scoring, and human-in-the-loop feedback loops
  • Master’s or PhD in Computer Science, Machine Learning, or related field preferred—or equivalent practical experience
Job Responsibility
Job Responsibility
  • Lead the vision, design, and execution of LLM-powered AI products, leveraging advance AI modeling (e.g. SLM post-training/fine-tuning), RAG architectures and hybrid ranking system
  • Define system architecture across retrievers, rankers, orchestration layers, prompt templates, and feedback mechanisms
  • Work closely with product and design teams to ensure delightful, fast, and grounded user experiences
  • Build and manage a cross-disciplinary team including ML engineers, backend/frontend engineers, and applied scientists
  • Foster a culture of E2E ownership — empowering the team to move from prototype to production quickly and iteratively
  • Mentor individuals to grow in both technical depth and product acumen
  • Shape the technical roadmap and long-term strategy for GenAI search across Atlassian’s product suite
  • Partner with platform and infra teams to scale inference, evaluate performance, and integrate usage signals for continuous improvement
  • Champion data quality, grounding, and responsible AI practices in all deployed features
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
  • Fulltime
Read More
Arrow Right
New

Staff Product Manager, Managed Inference

As a core member of the Crusoe Managed AI Services team, you will own the comple...
Location
Location
United States , San Francisco; Sunnyvale
Salary
Salary:
204000.00 - 247000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of experience in technical product management or engineering roles with product responsibilities
  • Experience building and launching cloud infrastructure, platform, or AI/ML services used in production
  • Strong understanding of cloud infrastructure (e.g., AWS, GCP, Azure) and modern compute architectures
  • Familiarity with the machine learning lifecycle, particularly model deployment, inference, and monitoring
  • Strong communication and collaboration skills, with experience working across engineering, product, and business teams
  • Demonstrated ability to operate independently with strong product judgment and a bias for action
  • Bachelor’s degree in Computer Science or a related technical field (or equivalent experience)
Job Responsibility
Job Responsibility
  • Own the end-to-end product lifecycle for Crusoe’s Managed Inference services, including roadmap definition, execution, and iteration
  • Translate customer needs, market signals, and technical constraints into clear product requirements and prioritization
  • Partner closely with Engineering, Infrastructure, and Platform teams to deliver scalable, reliable inference services
  • Drive product decisions across performance, reliability, cost efficiency, and developer experience
  • Define and track success metrics for inference services in production environments
  • Collaborate with go-to-market teams to support product launches, positioning, and customer adoption
  • Communicate product strategy and tradeoffs clearly to cross-functional partners and leadership
What we offer
What we offer
  • Restricted Stock Units
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Staff Product Manager, Managed Inference

As a core member of the Crusoe Managed AI Services team, you will own the comple...
Location
Location
United States , San Francisco; Sunnyvale; New York
Salary
Salary:
204000.00 - 247000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of experience in technical product management or engineering roles with product responsibilities
  • Experience building and launching cloud infrastructure, platform, or AI/ML services used in production
  • Strong understanding of cloud infrastructure (e.g., AWS, GCP, Azure) and modern compute architectures
  • Familiarity with the machine learning lifecycle, particularly model deployment, inference, and monitoring
  • Strong communication and collaboration skills, with experience working across engineering, product, and business teams
  • Demonstrated ability to operate independently with strong product judgment and a bias for action
  • Bachelor’s degree in Computer Science or a related technical field (or equivalent experience)
Job Responsibility
Job Responsibility
  • Own the end-to-end product lifecycle for Crusoe’s Managed Inference services, including roadmap definition, execution, and iteration
  • Translate customer needs, market signals, and technical constraints into clear product requirements and prioritization
  • Partner closely with Engineering, Infrastructure, and Platform teams to deliver scalable, reliable inference services
  • Drive product decisions across performance, reliability, cost efficiency, and developer experience
  • Define and track success metrics for inference services in production environments
  • Collaborate with go-to-market teams to support product launches, positioning, and customer adoption
  • Communicate product strategy and tradeoffs clearly to cross-functional partners and leadership
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right
New

Director of Engineering, Cloud Availability

As the Director of Engineering, Cloud Availability, you will lead our engineerin...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of engineering leadership experience with a proven track record of managing high-performing technical teams
  • Deep technical knowledge of public cloud infrastructure and experience building or operating large-scale platforms (Public, Private, or Hybrid)
  • Expert-level understanding of availability, observability, SLIs/SLOs, and modern incident management frameworks
  • Proven ability to lead remote teams and successfully collaborate with US-based engineering organizations
  • Demonstrated success navigating and leading within a matrix organizational structure
  • Strong familiarity with virtual and managed Kubernetes platforms, such as EKS, GKE, or AKS
  • The ability to balance long-term organizational strategy with the immediate tactical needs of a fast-growing engineering site
Job Responsibility
Job Responsibility
  • Organizational Leadership: Partner closely with Data Center, Network, and SRE teams to build and scale a world-class engineering organization in Dublin
  • Site Leadership & Culture: Serve as the primary point of contact and face of Crusoe leadership in Dublin, proactively managing office sentiment and ensuring the team remains focused on high-impact objectives
  • Global Strategic Alignment: Build high-trust partnerships with US-based leadership to ensure local priorities are perfectly synchronized with the global business roadmap
  • Operational Excellence: Implement and refine "follow-the-sun" protocols to enable smooth hand-offs between time zones, ensuring zero customer disruption and 24/7 reliability
  • Unified Team Vision: Foster a "one-team" mindset across geographic boundaries, breaking down silos and promoting deep collaboration between Dublin and US offices
  • Talent Development: Level up the Dublin engineering team by identifying individual strengths and establishing a culture of mentorship to grow the next generation of Engineering Leads and ICs
  • Reliability Initiatives: Lead the development of SRE functions for IaaS and managed services, including Inference, SLURM, and automated cluster management
What we offer
What we offer
  • pension contributions
  • private health and dental insurance
  • income protection
  • life assurance
  • Fulltime
Read More
Arrow Right

Director of Engineering, Cloud Availability

As the Director of Engineering, Cloud Availability, you will lead our engineerin...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of engineering leadership experience with a proven track record of managing high-performing technical teams
  • Deep technical knowledge of public cloud infrastructure and experience building or operating large-scale platforms (Public, Private, or Hybrid)
  • Expert-level understanding of availability, observability, SLIs/SLOs, and modern incident management frameworks
  • Proven ability to lead remote teams and successfully collaborate with US-based engineering organizations
  • Demonstrated success navigating and leading within a matrix organizational structure
  • Strong familiarity with virtual and managed Kubernetes platforms, such as EKS, GKE, or AKS
  • The ability to balance long-term organizational strategy with the immediate tactical needs of a fast-growing engineering site
Job Responsibility
Job Responsibility
  • Organizational Leadership: Partner closely with Data Center, Network, and SRE teams to build and scale a world-class engineering organization in Dublin
  • Site Leadership & Culture: Serve as the primary point of contact and face of Crusoe leadership in Dublin, proactively managing office sentiment and ensuring the team remains focused on high-impact objectives
  • Global Strategic Alignment: Build high-trust partnerships with US-based leadership to ensure local priorities are perfectly synchronized with the global business roadmap
  • Operational Excellence: Implement and refine "follow-the-sun" protocols to enable smooth hand-offs between time zones, ensuring zero customer disruption and 24/7 reliability
  • Unified Team Vision: Foster a "one-team" mindset across geographic boundaries, breaking down silos and promoting deep collaboration between Dublin and US offices
  • Talent Development: Level up the Dublin engineering team by identifying individual strengths and establishing a culture of mentorship to grow the next generation of Engineering Leads and ICs
  • Reliability Initiatives: Lead the development of SRE functions for IaaS and managed services, including Inference, SLURM, and automated cluster management
What we offer
What we offer
  • pension contributions
  • private health and dental insurance
  • income protection
  • life assurance
  • Fulltime
Read More
Arrow Right

Engineering Manager, GenAI Platform

Our generative AI-powered products are bringing joy back to the practice of medi...
Location
Location
United States , San Francisco
Salary
Salary:
220000.00 - 270000.00 USD / Year
abridge.com Logo
Abridge
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years as a software engineer
  • 2+ years managing software engineering teams
  • Experience in building and leading teams in the AI Infrastructure domain - LLM workflows, agentic systems, retrieval and evaluation
  • Comfortable giving constructive feedback on technical designs and code reviews
  • Skilled in building secure, compliant systems in major cloud platforms (GCP preferred, AWS experience welcome)
  • Familiar with containers, databases, and modern frontend frameworks
  • Skilled at hiring and mentorship, with a track record of helping engineers grow their skills and careers
  • Excited about being hands-on in a fast-moving, productive, and supportive environment
  • Has thrived in a fast-growing startup, knows how to operate in that environment
Job Responsibility
Job Responsibility
  • Recruit and mentor engineers with backgrounds in LLM-driven workflows and Agentic systems
  • Provide regular feedback
  • create opportunities for career growth
  • and foster a culture of collaboration and excellence
  • Work closely with ML/AI Research, Inference, Data, and Product teams to plan, execute, and support multiple projects simultaneously
  • responsible for the engineering process in the team and the output of the team
  • Help set business context for the team and apply your knowledge of business and software engineering to guide architectural discussions
  • anticipate and resolve technical challenges
  • and advocate for technology projects
  • Set a high standard for your team including software quality
What we offer
What we offer
  • Generous Time Off: 14 paid holidays, flexible PTO for salaried employees, and accrued time off for hourly employees
  • Comprehensive Health Plans: Medical, Dental, and Vision coverage for all full-time employees and their families
  • Generous HSA Contribution: If you choose a High Deductible Health Plan, Abridge makes monthly contributions to your HSA
  • Paid Parental Leave: Generous paid parental leave for all full-time employees
  • Family Forming Benefits: Resources and financial support to help you build your family
  • 401(k) Matching: Contribution matching to help invest in your future
  • Personal Device Allowance: Tax free funds for personal device usage
  • Pre-tax Benefits: Access to Flexible Spending Accounts (FSA) and Commuter Benefits
  • Lifestyle Wallet: Monthly contributions for fitness, professional development, coworking, and more
  • Mental Health Support: Dedicated access to therapy and coaching to help you reach your goals
  • Fulltime
Read More
Arrow Right