CrawlJobs Logo

Senior Backend Engineer, Inference Platform

together.ai Logo

Together AI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

160000.00 - 250000.00 USD / Year

Job Description:

Together AI is building the Inference Platform that brings the most advanced generative AI models to the world. Our platform powers multi-tenant serverless workloads and dedicated endpoints, enabling developers, enterprises, and researchers to harness the latest LLMs, multimodal models, image, audio, video, and speech models at scale.

Job Responsibility:

  • Build and optimize global and local request routing, ensuring low-latency load balancing across data centers and model engine pods
  • Develop auto-scaling systems to dynamically allocate resources and meet strict SLOs across dozens of data centers
  • Design systems for multi-tenant traffic shaping, tuning both resource allocation and request handling — including smart rate limiting and regulation — to ensure fairness and consistent experience across all users
  • Engineer trade-offs between latency and throughput to serve diverse workloads efficiently
  • Optimize prefix caching to reduce model compute and speed up responses
  • Collaborate with ML researchers to bring new model architectures into production at scale
  • Continuously profile and analyze system-level performance to identify bottlenecks and implement optimizations

Requirements:

  • 5+ years of demonstrated experience building large-scale, fault-tolerant, distributed systems and API microservices
  • Strong background in designing, analyzing, and improving efficiency, scalability, and stability of complex systems
  • Excellent understanding of low-level OS concepts: multi-threading, memory management, networking, and storage performance
  • Expert-level programming in one or more of: Rust, Go, Python, or TypeScript
  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience

Nice to have:

  • Knowledge of modern LLMs and generative models and how they are served in production
  • Experience working with the open source ecosystem around inference
  • familiarity with SGLang, vLLM, or NVIDIA Dynamo
  • Experience with Kubernetes or container orchestration
  • Familiarity with GPU software stacks (CUDA, Triton, NCCL) and HPC technologies (InfiniBand, NVLink, MPI)
What we offer:
  • Competitive compensation
  • equity
  • health insurance
  • other competitive benefits

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Backend Engineer, Inference Platform

Senior LLM Backend Engineer

We are looking for a Senior Backend Engineer with a strong focus on Large Langua...
Location
Location
Spain
Salary
Salary:
Not provided
bark.com Logo
Bark
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Extensive production experience with Python in backend engineering
  • Proven experience integrating LLMs into applications via APIs or SDKs
  • Strong experience building and maintaining APIs for LLM-based features
  • Strong experience building and maintaining event-driven workflows
  • Strong experience building and maintaining business logic that consumes AI outputs
  • Strong experience building and maintaining integrations with 3rd party AI/ML platforms
  • Solid SQL and NoSQL experience (especially in AI data pipelines)
  • Production experience with Docker, ideally with Kubernetes or AWS Fargate/ECS/EKS
  • Experience deploying and maintaining AI services in cloud environments
  • Strong organisational skills and ability to deliver in a fast-paced, product-focused environment
Job Responsibility
Job Responsibility
  • Work with product managers to understand user needs and translate them into AI-powered functionality
  • Design and build APIs, services, and workflows that integrate LLMs (both proprietary and open-source)
  • Implement prompt engineering, RAG pipelines, and model fine-tuning where required
  • Optimise AI inference performance, scalability, and cost-effectiveness
  • Ensure AI features meet high standards for security, reliability, and maintainability
  • Collaborate with other engineers to integrate AI features seamlessly into the wider system
  • Stay on top of emerging LLM technologies and best practices, running experiments and sharing knowledge across the team
What we offer
What we offer
  • Fully remote working
  • Personal annual L&D Budgets with 600€ to spend on your development
  • Being at the forefront of an industry with new and exciting problems to solve
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Network Enablement (Applied ML)

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering skills including systems design, APIs, and building reliable backend services (Go or Python preferred)
  • Production experience with batch and streaming data pipelines and orchestration tools such as Airflow or Spark
  • Experience building or operating real-time scoring and online feature-serving systems, including feature stores and low-latency model inference
  • Experience integrating model outputs into product flows (APIs, feature flags) and measuring impact through experiments and product metrics
  • Experience with model lifecycle and operations: model registries, CI/CD for models, reproducible training, offline & online parity, monitoring and incident response
Job Responsibility
Job Responsibility
  • Embed model inference into Network Enablement product flows and decision logic (APIs, feature flags, backend flows)
  • Define and instrument product + ML success metrics (fraud reduction, retention lift, false positives, downstream impact)
  • Design and run experiments and rollout plans (backtesting, shadow scoring, A/B tests, feature-flagged releases) to validate product hypotheses
  • Build and operate offline training pipelines and production batch scoring for bank intelligence products
  • Ship and maintain online feature serving and low-latency model inference endpoints for real-time partner/bank scoring
  • Implement model CI/CD, model/version registry, and safe rollout/rollback strategies
  • Monitor model/data health: drift/regression detection, model-quality dashboards, alerts, and SLOs targeted to partner product needs
  • Ensure offline and online parity, data lineage, and automated validation / data contracts to reduce regressions
  • Optimize inference performance and cost for real-time scoring (batching, caching, runtime selection)
  • Ensure fairness, explainability and PII-aware handling for partner-facing ML features
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (TypeScript) - AI/ML

We are looking for a Senior Software Engineer to drive the development of AI/ML-...
Location
Location
United States
Salary
Salary:
131000.00 - 185000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience in production environments
  • Exposure to working directly with AI/ML technologies
  • Strong frontend skills with TypeScript/JavaScript and React
  • Backend development experience in TypeScript or Python, with a focus on API design and service architecture
  • You have a high level of ownership and can drive features from concept to production with minimal supervision
  • You thrive in collaborative environments and can effectively communicate technical concepts to diverse stakeholders
Job Responsibility
Job Responsibility
  • Feature Development: Design and implement AI-powered features across the full stack, from backend inference services to intuitive frontend interfaces within the ClickHouse Cloud platform
  • API Architecture: Create robust, scalable APIs that connect ClickHouse's database capabilities with modern AI/ML inference systems and external/internal AI services
  • UI/UX Implementation: Build responsive, intuitive user interfaces that make complex AI functionalities accessible and valuable to users of all technical backgrounds
  • Ecosystem Integrations: Implement and maintain integrations with the broader AI/ML ecosystem and standards, ensuring that ClickHouse as a technology works seamlessly with popular frameworks and tools
  • Technical Integration: Integrate models into production systems with proper monitoring, versioning, observability, and evaluation
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (TypeScript) - AI/ML

We are looking for a Senior Software Engineer to drive the development of AI/ML-...
Location
Location
The Netherlands
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience in production environments
  • Exposure to working directly with AI/ML technologies
  • Strong frontend skills with TypeScript/JavaScript and React
  • Backend development experience in TypeScript or Python, with a focus on API design and service architecture
  • You have a high level of ownership and can drive features from concept to production with minimal supervision
  • You thrive in collaborative environments and can effectively communicate technical concepts to diverse stakeholders
Job Responsibility
Job Responsibility
  • Feature Development: Design and implement AI-powered features across the full stack, from backend inference services to intuitive frontend interfaces within the ClickHouse Cloud platform
  • API Architecture: Create robust, scalable APIs that connect ClickHouse's database capabilities with modern AI/ML inference systems and external/internal AI services
  • UI/UX Implementation: Build responsive, intuitive user interfaces that make complex AI functionalities accessible and valuable to users of all technical backgrounds
  • Ecosystem Integrations: Implement and maintain integrations with the broader AI/ML ecosystem and standards, ensuring that ClickHouse as a technology works seamlessly with popular frameworks and tools
  • Technical Integration: Integrate models into production systems with proper monitoring, versioning, observability, and evaluation
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Senior Machine Learning Engineering Manager, Gen AI

We're seeking a Senior Machine Learning Manager (M60) to lead a cross-functional...
Location
Location
United States
Salary
Salary:
193500.00 - 303150.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in ML, search, or backend engineering roles, with 3+ years leading teams
  • Strong track record of shipping ML-powered or LLM-integrated user-facing products
  • Experience with RAG systems (vector search, hybrid retrieval, LLM orchestration)
  • Deep experience in either modeling (e.g., LLMs, search, NLP) or engineering (e.g., backend infra, full-stack), with the ability to lead end-to-end
  • Deep understanding of LLM ecosystems (OpenAI, Claude, Mistral, OSS), orchestration frameworks (LangChain, LlamaIndex), and vector databases (Weaviate, Pinecone, FAISS, etc.)
  • Strong product intuition and ability to translate complex tech into valuable user features
  • Familiarity with GenAI evaluation methods: hallucination detection, groundedness scoring, and human-in-the-loop feedback loops
  • Master’s or PhD in Computer Science, Machine Learning, or related field preferred—or equivalent practical experience
Job Responsibility
Job Responsibility
  • Lead the vision, design, and execution of LLM-powered AI products, leveraging advance AI modeling (e.g. SLM post-training/fine-tuning), RAG architectures and hybrid ranking system
  • Define system architecture across retrievers, rankers, orchestration layers, prompt templates, and feedback mechanisms
  • Work closely with product and design teams to ensure delightful, fast, and grounded user experiences
  • Build and manage a cross-disciplinary team including ML engineers, backend/frontend engineers, and applied scientists
  • Foster a culture of E2E ownership — empowering the team to move from prototype to production quickly and iteratively
  • Mentor individuals to grow in both technical depth and product acumen
  • Shape the technical roadmap and long-term strategy for GenAI search across Atlassian’s product suite
  • Partner with platform and infra teams to scale inference, evaluate performance, and integrate usage signals for continuous improvement
  • Champion data quality, grounding, and responsible AI practices in all deployed features
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer (Health)

WHOOP is an advanced health and fitness wearable, on a mission to unlock human p...
Location
Location
United States , Boston
Salary
Salary:
150000.00 - 210000.00 USD / Year
whoop.com Logo
Whoop
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree in Computer Science, Data Science, Applied Mathematics, or a related field. Master’s preferred
  • 5+ years of professional experience as a Machine Learning Engineer or Software Engineer with focus on ML systems
  • Proven expertise working with time series data (wearable, physiological, or high-frequency sensor data strongly preferred)
  • Experience designing and deploying ML inference systems at scale: both real-time streaming and large-scale batch pipelines
  • Strong coding skills in Python (scientific stack) and SQL, with a track record of writing clean, production-quality code
  • Strong communication skills to collaborate across engineering, research, and product teams
  • Proven experience deploying and maintaining ML systems on cloud platforms (AWS or GCP)
  • Working familiarity with MLOps best practices: model versioning, CI/CD for ML, observability, and monitoring for inference systems
  • Ability to reason about and design for performance trade-offs (latency vs. throughput vs. cost) when building ML inference systems
  • Strong understanding of backend service development (APIs and service reliability) as it applies to serving ML models at scale
Job Responsibility
Job Responsibility
  • Create, improve, and maintain production services that provide analysis for health features in collaboration with Data Scientists and MLOps Engineers
  • Collaborate with Data Engineers to improve ML data pipelines, tooling, and validation systems that support robust model performance
  • Work alongside data scientists to translate research prototypes into production ML systems optimized for scale, latency, and cost efficiency
  • Collaborate with researchers and product teams to align model development with health insights and member impact
  • Participate in on-call rotations for data science services, ensuring uptime and performance in production environments
What we offer
What we offer
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...
Location
Location
United States , San Francisco
Salary
Salary:
216500.00 - 324500.00 USD / Year
gofundme.com Logo
GoFundMe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
  • Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
  • Extensive experience designing, developing, and operating scalable backend systems
  • Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
  • Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
  • Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
  • Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
  • Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
  • Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
  • Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)
Job Responsibility
Job Responsibility
  • Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
  • Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
  • Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
  • Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
  • Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
  • Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
  • Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
  • Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
  • Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
  • Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure
What we offer
What we offer
  • Competitive pay
  • Comprehensive healthcare benefits
  • Financial assistance for things like hybrid work, family planning
  • Generous parental leave
  • Flexible time-off policies
  • Mental health and wellness resources
  • Learning, development, and recognition programs
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, Backend Platform

At Harvey, we’re transforming how legal and professional services operate — not ...
Location
Location
United States , San Francisco
Salary
Salary:
200000.00 - 260000.00 USD / Year
harvey.ai Logo
Harvey
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience (post-BS/MS), including building scalable backend systems or internal developer platforms
  • Proficiency in Python (or similar languages) and deep knowledge of backend development fundamentals (APIs, data stores, concurrency, distributed systems)
  • Hands-on experience with web frameworks and service architectures (e.g. Flask/FastAPI, Bounded context services, microservices) and an understanding of designing clean, versioned APIs
  • Familiarity with caching, messaging, and database technologies (Redis, Kafka, SQL/NoSQL databases, Vector databases, etc.) and how to use them effectively for high performance and reliability
  • A track record of writing high-quality, well-tested code and using tools (unit/integration testing, static typing, CI) to catch issues early and ensure reliability
  • Strong problem-solving skills and a passion for improving developer experience — you enjoy creating tools or frameworks that make other engineers more productive
  • Excellent collaboration and communication skills, with the ability to work across teams and incorporate feedback
Job Responsibility
Job Responsibility
  • Develop and maintain Harvey’s internal backend frameworks and libraries that provide common capabilities (API routing, service lifecycle management, caching and messaging primitives, error handling interfaces, etc.), so product teams don’t have to reinvent them
  • Create and improve APIs, service templates, and versioned interfaces that establish consistent patterns for building new services and features
  • Introduce and champion modern backend architecture patterns like asynchronous I/O (asyncio) and streaming data processing, continually evolving our platform for better performance and scalability
  • Design Harvey-specific abstractions and domain-specific frameworks—covering cross-cutting concerns (e.g., authorization, streaming) and areas like data governance and event processing—to provide product engineers with these capabilities out of the box
  • Embed reliability and observability into the platform by building in tracing, metrics, and automated tests (shift-left), ensuring services built on our foundation are robust and easy to monitor
  • Collaborate with Model Infrastructure team to tackle challenges unique to GenAI-native applications — such as supporting high-throughput model inference, managing streaming and long-running API interactions, and designing abstractions for retrieval, context handling, and prompt lifecycle
  • Collaborate with the Developer Experience and Infrastructure teams (who own CI/CD pipelines, build tools, and release infrastructure) to integrate our platform components seamlessly into the deployment and monitoring ecosystem
  • Work closely with product engineering teams to gather feedback, evangelize best practices, and make the “paved road” approach a reality — providing strong defaults and clear documentation so teams can move fast with confidence
What we offer
What we offer
  • Offers Equity
  • Offers Bonus
  • Comprehensive health, dental and vision coverage
  • retirement benefits (401k match up to 4%)
  • flexible PTO
  • Fulltime
Read More
Arrow Right