Staff Software Engineer, Inference Infrastructure Job at Cohere (San Francisco, Toronto, London, New York, Montreal)

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...

Location

United States , San Francisco

Salary:

216500.00 - 324500.00 USD / Year

GoFundMe

Expiration Date

Until further notice

Requirements

9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
Extensive experience designing, developing, and operating scalable backend systems
Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)

Job Responsibility

Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure

What we offer

Competitive pay
Comprehensive healthcare benefits
Financial assistance for things like hybrid work, family planning
Generous parental leave
Flexible time-off policies
Mental health and wellness resources
Learning, development, and recognition programs

Fulltime

Member of Technical Staff, Cloud Infrastructure

As a Software Engineer on our Cloud Infrastructure team, you'll be at the forefr...

Location

United States , New York, NY; San Mateo, CA; Redwood City, CA

Salary:

175000.00 - 220000.00 USD / Year

Fireworks AI

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure)
Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.)
Strong software development skills in languages like Python, or C++
Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization

Job Responsibility

Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines
Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure
Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency
Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning
Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions
Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Ray, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability
Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence

What we offer

Meaningful equity in a fast-growing startup
Competitive salary
Comprehensive benefits package

Fulltime

Staff Machine Learning Engineer

As a Staff Machine Learning Engineer at Aignostics, you will play a crucial role...

Location

Germany , Berlin

Salary:

Not provided

Aignostics

Expiration Date

Until further notice

Requirements

Advanced degree in a relevant field or extensive work experience
8+ years of industry experience, with at least 2 years as Staff Engineer or an equivalent role
Proven track record of driving technical excellence and innovation
Solid background in data-intensive systems and software architecture, design patterns and clean coding
Expert Python programming and fluency in C/C++ or other low-level language(s)
Experience with designing and implementing large-scale, distributed ML systems and platforms
Proven track record of deploying ML models into production environments
Strong knowledge of machine learning fundamentals
Experience with deep learning frameworks (e.g. Pytorch and Tensorflow) and state-of-the-art techniques (e.g. generative models)
Deep understanding of cloud technologies (e.g. GCP, AWS), containerization and orchestration (Kubernetes)

Job Responsibility

Define and drive the technical architecture and system design principles for our AI platform and infrastructure
Work in close collaboration with engineering leads to build flexible frameworks and systems for model training, evaluation and inference across different pathology applications
Guide the CTO office, product management and fellow engineering leads through complex decisions by providing expert consultation on feasibility, architecture, trade-offs and risk mitigation strategies, while ensuring alignment with our technical vision
Foster technical alignment across teams by establishing shared architectural principles and best practices, facilitating cross-team design reviews to enable consistent decision-making across domains
Champion technical excellence by leading strategic initiatives that modernize our architecture and reduce technical debt while measuring and improving our technical health metrics
Elevate the technical capabilities of our engineering staff through structured mentoring, workshops and establishing comprehensive technical guidelines that enable teams to make better design decisions
Drive innovation by evaluating emerging technologies, leading proof-of-concept initiatives and building support for strategic technical investments that advance our engineering capabilities while ensuring measurable business value

What we offer

Cutting-edge AI research and development, with involvement of Charité, TU Berlin and our other partners
Work with a welcoming, diverse and highly international team of colleagues
Opportunity to take responsibility and grow your role within the startup
Expand your skills by benefitting from our Learning & Development yearly budget of 1,000 € (plus 2 L&D days), language classes and internal development programs
Mentoring program, you’ll learn from great experts
Flexible working hours and teleworking policy
30 paid vacations days per year
We are family & pet friendly and support flexible parental leave options
Pick a subsidized membership of your choice among public transport, sports and well-being
Enjoy our social gatherings, lunches and off-site events for a fun and inclusive work environment

Staff Product Security Engineer

We’re looking for a Staff Product Security Engineer to lead the design and imple...

Location

United States

Salary:

184000.00 - 252000.00 USD / Year

AlphaSense

Expiration Date

Until further notice

Requirements

7+ years of experience in product, application, or cloud security engineering
Deep understanding of secure SDLC, threat modeling, and secure architecture design
Proven expertise with AWS cloud security concepts and best practices
Strong experience with container security, orchestration, and runtime protection
Proficiency in Python, Java, and/or JavaScript for security automation, code review, and tooling
Experience securing AI/ML pipelines, data workflows, or model-serving infrastructure
Familiarity with DevSecOps and continuous integration/deployment environments

Job Responsibility

Embed robust security practices throughout the software and AI development lifecycle (SDLC)
Lead secure design reviews, threat modeling, and risk assessments for AI-driven products, APIs, and backend services
Partner with engineering and product teams to ensure security, privacy, and compliance by design
Build and maintain security automation and governance frameworks that integrate seamlessly into development workflows
Architect and enforce security controls for AI/ML systems, including model training, data pipelines, and inference environments
Identify and mitigate AI-specific attack vectors such as data poisoning, model inversion, prompt injection, and model theft
Collaborate with governance and compliance teams to align with ethical AI principles and frameworks like NIST AI RMF and the EU AI Act
Implement model provenance, integrity, and auditability controls to ensure responsible and secure AI operations
Partner with DevOps and SRE teams to secure service meshes, container networking, and secrets management
Drive software supply chain security, including artifact integrity, dependency management, and vulnerability reduction

What we offer

Competitive compensation, benefits, and career growth opportunities
Opportunity to shape and drive product security strategy
Collaborative and security-minded engineering culture
Work on cutting-edge security challenges in a fast-growing company
Performance-based bonus, equity, and a generous benefits program

Fulltime

New

Staff Infrastructure Software Engineer, Enterprise AI

Scale GP is building the next generation of enterprise-grade Generative AI produ...

Location

United States , New York; San Francisco

Salary:

216200.00 - 270250.00 USD / Year

Scale

Expiration Date

Until further notice

Requirements

Proven experience in a senior role
5+ years of full-time software engineering experience
Deep understanding of modern infrastructure practices, including CI/CD, IaC (e.g., Terraform, Helm Charts), container orchestration (e.g., Kubernetes) and observability platforms (e.g., Datadog, Prometheus, Grafana)
Extensive experience with at least one major cloud provider (AWS, Azure, or GCP)
Strong knowledge of security and compliance in enterprise environments, with a focus on access management, data isolation, and customer-specific VPC setups
Proficiency in Python or JavaScript/TypeScript, and SQL

Job Responsibility

Define the architectural patterns for our multi-cloud infrastructure to support secure, reliable, and scalable Agentic workflows for enterprise customers
Lead the infrastructure roadmap with a strong focus on compliance, privacy, and security standards, including designing change management and data isolation strategies
Own the development and maintenance of our best-in-class Agentic observability platform (logging, metrics, tracing, and analytics) to proactively ensure system health and enable rapid incident response
Drive developer efficiency by building automated tooling and championing Infrastructure-as-Code (IaC) paradigms throughout the engineering organization
Solve the toughest engineering problems related to multi-tenancy, data isolation, and high-performance inference at a massive scale, taking end-to-end ownership across the full product lifecycle

What we offer

Comprehensive health, dental and vision coverage
retirement benefits
a learning and development stipend
generous PTO
equity based compensation
additional benefits such as a commuter stipend

Fulltime

Staff ML Infrastructure Engineer

We are seeking a Staff / Principal ML Infrastructure Engineer to lead the design...

Location

United States , San Francisco

Salary:

Not provided

Darwin Recruitment GmbH

Expiration Date

Until further notice

Requirements

7+ years of software engineering experience, including 3+ years building production ML systems
Deep experience with distributed training and inference frameworks (e.g., PyTorch, JAX, TensorFlow)
Familiarity with model serving technologies and orchestration (e.g., Triton, Ray, Kubernetes)
Strong understanding of GPU/TPU infrastructure, performance optimization, and scalability challenges
Proven experience solving reliability, latency, and cost trade-offs in production ML systems
Excellent collaboration, communication, and problem-solving skills

Job Responsibility

Design, implement, and maintain high-performance infrastructure for training and serving LLMs
Optimize model pipelines for efficiency, latency, and cost at scale
Collaborate with ML researchers, platform engineers, and product teams to deploy models safely into production
Build monitoring, alerting, and tooling to ensure reliability and observability of large-scale ML systems
Evaluate and integrate new frameworks, tools, and architectures to improve ML workflows
Provide technical leadership and mentorship to other engineers on the team

What we offer

Flexible work arrangements and competitive compensation

Fulltime

Staff ML Infrastructure Engineer

We are seeking a Staff / Principal ML Infrastructure Engineer to lead the design...

Location

United States , New York

Salary:

Not provided

Darwin Recruitment GmbH

Expiration Date

Until further notice

Requirements

7+ years of software engineering experience
3+ years building production ML systems
Deep experience with distributed training and inference frameworks (e.g., PyTorch, JAX, TensorFlow)
Familiarity with model serving technologies and orchestration (e.g., Triton, Ray, Kubernetes)
Strong understanding of GPU/TPU infrastructure, performance optimization, and scalability challenges
Proven experience solving reliability, latency, and cost trade-offs in production ML systems
Excellent collaboration, communication, and problem-solving skills

Job Responsibility

Design, implement, and maintain high-performance infrastructure for training and serving LLMs
Optimize model pipelines for efficiency, latency, and cost at scale
Collaborate with ML researchers, platform engineers, and product teams to deploy models safely into production
Build monitoring, alerting, and tooling to ensure reliability and observability of large-scale ML systems
Evaluate and integrate new frameworks, tools, and architectures to improve ML workflows
Provide technical leadership and mentorship to other engineers on the team

What we offer

Flexible work arrangements
competitive compensation

Fulltime

New

Staff Software Engineer - AI Applications

Vanilla is seeking a Staff Software Engineer - AI Applications with a strong bac...

Location

United States

Salary:

190000.00 - 210000.00 USD / Year

Vanilla Technologies

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, Engineering, a related field, or equivalent practical experience
8+ years relevant work experience
Proficiency in modern programming languages such as Python or Javascript
Experience with OpenAI, Anthropic, or similar for both chat and API interfaces
Deep understanding of machine learning and AI technologies, including the ability to design, train, and implement machine learning models and use natural language processing techniques for automation
Production experience with scalability and best-practices of AI infrastructure
Must have experience with AI observability, monitoring, and signaling using tools like LangChain or LangGraph
Hands-on experience using RAG and chunking to tune LLM performance
Experienced with LLM orchestration tooling and decision frameworks
Experience or exposure building agentic capabilities and workflows

Job Responsibility

Machine learning and AI: You are passionate and knowledgeable about the current and future state of AI
You will be utilizing existing Large Language Models to build applied AI applications focused on producing high accuracy rates. Your software engineer skills will come into play here as you'll take ownership in constructing services to ingest results
You will work with product, and engineering teams and build models/services that can ingest data, extract key information and surface insights
You can build tooling to support model training, evaluation, inference serving, monitoring and alerting
You want to use the latest ML frameworks and open source tools to develop new model training pipelines
Hands On Coding: You have direct experience with software engineering and are familiar with modern languages like Python, Javascript, Go, Rust
You have experience building microservices and understand the tradeoffs of the approach
Data handling: You can identify, extract, transform, and load data from disparate sources into a centralized system. You are able to normalize, cleanse, and validate this data
Database management: You are able to design and implement schemas, optimize queries, and manage database performance
Project management: You must be an effective self-organizer: prioritize tasks, manage resources, and communicate effectively with non-technical stakeholders

What we offer

Flexible paid time off policy and 10 company-wide paid holidays
Parental leave, 4 weeks for all full-time employees and up to 12 weeks for birthing parents
Medical, dental, and vision benefits coverage for employees and their families
401K eligibility after one month of employment
Free estate planning documents
Budget for learning & development and home office setup
Paid parking or transit for hybrid and in office employees

Fulltime

Staff Software Engineer, Inference Infrastructure

Cohere

Location:

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
February 20, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Staff Software Engineer, Inference Infrastructure

Senior Staff Machine Learning Engineer

Member of Technical Staff, Cloud Infrastructure

Staff Machine Learning Engineer

Staff Product Security Engineer

Staff Infrastructure Software Engineer, Enterprise AI

Staff ML Infrastructure Engineer

Staff ML Infrastructure Engineer

Staff Software Engineer - AI Applications

Staff Software Engineer, Inference Infrastructure

Cohere

Location:

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:February 20, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Staff Software Engineer, Inference Infrastructure

Senior Staff Machine Learning Engineer

Member of Technical Staff, Cloud Infrastructure

Staff Machine Learning Engineer

Staff Product Security Engineer

Staff Infrastructure Software Engineer, Enterprise AI

Staff ML Infrastructure Engineer

Staff ML Infrastructure Engineer

Staff Software Engineer - AI Applications

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
February 20, 2026