CrawlJobs Logo

Member of Technical Staff, Model Efficiency

cohere.com Logo

Cohere

Location Icon

Location:
United States; Canada; France; South Korea; United Kingdom , New York

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Our team is a fast-growing group of researchers and engineers focused on building reliable ML systems and pushing the boundaries of LLM inference efficiency. We develop techniques that improve how models execute in production, driving lower latency, higher throughput, and consistent quality across diverse workloads.

Job Responsibility:

  • Work across the inference stack to improve core performance metrics by diving deep into model execution, identifying bottlenecks, and developing innovative optimizations
  • Collaborate closely with modeling and systems teams to experiment, measure, and ship improvements that meaningfully accelerate inference
  • Build expertise in advanced performance techniques, including GPU/CUDA optimizations, kernel-level improvements, and model execution strategies for MoE and large-scale architectures

Requirements:

  • 5+ years of experience writing high-performance, production-quality code
  • Strong programming skills in C++ or Python (Rust/Go also welcome)
  • Experience working with large language models and familiarity with the LLM inference ecosystem (e.g., vLLM, SGLang, etc.)
  • Ability to diagnose and resolve performance bottlenecks across the model execution stack
  • A strong bias for action — you ship fast, measure impact, and iterate

Nice to have:

  • GPU programming, CUDA, or low-level systems optimization
  • Language modeling with transformers (MoE, speculative decoding, KV-cache optimizations)
  • Scaling performance-critical distributed systems (e.g., computation, search, storage)
What we offer:
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)

Additional Information:

Job Posted:
February 20, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff, Model Efficiency

Member of Technical Staff – Model Training

At Inflection AI, our public benefit mission is to harness the power of AI to im...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have hands-on experience training and fine-tuning large transformer models on multi-GPU / multi-node clusters
  • Are fluent in PyTorch and its ecosystem tools (Torchtune, FSDP, DeepSpeed) and enjoy digging into distributed-training internals, mixed precision, and memory-efficiency tricks
  • Have shipped or published work in RLHF, DPO, GRPO, or RLAIF and understand their practical trade-offs
  • Care deeply about training tools, pipelines, and reproducibility—you automate the boring parts so you can iterate on the fun parts
  • Balance research curiosity with product pragmatism—you know when to run an ablation and when to ship
  • Communicate crisply with both technical and non-technical teammates
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Contribute to end-to-end post-training workflows—dataset curation, hyper-parameter search, evaluation, and rollout—using PyTorch, Torchtune, FSDP/DeepSpeed, and our internal orchestration stack
  • Prototype and compare alignment techniques (e.g., curriculum RL, multi-objective reward modeling, tool-use fine-tuning) and push the best ideas into production
  • Automate training at scale: build robust pipeline components, tools, scripts, and dashboards so experiments are reproducible and easy to trace
  • Define the metrics that matter
  • run A/B tests and iterate quickly to meet aggressive quality targets
  • Collaborate with inference, safety, and product teams to land improvements in customer-facing systems
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Competitive stock options
Read More
Arrow Right

Member of Technical Staff, Performance Optimization

We're looking for a Software Engineer focused on Performance Optimization to hel...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience
  • 5+ years of experience working on performance optimization or high-performance computing systems
  • Proficiency in CUDA or ROCm and experience with GPU profiling tools (e.g., Nsight, nvprof, CUPTI)
  • Familiarity with PyTorch and performance-critical model execution
  • Experience with distributed system debugging and optimization in multi-GPU environments
  • Deep understanding of GPU architecture, parallel programming models, and compute kernels
Job Responsibility
Job Responsibility
  • Optimize system and GPU performance for high-throughput AI workloads across training and inference
  • Analyze and improve latency, throughput, memory usage, and compute efficiency
  • Profile system performance to detect and resolve GPU- and kernel-level bottlenecks
  • Implement low-level optimizations using CUDA, Triton, and other performance tooling
  • Drive improvements in execution speed and resource utilization for large-scale model workloads (LLMs, VLMs, and video models)
  • Collaborate with ML researchers to co-design and tune model architectures for hardware efficiency
  • Improve support for mixed precision, quantization, and model graph optimization
  • Build and maintain performance benchmarking and monitoring infrastructure
  • Scale inference and training systems across multi-GPU, multi-node environments
  • Evaluate and integrate optimizations for emerging hardware accelerators and specialized runtimes
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Cloud Infrastructure

As a Software Engineer on our Cloud Infrastructure team, you'll be at the forefr...
Location
Location
United States , New York, NY; San Mateo, CA; Redwood City, CA
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure)
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Strong software development skills in languages like Python, or C++
  • Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization
Job Responsibility
Job Responsibility
  • Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines
  • Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure
  • Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency
  • Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning
  • Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions
  • Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Ray, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability
  • Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, AI Training Infrastructure

As a Training Infrastructure Engineer, you'll design, build, and optimize the in...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience
  • 3+ years of experience with distributed systems and ML infrastructure
  • Experience with PyTorch
  • Proficiency in cloud platforms (AWS, GCP, Azure)
  • Experience with containerization, orchestration (Kubernetes, Docker)
  • Knowledge of distributed training techniques (data parallelism, model parallelism, FSDP)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for large-scale model training workloads
  • Develop and maintain distributed training pipelines for LLMs and multimodal models
  • Optimize training performance across multiple GPUs, nodes, and data centers
  • Implement monitoring, logging, and debugging tools for training operations
  • Architect and maintain data storage solutions for large-scale training datasets
  • Automate infrastructure provisioning, scaling, and orchestration for model training
  • Collaborate with researchers to implement and optimize training methodologies
  • Analyze and improve efficiency, scalability, and cost-effectiveness of training systems
  • Troubleshoot complex performance issues in distributed training environments
What we offer
What we offer
  • meaningful equity in a fast-growing startup
  • comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Research

As a Member of Technical Staff on the Research team, you’ll push the boundaries ...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 240000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Research background in Artificial Intelligence, Machine Learning, Physics, or similar field
  • Experience solving analytical problems using analytic and quantitative approaches
  • Experience communicating research to audiences with different backgrounds
  • Experience coding in C/C++, Python, or other similar languages
Job Responsibility
Job Responsibility
  • Conduct foundational research to advance the capabilities, efficiency, and reliability of LLMs and multimodal systems
  • Design, implement, and evaluate novel model architectures, training methods, and optimization techniques
  • Collaborate with engineering teams to transition research prototypes into production-grade systems
  • Analyze empirical results, identify performance bottlenecks, and iterate quickly to improve model quality
  • Contribute to internal research strategy by identifying high-impact opportunities and emerging trends in AI
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right
New

Staff Software Engineer, Ops Efficiency

The Ops Efficiency team focuses on driving operational excellence by automating ...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
airwallex.com Logo
Airwallex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science or related fields
  • At least 5+ years of software engineering experience
  • Strong ability to quickly understand complex business processes and propose effective technical solutions
  • Excellent organizational and communication skills, with proven ability to collaborate across different teams and stakeholders
  • Solid software engineering skills, with experience in backend development using Kotlin or Java
  • Hands-on experience in building scalable, robust automation or workflow systems
  • Self-driven and proactive in identifying problems and driving improvements
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable automation tools and systems that improve internal operational efficiency and customer onboarding experience
  • Develop and implement AI and data-driven solutions to predict user risks and prevent fraudulent/malicious activities
  • Collaborate with stakeholders (product managers, operations, compliance, and other engineering teams) to define requirements and deliver impactful solutions
  • Participate in architectural discussions and code reviews, contributing to high-quality engineering standards
  • Continuously improve system reliability, performance, and scalability
  • Serve as a technical role model for junior team members and support their technical growth when needed
  • Fulltime
Read More
Arrow Right

Member of Technical Staff - Responsible AI

Core AI is at the forefront of Microsoft’s mission to reinvent how software is b...
Location
Location
United States , New York
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
  • Demonstrated track record of high-impact innovation, open-source contributions or publications
  • Experience working on safety, alignment, or Responsible AI
  • 6+ years of technical engineering experience designing and delivering highly available, large-scale cloud services
  • 4+ years of technical engineering experience with machine learning models
  • Ability to navigate the company, and influence and inspire peers in engineering and broad product development
Job Responsibility
Job Responsibility
  • Prototype, build, and iterate on new Responsible AI tools and capabilities that make it simple for developers to build and deploy AI responsibly at scale
  • Apply subject-matter expertise in cross-product features, collaborating with appropriate stakeholders to drive project plans, release plans, and deliverables across multiple groups
  • Proactively seek out new knowledge and adapt to new trends, technical solutions, and patterns that improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale, and share knowledge with other engineers
  • Work effectively in a fast-paced product environment where rapid prototyping, tight feedback loops, and iterative learning are core to how we build
  • Fulltime
Read More
Arrow Right

Sfcc Technical Lead

The SFCC Technical Lead is primarily responsible for producing quality, on-budge...
Location
Location
Salary
Salary:
Not provided
grinteq.com Logo
Grinteq
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience with software development
  • 5+ years of experience with Salesforce Commerce Cloud (ex Demandware), ideally holding a leadership role on at least 2 full-scale projects
  • Strong knowledge and experience with integrations to back-end systems, in particular other systems in the Salesforce landscape
  • Ability to come up with accurate development estimates based on high-level business and/or technical requirements
  • Excellent knowledge of design patterns, OOP, coding standards, algorithm performance & optimization
  • Good understanding of data structures, JavaScript, RESTful JSON, browser-based DOM manipulation
  • Extensive experience with debugging, reuse, source code, management strategies (e.g. forking, branching), and release management
  • Knowledge of interactions with enterprise 3PL solutions (ERP, CRM, OMS, PIM) using web services & job
  • Experience with production launch readiness and cloud-based deployment models
  • Excellent knowledge of performance optimization techniques
Job Responsibility
Job Responsibility
  • Work with client’s IT organization to establish technology strategy at an application level
  • Facilitate group discussions and lead client requirement activities
  • able to translate user requirements into functional specifications for development teams
  • Establish high, mid and micro level plans and set technical direction for a small team
  • lead the estimation effort for projects
  • work to identify and manage risk and control scope
  • Strong knowledge and expertise regarding SFCC Platform gained in direct interaction with our projects
  • Leads teams of 2- 5 members to deliver to the highest quality, exceeding customer expectations
  • Work closely with a local team to create high quality e-commerce sites built on the SFCC platform
  • Analyze client business needs and recommend innovative solutions that leverage technology to provide market differentiation, efficiency improvements, and better user experiences
What we offer
What we offer
  • A decent salary level which allows you to think about our mutual success and not about tomorrow
  • Flexible working hours. You create your own schedule
  • Possibility to work remotely. You prefer home office or traveling around? Easy, that's exactly how we operate
Read More
Arrow Right