CrawlJobs Logo

Senior Research Engineer - Inference ML

cerebras.net Logo

Cerebras Systems

Location Icon

Location:
United States; Canada , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

Job Responsibility:

  • Design, implement, and optimize state-of-the-art transformer architectures for NLP and computer vision on Cerebras hardware
  • Research and prototype novel inference algorithms and model architectures that exploit the unique capabilities of Cerebras hardware, with emphasis on speculative decoding, pruning/compression, sparse attention, and sparsity
  • Train models to convergence, perform hyperparameter sweeps, and analyze results to inform next steps
  • Bring up new models on the Cerebras system, validate functional correctness, and troubleshoot any integration issues
  • Profile and optimize model code using Cerebras tools to maximize throughput and minimize latency
  • Develop diagnostic tooling or scripts to surface performance bottlenecks and guide optimization strategies for inference workloads
  • Collaborate across teams, including software, hardware, and product, to drive projects from inception through delivery

Requirements:

  • Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Electrical Engineering, or a related technical field AND 7+ years of ML software development experience
  • OR Master’s degree in Computer Science or related technical field AND 4+ years of software development experience
  • OR PhD in Computer Science or related technical field with 2+ years of relevant research or industry experience
  • OR Equivalent practical experience
  • 4+ years of experience testing, maintaining, or launching software products, including 2+ years of experience with software design and architecture
  • 3+ years of experience in software development focused on machine learning (e.g., deep learning, large language models, or computer vision)
  • Strong programming skills in Python and/or C++
  • Experience with Generative AI and Machine Learning systems
  • Evidence of research impact in machine learning, such as publications at top conferences (NeurIPS, ICLR, ICML, ACL, EMNLP, MLSys) or comparable contributions to widely used open-source projects or high-quality preprints

Nice to have:

  • Master’s degree or PhD in Computer Science, Computer Engineering, or a related technical field
  • Experience independently driving complex ML or inference projects from prototype to production-quality implementations
  • Hands-on experience with relevant ML frameworks such as PyTorch, Transformers, vLLM, or SGLang
  • Experience with large language models, mixture-of-experts models, multimodal learning, or AI agents
  • Experience with speculative decoding, neural network pruning and compression, sparse attention, quantization, sparsity, post-training techniques, and inference-focused evaluations
  • Familiarity with large-scale model training and deployment, including performance and cost trade-offs in production systems
  • Triton/CUDA experience is a big plus
What we offer:
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs

Additional Information:

Job Posted:
February 17, 2026

Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Research Engineer - Inference ML

Senior Research Engineer

We are seeking a highly skilled Senior Research Engineer to collaborate closely ...
Location
Location
United States
Salary
Salary:
210000.00 - 309000.00 USD / Year
assembly.ai Logo
Assembly
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong expertise in the Python ecosystem and major ML frameworks (PyTorch, JAX)
  • Experience with lower-level programming (C++ or Rust preferred)
  • Deep understanding of GPU acceleration (CUDA, profiling, kernel-level optimization)
  • TPU experience is a strong plus
  • Proven ability to accelerate deep learning workloads using compiler frameworks, graph optimizations, and parallelization strategies
  • Solid understanding of the deep learning lifecycle: model design, large-scale training, data processing pipelines, and inference deployment
  • Strong debugging, profiling, and optimization skills in large-scale distributed environments
  • Excellent communication and collaboration skills, with the ability to clearly prioritize and articulate impact-driven technical solutions
Job Responsibility
Job Responsibility
  • Investigate and mitigate performance bottlenecks in large-scale distributed training and inference systems
  • Develop and implement both low-level (operator/kernel) and high-level (system/architecture) optimization strategies
  • Translate research models and prototypes into highly optimized, production-ready inference systems
  • Explore and integrate inference compilers such as TensorRT, ONNX Runtime, AWS Neuron and Inferentia, or similar technologies
  • Design, test, and deploy scalable solutions for parallel and distributed workloads on heterogeneous hardware
  • Facilitate knowledge transfer and bidirectional support between Research and Engineering teams, ensuring alignment of priorities and solutions
What we offer
What we offer
  • competitive equity grants
  • 100% employer-paid benefits
  • flexibility of being fully remote
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer (Health)

WHOOP is an advanced health and fitness wearable, on a mission to unlock human p...
Location
Location
United States , Boston
Salary
Salary:
150000.00 - 210000.00 USD / Year
whoop.com Logo
Whoop
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree in Computer Science, Data Science, Applied Mathematics, or a related field. Master’s preferred
  • 5+ years of professional experience as a Machine Learning Engineer or Software Engineer with focus on ML systems
  • Proven expertise working with time series data (wearable, physiological, or high-frequency sensor data strongly preferred)
  • Experience designing and deploying ML inference systems at scale: both real-time streaming and large-scale batch pipelines
  • Strong coding skills in Python (scientific stack) and SQL, with a track record of writing clean, production-quality code
  • Strong communication skills to collaborate across engineering, research, and product teams
  • Proven experience deploying and maintaining ML systems on cloud platforms (AWS or GCP)
  • Working familiarity with MLOps best practices: model versioning, CI/CD for ML, observability, and monitoring for inference systems
  • Ability to reason about and design for performance trade-offs (latency vs. throughput vs. cost) when building ML inference systems
  • Strong understanding of backend service development (APIs and service reliability) as it applies to serving ML models at scale
Job Responsibility
Job Responsibility
  • Create, improve, and maintain production services that provide analysis for health features in collaboration with Data Scientists and MLOps Engineers
  • Collaborate with Data Engineers to improve ML data pipelines, tooling, and validation systems that support robust model performance
  • Work alongside data scientists to translate research prototypes into production ML systems optimized for scale, latency, and cost efficiency
  • Collaborate with researchers and product teams to align model development with health insights and member impact
  • Participate in on-call rotations for data science services, ensuring uptime and performance in production environments
What we offer
What we offer
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer, Personalization and Recommendations

As a Senior Machine Learning Engineer on the Personalization & Recommendations t...
Location
Location
United States , San Francisco
Salary
Salary:
183360.00 - 248000.00 USD / Year
edtechjobs.io Logo
EdTech Jobs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in applied machine learning or ML-heavy software engineering, with a strong focus on personalization, ranking, or recommendation systems
  • Demonstrated impact improving key metrics such as CTR, retention, or engagement through recommender or search systems in production
  • Strong hands-on skills in Python and PyTorch, with expertise in data and feature engineering, distributed training and inference on GPUs, and familiarity with modern MLOps practices — including model registries, feature stores, monitoring, and drift detection
  • Deep understanding of retrieval and ranking architectures, such as Two-Tower models, deep cross networks, Transformers, or MMoE, and the ability to apply them to real-world problems
  • Experience with large-scale embedding models and vector search, including FAISS, ScaNN, or similar systems
  • Proficiency in experiment design and evaluation, connecting offline metrics (AUC, NDCG, calibration) with online A/B test outcomes to drive product decisions
  • Clear, effective communication, collaborating well with product managers, data scientists, engineers, and cross-functional partners
  • A growth and mentorship mindset, helping elevate team quality in modeling, experimentation, and reliability
  • Commitment to responsible and inclusive personalization, ensuring our systems respect learner privacy, fairness, and diverse goals
Job Responsibility
Job Responsibility
  • Design and implement personalization models across candidate retrieval, ranking, and post-ranking layers, leveraging user embeddings, contextual signals and content features
  • Develop scalable retrieval and serving systems using architectures such as Two-Tower models, deep ranking networks, and ANN-based vector search for real-time personalization
  • Build and maintain model training, evaluation, and deployment pipelines, ensuring reliability, training–serving consistency, observability, and robust monitoring
  • Partner with Product and Data Science to translate learner objectives (engagement, retention, mastery) into measurable modeling goals and experiment designs
  • Advance evaluation methodologies, contributing to offline metric design (e.g., NDCG, CTR, calibration) and supporting rigorous A/B testing to measure learner and business impact
  • Collaborate with platform and infrastructure teams to optimize distributed training, inference latency, and serving cost in production environments
  • Stay informed on industry and research trends, evaluating opportunities to meaningfully apply them within Quizlet’s ecosystem
  • Mentor junior and mid-level engineers, supporting technical growth, experimentation rigor, and responsible ML practices
  • Champion collaboration, inclusion, curiosity, and data-driven problem solving, contributing to a healthy and productive team culture
What we offer
What we offer
  • 20 vacation days
  • Competitive health, dental, and vision insurance (100% employee and 75% dependent PPO, Dental, VSP Choice)
  • Employer-sponsored 401k plan with company match
  • Access to LinkedIn Learning and other resources to support professional growth
  • Paid Family Leave, FSA, HSA, Commuter benefits, and Wellness benefits
  • 40 hours of annual paid time off to participate in volunteer programs of choice
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer

Join the Affirm team as a Senior Staff Machine Learning Engineer and become a pi...
Location
Location
United States
Salary
Salary:
232000.00 - 310000.00 USD / Year
affirm.com Logo
Affirm
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience researching, designing, deploying, and operating large-scale, real-time machine learning systems
  • Experience leading end-to-end ML system design, from data architecture and feature pipelines to model training, evaluation, and production deployment
  • Proficient in Python and ML frameworks, including PyTorch and XGBoost
  • Strong understanding of representation learning and embedding-based modeling
  • Deep expertise in neural network-based sequence modeling, including architectures such as Transformers, recurrent, or attention-based models, and multi-task learning systems
  • Deep hands-on experience with large-scale distributed ML infrastructure, including streaming or batch data ingestion, feature stores, feature engineering, training pipelines, model serving and inference infrastructure, monitoring, and automated retraining
  • Strong technical leadership: defining long-term strategy, guiding research direction, and aligning work across teams
  • Exceptional judgment, collaboration, and communication skills
  • Strong verbal and written communication skills that support effective collaboration across our global engineering organization
  • Equivalent practical experience or a Bachelor’s degree in a related field
Job Responsibility
Job Responsibility
  • Define and drive multi-year, multi-team technical strategy for machine learning across Affirm
  • Lead the design, implementation, and scaling of advanced ML systems
  • Partner deeply with ML Platform, product, engineering, and risk leadership to shape long-term modeling capabilities
  • Provide broad technical leadership across the ML organization, mentoring senior engineers
  • Drive clarity and alignment on ambiguous, high-stakes technical decisions
  • Champion operational and system excellence at the area level
What we offer
What we offer
  • Equity rewards
  • Monthly stipends for health, wellness and tech spending
  • 100% subsidized medical coverage, dental and vision for you and your dependents
  • Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
  • Competitive vacation and holiday schedules
  • Employee stock purchase plan enabling you to buy shares of Affirm at a discount
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer

Join the Affirm team as a Senior Staff Machine Learning Engineer and become a pi...
Location
Location
Canada
Salary
Salary:
206000.00 - 256000.00 CAD / Year
affirm.com Logo
Affirm
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience researching, designing, deploying, and operating large-scale, real-time machine learning systems
  • Experience leading end-to-end ML system design, from data architecture and feature pipelines to model training, evaluation, and production deployment
  • Proficiency in Python and ML frameworks, including PyTorch and XGBoost
  • Experience with ML tooling for training orchestration, experimentation, and model monitoring, such as Kubeflow, MLflow, or equivalent
  • Strong understanding of representation learning and embedding-based modeling
  • Deep expertise in neural network-based sequence modeling, including architectures such as Transformers, recurrent, or attention-based models, and multi-task learning systems
  • Deep hands-on experience with large-scale distributed ML infrastructure, including streaming or batch data ingestion, feature stores, feature engineering, training pipelines, model serving and inference infrastructure, monitoring, and automated retraining
  • Strong technical leadership: defining long-term strategy, guiding research direction, and aligning work across teams
  • Exceptional judgment, collaboration, and communication skills
  • Strong verbal and written communication skills
Job Responsibility
Job Responsibility
  • Define and drive multi-year, multi-team technical strategy for machine learning across Affirm
  • Lead the design, implementation, and scaling of advanced ML systems
  • Partner deeply with ML Platform, product, engineering, and risk leadership to shape long-term modeling capabilities
  • Provide broad technical leadership across the ML organization
  • Drive clarity and alignment on ambiguous, high-stakes technical decisions
  • Champion operational and system excellence at the area level
What we offer
What we offer
  • Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents
  • Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
  • Time off - competitive vacation and holiday schedules
  • ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Infrastructure Engineer

As a Senior ML Infrastructure Engineer at Plus, you will design scalable archite...
Location
Location
United States , Santa Clara
Salary
Salary:
160000.00 - 200000.00 USD / Year
plus.ai Logo
PlusAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Phd or MS in Computer Science, Electrical Engineering, or related field
  • Good oral and written communication skills
  • Phd new grad or Masters with 3+ years of software engineering experience with a focus on ML infrastructure or distributed systems
  • Proficiency in in Python, C++, SQL
  • Deep understanding of containerization, orchestration technologies, distributed ML workload, and experiment tracking tools (e.g., Docker, Kubernetes, multiprocessing, Kubeflow, and mlflow)
  • Deploy and manage resources across multiple cloud platforms (AWS, GCP, or on-prem environments)
  • Proficiency in at least one deep learning framework, such as PyTorch and data pipeline tools (e.g., Apache Airflow, Prefect)
  • Strong knowledge of distributed systems, databases, and storage solutions
  • Extensive software design and development skills
  • Ability to learn and adapt to new technologies and contribute in a productive environment
Job Responsibility
Job Responsibility
  • Design and develop scalable, high-performance systems for training, inference, deploying, and monitoring ML models at scale
  • Build and maintain efficient data pipelines, model versioning systems, and experiment tracking frameworks
  • Collaborate with cross-functional teams, including ML researchers and engineers, to identify bottlenecks and improve platform usability
  • Implement distributed systems and storage solutions optimized for machine learning workloadsDrive improvements in CI/CD workflows for ML models and infrastructure
  • Ensure high availability and reliability of the ML platform by implementing robust monitoring, logging, and alerting systems
  • Stay current with industry trends and integrate relevant tools and frameworks to enhance the platform
  • Mentor junior engineers and contribute to a culture of technical excellence
  • Ensure that your work is performed in accordance with the company’s Quality Management System (QMS) requirements and contribute to continuous improvement efforts
  • Ensure team compliance with QMS, monitor quality, and drive process improvements
What we offer
What we offer
  • Work, learn and grow in a highly future-oriented, innovative and dynamic field
  • Wide range of opportunities for personal and professional development
  • Catered free lunch, unlimited snacks and beverages
  • Highly competitive salary and benefits package, including 401(k) plan
  • Fulltime
Read More
Arrow Right
New

Principal Engineer - Marketplace

Principal Engineer role in the Marketplace Engineering team to lead breakthrough...
Location
Location
United States , San Francisco; Sunnyvale
Salary
Salary:
302000.00 - 336000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Machine Learning, Operations Research, or related quantitative field OR Master’s degree with 12+ years of industry experience
  • 10+ years of experience building and deploying ML models in large-scale production environments
  • Expert-level proficiency in modern ML frameworks (TensorFlow, PyTorch, JAX) and distributed computing platforms (Spark, Ray)
  • Deep expertise across multiple areas including: Deep Learning, Causal Inference, Reinforcement Learning, Multi-objective Optimization, Algorithmic Game Theory, and Large-scale Ads Ranking/Auction Systems
  • Proven track record of leading complex ML projects from research through production with significant measurable business impact
  • Strong programming skills in Python, Java, or Go with experience building production ML systems
  • Experience with feature engineering, model serving, and ML infrastructure at scale (handling millions of predictions per second)
  • Technical leadership experience including mentoring senior engineers and driving cross-team technical initiatives
  • Advanced Deep Learning and Neural Network architectures
  • Scalable ML architecture and distributed model training
Job Responsibility
Job Responsibility
  • Lead the design and implementation of advanced ML systems for dynamic pricing algorithms serving millions of drivers across 70+ countries around the world
  • Architect real-time ML infrastructure handling 1M+ pricing decisions per second with sub-50ms latency requirements
  • Drive breakthrough research in causal ML, reinforcement learning, algorithmic game theory, and multi-objective optimization for marketplace optimization with strategic agents
  • Own end-to-end ML model lifecycle from research through production deployment and continuous optimization
  • Develop and enforce best practices in system design, ensuring data integrity, security, and optimal performance
  • Serve as a representative for the Marketplace organization to the broader internal and external technical community
  • Contribute to the eng brand for Marketplace and serve as a talent magnet to help attract and retain talent for the team
  • Stay abreast of industry trends and emerging technologies in software engineering, focused particularly on ML/AI, to enhance our systems and processes continually
  • Build scalable ML architecture and feature management systems supporting Driver Pricing and broader Marketplace teams
  • Design experimentation frameworks enabling rapid testing of pricing algorithms using A/B, Switchback, Synthetic Control, and other experimental methodologies
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • Eligible to participate in a 401(k) plan
  • Eligible for various benefits (details at provided link)
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer - Driver Pricing & Marketplace Optimization

We’re seeking an exceptional Senior Staff ML Engineer to lead breakthrough ML in...
Location
Location
United States , Sunnyvale, California; San Francisco, California
Salary
Salary:
267000.00 - 297000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Machine Learning, Operations Research, or related quantitative field OR Master’s degree with 12+ years of industry experience
  • 10+ years of experience building and deploying ML models in large-scale production environments
  • Expert-level proficiency in modern ML frameworks (TensorFlow, PyTorch, JAX) and distributed computing platforms (Spark, Ray)
  • Deep expertise across multiple areas including: Deep Learning, Causal Inference, Reinforcement Learning, Multi-objective Optimization, Algorithmic Game Theory, and Large-scale Ads Ranking/Auction Systems
  • Proven track record of leading complex ML projects from research through production with significant measurable business impact
  • Strong programming skills in Python, Java, or Go with experience building production ML systems
  • Experience with feature engineering, model serving, and ML infrastructure at scale (handling millions of predictions per second)
  • Technical leadership experience including mentoring senior engineers and driving cross-team technical initiatives
Job Responsibility
Job Responsibility
  • Lead the design and implementation of advanced ML systems for dynamic pricing algorithms serving millions of drivers across 70+ countries around the world
  • Architect real-time ML infrastructure handling 1M+ pricing decisions per second with sub-50ms latency requirements
  • Drive breakthrough research in causal ML, reinforcement learning, algorithmic game theory, and multi-objective optimization for marketplace optimization with strategic agents
  • Own end-to-end ML model lifecycle from research through production deployment and continuous optimization
  • Build scalable ML architecture and feature management systems supporting Driver Pricing and broader Marketplace teams
  • Design experimentation frameworks enabling rapid testing of pricing algorithms using A/B, Switchback, Synthetic Control, and other experimental methodologies
  • Establish ML engineering best practices, monitoring, and operational excellence across the organization
  • Create platform abstractions that enable other ML engineers to iterate faster on pricing algorithms
  • Partner with Product, Operations, and Earner Experience teams to translate complex business requirements into ML solutions
  • Collaborate with Marketplace Engineering and Science teams to productionize cutting-edge ML research
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right