CrawlJobs Logo

Research Engineer, Core ML

United States, San Francisco 200000.00 - 280000.00 USD / Year · Job Posted March 10, 2026
Apply Position
Job Link Share

Job Description

This is a research engineering role with direct production impact. You will translate new RL algorithms, scheduling methods, and inference optimizations into production-grade systems that power Together’s API. Success means shipping measurable improvements in latency, throughput, cost, and model quality at scale. The Core ML (Turbo) team sits at the intersection of efficient inference and post‑training / RL systems, building and operating the systems behind Together’s API.

Job Responsibility

  • Advance inference efficiency end‑to‑end
  • Design and prototype algorithms, architectures, and scheduling strategies for low‑latency, high‑throughput inference
  • Implement and maintain changes in high‑performance inference engines
  • Profile and optimize performance across GPU, networking, and memory layers
  • Unify inference with RL / post‑training
  • Design and operate RL and post‑training pipelines
  • Make RL and post‑training workloads more efficient with inference‑aware training loops
  • Co‑design algorithms and infrastructure
  • Run ablations and scale‑up experiments to understand trade‑offs
  • Own critical systems at production scale
  • Profile, debug, and optimize inference and post-training services under real production workloads
  • Drive roadmap items that require real engine modification
  • Establish metrics, benchmarks, and experimentation frameworks
  • Provide technical leadership (Staff level)
  • Set technical direction for cross‑team efforts
  • Mentor other engineers and researchers

Requirements

  • 3+ years of experience working on ML systems, large‑scale model training, inference, or adjacent areas (or equivalent experience via research / open source)
  • Advanced degree in Computer Science, EE, or a related field, or equivalent practical experience
  • Demonstrated experience owning complex technical projects end‑to‑end
  • Strong expertise in at least one of the following: Large‑scale inference systems (e.g., SGLang, vLLM, FasterTransformer, TensorRT, custom engines, or similar), GPU performance, distributed serving
  • RL / post‑training for LLMs or large models (e.g., GRPO, RLHF/RLAIF, DPO‑like methods, reward modeling)
  • Model architecture design for Transformers or other large neural nets
  • Distributed systems / high‑performance computing for ML
  • Strong coding ability in Python
  • Experience profiling and optimizing performance across GPU, networking, and memory layers
  • Track record of impactful work in ML systems, RL, or large‑scale model training (papers, open‑source projects, or production systems)

Nice to have

  • Bias toward implementation and shipping
  • Comfortable working from algorithms to engines
  • Able to take a new sampling method, scheduler, or RL update and turn it into a production‑grade implementation
  • Solid research foundation in your area(s) of depth
  • Can read new RL / post‑training papers, understand their implications on the stack, and design minimal, correct changes
  • Operate well as a full‑stack problem solver
  • Enjoy collaborating with infra, research, and product teams

What we offer

  • Startup equity
  • Health insurance
  • Competitive benefits

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Research Engineer, Core ML

8 matching positions

Member of Technical Staff - ML Research Engineer, Multi-Modal - Audio

Our Audio team is building frontier speech-language models that handle STT, TTS,...
Location
Location
United States , San Francisco, Boston
Salary
Salary:
Not provided
liquid.ai Logo
Liquid AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming fundamentals with demonstrated ability to write clean, maintainable, production-grade code
  • Experience building and shipping production ML systems beyond model training (data pipelines, evals, serving infrastructure)
  • Proficiency in PyTorch and familiarity with distributed training frameworks (DeepSpeed, FSDP, or similar)
  • Track record of collaborating effectively in shared codebases with high engineering standards
Job Responsibility
Job Responsibility
  • Build and scale data pipelines for audio model training, including preprocessing, augmentation, and quality filtering at scale
  • Design, implement, and maintain evaluation systems that measure multimodal performance across internal and public benchmarks
  • Fine-tune and adapt audio models for customer-specific use cases, owning delivery from requirements through deployment
  • Contribute production code to the core audio repository, collaborating with infrastructure and research teams
  • Support experimentation under real hardware constraints, shifting between customer work and core development as priorities evolve
What we offer
What we offer
  • Competitive base salary with equity in a unicorn-stage company
  • We pay 100% of medical, dental, and vision premiums for employees and dependents
  • 401(k) matching up to 4% of base pay
  • Unlimited PTO plus company-wide Refill Days throughout the year
  • Fulltime
Read More
Arrow Right

AI Research Engineer, FB Media Core - Video

80+% of all internet traffic is video and images. The Media Core - Video team is...
Location
Location
United States , Bellevue
Salary
Salary:
181000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • 1+ year of industry experience as a researcher/specialist in computer vision, neural compression, or related AI/ML domain
  • Extensive experience working with with video and/or image models
  • Programming experience in Python and hands-on experience with frameworks like PyTorch or Spark
Job Responsibility
Job Responsibility
  • Use expertise in image/video models to accelerate the frontiers of AI-based video compression and enhancement
  • Design and implement ML-based quality metrics and metadata for traditional and novel audio/video processing use cases, pre/post processing and GenAI videos
  • Build and optimize smart composition (ML, CV, and AI) and rendering for Calling, Creators (Edits) and native/Ads videos
  • Create and refine intelligent algorithms to improve video quality for calling and video conferencing
  • Drive cross-functional impact across the end-to-end stack to maximize multimedia quality, from ingest to delivery
  • Push state-art on video enhancement (super-resolution, restoration, SDR-HDR, frame-rate conversion etc.)
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI/ML Research Engineer

Build the systems that expand human capability. At Blackrock Neurotech, we've sp...
Location
Location
United States , Salt Lake City
Salary
Salary:
Not provided
blackrockneurotech.com Logo
Blackrock Neurotech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on experience building and training deep learning models, or a PhD in Machine Learning, Computer Science, Computational Neuroscience, or related field with applied industry experience
  • Strong experience with PyTorch (or similar modern ML frameworks) and fluency in Python
  • Solid software engineering practices including version control, testing, code review, and reproducibility
  • Experience designing model architectures and understanding training dynamics, optimization, and compute tradeoffs at scale
  • Ability to design clean experiments, analyze results rigorously, and make data-driven decisions
  • Comfortable working in ambiguous, research-oriented environments with imperfect or evolving datasets
  • Strong written and verbal communication skills across technical and non-technical stakeholders
  • Demonstrated ownership, follow-through, and intellectual honesty in problem solving
Job Responsibility
Job Responsibility
  • Own substantial pieces of our core modeling work end-to-end, from preparing and curating large neural datasets to designing and running training experiments to analyzing results and turning findings into the next round of model improvements
  • Write and review model and pipeline code, launch and monitor training runs, debug issues that surface at scale, and analyze results to understand not just whether a model works but why
  • Shape initiatives spanning dataset curation, training infrastructure, model architecture, and evaluation methodology, with room to lead specific experimental threads as you build context
  • Fulltime
Read More
Arrow Right

Staff ML Engineer, Inference Platform

The ML Inference Platform is part of the AI Compute Platforms organization withi...
Location
Location
United States , Sunnyvale
Salary
Salary:
185500.00 - 270000.00 USD / Year
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of industry experience, with focus on machine learning systems or high performance backend services
  • Expertise in either Go, Python, C++ or other relevant coding languages
  • Expertise in ML inference, model serving frameworks (triton, rayserve, vLLM etc)
  • Strong communication skills and a proven ability to drive cross-functional initiatives
  • Experience working with cloud platforms such as GCP, Azure, or AWS
  • Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities
Job Responsibility
Job Responsibility
  • Design and implement core platform backend software components
  • Collaborate with ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value
  • Lead technical decision-making on model serving strategies, orchestration, caching, model versioning, and auto-scaling mechanisms
  • Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization of inference services
  • Proactively research and integrate state-of-the-art model serving frameworks, hardware accelerators, and distributed computing techniques
  • Lead large-scale technical initiatives across GM’s ML ecosystem
  • Raise the engineering bar through technical leadership, establishing best practices
  • Contribute to open source projects
  • represent GM in relevant communities
What we offer
What we offer
  • medical
  • dental
  • vision
  • Health Savings Account
  • Flexible Spending Accounts
  • retirement savings plan
  • sickness and accident benefits
  • life insurance
  • paid vacation & holidays
  • tuition assistance programs
  • Fulltime
Read More
Arrow Right

AI Research Engineer

AI Research Engineers at Hex partner with product teams to build industry-leadin...
Location
Location
United States , San Francisco or New York
Salary
Salary:
214000.00 - 285000.00 USD / Year
hex.tech Logo
Her
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience getting AI/ML capabilities into production and serving real users
  • Understanding of core MLOps/SW Architecture concepts for modern ML-based applications
  • Comfortable working in both Python & JS/TS
  • Experimentalist mindset
  • Interest in the data space, and a love of shipping great products and building tools that empower end users to do more
  • Experience maintaining a high quality bar for design, correctness, and testing
Job Responsibility
Job Responsibility
  • Building features and experiences from 0 to 1
  • Partnering on determining the architecture and stack for our AI-enabled capabilities
  • Shipping product experiences that fundamentally change the way that Data Scientists and Analysts operate
  • Working at the cutting edge of production AI applications
  • Driving Hex's context engine and pushing forward the capabilities of our Notebook Agent
What we offer
What we offer
  • Market-benched salary & equity
  • Comprehensive health benefits
  • Flexible paid time off
  • Fulltime
Read More
Arrow Right

AI Research Engineer, Search and Context

AI Research Engineers at Hex partner with product teams to build industry-leadin...
Location
Location
United States , SF, NYC, or Remote
Salary
Salary:
225000.00 - 285000.00 USD / Year
hex.tech Logo
Her
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience building and measuring high quality search and recommendation systems
  • Experience getting AI/ML capabilities into production and serving real users
  • A lot of enthusiasm for applications of AI to real business problems
  • Understanding of core MLOps/SW Architecture concepts for modern ML-based applications
  • Comfortable working in both Python & JS/TS
  • Experimentalist mindset
  • Interest in the data space, and a love of shipping great products and building tools that empower end users to do more
  • Experience maintaining a high quality bar for design, correctness, and testing
Job Responsibility
Job Responsibility
  • Experimenting with new agentic techniques for search, discovery, and context management
  • Designing and implementing the architecture for our scalable search and indexing pipelines
  • Working at the cutting edge of production AI applications deployed to real customers
What we offer
What we offer
  • Market-benched salary & equity
  • Comprehensive health benefits
  • Flexible paid time off
  • Fulltime
Read More
Arrow Right

ML Engineer, Training Infrastructure

You’ll take on challenging engineering tasks crucial to the development of tabul...
Location
Location
Germany; United States , Berlin; Freiburg; New York; San Francisco
Salary
Salary:
Not provided
priorlabs.ai Logo
Prior Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Exceptional software engineering fundamentals and expert-level Python proficiency, with 5+ years of hands-on industry experience building and operating production systems
  • Proven track record of designing and building complex, scalable software, preferably for data processing or distributed systems
  • Deep, practical knowledge of the modern ML ecosystem (PyTorch, scikit-learn, etc.) and a genuine interest in applying systems thinking to solve hard problems in AI
  • Core MLOps Concepts: Strong understanding of the entire machine learning lifecycle (MLLC) from data ingestion and preparation to model deployment, monitoring, and retraining. Familiarity with MLOps principles and best practices (e.g., reproducibility, versioning, automation, continuous integration/delivery for ML)
Job Responsibility
Job Responsibility
  • Training & research compute infrastructure: Own our cloud GPU cluster (operations, reliability, and cost/performance) currently based on Slurm. Design and implement future versions as our compute needs scale and we expand across multiple cloud/HPC providers
  • Training & inference performance: Work closely with researchers to identify and resolve performance bottlenecks in distributed training and inference. Support high hardware utilization and efficient memory usage through systems-level debugging, profiling, and infrastructure improvements
  • Developer productivity: Manage our internal repositories on GitHub and keep their CI and other pipelines speedy. Ensure our experiment tracking, model registry, data processing pipelines are working smoothly
  • Try out your own ideas! We operate an open environment. If you’ve got the next SOTA tabular architecture up your sleeve, go ahead and train it
What we offer
What we offer
  • Competitive compensation package with meaningful equity
  • 30 days of paid vacation + public holidays
  • Comprehensive benefits including healthcare, transportation, and fitness
  • Work with state-of-the-art ML architecture, substantial compute resources and with a world-class team
  • Fulltime
Read More
Arrow Right

Research Engineer

As a Research Scientist on our team, you will work on real production use cases ...
Location
Location
United States , New York
Salary
Salary:
170000.00 - 300000.00 USD / Year
withtandem.com Logo
Tandem
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills and general Computer Science knowledge
  • Strong research background in ML/NLP, demonstrated through publications in top-tier conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP) or significant open-source contributions
  • Experience working on complex ML problems (e.g., data-efficient learning, reasoning agents, or multi-step workflows) and deploying those solutions in production environments
  • Deep understanding of modern ML methods, including transformer architectures, attention mechanisms, reinforcement learning, and multimodal models — with proficiency in deep learning frameworks such as PyTorch, TensorFlow, or JAX
  • Strong written and verbal communication that allows you to be an effective participant in both internal debates and external relationships
  • Track record of moving quickly, finding shortcuts, and going to unreasonable lengths to deliver on goals
  • High NPS with your former teammates
Job Responsibility
Job Responsibility
  • Scope and spearhead AI augmentation and automation projects across our product surface area, including: Unintuitive classifications, Data extraction and summarization, Precise content generation, Reference-based search and question answering, Process outcome prediction, Probabilistic triggering of workflows, and Multimodal LLM-powered bots
  • Stay on top of emerging AI methods and guide decisions around which models and techniques to adopt––including evaluating when to use open-source models, proprietary models, and custom fine-tuning approaches
  • Establish research strategies for various AI methods, including rigorous experimentation and evaluation protocols that account for accuracy, consistency, interpretability, and real-world impact
  • Develop novel algorithms and techniques to address core research problems in natural language processing, data extraction, and autonomous reasoning (e.g., few-shot learning, agentic reasoning, and multi-modal interaction)
  • Participate actively in client engagements, working directly with customers to understand requirements and deliver innovative solutions
  • Work closely with the rest of our team and CEO to make business decisions as we balance speed of growth and long-term profitability
What we offer
What we offer
  • Fully covered medical, vision, and dental insurance
  • Memberships for One Medical, Talkspace, Teladoc, and Kindbody
  • Unlimited paid time off (PTO) and 16 weeks of parental leave
  • 401K plan setup, FSA option, commuter benefits, and DashPass
  • Lunch at the office every day and Dinner at the office after 7 pm
  • Offers Equity
  • Fulltime
Read More
Arrow Right