CrawlJobs Logo

Research Engineer, Core ML

together.ai Logo

Together AI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

200000.00 - 280000.00 USD / Year

Job Description:

This is a research engineering role with direct production impact. You will translate new RL algorithms, scheduling methods, and inference optimizations into production-grade systems that power Together’s API. Success means shipping measurable improvements in latency, throughput, cost, and model quality at scale. The Core ML (Turbo) team sits at the intersection of efficient inference and post‑training / RL systems, building and operating the systems behind Together’s API.

Job Responsibility:

  • Advance inference efficiency end‑to‑end
  • Design and prototype algorithms, architectures, and scheduling strategies for low‑latency, high‑throughput inference
  • Implement and maintain changes in high‑performance inference engines
  • Profile and optimize performance across GPU, networking, and memory layers
  • Unify inference with RL / post‑training
  • Design and operate RL and post‑training pipelines
  • Make RL and post‑training workloads more efficient with inference‑aware training loops
  • Co‑design algorithms and infrastructure
  • Run ablations and scale‑up experiments to understand trade‑offs
  • Own critical systems at production scale
  • Profile, debug, and optimize inference and post-training services under real production workloads
  • Drive roadmap items that require real engine modification
  • Establish metrics, benchmarks, and experimentation frameworks
  • Provide technical leadership (Staff level)
  • Set technical direction for cross‑team efforts
  • Mentor other engineers and researchers

Requirements:

  • 3+ years of experience working on ML systems, large‑scale model training, inference, or adjacent areas (or equivalent experience via research / open source)
  • Advanced degree in Computer Science, EE, or a related field, or equivalent practical experience
  • Demonstrated experience owning complex technical projects end‑to‑end
  • Strong expertise in at least one of the following: Large‑scale inference systems (e.g., SGLang, vLLM, FasterTransformer, TensorRT, custom engines, or similar), GPU performance, distributed serving
  • RL / post‑training for LLMs or large models (e.g., GRPO, RLHF/RLAIF, DPO‑like methods, reward modeling)
  • Model architecture design for Transformers or other large neural nets
  • Distributed systems / high‑performance computing for ML
  • Strong coding ability in Python
  • Experience profiling and optimizing performance across GPU, networking, and memory layers
  • Track record of impactful work in ML systems, RL, or large‑scale model training (papers, open‑source projects, or production systems)

Nice to have:

  • Bias toward implementation and shipping
  • Comfortable working from algorithms to engines
  • Able to take a new sampling method, scheduler, or RL update and turn it into a production‑grade implementation
  • Solid research foundation in your area(s) of depth
  • Can read new RL / post‑training papers, understand their implications on the stack, and design minimal, correct changes
  • Operate well as a full‑stack problem solver
  • Enjoy collaborating with infra, research, and product teams
What we offer:
  • Startup equity
  • Health insurance
  • Competitive benefits

Additional Information:

Job Posted:
March 10, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer, Core ML

AI Researcher, Core ML

As an AI Researcher, you will be pushing the frontier of foundation model resear...
Location
Location
United States , San Francisco
Salary
Salary:
160000.00 - 230000.00 USD / Year
together.ai Logo
Together AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong background in Machine Learning
  • Experience in building state-of-the-art models at large scale
  • Experience in developing algorithms in areas such as optimization, model architecture, and data-centric optimizations
  • Passion in contributing to the open model ecosystem and pushing the frontier of open models
  • Excellent problem-solving and analytical skills
  • Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field
Job Responsibility
Job Responsibility
  • Develop novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that significantly improve over state-of-the-art
  • Take advantage of the computational infrastructure of Together to create the best open models in their class
  • Understand and improve the full lifecycle of building open models
  • release and publish your insights (blogs, academic papers etc.)
  • Collaborate with cross-functional teams to deploy your models and make them available to a wider community and customer base
  • Stay up-to-date with the latest advancements in machine learning
What we offer
What we offer
  • competitive compensation
  • startup equity
  • health insurance
  • other competitive benefits
  • Fulltime
Read More
Arrow Right

Staff / Principal Machine Learning Engineer

Our intelligent runtime must seamlessly connect to foundational models - whether...
Location
Location
United States , Mountain View
Salary
Salary:
240000.00 - 385000.00 USD / Year
inworld.ai Logo
Inworld AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A PhD in a relevant technical field, or a BA/BS degree with equivalent research and/or engineering experience
  • 5+ years of combined experience in software development (Python, C++) and applied ML engineering
  • Demonstrated experience applying or researching ML in domains such as natural language processing, speech processing, and/or action planning
  • Strong foundation in data structures, algorithms, and neural network architectures
  • Proficiency with ML frameworks such as PyTorch
Job Responsibility
Job Responsibility
  • Experiment with and implement cutting-edge ML models and techniques to advance our core AI capabilities
  • Train, evaluate, and optimize production-scale models and systems, focusing on quality, latency, cost, and on-device constraints
  • Collaborate with product and backend teams to translate novel ideas and research findings into robust, production-ready solutions
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • relocation assistance
  • Fulltime
Read More
Arrow Right

AI Content Engineer

Join us and help shape the future of AI by architecting next-generation knowledg...
Location
Location
United States , San Francisco
Salary
Salary:
Not provided
llamaindex.ai Logo
LlamaIndex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in software engineering (ML engineering + research a bonus)
  • Strong software engineering fundamentals with production Python experience
  • Understanding of modern ML techniques, particularly in computer vision, NLP, or multimodal learning
  • Demonstrated ability to write clearly, quickly, and authentically about technical topics
  • Bias toward shipping - comfortable publishing at blog pace, not paper pace
  • Ability to read, understand, and synthesize research papers rapidly
  • Scrappy and self-directed - can identify what's worth writing about and execute end-to-end
  • Track record of high-velocity output in fast-paced environments
Job Responsibility
Job Responsibility
  • Design, build, and maintain comprehensive benchmarks for document parsing and understanding
  • Publish high-quality technical content at a weekly cadence (blog posts, benchmark reports, technical comparisons, tutorials)
  • Stay deeply current with the document AI landscape - new models, papers, competitors, techniques
  • Run experiments and translate findings into publishable artifacts quickly
  • Produce technical analyses that demonstrate our capabilities against alternatives
  • Contribute to open-source examples, notebooks, and documentation
  • Collaborate with the core ML team to surface improvements and capabilities worth highlighting
  • Engage authentically with the developer community through technical content (not conferences/events)
What we offer
What we offer
  • Shape the Narrative: Your content will define how developers think about document understanding. You'll have direct influence on market perception
  • Technical Credibility: Work with cutting-edge document AI systems processing millions of documents. Your benchmarks and analyses will be grounded in real capabilities
  • High Autonomy: Significant freedom to identify what matters and publish quickly. No lengthy approval chains
  • Growth Opportunity: Help build this function from the ground up as we scale
  • Fulltime
Read More
Arrow Right

Member of Technical Staff - ML Research Engineer, Multi-Modal - Audio

Our Audio team is building frontier speech-language models that handle STT, TTS,...
Location
Location
United States , San Francisco, Boston
Salary
Salary:
Not provided
liquid.ai Logo
Liquid AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming fundamentals with demonstrated ability to write clean, maintainable, production-grade code
  • Experience building and shipping production ML systems beyond model training (data pipelines, evals, serving infrastructure)
  • Proficiency in PyTorch and familiarity with distributed training frameworks (DeepSpeed, FSDP, or similar)
  • Track record of collaborating effectively in shared codebases with high engineering standards
Job Responsibility
Job Responsibility
  • Build and scale data pipelines for audio model training, including preprocessing, augmentation, and quality filtering at scale
  • Design, implement, and maintain evaluation systems that measure multimodal performance across internal and public benchmarks
  • Fine-tune and adapt audio models for customer-specific use cases, owning delivery from requirements through deployment
  • Contribute production code to the core audio repository, collaborating with infrastructure and research teams
  • Support experimentation under real hardware constraints, shifting between customer work and core development as priorities evolve
What we offer
What we offer
  • Competitive base salary with equity in a unicorn-stage company
  • We pay 100% of medical, dental, and vision premiums for employees and dependents
  • 401(k) matching up to 4% of base pay
  • Unlimited PTO plus company-wide Refill Days throughout the year
  • Fulltime
Read More
Arrow Right

Tech Lead Manager - Behaviour Learning for Embodied AI

The Science organisation at Wayve advances foundational research in embodied AI ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Years of experience in applied ML/AI roles with strong hands-on contributions
  • Demonstrated track record of impactful technical work in one or more of: multimodal learning, reinforcement learning, generative models, latent action modelling, optimisation, or planning
  • Experience building large-scale ML infrastructure and working with high-dimensional temporal data (e.g., video, multi-sensor inputs)
  • Deep understanding of the end-to-end lifecycle of ML research and deployment
  • Strong Python and PyTorch engineering fundamentals, with experience developing research-grade, production-oriented tools
  • Proven ability to shape technical strategy and lead architectural design for ML systems
  • Publications at top-tier ML conferences such as NeurIPS, ICML, CoRL or ICLR
  • Clear and thoughtful communicator, capable of influencing technical direction and mentoring others without formal reporting lines
Job Responsibility
Job Responsibility
  • Architect the future – Design and evolve models for efficient, robust, and adaptable autonomy, setting a high technical bar for quality and innovation
  • Accelerate research impact – Partner with team members to test, scale, and productionise research ideas - from architecture design to data strategy. Provide technical guidance and feedback on research design, implementation, and evaluation. Implement scalable, high-throughput training pipelines for models with temporal context and develop and evaluate novel data sampling strategies to accelerate training and generalisation
  • Get hands-on when it matters – Lead from the front by contributing directly to key system components, codebases, and experiments, especially during high-leverage moments. Contribute directly as an IC on core research and development tasks (~60-70% of time)
  • Disrupt thoughtfully – Challenge assumptions, ask sharp questions, and champion bold ideas that push us beyond incremental gains and toward breakthrough advances
  • Make things happen – Lead a high-performing, cross-functional team of applied scientists and ML engineers working across ML, RL, representation learning, planning, among many more. Work closely with the team manager to drive quarterly planning and execution of research-engineering initiatives, enabling rapid iteration and delivery in high-ambiguity environments. Translate ambiguity into action and ensure technical progress tracks with our mission
  • Champion change – Lead through ambiguity. Balance structure and adaptability to help your team navigate evolving priorities, novel research, and complex organisational change
Read More
Arrow Right
New

Senior ML Infrastructure Engineer, Inference Platform

About the Team: The ML Inference Platform is part of the AV ML Infrastructure or...
Location
Location
United States , Austin, Texas; Mountain View, California; Sunnyvale, California
Salary
Salary:
155420.00 - 395900.00 USD / Year
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience, with focus on machine learning systems or high performance backend services
  • Expertise in either Python, C++ or other relevant coding languages
  • Expertise in ML inference, model serving frameworks (triton, rayserve, vLLM etc)
  • Strong communication skills and a proven ability to drive cross-functional initiatives
  • Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities
Job Responsibility
Job Responsibility
  • Design and implement core platform backend software components
  • Collaborate with ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value
  • Lead technical decision-making on model serving strategies, orchestration, caching, model versioning, and auto-scaling mechanisms for highly optimized use of accelerators
  • Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization of inference services
  • Proactively research and integrate state-of-the-art model serving frameworks, hardware accelerators, and distributed computing techniques
  • Lead technical initiatives across GM’s ML ecosystem
  • Raise the engineering bar through technical leadership, establishing best practices
  • Contribute to open source projects
  • represent GM in relevant communities
What we offer
What we offer
  • medical
  • dental
  • vision
  • Health Savings Account
  • Flexible Spending Accounts
  • retirement savings plan
  • sickness and accident benefits
  • life insurance
  • paid vacation & holidays
  • tuition assistance programs
  • Fulltime
Read More
Arrow Right

Data Engineer

As a Data Engineer, you’ll build and refine the pipelines, data models, and serv...
Location
Location
United States , Redmond
Salary
Salary:
155000.00 - 175000.00 USD / Year
2a.consulting Logo
2A Consulting
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven ability to design and build end-to-end data systems, from ingestion through cleaning, structuring, storage, and serving
  • Experience building and shipping data products that deliver practical value
  • Demonstrated impact using AI models in data workflows (applied use, not ML research)
  • 5+ years of software or data engineering experience, including at least 2 years of hands-on work with data pipelines
  • Comfortable defining architecture and starting systems from scratch, working independently in a small cross-functional team
  • Proficiency in Python, SQL, or similar languages used in data engineering workflows
Job Responsibility
Job Responsibility
  • Build and maintain core data pipelines
  • Build and maintain end-to-end ingestion pipelines for documents, datasets, code repositories, videos, transcripts, and internal knowledge sources
  • Clean, normalize, structure, and store data in formats that support both web applications and AI-driven use cases
  • Use “out of the box” Microsoft tools—such as Fabric, Azure services, Cosmos DB, or Copilot Studio—to create reliable, maintainable systems
  • Enrich and model research data
  • Use AI models to transform unstructured content into structured metadata and durable knowledge assets
  • Design the architecture and foundational data systems, establishing the patterns and infrastructure for a new, scalable environment
  • Develop and refine embeddings, vector indexes, and retrieval components to support semantic search and grounding scenarios
  • Build backend and data services
  • Build data services, APIs, and backend components that power internal applications and agent-supported workflows
What we offer
What we offer
  • Flexible time-off plan
  • 100% employer-paid medical, dental, and vision insurance
  • Employer-paid life insurance for those enrolled in medical coverage
  • 401(k) plan with company match
  • Fertility, surrogacy, and adoption benefits
  • Fitness and caregiver benefits
  • Employee Assistance Program
  • 100% employer-paid short- and long-term disability coverage
  • Fulltime
Read More
Arrow Right

Ai/ml Infrastructure Engineer

Zensors is the spatial intelligence platform for the physical world. Our AI plat...
Location
Location
United States , San Francisco
Salary
Salary:
150000.00 - 240000.00 USD / Year
helpcare.ai Logo
Helpcare AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS or Ph.D. in Computer Science, Electrical Engineering, or a related discipline
  • Strong programming skills in C/C++ and Python
  • Experience with model optimization, quantization, and efficient deep learning techniques (e.g., knowledge distillation, pruning)
  • Deep understanding of GPU hardware performance, including execution models, thread hierarchy, memory/cache management, and the cost/performance trade-offs of video processing
  • Experience with profiling and benchmarking tools (e.g., Nsight Systems, Nsight Compute) to validate performance on complex architectures
  • Experience identifying and resolving compute and data flow bottlenecks, particularly in high-bandwidth video processing pipelines
  • Strong communication skills and the ability to work cross-functionally between research and infrastructure teams
Job Responsibility
Job Responsibility
  • Optimizing Core ML Pipelines: Identifying key bottlenecks in our current video analytics pipeline and performing in-depth analysis to ensure the best possible performance on current server and edge compute architectures
  • Cross-Stack Collaboration: Collaborating closely with AI research and platform engineering teams to optimize core parallel algorithms and influence the design of our next-generation inference infrastructure
  • Model Acceleration: Applying advanced model optimization techniques—such as quantization (Int8/FP16), pruning, and layer fusion—to our Vision Transformers (ViTs) and CNNs to maximize throughput and minimize latency
  • Building Efficient Operators: Working across the entire ML framework/compiler stack (e.g., PyTorch, CUDA, TensorRT, and NVIDIA DeepStream) to write custom optimized ML operator libraries
  • Resource Efficiency: Reducing the compute cost per video stream to enable massive scalability of our SaaS product
  • Data Management: Building, improving, maintaining, and operating systems to facilitate the collection, labeling, and use of visual data for ML training
  • Fulltime
Read More
Arrow Right