Research Engineer, Scaling Job at 1X Technologies (Palo Alto)

AI Research Engineer, Scaling

As a Research Engineer focused on Scaling, you will design and build robust infr...

Location

United States , Palo Alto

Salary:

180000.00 - 300000.00 USD / Year

1X Technologies

Expiration Date

Until further notice

Requirements

Strong programming experience in Python and/or C++
Deep intuitive understanding of training and inference speed bottlenecks and scaling laws
A mindset aligned with extremely high scaling: belief that scale is foundational to enabling humanoid robotics
Degree in Computer Science or a related field
Experience with distributed training frameworks (e.g., TorchTitan, DeepSpeed, FSDP/ZeRO), multi-node debugging, and experiment management
Proven skills in optimizing inference performance using graph compilers, batching/scheduling, and serving systems like TensorRT or equivalents
Familiarity with quantization strategies (PTQ, QAT, INT8/FP8) and tools such as TensorRT and bitsandbytes
Experience developing or tuning CUDA or Triton kernels with understanding of hardware-level optimization (vectorization, tensor cores, memory hierarchies)

Job Responsibility

Own and lead scaling of distributed training and inference systems
Ensure compute resources are optimized to make data the primary constraint
Enable massive training runs (1000+ GPUs) using robot data, with robust fault tolerance, experiment tracking, and distributed operations
Optimize inference throughput for datacenter use cases such as world models and diffusion engines
Reduce latency and enhance performance for on-device robot policies using techniques such as quantization, scheduling, and distillation

What we offer

Equity
Health, dental, and vision insurance
401(k) with company match
Paid time off and holidays

Fulltime

Research Scientist / Engineer – Pre-training / Scaling

At Luma, the Pre-Training / Scaling team is responsible for building the core mu...

Location

United States , Palo Alto

Salary:

187500.00 - 395000.00 USD / Year

Luma AI

Expiration Date

Until further notice

Requirements

Expertise in Python and PyTorch with experience building ML models from scratch
Deep understanding of multimodal generative models and deep learning architectures
(Preferred) Strong research track record in generative AI with published work in top-tier venues preferred
(Preferred) Experience with large-scale distributed training systems

Job Responsibility

Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
Develop training methodologies for foundation models across thousands of GPUs
Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models
Collaborate with cross-functional teams to transition research into production systems

Fulltime

Research Engineer / Research Scientist - Foundations Retrieval Lead

The Foundations Research team works on high-risk, high-reward ideas that could s...

Location

United States , San Francisco

Salary:

445000.00 - 555000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

Proven experience leading high-performance teams of researchers or engineers in ML infrastructure or foundational research
Deep technical expertise in representation learning, embedding models, or vector retrieval systems
Familiarity with transformer-based LLMs and how embedding spaces can interact with language model objectives
Research experience in areas such as contrastive learning, supervised or unsupervised embedding learning, or metric learning
A track record of building or scaling large machine learning systems, particularly embedding pipelines in production or research contexts
A first-principles mindset for challenging assumptions about how retrieval and memory should work for large models

Job Responsibility

Lead research into embedding models and retrieval systems optimized for grounding, relevance, and adaptive reasoning
Manage a team of researchers and engineers building end-to-end infrastructure for training, evaluating, and integrating embeddings into frontier models
Drive innovation in dense, sparse, and hybrid representation techniques, metric learning, and learning-to-retrieve systems
Collaborate closely with Pretraining, Inference, and other Research teams to integrate retrieval throughout the model lifecycle
Contribute to OpenAI’s long-term vision of AI systems with memory and knowledge access capabilities rooted in learned representations

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

Research Engineer / Software Engineer (platform/core infrastructure)

Build the future of offensive security with XBOW. Attackers are already using AI...

Location

United States

Salary:

150000.00 - 350000.00 USD / Year

Xbow

Expiration Date

Until further notice

Requirements

Strong experience building and operating scalable, distributed systems on cloud infrastructure such as AWS or similar
Comfortable working with infrastructure as code (e.g., Terraform, CDK)
A track record of performance tuning across cloud services, databases, and compute layers
Eager to learn new tools, languages, and technologies as needed
A thoughtful communicator who values clarity and simplicity and is comfortable working in a fast-paced startup and navigating ambiguity
Strong problem-solving skills and the ability to work with incomplete information
Curious, practical, and eager to work across layers of the stack when needed
You think proactively about failure modes and bring experience implementing disaster recovery and business continuity plans that keep critical systems running

Job Responsibility

Design and implement infrastructure systems that scale reliably and securely, and can be deployed across multiple cloud environments (AWS, Azure, OCI etc.) and contexts (SaaS, on prem)
Tune and optimize cloud services across compute, storage, networking, and observability to drive performance, reliability and maintainability of core services
Develop our core services, written in TypeScript, Kotlin and Go
Support large-scale systems with event driven architectures
Own problems end-to-end—from design through deployment to production support
Navigate ambiguity and help define how we build as much as what we build
Partner closely with other engineers, AI researchers and Security researchers to enable high-quality, high-velocity product development
Design for resilience by implementing disaster recovery and business continuity strategies that ensure uptime, even when things break
Improve how we build, deploy, and monitor services at scale

What we offer

Competitive salary and a generous equity package
Career Growth: Shape your role, lead the function, and grow with the company
Meaningful Work: You will tackle technically complex challenges and play a pivotal role in the growth of our business

Fulltime

Research Engineer / Software Engineer (backend)

Build the future of offensive security with XBOW. Attackers are already using AI...

Location

United States

Salary:

150000.00 - 350000.00 USD / Year

Xbow

Expiration Date

Until further notice

Requirements

Experience building and operating scalable, distributed systems
Comfort working in a fast-moving, early-stage environment
Strong problem-solving skills and the ability to work with incomplete information
Familiarity with AWS or similar cloud platforms
Comfort working with infrastructure as code (e.g., Terraform or CDK)
Eager to learn new tools, languages, and technologies as needed
A thoughtful communicator who values clarity and simplicity

Job Responsibility

Design and build distributed backend systems that scale reliably and securely
Work in TypeScript, Kotlin and Go
Deploy and operate services in AWS and other cloud providers
Own problems end-to-end—from design through deployment to production support
Navigate ambiguity and help define how we build as much as what we build
Collaborate closely with teammates across the stack, including AI researchers, Security researchers and frontend engineers

What we offer

Competitive salary and a generous equity package
Career growth
Meaningful work
Remote work with support to travel to collaborate with colleagues in person

Fulltime

Machine Learning Research Scientist / Research Engineer, Post-Training

Scale works with the industry’s leading AI labs to provide high quality data and...

Location

United States , San Francisco; Seattle; New York

Salary:

252000.00 - 315000.00 USD / Year

Scale

Expiration Date

Until further notice

Requirements

Ph.D. or Master's degree in Computer Science, Machine Learning, AI, or a related field
Deep understanding of deep learning, reinforcement learning, and large-scale model fine-tuning
Experience with post-training techniques such as RLHF, preference modeling, or instruction tuning
Excellent written and verbal communication skills
Published research in areas of machine learning at major conferences (NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, etc.) and/or journals
Previous experience in a customer facing role

Job Responsibility

Research and develop novel post-training techniques, including SFT, RLHF, and reward modeling, to enhance LLM core capabilities in both text and multimodal modalities
Design and experiment new approaches to preference optimization
Analyze model behavior, identify weaknesses, and propose solutions for bias mitigation and model robustness
Publish research findings in top-tier AI conferences

What we offer

Comprehensive health, dental and vision coverage
retirement benefits
a learning and development stipend
generous PTO
equity based compensation
commuter stipend

Fulltime

Research Engineer, Text Data Research - MSL FAIR

Meta is seeking AI research engineers to help us build the data foundation for M...

Location

United States , Menlo Park

Salary:

257000.00 USD / Year ▼

Research Engineer, Media Data Research - MSL FAIR

Meta is seeking AI research engineers to help us build the data foundation for M...

Location

United States , Menlo Park

Salary:

217000.00 USD / Year ▼

Select Country

Research Engineer, Scaling

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?