CrawlJobs Logo
Briefcase Icon
Category Icon

Filters

×
Work Mode

Research Engineer, Scaling Jobs (On-site work)

2 Job Offers

Filters
Research Engineer, Scaling
Save Icon
Join 1X in Palo Alto as a Research Engineer, Scaling. You will design and build production-grade infrastructure for large-scale robot training and efficient inference. Your work optimizing distributed systems and on-device performance will directly impact our fleet. We offer competitive benefits ...
Location Icon
Location
United States , Palo Alto
Salary Icon
Salary
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
AI Research Engineer, Scaling
Save Icon
Join 1X as an AI Research Engineer, Scaling in Palo Alto. You will design robust infrastructure for large-scale training and inference across our humanoid robot fleet. This role requires expertise in distributed systems (e.g., TorchTitan, TensorRT) and optimizing performance from datacenter to ed...
Location Icon
Location
United States , Palo Alto
Salary Icon
Salary
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
A Research Engineer, Scaling, is a specialized technical role at the intersection of machine learning research, systems engineering, and high-performance computing. Professionals in these jobs are the architects of scale, responsible for transforming cutting-edge AI prototypes and research models into robust, production-grade systems capable of operating efficiently at massive magnitudes. Their core mission is to remove computational bottlenecks, ensuring that the primary constraint for advancement is data and algorithmic innovation, not hardware limitations. These roles are critical in organizations pushing the boundaries of AI, where the ability to train larger models on bigger datasets and deploy them reliably is a fundamental competitive advantage. The typical responsibilities for a Research Engineer, Scaling, are multifaceted. On the training side, they design, build, and maintain distributed training infrastructure to enable seamless large-scale runs spanning hundreds or thousands of accelerators (like GPUs). This involves deep work on fault tolerance, experiment tracking, data pipeline optimization, and leveraging frameworks for parallelized training. On the inference side, they optimize model deployment for both datacenter and edge environments. This includes maximizing throughput and minimizing latency through techniques like model quantization, kernel optimization, efficient scheduling, and the use of advanced compilers and serving systems. A unifying thread is the relentless focus on performance: understanding compute architectures, memory hierarchies, and network communication to squeeze out maximum efficiency from every cycle. The skill set required for these highly technical jobs is demanding and interdisciplinary. A strong foundation in computer science is essential, typically evidenced by an advanced degree. Proficiency in programming languages like Python and C++ is mandatory. Candidates must possess a deep, intuitive understanding of distributed systems principles, training scaling laws, and the full stack from algorithmic code to hardware execution. Hands-on experience with distributed training frameworks (e.g., PyTorch's ecosystem tools), inference optimization toolkits (e.g., TensorRT), and performance profiling is standard. Crucially, a Research Engineer in Scaling must have a mindset geared toward extreme scalability, viewing it not as an operational detail but as a foundational enabler of breakthrough AI capabilities. They are the engineers who build the runway upon which AI research can take flight, making them pivotal in the most ambitious technology jobs today.

Filters

×
Countries
Category
Location
Work Mode
Salary