CrawlJobs Logo

Member of Technical Staff (RL systems)

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
Switzerland , Zürich

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Microsoft AI is looking for a Member of Technical Staff – Reinforcement Learning Systems to help build the world’s most advanced reinforcement learning systems. We are responsible for designing, developing, and operating the large-scale reinforcement learning systems that power several use cases across the Superintelligence team – from training trustworthy and capable agents and powerful reasoning models to helpful and conversational assistants.

Job Responsibility:

  • Develop and tune the pretraining scalable software for Nvidia GB200 72NVL CX8 and AMD MIxxx architectures
  • Benchmark GB200 and AMD MIxxx GPU clusters
  • Gather data and insights to develop the pretraining compute roadmap
  • Care deeply about conversational AI and its deployment
  • Actively contribute to the development of AI models that are powering our innovative products
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
  • Enjoy working in a fast-paced, design-driven, product development cycle
  • Embody our Culture and Values

Requirements:

  • Bachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Experience with generative AI
  • Experience with distributed computing
  • Experience in leading technical projects and supporting architectural decisions with data

Nice to have:

  • A background in machine learning is preferred but not required
  • backgrounds in mathematics, competitive programming, and related domains are a plus

Additional Information:

Job Posted:
March 22, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff (RL systems)

Member of Technical Staff – Model Training

At Inflection AI, our public benefit mission is to harness the power of AI to im...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have hands-on experience training and fine-tuning large transformer models on multi-GPU / multi-node clusters
  • Are fluent in PyTorch and its ecosystem tools (Torchtune, FSDP, DeepSpeed) and enjoy digging into distributed-training internals, mixed precision, and memory-efficiency tricks
  • Have shipped or published work in RLHF, DPO, GRPO, or RLAIF and understand their practical trade-offs
  • Care deeply about training tools, pipelines, and reproducibility—you automate the boring parts so you can iterate on the fun parts
  • Balance research curiosity with product pragmatism—you know when to run an ablation and when to ship
  • Communicate crisply with both technical and non-technical teammates
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Contribute to end-to-end post-training workflows—dataset curation, hyper-parameter search, evaluation, and rollout—using PyTorch, Torchtune, FSDP/DeepSpeed, and our internal orchestration stack
  • Prototype and compare alignment techniques (e.g., curriculum RL, multi-objective reward modeling, tool-use fine-tuning) and push the best ideas into production
  • Automate training at scale: build robust pipeline components, tools, scripts, and dashboards so experiments are reproducible and easy to trace
  • Define the metrics that matter
  • run A/B tests and iterate quickly to meet aggressive quality targets
  • Collaborate with inference, safety, and product teams to land improvements in customer-facing systems
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Competitive stock options
Read More
Arrow Right

Member of technical staff - Research - Agent

About H: H exists to push the boundaries of superintelligence with agentic AI. B...
Location
Location
France; United Kingdom , Paris; London
Salary
Salary:
Not provided
hcompany.ai Logo
H Company
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Senior Experience: Previous demonstrable role(s) as a Staff, Principal, or Senior Engineer (or equivalent Research Scientist) in a Frontier AI Lab with a proven track record of leading complex, end-to-end AI/ML projects from conception to production
  • Education / Publication: Preferably PhD (or equivalent research experience) in Machine Learning, Computer Science, or a related field, preferably with a strong publication record (e.g., NeurIPS, ICML, ICLR) in Computer Science
  • Core Expertise: Deep theoretical and practical expertise in Agentic AI and proven experience building, scaling, and shipping solutions involving foundation models (LLMs/VLMs)
  • Soft Skills: Collaborative: Enjoys collaboration and thrives in a teamwork-oriented, fast-paced research environment
  • High-Impact Communicator: Possesses impactful communication skills, with the ability to bridge the gap between research and engineering and articulate complex ideas clearly
  • Mission-Driven: Genuinely eager to explore and solve the new engineering and research challenges at the frontier of agentic AI
Job Responsibility
Job Responsibility
  • Research & Leadership: Design and develop new agents, proposing new research directions, e.g., combining state-of-the-art RL with foundation models (LLMs/VLMs)
  • Algorithm & Systems Design: Design, implement, and scale complex, high-performance systems for training large-scale agents. This includes both the foundational infrastructure and the novel algorithms, reward models, and sophisticated training environments
  • Research-to-Production: Collaborate closely with researchers and engineers to implement, test, and productionize new agent logics, learning algorithms, and system architectures
  • Evaluation & Reliability: Create, manage, and scale massive benchmarks and evaluation systems to rigorously track agent capabilities. You will own system reliability, scalability, and observability for our entire research infrastructure
  • Mentorship & Standards: Mentor and guide other engineers and researchers on the team, fostering technical excellence. You will establish and enforce engineering standards, tooling, and best practices for both code and research design
  • Innovation: Conduct thorough code and design reviews, champion technical innovation, and proactively address technical debt to accelerate the R&D lifecycle
What we offer
What we offer
  • Join the exciting journey of shaping the future of AI, and be part of the early days of one of the hottest AI startups
  • Collaborate with a fun, dynamic, and multicultural team, working alongside world-class AI talent in a highly collaborative environment
  • Enjoy a competitive salary
  • Unlock opportunities for professional growth, continuous learning, and career development
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Applied Research

The Applied Researcher role is designed for engineers who love working across ML...
Location
Location
United States , San Mateo
Salary
Salary:
Not provided
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS in Computer Science, Electrical Engineering, Machine Learning, or a related field, or equivalent practical experience, open to all levels of experiences
  • Strong experience with PyTorch and modern Transformer architectures
  • Solid computer science fundamentals: data structures, algorithms, concurrency, distributed systems, networking
  • Hands-on experience training, fine-tuning, or evaluating machine learning models, preferably LLMs
  • Familiarity with recent developments in the LLM research domain, including model architectures, training methods, and evaluation strategies
  • Passion for partnering with customers: understanding their constraints, co-designing solutions, and iterating based on real-world feedback
  • Curiosity and enthusiasm for exploring a wide range of problem domains and project types - from quick experiments to long-running, complex engagements
  • Ability to operate in a fast-paced, ambiguous environment and drive projects independently
Job Responsibility
Job Responsibility
  • Sit at the intersection of ML research, systems engineering, and customer-facing problem solving
  • Work hands-on with customers and customer data to tune, evaluate and deploy models using various techniques such as SFT / DPO / RL
  • Help customers build competitive models using their unique data tailored to their unique products
  • Be the technical bridge between customer needs, customer data, and our tuning and serving infrastructure
What we offer
What we offer
  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Next Generation Agents

Agentic LLM systems are being deployed widely across enterprise companies includ...
Location
Location
Salary
Salary:
Not provided
cohere.com Logo
Cohere
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering skills
  • Proficiency in Python and have some experience with ML-related code (e.g., pytorch, numpy, etc.)
  • Experience with LLMs and agentic frameworks
  • Experience with post-training LLMs (SFT, PEFT, or RL*)
  • Experience with building synthetic data generation pipelines
Job Responsibility
Job Responsibility
  • Design and develop novel agentic solutions
  • Improve upon SOTA on hard agentic tasks
  • Research the next-generation of on-line learning-from-experience self-improvement
  • Work with partner teams (Reasoning, Post-training, Pre-training, etc.) to improve performance of agentic system
  • Work with an amazing team of researchers and engineers pushing the boundaries
What we offer
What we offer
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Agents Modeling

We’re looking for an experienced machine learning researcher / engineer who can ...
Location
Location
United States , New York
Salary
Salary:
Not provided
cohere.com Logo
Cohere
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have a PhD in computer science or related field or similar industry research experience
  • Strong software engineering skills
  • Proficiency in Python and experience with ML-related code (e.g., pytorch, numpy, etc.)
  • Experience with LLMs and agentic frameworks
  • Experience with post-training LLMs (SFT, PEFT, or RL*)
  • Experience with building synthetic data generation pipelines
Job Responsibility
Job Responsibility
  • Design and develop novel agentic solutions
  • Improve upon SOTA on hard agentic tasks
  • Research the next-generation of on-line learning-from-experience self-improvement
  • Work with partner teams (Reasoning, Post-training, Pre-training, etc.) to improve performance of agentic system
  • Work with an amazing team of researchers and engineers pushing the boundaries
What we offer
What we offer
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)
  • Fulltime
Read More
Arrow Right

Member of Technical Staff - Inference

Prime Intellect is building the open superintelligence stack - from frontier age...
Location
Location
United States , San Francisco
Salary
Salary:
Not provided
Prime Intellect
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years building and running large‑scale ML/LLM services with clear latency/availability SLOs
  • Hands‑on with at least one of vLLM, SGLang, TensorRT‑LLM
  • Familiarity with distributed and disaggregated serving infrastructure such as NVIDIA Dynamo
  • Deep understanding of prefill vs. decode, KV‑cache behavior, batching, sampling, speculative decoding, parallelism strategies
  • Comfortable debugging CUDA/NCCL, drivers/kernels, containers, service mesh/networking, and storage, owning incidents end‑to‑end
  • Python: Systems tooling and backend services
  • PyTorch: LLM Inference engine development and integration, deployment readiness
  • AWS/GCP service experience, cloud deployment patterns
  • Running infrastructure at scale with containers on Kubernetes
  • Architecture, CUDA runtime, NCCL, InfiniBand
Job Responsibility
Job Responsibility
  • Build a multi-tenant LLM serving platform that operates across our cloud GPU fleets
  • Design placement and scheduling algorithms for heterogeneous accelerators
  • Implement multi‑region/zone failover and traffic shifting for resilience and cost control
  • Build autoscaling, routing, and load balancing to meet throughput/latency SLOs
  • Optimize model distribution and cold-start times across clusters
  • Integrate and contribute to LLM inference frameworks such as vLLM, SGLang, TensorRT‑LLM
  • Optimize configurations for tensor/pipeline/expert parallelism, prefix caching, memory management and other axes for maximum performance
  • Profile kernels, memory bandwidth and transport
  • apply techniques such as quantization and speculative decoding
  • Develop reproducible performance suites (latency, throughput, context length, batch size, precision)
What we offer
What we offer
  • Competitive compensation with significant equity incentives
  • Flexible work arrangement (remote or San Francisco office)
  • Full visa sponsorship and relocation support
  • Professional development budget
  • Regular team off-sites and conference attendance
  • Opportunity to shape decentralized AI and RL at Prime Intellect
  • Fulltime
Read More
Arrow Right
New

Air Conditioning Engineer

The role involves the planned and reactive maintenance of air conditioning and H...
Location
Location
United Kingdom , Irvine
Salary
Salary:
32000.00 - 42000.00 GBP / Year
https://www.randstad.com Logo
Randstad
Expiration Date
March 24, 2026
Flip Icon
Requirements
Requirements
  • Proven experience working with VRV, VRF, and split air conditioning systems
  • Strong background in PPMs, remedials, and fault finding
  • Experience working in a static site environment (ideally within a regulated setting)
  • Full UK driving licence required
  • Relevant HVAC/AC qualifications required
  • F-Gas Category 1 certification
  • NVQ Level 2 or Level 3 in Air Conditioning & Refrigeration (or equivalent)
  • Proven experience working as an Air Conditioning Engineer
  • Strong PPM and fault-finding experience across commercial systems
  • Full UK driving licence
Job Responsibility
Job Responsibility
  • Carry out Planned Preventative Maintenance (PPMs) on air conditioning systems
  • Diagnose faults and undertake effective fault finding
  • Complete remedial and reactive maintenance works
  • Maintain and service VRV and VRF systems, including split systems
  • Ensure all works are completed safely and documented accurately
  • Liaise with site management and clients to ensure minimal disruption to operations
  • Fulltime
!
Read More
Arrow Right
New

Qa engineer

Location
Location
India , Ahmedabad
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
April 04, 2026
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Engineering, Quality, or related technical field
  • 3–5 years of experience in Quality Assurance / Quality Control in a product manufacturing or health-tech company
  • Hands-on experience with QMS implementation, ISO standards, CAPA, and audits
  • Strong understanding of quality tools such as FMEA, Root Cause Analysis, and 5 Whys
  • Excellent analytical, documentation, and communication skills
  • Attention to detail with a problem-solving mindset
Job Responsibility
Job Responsibility
  • Drive Consistent Excellence in Final Product Quality
  • Conduct in-process and final product inspections to ensure product conformance
  • Manage product testing, calibration, and validation processes
  • Investigate product deviations and implement corrective and preventive actions (CAPA)
  • Quality Systems & Process Control
  • Establish, implement, and maintain Quality Management Systems (QMS) in compliance with ISO,CE, FDA and other regulatory standards
  • Define and monitor quality control checkpoints across the production process
  • Drive continuous improvement in quality and process reliability
  • Supplier Quality Management
  • Evaluate and audit suppliers to ensure compliance with company quality standards
Read More
Arrow Right