CrawlJobs Logo

Research Engineer, Language Model Pre-Training

zyphra.com Logo

Zyphra

Location Icon

Location:
United States , Palo Alto

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Research Engineer, Language Model Pre-training, you'll shape our language model roadmap through end-to-end pretraining development. You will work extremely closely with our pretraining team, who will integrate your insights into our next-generation models.

Job Responsibility:

  • Shape our language model roadmap through end-to-end pretraining development
  • Work across: Large-scale training runs and model parallelization
  • Performance optimization of our pretraining stack
  • Dataset collection, processing, and evaluation
  • Architecture and methodology research, including optimizer ablations

Requirements:

  • Strong engineering aptitude for rapidly implementing reliable and robust systems
  • Can rapidly learn new fields and are excited to implement new ideas
  • Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale
  • Deep expertise and intuition for solving machine learning problems and training models
  • Experience with training on large-scale (multi-node) GPU clusters
  • Deep understanding of model training pipelines – including model/data parallelism, distributed optimizers, etc.
  • Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing
  • Understanding of large-scale, highly parallel data processing pipelines
  • High proficiency with PyTorch and Python
  • Strong ability to dive into large pre-existing codebases and rapidly get up to speed
  • Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Math, Physics)

Nice to have:

Published machine learning research in well-respected venues is a plus

What we offer:
  • Comprehensive medical, dental, vision, and FSA plans
  • Competitive compensation and 401(k)
  • Relocation and immigration support on a case-by-case basis
  • On-site meals prepared by a dedicated culinary team
  • Thursday Happy Hours
  • In-person team in Palo Alto, CA, with a collaborative, high-energy environment

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer, Language Model Pre-Training

Research Engineer, VLA Models

As a Research Engineer, Vision-Language Action (VLA) Models, you will train the ...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming experience in Python (and familiarity with tools like Bazel)
  • Experience with frameworks like PyTorch
  • Experience with simulation environments (e.g., Isaac Sim, MuJoCo)
  • Deep understanding of how autonomous systems generalize to new environments
  • Experience designing evaluation metrics and validating models in real or simulated settings
  • Ability to coordinate with cross‑functional teams (controls, QA, data) to bring models into production
Job Responsibility
Job Responsibility
  • Take extreme ownership over autonomous capabilities: reviewing data, designing model architectures, shipping models, and maintaining performance across the fleet
  • Train NEO for whole‑body manipulation and navigation tasks in unseen environments
  • Design robust evaluation metrics to support scaling of model pre‑training
  • Experiment with state‑of‑the‑art techniques from vision–language models and generative model literature to predict actions
  • Collaborate with controls, QA, and data collection teams to deploy reinforcement learning policies to the production fleet
What we offer
What we offer
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

AI Research Engineer, VLA Models

As a Research Engineer on the Vision-Language Action (VLA) team, you will be res...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills in Python and familiarity with build systems like Bazel
  • Experience using deep learning frameworks such as PyTorch
  • Proficiency in simulation environments like Isaac Sim or MuJoCo
  • Deep understanding of generalization in autonomous systems
  • Experience designing and validating evaluation metrics in real or simulated environments
  • Ability to work cross-functionally with controls, QA, and data teams to operationalize models
Job Responsibility
Job Responsibility
  • Take end-to-end ownership of autonomous capability development: data review, model design, deployment, and fleet performance monitoring
  • Train NEO to perform whole-body manipulation and navigation tasks in unfamiliar environments
  • Design robust evaluation metrics to support scalable model pre-training
  • Experiment with cutting-edge vision-language and generative model techniques to predict robot actions
  • Collaborate with controls, QA, and data teams to deploy reinforcement learning policies to the production fleet
What we offer
What we offer
  • Equity
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right
New

Distinguished Applied Researcher

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , McLean; San Francisco; New York; Cambridge; San Jose
Salary
Salary:
278400.00 - 381300.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 6 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • LLM
  • PhD focus on NLP or Masters with 10 years of industrial NLP research experience
  • Core contributor to team that has trained a large language model from scratch (10B + parameters, 500B+ tokens) or through continued pre-training, post training pipeline for alignment and reasoning, LLM optimizations, complex reasoning with multi-agentic LLMs
  • Numerous publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Has worked on an LLM (open source or commercial) that is currently available for use
  • Demonstrated ability to guide the technical direction of a large-scale model training team
  • Experience with common training optimization frameworks (deep speed, nemo)
  • Experience contributing to the team that has trained a large language model from scratch (10B + parameters, 500B+ tokens) or through continued pre-training, post training pipeline for alignment and reasoning, LLM optimizations, complex reasoning with multi-agentic LLMs
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
  • Partner with a cross-functional team of scientists, machine learning engineers, software engineers, and product managers to deliver AI-powered platforms and solutions that change how customers interact with their money
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right
New

Applied Researcher II

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
262500.00 - 326800.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date plus 2 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • LLM
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • Optimization (Training & Inference)
  • PhD focused on topics related to optimizing training of very large deep learning models
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right
New

Applied Researcher II

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , McLean; San Francisco; New York; San Jose; Cambridge
Salary
Salary:
262500.00 - 326800.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date plus 2 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • LLM
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Applied Researcher II (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
262500.00 - 326800.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date plus 2 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal insights from data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments into the next generation of customer experiences
  • Translate the complexity of your work into tangible business goals
What we offer
What we offer
  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Comprehensive, competitive, and inclusive set of health, financial and other benefits that support total well-being
  • Fulltime
Read More
Arrow Right
New

Applied Researcher II (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; McLean; San Jose; Cambridge
Salary
Salary:
262500.00 - 326800.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date plus 2 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right

Applied Researcher II (AI Foundations)

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
262500.00 - 326800.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields, with an exception that required degree will be obtained on or before the scheduled start date plus 2 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • PhD focus on NLP or Masters with 5 years of industrial NLP research experience
  • Multiple publications on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Member of team that has trained a large language model from scratch (10B + parameters, 500B+ tokens)
  • Publications in deep learning theory
  • Publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR
  • PhD focused on topics related to optimizing training of very large deep learning models
  • Multiple years of experience and/or publications on one of the following topics: Model Sparsification, Quantization, Training Parallelism/Partitioning Design, Gradient Checkpointing, Model Compression
  • Experience optimizing training for a 10B+ model
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right