CrawlJobs Logo

Member of Technical Staff, Synthetic Data

· Job Posted February 20, 2026
Apply Position
Job Link Share

Job Description

As a Machine Learning Engineer specializing in synthetic data, you will play a pivotal role in developing the synthetic data pipeline that is crucial to Cohere’s advanced language models. Your responsibilities will encompass the end-to-end management of synthetic data, including maintaining and optimizing the synthetic data pipeline, data analysis and generation, as well as conducting data ablations and model evaluation to gauge data quality. You will work with diverse web data and code data and transform them using generative models to improve token efficiency and model quality. By combining research and engineering, you will bridge the gap between raw data and cutting-edge AI models, directly contributing to improvements in critical training metrics like throughput and accelerator utilization.

Job Responsibility

  • Design and build scalable inference pipelines that run on large GPU clusters
  • Conduct data ablations to assess data quality and experiment with data mixtures to enhance model performance
  • Research and implement innovative synthetic data curation methods, leveraging Cohere’s infrastructure to drive advancements in natural language processing
  • Collaborate with cross-functional teams, including researchers and engineers, to ensure data pipelines meet the demands of cutting-edge language models

Requirements

  • Strong software engineering skills, with proficiency in Python and experience building data pipelines
  • Familiarity with data processing frameworks such as Apache Spark, Apache Beam, Pandas, or similar tools
  • Experience working with LLMs through work projects, open-source contributions or personal experimentation
  • Familiarity with LLM inference frameworks such as vLLM and TensorRT
  • Experience working with large-scale datasets, including web data, code data, and multilingual corpora
  • A passion for bridging research and engineering to solve complex data-related challenges in AI model training

Nice to have

Bonus: paper at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP)

What we offer

  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Member of Technical Staff, Synthetic Data

8 matching positions

Member of Technical Staff - ML Research Engineer, Data

Our Data team powers Liquid Foundation Models across pre-training, vision, audio...
Location
Location
United States , San Francisco; Boston
Salary
Salary:
Not provided
liquid.ai Logo
Liquid AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong Python skills with the ability to quickly comprehend problems and translate them into clean, working code
  • Solid ML fundamentals: experience training, evaluating, and iterating on models (PyTorch preferred)
  • Track record of learning new technical domains quickly
  • 3+ years relevant experience with an M.S., or 1+ year with a Ph.D. (5+ years with a B.S.)
Job Responsibility
Job Responsibility
  • Build and maintain data processing, filtering, and selection pipelines at scale
  • Create pipelines for pretraining, midtraining, SFT, and preference optimization datasets
  • Design synthetic data generation systems using LLMs, structured prompting, and domain-specific generators
  • Design and run evaluations and ablations to measure dataset's impact on model performance
  • Monitor public datasets across text, vision, and audio domains
  • Collaborate with pre-training, vision, and audio teams on modality-specific data needs
What we offer
What we offer
  • Competitive base salary with equity in a unicorn-stage company
  • We pay 100% of medical, dental, and vision premiums for employees and dependents
  • 401(k) matching up to 4% of base pay
  • Unlimited PTO plus company-wide Refill Days throughout the year
  • Fulltime
Read More
Arrow Right

Member of Technical Staff - Forward Deployed Engineer

You will work directly on customer engagements that generate revenue. This is ha...
Location
Location
United States , San Francisco, Boston
Salary
Salary:
Not provided
liquid.ai Logo
Liquid AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands-on fine-tuning experience with modern LLMs (last 12-18 months): LoRA, PEFT, DPO, instruction tuning, or similar
  • Strong ML fundamentals: you understand how models learn, fail, and improve
  • Experience generating or curating training data to address model gaps
  • Autonomous coding and debugging skills in Python and PyTorch
  • Proficiency with open-source ML ecosystem (Hugging Face transformers, datasets, accelerate)
  • Fine-tunes models: You have hands-on experience with techniques like LoRA, PEFT, DPO, instruction tuning, or RLHF. You've written training loops, not just API calls
  • Works with modern architectures: Your experience includes models released in the last 12-18 months (Llama 3.x, Mistral, Gemma, Qwen, etc.), not just BERT or classical ML
  • Generates and curates data: You've created synthetic training data to address specific model failure modes. You understand how data quality drives model performance
  • Debugs methodically: When a model underperforms, you diagnose whether it's a data problem, architecture problem, or training problem, and you fix it
  • Ships to customers: You can translate ambiguous customer requirements into concrete technical specs and deliver against quality metrics
Job Responsibility
Job Responsibility
  • Fine-tune LFMs on customer data to hit quality and latency targets for on-device and edge deployments
  • Generate and curate training data to address specific model failure modes
  • Run experiments, track metrics, and iterate until customer success criteria are met
  • Translate ambiguous customer requirements into concrete technical specifications
  • Provide analytics to commercial teams for contract structuring and pricing
  • Work across text, vision, and audio modalities as customer needs require
What we offer
What we offer
  • Competitive base salary with equity in a unicorn-stage company
  • We pay 100% of medical, dental, and vision premiums for employees and dependents
  • 401(k) matching up to 4% of base pay
  • Unlimited PTO plus company-wide Refill Days throughout the year
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Next Generation Agents

Agentic LLM systems are being deployed widely across enterprise companies includ...
Location
Location
Salary
Salary:
Not provided
cohere.com Logo
Cohere
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering skills
  • Proficiency in Python and have some experience with ML-related code (e.g., pytorch, numpy, etc.)
  • Experience with LLMs and agentic frameworks
  • Experience with post-training LLMs (SFT, PEFT, or RL*)
  • Experience with building synthetic data generation pipelines
Job Responsibility
Job Responsibility
  • Design and develop novel agentic solutions
  • Improve upon SOTA on hard agentic tasks
  • Research the next-generation of on-line learning-from-experience self-improvement
  • Work with partner teams (Reasoning, Post-training, Pre-training, etc.) to improve performance of agentic system
  • Work with an amazing team of researchers and engineers pushing the boundaries
What we offer
What we offer
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Agents Modeling

We’re looking for an experienced machine learning researcher / engineer who can ...
Location
Location
United States , New York
Salary
Salary:
Not provided
cohere.com Logo
Cohere
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have a PhD in computer science or related field or similar industry research experience
  • Strong software engineering skills
  • Proficiency in Python and experience with ML-related code (e.g., pytorch, numpy, etc.)
  • Experience with LLMs and agentic frameworks
  • Experience with post-training LLMs (SFT, PEFT, or RL*)
  • Experience with building synthetic data generation pipelines
Job Responsibility
Job Responsibility
  • Design and develop novel agentic solutions
  • Improve upon SOTA on hard agentic tasks
  • Research the next-generation of on-line learning-from-experience self-improvement
  • Work with partner teams (Reasoning, Post-training, Pre-training, etc.) to improve performance of agentic system
  • Work with an amazing team of researchers and engineers pushing the boundaries
What we offer
What we offer
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)
  • Fulltime
Read More
Arrow Right

Member of Technical Staff - ML Engineer / Scientist (JP Localization)

At Liquid, we’re not just building AI models—we’re redefining the architecture o...
Location
Location
Japan , Tokyo
Salary
Salary:
Not provided
liquid.ai Logo
Liquid AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep understanding of the Japanese model evaluation landscape and familiarity with Japanese pre-training data sources
  • Experience using modeling and inference tools such as Huggingface inference, vLLM, and cloud APIs
Job Responsibility
Job Responsibility
  • Identify, collect, and curate diverse high-quality Japanese text, audio, and multimodal datasets
  • Design methods to synthetically generate or augment Japanese training data when needed
  • Ensure datasets meet enterprise-grade quality, coverage, and compliance requirements
  • Train and fine-tune language and vision models to achieve state-of-the-art performance for Japanese enterprise use cases
  • Adapt existing LFMs for Japanese language, culture, and enterprise-specific workflows
  • Implement evaluation frameworks to benchmark model quality on Japanese datasets
  • Design evaluation datasets and metrics for Japanese enterprise applications
  • Conduct thorough error analysis and iteratively improve model performance
  • Ensure robustness, fairness, and reliability in Japanese-language outputs
What we offer
What we offer
  • Hands-on experience with state-of-the-art technology at a leading AI company
  • The opportunity to directly shape foundation model performance in one of the world’s most complex and nuanced languages
  • A collaborative, fast-paced environment where your work drives the next generation of LFMs
  • Fulltime
Read More
Arrow Right
New

Sales Assistant / Driver

At Crown Paints, our Sales Assistants are the heartbeat of the store — helping c...
Location
Location
United Kingdom , Northampton
Salary
Salary:
12.71 GBP / Hour
crownpaints.co.uk Logo
Crown Paints
Expiration Date
June 30, 2026
Flip Icon
Requirements
Requirements
  • A friendly, confident approach with customers
  • Strong communication skills
  • Retail or sales experience (helpful but not essential)
  • Attention to detail and safe working practices
  • Full driving licence
Job Responsibility
Job Responsibility
  • Delivering brilliant customer service in store and over the phone
  • Mixing and tinting paint to customer requirements
  • Driving the store van to deliver stock locally
  • Processing sales and handling POS transactions
  • Supporting stock control, deliveries and store standards
  • Building relationships with customers and opening new trade accounts
What we offer
What we offer
  • Up to 25% quarterly performance bonus
  • 36 days holiday (including bank holidays + Christmas closure)
  • Generous pension with company match
  • Employee Assistance Programme (24/7 support)
  • Health, dental and optical benefits
  • Cycle to Work scheme
  • Retail and leisure discounts
  • No nights. No Sundays.
  • Ongoing training and development
  • Fulltime
Read More
Arrow Right
New

Store Manager

Crown Paints are seeking to recruit Store Manager to join our fantastic team bas...
Location
Location
United Kingdom , Warrington
Salary
Salary:
30000.00 - 33000.00 GBP / Year
crownpaints.co.uk Logo
Crown Paints
Expiration Date
June 30, 2026
Flip Icon
Requirements
Requirements
  • Hands-on, people-focused
  • relationship building
  • confident prospecting
  • setting and achieving challenging targets
  • problem-solving
  • driven by results
  • leadership
  • managing and motivating a small team
  • coaching
  • stock management
Job Responsibility
Job Responsibility
  • Take the lead from the front of the store
  • build relationships with customers
  • win new business through confident prospecting
  • set and achieve challenging targets
  • shape sales performance and customer growth
  • manage and motivate a small team
  • coach others
  • support team performance
  • take ownership of stock
  • maintain control and forecast needs
What we offer
What we offer
  • 36 days annual leave
  • up to 25% performance bonus each quarter
  • significantly discounted paint for personal use
  • pension plan with company match and double contribution
  • Employee Assistance Programme (EAP) 24/7
  • health & wellbeing perks
  • excellent work-life balance (no night shifts or Sundays)
  • eating out, retail and leisure discounts
  • Cycle to Work Scheme
  • training and development
  • Fulltime
Read More
Arrow Right
New

Sales Assistant / Driver

At Crown Paints, our Sales Assistants are the heartbeat of the store — helping c...
Location
Location
United Kingdom , Blackburn
Salary
Salary:
12.71 GBP / Hour
crownpaints.co.uk Logo
Crown Paints
Expiration Date
July 01, 2026
Flip Icon
Requirements
Requirements
  • A friendly, confident approach with customers
  • Strong communication skills
  • Retail or sales experience (helpful but not essential)
  • Attention to detail and safe working practices
  • Full driving licence
Job Responsibility
Job Responsibility
  • Delivering brilliant customer service in store and over the phone
  • Mixing and tinting paint to customer requirements
  • Driving the store van to deliver stock locally
  • Processing sales and handling POS transactions
  • Supporting stock control, deliveries and store standards
  • Building relationships with customers and opening new trade accounts
What we offer
What we offer
  • Up to 25% quarterly performance bonus
  • 36 days holiday (including bank holidays + Christmas closure)
  • Generous pension with company match
  • Employee Assistance Programme (24/7 support)
  • Health, dental and optical benefits
  • Cycle to Work scheme
  • Retail and leisure discounts
  • No nights
  • No Sundays
  • Ongoing training and development
  • Parttime
Read More
Arrow Right