CrawlJobs Logo

Research Engineer - Evaluations

lumalabs.ai Logo

Luma AI

Location Icon

Location:
United States; United Kingdom , Palo Alto

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Luma is pushing the boundaries of generative AI, building tools that redefine how visual content is created. We're seeking a Research Engineer to design and scale the infrastructure that powers our model evaluation efforts. This role is about building the pipelines, metrics, and automated systems that close the loop between model output, evaluation, and improvement. You'll work across research, engineering, and product teams to ensure our models are measured rigorously, consistently, and in ways that directly inform development.

Job Responsibility:

  • Design and implement scalable pipelines for automated evaluation of generative models, with a focus on visual and multimodal outputs (image, video, text, audio)
  • Develop novel metrics and evaluation models that capture qualities like fidelity, coherence, temporal consistency, and alignment with human intent
  • Integrate evaluation signals into training loops (including reinforcement learning and reward modeling) to continuously improve model performance
  • Build infrastructure for large-scale regression testing, benchmarking, and monitoring of multimodal generative models
  • Collaborate with researchers running human studies to translate human evaluation frameworks into automated or semi-automated systems
  • Partner with model researchers to identify failure cases and build targeted evaluation harnesses
  • Maintain dashboards, reporting tools, and alerting systems to surface evaluation results to stakeholders
  • Stay current with emerging evaluation techniques in generative AI, multimodal LLMs, and perceptual quality assessment

Requirements:

  • Master's or PhD in Computer Science, Machine Learning, or a related technical field (or equivalent industry experience)
  • 3+ years of experience building ML evaluation systems, model pipelines, or large-scale infrastructure
  • Hands-on experience working with visual data (images and/or video), including evaluation, modeling, or data preparation
  • Proficiency in Python and ML frameworks (PyTorch, JAX, or TensorFlow)
  • Familiarity with human-in-the-loop evaluation workflows and how to scale them with automation
  • Strong background in machine learning, with experience in generative models (diffusion, LLMs, multimodal architectures)
  • Strong software engineering skills (CI/CD, testing, data pipelines, distributed systems)

Nice to have:

  • Experience with reinforcement learning or reward modeling
  • Prior work on perceptual metrics, multimodal evaluation benchmarks, or retrieval-based evaluation
  • Background in large-scale model training or evaluation infrastructure
  • Experience designing metrics for perceptual quality
  • Familiarity with creative media workflows (film, VFX, animation, digital art)
  • Contributions to open-source evaluation libraries or benchmarks

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer - Evaluations

AI Research Engineer

We're seeking a Research Engineer to conduct innovative research in key AI areas...
Location
Location
United Kingdom
Salary
Salary:
Not provided
prolific.com Logo
Prolific
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of engineering experience with significant AI/ML focus
  • Demonstrated research experience through publications, open-source contributions, or impactful projects
  • Strong engineering fundamentals and experience implementing AI systems in production environments
  • Deep knowledge of LLM evaluation methodologies, alignment techniques, and model optimization approaches
  • Experience with model fine-tuning, adapters, quantization, and distillation frameworks
  • Self-motivation and ability to define and pursue research directions independently
  • Excellent understanding of current challenges in AI safety, reliability, and alignment
  • Strong communication skills and ability to explain complex research concepts clearly
  • Passion for staying current with the rapidly evolving AI research landscape
Job Responsibility
Job Responsibility
  • Lead independent research projects in AI evaluation methodologies, alignment techniques, and synthetic data generation
  • Design and implement novel evaluation frameworks for LLMs and agent systems that are grounded in human data
  • Contribute to the academic AI community through publications and open-source contributions
  • Stay at the forefront of AI research and pioneer innovative approaches to tackle pressing open challenges in the field
  • Design and conduct rigorous experiments to study AI models and systems with sound methodological approaches
  • Develop scalable frameworks for systematic evaluation of model behaviours and capabilities
  • Create tools and frameworks that transform research insights into practical applications
  • Build infrastructure to support large-scale research experiments when needed
  • Apply knowledge of model fine-tuning, optimization techniques, distillation, and other ML engineering practices to support research goals
  • Work closely with ML engineers, data scientists, and product teams to translate research insights into practical applications
What we offer
What we offer
  • competitive salary
  • benefits
  • remote working
  • impactful, mission-driven culture
Read More
Arrow Right

Research Engineering Internship

Location
Location
United States , Valley View, OH
Salary
Salary:
Not provided
peaknano.com Logo
Peak Nano
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must be located in the Cleveland, OH area
  • Enrollment in B.A or B.S in Material/Polymer Science, Plastics Engineering, Chemical Engineering, or Mechanical Engineering
  • Basic knowledge or interest in polymeric materials
  • Demonstrated eagerness to learn new concepts, tasks, and applications
  • Ability to handle up to fifty (50) pounds with physical dexterity to assemble and operate test equipment
  • Ability to work independently or well in a team atmosphere
  • Intermediate PC skills with MS Office (Excel, PowerPoint, Word, Outlook, etc)
  • US Citizenship
Job Responsibility
Job Responsibility
  • Setup and tear down polymer process experiments
  • Participate in film extrusion or basic polymer materials fabrication trials
  • Prepare film samples for a range of characterization and testing
  • Conduct, record, and report results of characterization of polymer films or products
  • Conduct analytical test methods to solve problems with polymer films
  • Assist senior technical staff members in developing/modifying techniques or analytic equipment for the evaluation of new products and processes
  • Fulltime
Read More
Arrow Right

Research Engineer, Hardware

Join the NEXT team shaping the foundations of the future humanoid robot. Tackle ...
Location
Location
United States , Palo Alto
Salary
Salary:
Not provided
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong grasp of core engineering principles including math, mechanics, and materials
  • Experience designing actuation systems—motors, drivetrains, joints, structures
  • Background in Mechanical Engineering, Electrical Engineering, Physics, or similar
  • Demonstrated ability to go full-stack in R&D: from concept to prototype to evaluation
  • Skilled at assessing feasibility and filtering high-potential ideas from theoretical noise
Job Responsibility
Job Responsibility
  • Research, design, and prototype fundamental humanoid technologies including actuation systems, robotic joints, and structural components
  • Model and simulate systems to evaluate performance and feasibility
  • Design experiments and analyze results to validate hypotheses and inform engineering tradeoffs
  • Build and iterate on physical prototypes using machining, rapid fabrication, and embedded systems
  • Evaluate academic research and patents for technical inspiration and practical application
  • Distill high-concept ideas into actionable engineering workstreams
  • Collaborate closely with other domain experts to bridge theory and application
  • Fulltime
Read More
Arrow Right

Machine Learning/AI Research Engineer

Machine Learning/AI Research Engineer position focusing on advancing renewable o...
Location
Location
Ireland , Galway; Dublin
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD degree (or Master's with equivalent research and innovation experience) in a relevant discipline (e.g., computer science, software engineering, electrical engineering, math, physics, statistics, mechanical engineering, etc.)
  • Proven record of innovation with Deep Learning or with scientific and engineering computation involving the application of Machine Learning
  • Demonstrated experience with innovative solution development, developing proofs-of-concept, first-of-a-kind solutions, and technology transfer
  • Strong software development skills in Python and Pytorch
  • Strong application experience of Machine Learning with physical systems
  • Good understanding of digital twins, use of ML with Digital Twins, applications to sustainability
  • A strong science or engineering background with aptitude for system level analysis and modeling
  • Deep understanding of the relevant environment, ecosystem, trends, and literature
  • Excellent research and development skills
  • Ability to innovate, make research contributions, and bring ideas to reality in compelling ways
Job Responsibility
Job Responsibility
  • Develop and program integrated software algorithms to structure, analyze and leverage structured and unstructured data in product and systems applications
  • Work with large scale computing frameworks, data analysis systems, and modeling environments
  • Use machine learning and statistical modeling techniques to improve product/system performance
  • Formulate descriptive, diagnostic, predictive and prescriptive insights/algorithms and translate technical specifications into code
  • Apply, optimize and scale deep learning technologies and algorithms to give computers the capability to visualize, learn and respond to complex situations
  • Document procedures for installation and maintenance, complete programming, perform testing and debugging, define and monitor performance metrics
  • Provide thought leadership and technical influence internally and externally
  • Take innovative ideas and make them real – contributing along the full range from conception, to design, development, implementation, evaluation, and technology transfer
  • Collaborate with Hewlett Packard Labs' research teams and external partners
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Research Engineer

Fortune 500 clients and government agencies trust eGain AI knowledge solution to...
Location
Location
United States , Sunnyvale
Salary
Salary:
9000.00 USD / Month
pasantennas.com Logo
eGain
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Pursuing a Ph.D. in Computer Science, Data Science, or a related field from a top-tier US university
Job Responsibility
Job Responsibility
  • Explore, evaluate, and experiment open-source and commercial XLM/ML-based tools and algorithms
  • Prototype and test new AI Knowledge capabilities and expert-guided AI tools
Read More
Arrow Right

Research Engineer

We are looking for a Research Engineer to join the research team at ElevenLabs. ...
Location
Location
Poland
Salary
Salary:
Not provided
elevenlabs.io Logo
ElevenLabs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years industry experience as a Machine Learning Engineer, with a key emphasis on constructing data pipelines, as well as developing and implementing machine learning models
  • Demonstrating the capacity to autonomously evaluate novel concepts or enhance current machine learning projects, with the potential outcome of contributing to published works
  • Extensive background in conducting exploratory research to enhance the excellence of gathered data, particularly within the realm of audio and text-to-speech domains
Job Responsibility
Job Responsibility
  • Creating and upholding a reliable and expandable data management system specialized for text-to-speech projects. This includes establishing guidelines for versioning and ensuring data quality
  • Establishing a streamlined process for autonomously training, assessing, and launching text-to-speech models. This encompasses implementing procedures for dynamic learning, as well as routines for fine-tuning and refreshing validation data
  • Investigating cutting-edge approaches and strategies in machine learning, deep learning, and algorithms pertaining to text-to-speech technology
What we offer
What we offer
  • Innovative culture
  • Growth paths
  • Learning & development: ElevenLabs proactively supports professional development through an annual discretionary stipend
  • Social travel: We also provide an annual discretionary stipend to meet up with colleagues each year, however you choose
  • Annual company offsite
  • Co-working: If you’re not located near one of our main hubs, we offer a monthly co-working stipend
  • Fulltime
Read More
Arrow Right

Research Engineer, GenAI

You will be part of Kiddom’s Data Science team, building the foundation of our s...
Location
Location
United States , San Francisco; New York
Salary
Salary:
175000.00 - 250000.00 USD / Year
kiddom.co Logo
Kiddom
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience applying machine learning to solve real-world problems with large, complex datasets
  • 1–2 years in a technical leadership role
  • Proven track record designing, evaluating, and deploying ML/AI systems in production environments that drive measurable business impact, ideally in recommendation, personalization, search, or workflow optimization
  • Strong programming skills in Python
  • Fluency in data manipulation (SQL, Pandas) and common ML toolkits (scikit-learn, XGBoost, TensorFlow/PyTorch)
  • Strong analytical skills and ability to break down complex problems into measurable hypotheses and experiments
  • Excellent communication skills with a history of cross-functional collaboration with product, design, and engineering stakeholders
Job Responsibility
Job Responsibility
  • Architect and scale machine learning systems for search, personalization, and recommendations that power Kiddom’s teacher helper and insight engine
  • Develop evaluation-first development workflows to measure how models improve teaching efficiency, lesson planning, and student learning outcomes
  • Fine-tune machine learning models with feedback signals from teachers and students to align outputs with instructional goals and classroom needs
  • Design intelligent discovery pipelines that combine semantic retrieval, curriculum alignment, and real-time personalization
  • Build agentic assistants that help teachers plan lessons, adapt instruction, and reduce repetitive tasks
  • Collaborate closely with product managers, designers, and curriculum experts to translate high-level educational goals into scalable ML-powered systems
  • Coach and mentor junior ML engineers and data scientists, fostering technical and professional growth
What we offer
What we offer
  • Meaningful equity
  • Health insurance benefits: medical (various PPO/HMO/HSA plans), dental, vision, disability and life insurance
  • One Medical membership (in participating locations)
  • Flexible vacation time policy (subject to internal approval). Average use 4 weeks off per year
  • 10 paid sick days per year (pro rated depending on start date)
  • Paid holidays
  • Paid bereavement leave
  • Paid family leave after birth/adoption. Minimum of 16 paid weeks for birthing parents, 10 weeks for caretaker parents. Meant to supplement benefits offered by State
  • Commuter and FSA plans
  • Fulltime
Read More
Arrow Right

Machine Learning Research Engineer

You will be part of Kiddom’s Data Science team, building the foundation of our s...
Location
Location
United States , San Francisco; New York
Salary
Salary:
175000.00 - 250000.00 USD / Year
kiddom.co Logo
Kiddom
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have 5+ years of industry experience applying machine learning to solve real-world problems with large, complex datasets, with 1–2 years in a technical leadership role
  • Proven track record designing, evaluating, and deploying ML/AI systems in production environments that drive measurable business impact, ideally in recommendation, personalization, search, or workflow optimization
  • Strong programming skills in Python and fluency in data manipulation (SQL, Pandas) and common ML toolkits (scikit-learn, XGBoost, TensorFlow/PyTorch)
  • Strong analytical skills and ability to break down complex problems into measurable hypotheses and experiments
  • Excellent communication skills with a history of cross-functional collaboration with product, design, and engineering stakeholders
Job Responsibility
Job Responsibility
  • Architect and scale machine learning systems for search, personalization, and recommendations that power Kiddom’s teacher helper and insight engine
  • Develop evaluation-first development workflows to measure how models improve teaching efficiency, lesson planning, and student learning outcomes
  • Fine-tune machine learning models with feedback signals from teachers and students to align outputs with instructional goals and classroom needs
  • Design intelligent discovery pipelines that combine semantic retrieval, curriculum alignment, and real-time personalization
  • Build agentic assistants that help teachers plan lessons, adapt instruction, and reduce repetitive tasks
  • Collaborate closely with product managers, designers, and curriculum experts to translate high-level educational goals into scalable ML-powered systems
  • Coach and mentor junior ML engineers and data scientists, fostering technical and professional growth
What we offer
What we offer
  • Competitive salary
  • Meaningful equity
  • Health insurance benefits: medical (various PPO/HMO/HSA plans), dental, vision, disability and life insurance
  • One Medical membership (in participating locations)
  • Flexible vacation time policy (subject to internal approval). Average use 4 weeks off per year
  • 10 paid sick days per year (pro rated depending on start date)
  • Paid holidays
  • Paid bereavement leave
  • Paid family leave after birth/adoption. Minimum of 16 paid weeks for birthing parents, 10 weeks for caretaker parents. Meant to supplement benefits offered by State
  • Commuter and FSA plans
  • Fulltime
Read More
Arrow Right