Machine Learning Eval Engineer Job at Reducto (San Francisco)

Job Description

As an ML Eval Engineer, you’ll play a key role in building the evaluation systems and benchmarks that make Reducto’s models better over time. You’ll collaborate closely with our ML, platform, and GTM teams to identify model weaknesses, design strong benchmarks, and create metrics and tooling that surface new failure modes as we scale. This is a high-impact role where you’ll help define how model quality is measured at Reducto and shape the systems we use to improve it.

Job Responsibility

Design, build, and maintain evaluation benchmarks that reveal where our models perform well and where they fail
Develop metrics, heuristics, and workflows to automatically identify new failure modes across large and messy real-world datasets
Partner closely with other ML engineers to turn evaluation insights into model improvements and better training priorities
Work hands-on with unstructured enterprise data, including PDFs, spreadsheets, and other difficult document formats, to uncover edge cases and hard examples
Build lightweight internal and user-facing tools, including simple interfaces in Python frameworks like Flask, to help teams inspect results, analyze model behavior, and communicate evaluation outcomes
Collaborate with customers and internal teams to understand real-world data needs and create bespoke benchmarks that highlight Reducto’s strengths

Requirements

Hold yourself to a high bar for quality and precision
Enjoy solving complex problems and building from first principles
Have strong Python skills and can independently build clean, reliable technical solutions
Are comfortable working with data infrastructure such as AWS S3 and OLAP or analytics systems like Tinybird
Love getting your hands dirty with unstructured data and chasing down difficult failure cases
Operate well in fast-changing, high-growth environments
Collaborate effectively across technical and non-technical teams
Take full ownership from strategy through execution

Nice to have

Bonus points for product and frontend experience
Have experience at an early-stage or high-growth startup
Have some background in product thinking and can build simple, polished user-facing interfaces
Are comfortable working directly with customers to understand their workflows and data needs
Have experience in AI/ML, data infrastructure, enterprise software, or document understanding systems
Care deeply about combining technical excellence with business impact

What we offer

Unlimited PTO
Lunch: Receive a free lunch to eat with your teammates daily at the office
Reimbursed Transportation: Provide us with your receipts and we’ll take care of the costs
Insurance: Generous health insurance covering medical, dental, and vision
Health and Wellness Budget: We provide up to $150/mo reimbursement for health and wellness spending, such as gym memberships, fitness classes, or similar
Parental Leave: Work with us to build a leave schedule that works for you and your family

Reducto - All Job Offers

Select Country

Machine Learning Eval Engineer

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

Machine Learning Eval Engineer

Senior Machine Learning Engineer

Lead Machine Learning Engineer

Staff Machine Learning Research Scientist, LLM Evals

Tech Lead Manager Machine Learning Research Scientist LLM Evals

Senior Software Engineer, AI Eval

Head of Applied AI & Agent Factory - Managing Director

Mid Level Genai Engineers

AI Engineer II

Our AI answers in your language