This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As an ML Eval Engineer, you’ll play a key role in building the evaluation systems and benchmarks that make Reducto’s models better over time. You’ll collaborate closely with our ML, platform, and GTM teams to identify model weaknesses, design strong benchmarks, and create metrics and tooling that surface new failure modes as we scale. This is a high-impact role where you’ll help define how model quality is measured at Reducto and shape the systems we use to improve it.
Job Responsibility:
Design, build, and maintain evaluation benchmarks that reveal where our models perform well and where they fail
Develop metrics, heuristics, and workflows to automatically identify new failure modes across large and messy real-world datasets
Partner closely with other ML engineers to turn evaluation insights into model improvements and better training priorities
Work hands-on with unstructured enterprise data, including PDFs, spreadsheets, and other difficult document formats, to uncover edge cases and hard examples
Build lightweight internal and user-facing tools, including simple interfaces in Python frameworks like Flask, to help teams inspect results, analyze model behavior, and communicate evaluation outcomes
Collaborate with customers and internal teams to understand real-world data needs and create bespoke benchmarks that highlight Reducto’s strengths
Requirements:
Hold yourself to a high bar for quality and precision
Enjoy solving complex problems and building from first principles
Have strong Python skills and can independently build clean, reliable technical solutions
Are comfortable working with data infrastructure such as AWS S3 and OLAP or analytics systems like Tinybird
Love getting your hands dirty with unstructured data and chasing down difficult failure cases
Operate well in fast-changing, high-growth environments
Collaborate effectively across technical and non-technical teams
Take full ownership from strategy through execution
Nice to have:
Bonus points for product and frontend experience
Have experience at an early-stage or high-growth startup
Have some background in product thinking and can build simple, polished user-facing interfaces
Are comfortable working directly with customers to understand their workflows and data needs
Have experience in AI/ML, data infrastructure, enterprise software, or document understanding systems
Care deeply about combining technical excellence with business impact
What we offer:
Unlimited PTO
Lunch: Receive a free lunch to eat with your teammates daily at the office
Reimbursed Transportation: Provide us with your receipts and we’ll take care of the costs
Insurance: Generous health insurance covering medical, dental, and vision
Health and Wellness Budget: We provide up to $150/mo reimbursement for health and wellness spending, such as gym memberships, fitness classes, or similar
Parental Leave: Work with us to build a leave schedule that works for you and your family