This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As an Agentic Evaluation Specialist, you will work directly with researchers at a top 8 Frontier large language model company to improve LLM Agent performance specific for Agentic Storage Management.
Job Responsibility:
Data migration workflows across cloud platforms (AWS, GCP, Azure)
Data migration strategies, considering trade-offs (online vs offline, performance, scalability)
Creation & upkeep of data pipelines and ETL/ELT workflows in distributed environments
Integrate and evaluate LLM/Generative AI components within workflows
Perform system testing and validation: Execute test scenarios, Identify edge cases and failure points, Debug and analyze issues across systems, Ensure data integrity, consistency, and performance across systems
Document findings, insights, and recommendations clearly for stakeholders
Collaborate with cross-functional teams across engineering, data, and product
Requirements:
5+ years of relevant experience with cloud platforms (GCP, AWS, or Azure) and data/storage systems
Strong understanding of data migration processes and associated trade-offs
Experience with data pipelines, ETL/ELT workflows, or distributed systems
Hands-on exposure to LLMs / Generative AI (prompting, evaluation, or integration)
Experience in system testing, validation, or QA
Strong debugging and problem-solving skills
Excellent written communication skills
Nice to have:
Background in storage administration, SRE, or cloud infrastructure operations
Experience with MLOps, model evaluation frameworks, or AI testing workflows
Familiarity with Google Cloud Storage (GCS) or similar platforms
Experience working in Agile environments with cross-functional teams