This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
You'll own the core models and prompts that power Gamma. We weave together text, image, and layout generation to automate all the drudgery of building presentations and websites, and we use AI throughout our product. Your job is to elevate quality, evaluate new models, and push the frontier with new features and modalities. This role is about productizing existing models, not training new ones. You'll focus on prompting, evaluating, and fine-tuning foundation models for maximum performance. With over 1 million AI-generated presentations and 6 million AI images created daily, you'll work at massive scale.
Job Responsibility:
Own our existing LLM and image prompts, measuring and continuously improving quality at scale
Develop complex prompts for new features using AI JSX, balancing creativity with reliability
Build evaluation frameworks for our prompts and models, monitoring metrics and qualitative feedback to create better test sets
Drive the roadmap based on quality gaps, constantly evaluating new frontier models and methods
Curate datasets for fine-tuning open source models and launch new modalities like voice and video
Build analytics and tracking systems while owning uptime, latency, and costs across our AI infrastructure
Requirements:
Prompt hacker: tinkerer who loves seeing how far you can push the limits of a foundation model, with experience building and evaluating prompts at scale
Software engineer: Experienced developer comfortable in TypeScript and Python, excited about mixing prompt engineering with traditional software engineering
Data-driven: You embrace using data to raise the bar of AI quality, with skills in writing evals, designing metrics, and turning qualitative feedback into quantitative measures
Self-sufficient in gathering and cleaning data to inform prompt improvements and model evaluations
Nice to have:
Experience working with modern LLMs, plus image models like Flux and Imagen
Familiarity with AI tooling like AIJSX for prompting and Braintrust for evaluations