This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Instrumentl automates grant discovery and management for nonprofits. We’re a mission-driven startup helping the nonprofit sector to drive impact, and we’re well on our way to becoming the #1 most-loved grant discovery and management tool. Instrumentl is a hyper growth YC-backed startup with over 4,000 nonprofit clients, from local homeless shelters to larger organizations like the San Diego Zoo and the University of Alaska. We are building the future of fundraising automation, helping nonprofits to discover, track, and manage grants efficiently through our SaaS platform.
Job Responsibility:
Design agentic systems & ship AI to production: Turn prototypes into resilient, observable services with clear SLAs, rollback/fallback strategies, and cost/latency budgets. Build tool‑using LLM “agents” (task planning, function/tool calling, multi‑step workflows, guardrails) for tasks like grant discovery, application drafting, and research assistance
Own RAG end‑to‑end: Ingest and normalize content, choose chunking/embedding strategies, implement hybrid retrieval, re‑ranking, citations, and grounding. Continuously improve recall/precision while managing index health
Manage embeddings at scale: Select, evaluate, and migrate embedding models
Fine‑tune & build evaluation: Run SFT/LoRA or instruction‑tuning on curated datasets
evaluate the ROI vs. prompt engineering/model selection
manage data versioning and reproducibility. Create offline and online eval harnesses (helpfulness, groundedness, hallucination, toxicity, latency, cost), synthetic test sets, red‑teaming, and human‑in‑the‑loop review
Collaborate cross‑functionally while raising engineering standards: Work side by side with Product, Design, and GTM on scoping, UX, and measurement
run experiments (A/B, canaries), interpret results, and iterate. Write clear, maintainable code, add tests and docs, and contribute to reliability practices (alerts, dashboards, incident response)
Requirements:
5+ years of professional software engineering experience
2+ years working with modern LLMs (as an IC)
Startup experience and comfort operating in fast, scrappy environments is a plus
Proven production impact: You’ve taken LLM/RAG systems from prototype to production, owned reliability/observability, and iterated post‑launch based on evals and user feedback
LLM agentic systems: Experience building tool/function‑calling workflows, planning/execution loops, and safe tool integrations (e.g., with LangChain/LangGraph, LlamaIndex, Semantic Kernel, or custom orchestration)
RAG expertise: Strong grasp of document ingestion, chunking/windowing, embeddings, hybrid search (keyword + vector), re‑ranking, and grounded citations. Experience with re‑rankers/cross‑encoders, hybrid retrieval tuning, or search/recommendation systems
Embeddings & vector stores: Hands‑on with embedding model selection/versioning and vector DBs (e.g., pgvector, FAISS, Pinecone, Weaviate, Milvus, Qdrant)
Evaluation mindset: Comfort designing eval suites (RAG/QA, extraction, summarization), using automated and human‑in‑the‑loop methods
familiarity with frameworks like Ragas/DeepEval/OpenAI Evals or equivalent
Infrastructure & languages: Proficiency in Python (FastAPI, Celery) and TypeScript/Node
Experience with AWS/GCP, Docker, CI/CD, and observability (logs/metrics/traces)
Data chops: Comfortable with SQL, schema design, and building/maintaining data pipelines that power retrieval and evaluation
Collaborative approach: You thrive in a cross‑functional environment and can translate researchy ideas into shippable, user‑friendly features
Results‑driven: Bias for action and ownership with an eye for speed, quality, and simplicity
Nice to have:
Fine‑tuning: Practical experience with SFT/LoRA or instruction‑tuning (and good intuition for when fine‑tuning vs. prompting vs. model choice is the right lever)
Exposure to open‑source LLMs (e.g., Llama) and providers (e.g., OpenAI, Anthropic, Google, Mistral)
Familiarity with responsible AI, red‑teaming, and domain‑specific safety policies
What we offer:
100% covered health, dental, and vision insurance for employees, 50% for dependents
Generous PTO policy, including parental leave
401(k)
Company laptop + stipend to set up your home workstation
Company retreats for in-person time with your colleagues
Work with awesome nonprofits around the US. We partner with incredible organizations doing meaningful work, and you get to help power their success