Senior Software Engineer, AI Eval Job at Sentry (San Francisco)

Senior Software Engineer - Studio - Java, AI

As a Senior Software Engineer, you’ll build the backend that powers AI features ...

Location

United States , New York

Salary:

175000.00 - 240000.00 USD / Year

Clear Street

Expiration Date

Until further notice

Requirements

At least 7+ years of strong proficiency in enterprise Java
Experience designing and deploying AI/ML or LLM-backed systems in production
Familiarity with LLM tooling and patterns: (e.g. tool calling, RAG pipelines and knowledge bases, evals, cost/latency tradeoffs, basic red-teaming)
Experience in supporting and running systems in a production environment
Comfortable working in a dynamic environment, partnering with cross-functional teams, and moving from prototype to reliable production

Job Responsibility

Design, implement, and productionize reliable AI workflows to augment the Studio trading platform
Build tooling to monitor, tune, and evaluate models and workflows, as well as applicable guardrails to ensure outputs meet quality and regulatory requirements
Collaborate with technical and non-technical teams across the firm to identify high ROI AI opportunities
Build rapid prototypes and translate them into production-grade systems. Utilize the latest AI-powered development tools to iterate quickly
Create reusable libraries, SDKs and tooling to enable AI development throughout the firm
Stay current on the latest in applied AI. Read papers, evaluate new models, test out new tools
Participate in code review and architecture design, manage deployments, and support and contribute to the success of the overall Studio platform

What we offer

Competitive compensation, benefits, and perks
Company equity
401k matching
Gender neutral parental leave
Full medical, dental and vision insurance
Lunch stipends
Fully stocked kitchens
Happy hours

Fulltime

Senior Platform Engineer, AI Evaluation

We’re looking for an AI Platform Engineer to evolve and extend our internal eval...

Location

United States , Mountain View

Salary:

137871.00 - 172339.00 USD / Year

Khan Academy

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
5 years of Software Engineering experience with 2+ of those years working on the evaluation of generative AI systems
Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
Familiarity with the architecture of large language models and their industry-standard APIs

Job Responsibility

Evolve and extend our internal evaluation framework for assessing the quality of our AI-driven experiences
Work closely with ML data engineers and platform developers to help internal teams adopt an eval-driven development process incorporating offline benchmark tests and online experiments
Gather internal requirements, getting buy-in for changes, and then developing documentation and training materials

What we offer

Competitive salaries
Ample paid time off as needed
8 pre-scheduled Wellness Days in 2026
Remote-first culture
Generous parental leave
401(k) + 4% matching
Comprehensive insurance, including medical, dental, vision, and life

Fulltime

Senior Software Engineer, AI Product

As a Senior Applied AI Engineer at Vanta, you will play a crucial role in shapin...

Location

United States

Salary:

207000.00 - 244000.00 USD / Year

Vanta

Expiration Date

Until further notice

Requirements

At least 7 years of industry experience as a software engineer
You’ve shipped LLM-backed products and have experience with prompting, RAG, and/or agent frameworks
You have experience designing, building, and scaling full-stack applications, including backend systems, APIs, and frontend interfaces
You have familiarity with TypeScript, React, and Node.js, or a willingness to learn
You have experience improving AI systems, creating eval sets, and driving quality hill-climbing
You have experience mentoring other engineers and collaborating with product and design
You have worked at rapidly scaling startups and large companies, especially with environments that prioritize a bias for action
You are action-driven, willing to roll up your sleeves and engage directly with users
You aren’t afraid to put on your product hat
While you bring strong opinions, you prioritize building a platform that meets users where they are

Job Responsibility

Work cross-functionally to design and implement AI-powered features to deliver customer value and integrate LLMs with Vanta’s existing products and systems
Instrument evaluations, guardrails, and monitoring, and review customer usage to continually improve quality
Collaborate with AI Platform engineers shaping foundational AI systems and tooling that accelerate product teams
Make pragmatic tradeoffs that consider business priorities, user experience, and a sustainable technical foundation
Mentor engineers, champion good technical and product instincts, and model a collaborative, high-ownership engineering culture

What we offer

Offers Equity
medical benefits
401(k) plan
other company perk programs
Comprehensive medical, dental, and vision coverage, with 100% of employee-only benefit premiums covered for most medical plans
16 weeks fully-paid Parental Leave for all new parents
Health & wellness stipend
Remote workspace, internet, and cellphone stipend
Commuter benefits for team members who report to the SF and NYC office
Family planning benefits

Fulltime

Senior Software Engineer

As a Senior Research Engineer at Microsoft, you will advance Microsoft’s mission...

Location

India , Hyderabad

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Engineering, Mathematics, Statistics, Physics, or a related field and 4 or more years in applied ML or AI research and product engineering
Master’s degree and 3 or more years in applied ML or AI research and product engineering
PhD in a relevant field and 2 or more years with generative AI, LLMs, or related ML algorithms
Proficiency in Python and at least one deep learning framework such as PyTorch, JAX, or TensorFlow
Experience deploying Fine Tuned LLMs or multimodal models in live production environments
Experience shipping and maintaining production AI systems
Ability to meet Microsoft, customer, and government security screening requirements
Microsoft Cloud Background Check upon hire or transfer and every two years thereafter

Job Responsibility

Bringing State-of-the-Art Research to Products
Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML
Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments
Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
Drive original research and thought leadership (whitepapers, internal notes, patents)
convert insights into shipped capabilities
Research Translation: Continuously review emerging work
identify high-potential methods and adapt them to Microsoft problem spaces
End-to-End System Development
ML Design & Architecture: Own end-to-end pipeline from data prep, training, evaluation, deployment, and feedback loops

Fulltime

Senior AI Frontend Engineer (Developer Productivity)

We're seeking a Senior Frontend Engineer with a strong React/TypeScript backgrou...

Location

United Kingdom , Belfast

Salary:

Not provided

Citi

Expiration Date

Until further notice

Requirements

Strong expertise (5–10+ years) building modern frontend applications with React and TypeScript
Proficiency in JavaScript, React (or another UI framework), and TypeScript
Experience with state management libraries (redux, context API, zustand), for building wellstructured applications
Experience with storybook or componentised development
Proficiency in implementing streaming and real-time experiences (e.g., word/token streaming, live updates, progress/status indicators)
Strong understanding of frontend architectures, state management, performance optimisation, and responsive design
Hands-on experience with any tools like LangChain / LangGraph / Vercel AI SDK / Google ADK (Agent Development Kit)
Familiarity with CI/CD tools (e.g.: Jenkins, Tekton, ArgoCD, Harness, etc)

Job Responsibility

Own the user-facing layer of our nextgeneration Developer Productivity platform @ Citi, transforming complex AI capabilities - from chat interfaces to rich data visualizations - into intuitive, trustworthy experiences
Collaborate closely with other AI, Software Engineers and the Product team to leverage bleeding-edge Generative AI
Challenge, change, modernise & enhance the experience of our 50,000 engineers globally throughout Citi's SDLC (Software Development Life Cycle)
Release to production a small new or enhanced AI-first user interface that will have positively impacted the lives of thousands of Software Engineers and Business Analysts working in Software Requirements Engineering
Start raising the bar in our React.JS codebase introducing better componentisation, testing, storybook
Establish network of UI engineers across the organisation to contribute and learn about best practice
Get buy in from the team on architectural principles, ways of working and system requirements
Own and champion the implementation of best practices for interaction design within the team, establishing clear guidelines for AI-specific UX patterns
Mentor junior engineers on best practices for designing and implementing AI-driven user interfaces
Design & implement production-grade features for AI solutions

What we offer

27 days annual leave (plus bank holidays)
A discretional annual performance related bonus
Private Medical Care & Life Insurance
Employee Assistance Program
Pension Plan
Paid Parental Leave
Special discounts for employees, family, and friends
Access to an array of learning and development resources

Fulltime

Senior Software Engineer, AI Evals

As a Senior Software Engineer on Sentry’s AI/ML team, you’ll be responsible for ...

Location

United States , San Francisco

Salary:

240000.00 - 280000.00 USD / Year

Sentry

Expiration Date

Until further notice

Requirements

Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field
Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred)
Comfort writing production-quality code (we use Python and TypeScript)
Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines
Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts)

Job Responsibility

Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria
Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring

What we offer

Offers Equity
incentive compensation
equity grants
paid time off
group health insurance coverage

Fulltime

Senior Research Engineer

As a Research Engineer at Microsoft, you will set the technical vision and lead ...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Proven track record leading large-scale AI systems and cross-org initiatives that shipped
Solid software engineering foundations and hands-on depth in Python plus deep-learning frameworks (PyTorch/ TensorFlow) and modern MLOps/tooling
Experience shipping and maintaining production AI systems
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Architect and deliver complex AI systems across model development, data, infra, evaluation, and deployment spanning multiple product lines
Set technical direction for large programs
drive alignment across Research, Engineering, and Product
Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
Integrate LLMs, multimodal models, multi-agent architectures, and RAG into Microsoft’s ecosystem
Establish best practices for MLOps, governance, and Responsible AI, compliant with Microsoft principles and industry standards
Drive original research and thought leadership (whitepapers, internal notes, patents)
convert insights into shipped capabilities
Research Translation: Continuously review emerging work
identify high-potential methods and adapt them to Microsoft problem spaces

Fulltime

New

Staff Software Engineer – Applied AI

Lead the design and delivery of end-to-end AI applications, from discovery and p...

Location

United Arab Emirates , Dubai

Salary:

Not provided

Orbis Consultants

Expiration Date

Until further notice

Requirements

7+ years’ experience building production-grade software
Strong backend capability (Python preferred, but stack-agnostic mindset)
Hands-on experience or strong interest in LLMs / GenAI (LangChain, vector DBs, model tooling, eval frameworks etc.)
Comfortable owning projects end-to-end and interacting directly with technical stakeholders
Startup mentality – high ownership, adaptable, and excited by ambiguity

Job Responsibility

Architecting and deploying custom AI solutions (automation, agents, evaluation frameworks, internal AI tooling)
Working directly with senior stakeholders (including CTO-level) on requirements and trade-offs
Leading technical direction across projects
Shaping engineering standards and culture as the team scales

What we offer

Front-row seat to real-world enterprise AI deployment
Exposure to a wide range of industries and use cases
Senior, high-calibre engineering environment
Opportunity to shape a new regional presence in Dubai

Fulltime

Senior Software Engineer, AI Eval

Sentry

Location:
United States , San Francisco

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
January 22, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Software Engineer, AI Eval