CrawlJobs Logo

Program Lead: Product Operations - AI Observability

United States, Sunnyvale 162000.00 - 180000.00 USD / Year · Job Posted April 12, 2026
Apply Position
Job Link Share

Job Description

The AI Observability Program Leader will own the end-to-end strategy, design, and implementation of the frameworks used to monitor, understand, and improve Uber’s GenAI-powered agentic systems. This role sits within the Global Digital Experience team, the operational arm of Uber’s customer support tech organization, and is a critical driver of accuracy, safety, and reliability across Uber’s next-generation AI solutions. This leader will bridge the gap between raw AI logs and actionable product insights.

Job Responsibility

  • Architect Observability Frameworks: Own the strategy for understanding AI agentic reasoning, enabling deep analysis of step-by-step agent decision-making
  • Drive Autoeval Strategy: Design and roll out automated evaluation systems (LLM-as-a-judge) to provide a scalable, high-confidence "pulse" on AI performance across conversational and voice interfaces
  • Define Micrometrics: Develop granular signals within agentic activity—identifying latent failures, reasoning loops, or tool-calling inefficiencies—to drive product improvements
  • Lead Pre-Launch Simulation: Partner with Product & Engineering to build and maintain simulation environments that test AI agents against edge cases before deployment, and democratise these tools with Operations teams
  • Cross-Functional Technical Partnership: Act as the primary liaison between Product, Engineering, and Data Science to ensure observability tooling is integrated into the development lifecycle and directly informs release "Go/No-Go" decisions
  • Insight Synthesis: Package complex technical observability data into clear, actionable narratives for leadership, highlighting specific failure patterns and opportunities for CX improvement
  • Operational Excellence: Establish the standards and tooling for how AI performance is reported globally, ensuring consistency across different regions and support modalities.

Requirements

  • 5+ years of experience in Technical Program Management, Product Operations, AI Quality, or Observability
  • Bachelor’s degree in Engineering, Computer Science, Data Science, or a related technical field.

Nice to have

  • AI Literacy: Deep understanding of GenAI systems, including LLM orchestration, agentic workflows, and the nuances of reasoning chains (e.g., Chain of Thought)
  • Systems Thinking: Proven experience designing technical frameworks or evaluation pipelines (e.g., autoevals, RAG evaluation, or model benchmarking)
  • Analytical Rigor: Ability to define and track complex technical metrics (micrometrics) and correlate them with high-level business KPIs
  • Influence without Authority: Demonstrated ability to drive complex initiatives in an IC capacity by building strong partnerships with Engineering and Product teams
  • Advanced AI Expertise: Experience with "LLM-as-a-judge" frameworks, prompt engineering for evaluations, and fine-tuning feedback loops
  • Simulation & Testing: Background in building simulators, "digital twins," or robust A/B testing frameworks for conversational AI or autonomous agents
  • Tooling Proficiency: Familiarity with AI observability tools
  • Problem Solving: Exceptional ability to turn "noisy" AI logs into structured failure pattern analysis
  • Communication: Strong ability to translate highly technical agent behaviors into business-relevant insights for non-technical stakeholders
  • Domain Knowledge: Experience in Customer Support technology, Voice UX, or high-volume automated workflows.

What we offer

  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits (details at link).

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Program Lead: Product Operations - AI Observability

8 matching positions

AI Product Manager

We’re scaling AI and machine learning across our products, devices, and operatio...
Location
Location
United States , Boston
Salary
Salary:
121300.00 - 177900.00 USD / Year
simplisafe.com Logo
SimpliSafe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of product management experience, including significant ownership of AI/ML or data-intensive products
  • Clear track record of shipping production ML systems (not just integrating third-party AI APIs), in close partnership with data science, ML engineering, and MLOps
  • Principal-level impact: leading cross-team initiatives, shaping strategy, and influencing senior stakeholders
  • Strong understanding of core ML concepts and lifecycle: data, labeling, training/validation, evaluation metrics, deployment, monitoring, and retraining
  • ML experience with at least one of following: computer vision or sensor data, LLM-powered applications (prompting, RAG, fine-tuning, evaluation) and/or hardware or edge products (e.g., on-device models, connectivity/latency trade-offs)
  • Familiarity with modern ML infrastructure (cloud platforms, model serving, CI/CD for ML, monitoring/alerting)
  • Comfortable going deep into data, metrics, and model behavior—not just the UX layer
  • Excellent communicator who can make complex AI topics clear to diverse audiences
  • Strong alignment with our values: customer-obsessed, low ego, highly collaborative, comfortable with ambiguity, and biased toward learning and iteration.
Job Responsibility
Job Responsibility
  • Define and communicate the multi-year roadmap for key AI/ML capabilities across SimpliSafe
  • Identify and prioritize AI opportunities where models and data can materially improve safety, customer experience, or efficiency—on both devices and cloud services
  • Make build-vs-buy decisions for AI capabilities in partnership with data science and engineering
  • Partner with data scientists, ML engineers, and MLOps to design and deliver end-to-end ML solutions—from problem framing through data, training, evaluation, deployment, and monitoring
  • Work with hardware and embedded teams to shape edge AI/ML experiences (e.g., on-device detection, low-latency decisions, bandwidth-aware designs)
  • Define model-level requirements (metrics, latency, cost, guardrails) and connect them to business outcomes (e.g., false alarm reduction, detection accuracy, handle time, CSAT)
  • Translate product needs into requirements for ML platform capabilities (model serving, observability, experiment tracking, human-in-the-loop tools)
  • Lead product direction for LLM and multimodal use cases (e.g., text, vision, sensor data)
  • Decide when to use prompt engineering, RAG, fine-tuning, or traditional ML—and how to evaluate quality, safety, and hallucinations
  • Design workflows that incorporate human review and escalation where needed
What we offer
What we offer
  • A mission- and values-driven culture and a safe, inclusive environment where you can build, grow, and thrive
  • A comprehensive total rewards package that supports your wellness and provides security for SimpliSafers and their families
  • Free SimpliSafe system and professional monitoring for your home
  • Employee Resource Groups (ERGs) that bring people together, give opportunities to network, mentor and develop, and advocate for change
  • Participation in our annual bonus program, equity, and other forms of compensation, in addition to a full range of medical, retirement, and lifestyle benefits.
  • Fulltime
Read More
Arrow Right

Senior AI Engineer

Guidepoint is seeking an experienced Senior AI Engineer to join our Toronto-base...
Location
Location
Canada , Toronto
Salary
Salary:
Not provided
modoras.com Logo
Modoras Accounting Syd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of professional experience (or 5+ with a Master’s degree) designing, building, and scaling distributed, production-grade backend systems
  • 2+ years building and operating Generative AI and agentic systems in production
  • Strong software engineering fundamentals in Python, including building and scaling REST APIs using frameworks such as FastAPI, with experience in asynchronous programming and microservices
  • Hands-on experience building enterprise AI agents and workflows using LLM platforms such as OpenAI, Anthropic (Claude), or Google Gemini, and frameworks like LangChain or agent SDKs
  • Experience building and operating within the enterprise AI ecosystem, including custom GPTs or agents, agent builders, connectors/apps, and application or agent SDKs (e.g., OpenAI Apps SDK, ChatKit, or equivalents)
  • Experience designing and operating agent integration layers (e.g., MCP servers or similar) that connect AI agents to internal APIs, tools, and services, with secure authentication and authorization using enterprise identity platforms such as Okta, Microsoft Entra ID, or OAuth-based systems
  • Strong understanding of AI governance, compliance, and responsible AI practices, including access control, auditability, data handling, and secure deployment of AI systems in enterprise environments
  • Direct experience with RAG, vector search using databases such as Elasticsearch, multi-agent AI systems, tool-calling agents, prompt engineering, and agent evaluation in production environments
  • Cloud-native experience deploying and operating containerized applications on Azure (preferred) or AWS/GCP using Docker and Kubernetes
  • Proven ability to lead complex technical initiatives, make sound architectural decisions, and mentor engineers building production-ready AI systems
Job Responsibility
Job Responsibility
  • Design, build, and operate scalable, low-latency backend services and REST APIs that power Generative AI capabilities, including retrieval-augmented generation (RAG) pipelines, vector search, and enterprise-grade agentic systems
  • Own the full lifecycle of AI applications and agents, from system architecture and development to CI/CD, deployment, agent evaluation, monitoring, and ongoing optimization in production
  • Build production-grade research agents and enterprise AI workflows that integrate LLMs with proprietary knowledge, vector databases (e.g., Elasticsearch), internal tools, external APIs, and real-time data
  • Design and operate multi-agent AI systems, including tool-calling agents and agent orchestration patterns, to support complex research and enterprise workflows
  • Apply AIOps best practices for building, evaluating, deploying, and operating AI agents with strong observability, reliability, and quality controls
  • Continuously improve retrieval and generation quality using prompt engineering, retrieval tuning, re-ranking, advanced chunking strategies, and hallucination reduction techniques
  • Provide technical leadership through design discussions, code reviews, and mentorship, and partner closely with product and business stakeholders to influence the AI roadmap
What we offer
What we offer
  • Paid Time Off
  • Comprehensive benefits plan
  • Company RRSP Match
  • Development opportunities through the LinkedIn Learning platform
Read More
Arrow Right

AI Transformation Leader - Culture & Employee Experience

Microsoft Security is one of the world's largest security organizations — spanni...
Location
Location
United States , Redmond
Salary
Salary:
130900.00 - 277200.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Business, Operations, Finance, or related field AND 6+ years experience in program management, process management, or process improvement OR equivalent experience
  • Master's Degree in Business, Operations, Finance, or related field AND 8+ years experience in program management, process management, or process improvement OR Bachelor's Degree in Business, Operations, Finance, or related field AND 12+ years experience in program management, process management, or process improvement OR equivalent experience
  • Proven ability to drive adoption through scalable resources and community engagement
  • Proven ability to partner closely with matrixed teams and navigate across complex systems
  • Demonstrated storytelling and content strategy skills
  • Demonstrated experience driving large-scale behavior or technology adoption in an engineering organization
  • Working knowledge of AI tools and workflows, and the adoption dynamics specific to engineering teams
  • Proven program design and execution skills — from concept through measurable, sustained outcome
  • Ability to influence engineering leaders without direct authority
  • Ability to translates complex change into clear expectations and observable behaviors
Job Responsibility
Job Responsibility
  • Activate the New Way of Working
  • Drive activation of the new AI-first operating model end-to-end: from readiness through team-level adoption, measurement, and iteration, in close partnership with Engineering Executive Sponsor and Engineering Leaders
  • Partner with engineering managers and VPs to embed AI-first behaviors into operating rhythms, expectations, and management practice
  • Identify and systematically remove barriers to adoption — organizational, tooling, and behavioral
  • Build feedback loops that connect adoption reality on the ground back to operating model design and product teams
  • Build and Run AI Enablement Programs
  • Design and run the Security Frontier Enablement Lab — a structured environment where engineering teams experiment with AI tools, agents, and workflows in low-risk, high-signal settings
  • Produce and scale the AI & Agents Hackathon series to surface new use cases, accelerate adoption, and build community around AI-first practice
  • Run Demo Day as a recurring, visible forum that makes AI-first work celebrated and shared across Security Engineering
  • Drive Behavior Change at Scale
  • Fulltime
Read More
Arrow Right

Director TPM, Release Management and Developer Experience

As Director of Technical Program Management (TPM) at SingleStore, you will be re...
Location
Location
India
Salary
Salary:
Not provided
singlestore.com Logo
SingleStore
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in program/project management, with at least 5+ years in senior leadership roles (Director or equivalent)
  • Proven track record of leading release management and developer experience programs in high-growth, cloud-native, or SaaS organizations
  • Strong expertise in cloud platforms (AWS, GCP, Azure) and engineering productivity tooling (CI/CD, observability, source control, developer platforms)
  • Demonstrated ability to manage complex, cross-functional programs across engineering, product, QA, and operations
  • Experience building, scaling, and mentoring program management and release functions within engineering
  • Skilled in agile methodologies, portfolio planning, and Jira/Atlassian ecosystem administration
  • Strong executive communication and stakeholder management skills, with the ability to influence senior leadership and drive organizational alignment
  • Track record of balancing strategic vision with hands-on program delivery
Job Responsibility
Job Responsibility
  • Release Management Leadership Own and scale the release management process across engineering, ensuring consistent, predictable, and high-quality software releases that align with product and business goals
  • Developer Experience Programs Lead initiatives that improve developer productivity, workflows, and tooling (e.g., CI/CD, observability, internal platforms). Drive adoption of best practices and optimize engineering systems for efficiency and ease of use
  • Program & Portfolio Oversight Define and govern a portfolio of cross-functional programs spanning release management, QA, DevOps, and developer experience. Ensure programs deliver measurable business value and outcomes
  • Strategic Alignment Partner with senior engineering and product leaders to align roadmaps, priorities, and dependencies. Act as a connector across teams to ensure clarity of vision and execution
  • Frameworks & Best Practices Establish scalable planning, governance, and reporting mechanisms (e.g., agile frameworks, Jira administration, dashboards) that enhance visibility and accountability across engineering programs
  • Leadership & Mentorship Build, lead, and mentor a high-performing program management and release management function, fostering career growth and organizational impact
  • AI & Automation Enablement Champion AI-driven automation and analytics in release pipelines, developer tooling, and program management to improve efficiency, predictability, and decision-making
  • Executive Communication Deliver concise, actionable updates to executives and stakeholders, surfacing progress, risks, and opportunities across releases and programs
  • Continuous Improvement Identify bottlenecks, risks, and opportunities for process improvement across engineering
  • drive adoption of innovative practices and tools that improve reliability, speed, and developer experience
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right

Data and AI Solution Architect

The Solution Architect is responsible for translating the client’s business requ...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, Information Technology, Engineering, or related field AND 12+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR Master's Degree in Computer Science, Information Technology, Engineering, or related field AND 8+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR equivalent experience
  • 6+ years technical sales experience
  • 6+ years project management experience
  • 3+ years people management experience, including managing consultant practice managers, technical sales managers, and/or technical architect managers
  • Technical or Professional Certification in Domain (e.g., Security)
Job Responsibility
Job Responsibility
  • Gather customer/partner insights from a broad range of stakeholders as well as the main business sponsor to shape and form both the definition and ongoing execution of projects and work with customer/partner stakeholders and business sponsor to socialize both the business solution and the project approach to determine if changes are needed to the business solution
  • Use evidence-based approach to represent the customer/partner as a customer advocate and share insights with Product Engineering teams and the business with a view to improving Microsoft technologies, products, services and offerings such that they can better meet customer/partner needs across a territory
  • Define and document the Architecture through an architecture description document, an architecture decisions log, and a requirements/constraints traceability matrix to communicate the value proposition of the business solution along with the project approach
  • Work with customers to understand and demonstrate business value (e.g., release of revenue, cost savings) that the business solution realizes and manage and resolve ambiguity in the requirements and constraints and documents assumptions and implications where it cannot be resolved
  • Generate new and/or improvements to existing intellectual property. Connects gaps and patterns across business and technology areas to drive changes
  • Identify which ideas should be culled, with consideration for scale across customers and drive the re-use of intellectual property and recommends practices in both pre-sales and delivery as well as participate and contribute to internal/external communities
  • Lead virtual teams around technologies and customer/partner challenges by sharing ideas, insight, and strategic, technical input with technical teams, internal communities across the field and the larger virtual team across Microsoft using knowledge of Microsoft architectures and their context in the competitive landscape
  • Design and lead end‑to‑end enterprise data, GenAI, Copilot, and agentic AI architectures, translating business outcomes into scalable, secure, and governed solutions across cloud data platforms and AI services
  • Own data platform design (lakehouse/data lake/warehouse, streaming, analytics, semantic models) to ensure the enterprise data estate is AI‑ready (quality, lineage, governance, observability)
  • Lead data migration and modernization from on‑premises environments to cloud, including planning and executing migrations with minimal disruption, and aligning migration choices to modernization goals
  • Fulltime
Read More
Arrow Right

Data and AI Solution Architect

The Solution Architect is responsible for translating the client’s business requ...
Location
Location
Thailand , Bangkok
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, Information Technology, Engineering, or related field AND 12+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR Master's Degree in Computer Science, Information Technology, Engineering, or related field AND 8+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR equivalent experience
  • 6+ years technical sales experience
  • 6+ years project management experience
  • 3+ years people management experience, including managing consultant practice managers, technical sales managers, and/or technical architect managers
  • Technical or Professional Certification in Domain (e.g., Security)
Job Responsibility
Job Responsibility
  • Gather customer/partner insights from a broad range of stakeholders as well as the main business sponsor to shape and form both the definition and ongoing execution of projects and work with customer/partner stakeholders and business sponsor to socialize both the business solution and the project approach to determine if changes are needed to the business solution
  • Use evidence-based approach to represent the customer/partner as a customer advocate and share insights with Product Engineering teams and the business with a view to improving Microsoft technologies, products, services and offerings such that they can better meet customer/partner needs across a territory
  • Define and document the Architecture through an architecture description document, an architecture decisions log, and a requirements/constraints traceability matrix to communicate the value proposition of the business solution along with the project approach
  • Work with customers to understand and demonstrate business value (e.g., release of revenue, cost savings) that the business solution realizes and manage and resolve ambiguity in the requirements and constraints and documents assumptions and implications where it cannot be resolved
  • Generate new and/or improvements to existing intellectual property
  • Connects gaps and patterns across business and technology areas to drive changes
  • Identify which ideas should be culled, with consideration for scale across customers and drive the re-use of intellectual property and recommends practices in both pre-sales and delivery as well as participate and contribute to internal/external communities
  • Lead virtual teams around technologies and customer/partner challenges by sharing ideas, insight, and strategic, technical input with technical teams, internal communities across the field and the larger virtual team across Microsoft using knowledge of Microsoft architectures and their context in the competitive landscape
  • Design and lead end‑to‑end enterprise data, GenAI, Copilot, and agentic AI architectures, translating business outcomes into scalable, secure, and governed solutions across cloud data platforms and AI services
  • Own data platform design (lakehouse/data lake/warehouse, streaming, analytics, semantic models) to ensure the enterprise data estate is AI‑ready (quality, lineage, governance, observability)
  • Fulltime
Read More
Arrow Right

Product Manager

We are looking for a Product Manager to lead the strategy, measurement, and cont...
Location
Location
United States , Irvine
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in technical product management, product ownership, or a similar role focused on applied AI or machine learning products
  • Proven success launching at least one product from concept through production, with direct accountability for adoption, outcomes, and performance measurement
  • Practical experience creating and optimizing production prompts for AI-driven applications such as chat, voice, agent assistance, or related tools
  • Strong SQL skills, including the ability to work with joins, common table expressions, and analytical queries to independently explore and interpret data
  • Solid analytical capability with experience in funnel analysis, cohort evaluation, dashboard creation, and evidence-based decision-making
  • Technical understanding of APIs, webhooks, integrations, and system interactions sufficient to collaborate effectively with engineers
  • Excellent written communication and organizational discipline, with experience defining requirements, documenting success measures, and managing execution in structured environments
  • Familiarity with Agile practices and tools such as Jira, along with experience handling backlogs, prioritization, and issue tracking
Job Responsibility
Job Responsibility
  • Establish performance measures for AI solutions and deliver regular reporting on business and operational outcomes such as resolution effectiveness, completion rates, conversion trends, abandonment patterns, and revenue impact
  • Lead recurring reviews of live AI products, investigate performance declines across prompts, models, data inputs, and connected systems, and coordinate corrective actions through completion
  • Create quality review programs for conversations and interactions, synthesize observations from sampling and audits, and convert findings into improvements for prompts and user flows
  • Drive new AI initiatives from initial idea to production deployment by defining the product approach, setting measurable goals, coordinating build efforts, and overseeing successful releases
  • Manage proof-of-concept programs with clear test objectives, structured evaluation methods, decision criteria, and recommendations for expansion or discontinuation
  • Oversee third-party AI vendor assessments by gathering requirements, comparing options, validating references, supporting contract evaluation, and guiding onboarding within governance standards
  • Develop, refine, and maintain prompts for voice, chat, and assistive AI experiences across customer-facing and employee-facing journeys
  • Design evaluation methods for prompt and model output using real interaction data, apply scoring frameworks, and iterate systematically to improve quality and consistency
  • Partner with data and engineering teams to define tracking, reporting structures, and integrations needed to measure product performance, investigate issues, and support decision-making
  • Act as the primary owner for assigned AI products by aligning stakeholders across business, compliance, legal, operations, and engineering while ensuring adherence to governance and regulatory expectations
What we offer
What we offer
  • medical, vision, dental, and life and disability insurance
  • 401(k) plan
  • Fulltime
Read More
Arrow Right