CrawlJobs Logo

Program Lead: Product Operations - AI Observability

uber.com Logo

Uber

Location Icon

Location:
United States , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

162000.00 - 180000.00 USD / Year

Job Description:

The AI Observability Program Leader will own the end-to-end strategy, design, and implementation of the frameworks used to monitor, understand, and improve Uber’s GenAI-powered agentic systems. This role sits within the Global Digital Experience team, the operational arm of Uber’s customer support tech organization, and is a critical driver of accuracy, safety, and reliability across Uber’s next-generation AI solutions. This leader will bridge the gap between raw AI logs and actionable product insights.

Job Responsibility:

  • Architect Observability Frameworks: Own the strategy for understanding AI agentic reasoning, enabling deep analysis of step-by-step agent decision-making
  • Drive Autoeval Strategy: Design and roll out automated evaluation systems (LLM-as-a-judge) to provide a scalable, high-confidence "pulse" on AI performance across conversational and voice interfaces
  • Define Micrometrics: Develop granular signals within agentic activity—identifying latent failures, reasoning loops, or tool-calling inefficiencies—to drive product improvements
  • Lead Pre-Launch Simulation: Partner with Product & Engineering to build and maintain simulation environments that test AI agents against edge cases before deployment, and democratise these tools with Operations teams
  • Cross-Functional Technical Partnership: Act as the primary liaison between Product, Engineering, and Data Science to ensure observability tooling is integrated into the development lifecycle and directly informs release "Go/No-Go" decisions
  • Insight Synthesis: Package complex technical observability data into clear, actionable narratives for leadership, highlighting specific failure patterns and opportunities for CX improvement
  • Operational Excellence: Establish the standards and tooling for how AI performance is reported globally, ensuring consistency across different regions and support modalities.

Requirements:

  • 5+ years of experience in Technical Program Management, Product Operations, AI Quality, or Observability
  • Bachelor’s degree in Engineering, Computer Science, Data Science, or a related technical field.

Nice to have:

  • AI Literacy: Deep understanding of GenAI systems, including LLM orchestration, agentic workflows, and the nuances of reasoning chains (e.g., Chain of Thought)
  • Systems Thinking: Proven experience designing technical frameworks or evaluation pipelines (e.g., autoevals, RAG evaluation, or model benchmarking)
  • Analytical Rigor: Ability to define and track complex technical metrics (micrometrics) and correlate them with high-level business KPIs
  • Influence without Authority: Demonstrated ability to drive complex initiatives in an IC capacity by building strong partnerships with Engineering and Product teams
  • Advanced AI Expertise: Experience with "LLM-as-a-judge" frameworks, prompt engineering for evaluations, and fine-tuning feedback loops
  • Simulation & Testing: Background in building simulators, "digital twins," or robust A/B testing frameworks for conversational AI or autonomous agents
  • Tooling Proficiency: Familiarity with AI observability tools
  • Problem Solving: Exceptional ability to turn "noisy" AI logs into structured failure pattern analysis
  • Communication: Strong ability to translate highly technical agent behaviors into business-relevant insights for non-technical stakeholders
  • Domain Knowledge: Experience in Customer Support technology, Voice UX, or high-volume automated workflows.
What we offer:
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits (details at link).

Additional Information:

Job Posted:
April 12, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Program Lead: Product Operations - AI Observability

AI Product Manager

We’re scaling AI and machine learning across our products, devices, and operatio...
Location
Location
United States , Boston
Salary
Salary:
121300.00 - 177900.00 USD / Year
simplisafe.com Logo
SimpliSafe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of product management experience, including significant ownership of AI/ML or data-intensive products
  • Clear track record of shipping production ML systems (not just integrating third-party AI APIs), in close partnership with data science, ML engineering, and MLOps
  • Principal-level impact: leading cross-team initiatives, shaping strategy, and influencing senior stakeholders
  • Strong understanding of core ML concepts and lifecycle: data, labeling, training/validation, evaluation metrics, deployment, monitoring, and retraining
  • ML experience with at least one of following: computer vision or sensor data, LLM-powered applications (prompting, RAG, fine-tuning, evaluation) and/or hardware or edge products (e.g., on-device models, connectivity/latency trade-offs)
  • Familiarity with modern ML infrastructure (cloud platforms, model serving, CI/CD for ML, monitoring/alerting)
  • Comfortable going deep into data, metrics, and model behavior—not just the UX layer
  • Excellent communicator who can make complex AI topics clear to diverse audiences
  • Strong alignment with our values: customer-obsessed, low ego, highly collaborative, comfortable with ambiguity, and biased toward learning and iteration.
Job Responsibility
Job Responsibility
  • Define and communicate the multi-year roadmap for key AI/ML capabilities across SimpliSafe
  • Identify and prioritize AI opportunities where models and data can materially improve safety, customer experience, or efficiency—on both devices and cloud services
  • Make build-vs-buy decisions for AI capabilities in partnership with data science and engineering
  • Partner with data scientists, ML engineers, and MLOps to design and deliver end-to-end ML solutions—from problem framing through data, training, evaluation, deployment, and monitoring
  • Work with hardware and embedded teams to shape edge AI/ML experiences (e.g., on-device detection, low-latency decisions, bandwidth-aware designs)
  • Define model-level requirements (metrics, latency, cost, guardrails) and connect them to business outcomes (e.g., false alarm reduction, detection accuracy, handle time, CSAT)
  • Translate product needs into requirements for ML platform capabilities (model serving, observability, experiment tracking, human-in-the-loop tools)
  • Lead product direction for LLM and multimodal use cases (e.g., text, vision, sensor data)
  • Decide when to use prompt engineering, RAG, fine-tuning, or traditional ML—and how to evaluate quality, safety, and hallucinations
  • Design workflows that incorporate human review and escalation where needed
What we offer
What we offer
  • A mission- and values-driven culture and a safe, inclusive environment where you can build, grow, and thrive
  • A comprehensive total rewards package that supports your wellness and provides security for SimpliSafers and their families
  • Free SimpliSafe system and professional monitoring for your home
  • Employee Resource Groups (ERGs) that bring people together, give opportunities to network, mentor and develop, and advocate for change
  • Participation in our annual bonus program, equity, and other forms of compensation, in addition to a full range of medical, retirement, and lifestyle benefits.
  • Fulltime
Read More
Arrow Right

Head of AI Transformation & Operational Excellence

Lead Valtech’s end-to-end AI transformation and operational excellence for a maj...
Location
Location
France , Paris
Salary
Salary:
Not provided
valtech.com Logo
Valtech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven track record in leading complex, data-driven or AI-enabled transformation programs
  • Demonstrated ability to move initiatives from POC to industrialized, production-grade solutions
  • Experience leading distributed delivery teams (onshore/offshore)
  • Strong understanding of e-commerce, retail, or omnichannel ecosystems
  • Ability to operate at both strategic and operational levels
  • Fully bilingual French / English is mandatory
  • Strong transformation leader with a clear, forward-looking vision
  • Clear, direct, and structured communicator
  • able to simplify complexity
  • Pragmatic, execution-focused, and value-driven
Job Responsibility
Job Responsibility
  • Lead Valtech’s end-to-end AI transformation and operational excellence for the client account
  • Own the definition, orchestration, and execution of AI initiatives, ensuring they move from POC to industrialized, production-grade capabilities delivering tangible value
  • Ensure AI-driven transformation is consistently embedded across the four core pillars of delivery: Agile Operating Model, Engineering, Quality & Testing, Production & Support
  • Drive innovation and continuous improvement of delivery and support operations, leveraging AI, automation, process optimization, and new operating practices
  • Define and own the global AI transformation roadmap across eRetail and omnichannel delivery
  • Ensure AI initiatives are aligned with the client’s business priorities and Valtech’s delivery and operational objectives
  • Drive AI adoption beyond experimentation, with clear industrialization paths, ownership models, and value tracking
  • Position AI as a structural enabler of delivery performance, quality, and operational efficiency
  • Work in close partnership with the AI Transformation Program Manager
  • Act as Valtech’s AI and operational excellence reference toward the client
What we offer
What we offer
  • Flexibility, with remote and hybrid work options (country-dependent)
  • Career advancement, with international mobility and professional development programs
  • Learning and development, with access to cutting-edge tools, training and industry experts
  • Fulltime
Read More
Arrow Right

Senior AI Engineer

Guidepoint is seeking an experienced Senior AI Engineer to join our Toronto-base...
Location
Location
Canada , Toronto
Salary
Salary:
Not provided
modoras.com Logo
Modoras Accounting Syd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of professional experience (or 5+ with a Master’s degree) designing, building, and scaling distributed, production-grade backend systems
  • 2+ years building and operating Generative AI and agentic systems in production
  • Strong software engineering fundamentals in Python, including building and scaling REST APIs using frameworks such as FastAPI, with experience in asynchronous programming and microservices
  • Hands-on experience building enterprise AI agents and workflows using LLM platforms such as OpenAI, Anthropic (Claude), or Google Gemini, and frameworks like LangChain or agent SDKs
  • Experience building and operating within the enterprise AI ecosystem, including custom GPTs or agents, agent builders, connectors/apps, and application or agent SDKs (e.g., OpenAI Apps SDK, ChatKit, or equivalents)
  • Experience designing and operating agent integration layers (e.g., MCP servers or similar) that connect AI agents to internal APIs, tools, and services, with secure authentication and authorization using enterprise identity platforms such as Okta, Microsoft Entra ID, or OAuth-based systems
  • Strong understanding of AI governance, compliance, and responsible AI practices, including access control, auditability, data handling, and secure deployment of AI systems in enterprise environments
  • Direct experience with RAG, vector search using databases such as Elasticsearch, multi-agent AI systems, tool-calling agents, prompt engineering, and agent evaluation in production environments
  • Cloud-native experience deploying and operating containerized applications on Azure (preferred) or AWS/GCP using Docker and Kubernetes
  • Proven ability to lead complex technical initiatives, make sound architectural decisions, and mentor engineers building production-ready AI systems
Job Responsibility
Job Responsibility
  • Design, build, and operate scalable, low-latency backend services and REST APIs that power Generative AI capabilities, including retrieval-augmented generation (RAG) pipelines, vector search, and enterprise-grade agentic systems
  • Own the full lifecycle of AI applications and agents, from system architecture and development to CI/CD, deployment, agent evaluation, monitoring, and ongoing optimization in production
  • Build production-grade research agents and enterprise AI workflows that integrate LLMs with proprietary knowledge, vector databases (e.g., Elasticsearch), internal tools, external APIs, and real-time data
  • Design and operate multi-agent AI systems, including tool-calling agents and agent orchestration patterns, to support complex research and enterprise workflows
  • Apply AIOps best practices for building, evaluating, deploying, and operating AI agents with strong observability, reliability, and quality controls
  • Continuously improve retrieval and generation quality using prompt engineering, retrieval tuning, re-ranking, advanced chunking strategies, and hallucination reduction techniques
  • Provide technical leadership through design discussions, code reviews, and mentorship, and partner closely with product and business stakeholders to influence the AI roadmap
What we offer
What we offer
  • Paid Time Off
  • Comprehensive benefits plan
  • Company RRSP Match
  • Development opportunities through the LinkedIn Learning platform
Read More
Arrow Right

Director TPM, Release Management and Developer Experience

As Director of Technical Program Management (TPM) at SingleStore, you will be re...
Location
Location
India
Salary
Salary:
Not provided
singlestore.com Logo
SingleStore
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in program/project management, with at least 5+ years in senior leadership roles (Director or equivalent)
  • Proven track record of leading release management and developer experience programs in high-growth, cloud-native, or SaaS organizations
  • Strong expertise in cloud platforms (AWS, GCP, Azure) and engineering productivity tooling (CI/CD, observability, source control, developer platforms)
  • Demonstrated ability to manage complex, cross-functional programs across engineering, product, QA, and operations
  • Experience building, scaling, and mentoring program management and release functions within engineering
  • Skilled in agile methodologies, portfolio planning, and Jira/Atlassian ecosystem administration
  • Strong executive communication and stakeholder management skills, with the ability to influence senior leadership and drive organizational alignment
  • Track record of balancing strategic vision with hands-on program delivery
Job Responsibility
Job Responsibility
  • Release Management Leadership Own and scale the release management process across engineering, ensuring consistent, predictable, and high-quality software releases that align with product and business goals
  • Developer Experience Programs Lead initiatives that improve developer productivity, workflows, and tooling (e.g., CI/CD, observability, internal platforms). Drive adoption of best practices and optimize engineering systems for efficiency and ease of use
  • Program & Portfolio Oversight Define and govern a portfolio of cross-functional programs spanning release management, QA, DevOps, and developer experience. Ensure programs deliver measurable business value and outcomes
  • Strategic Alignment Partner with senior engineering and product leaders to align roadmaps, priorities, and dependencies. Act as a connector across teams to ensure clarity of vision and execution
  • Frameworks & Best Practices Establish scalable planning, governance, and reporting mechanisms (e.g., agile frameworks, Jira administration, dashboards) that enhance visibility and accountability across engineering programs
  • Leadership & Mentorship Build, lead, and mentor a high-performing program management and release management function, fostering career growth and organizational impact
  • AI & Automation Enablement Champion AI-driven automation and analytics in release pipelines, developer tooling, and program management to improve efficiency, predictability, and decision-making
  • Executive Communication Deliver concise, actionable updates to executives and stakeholders, surfacing progress, risks, and opportunities across releases and programs
  • Continuous Improvement Identify bottlenecks, risks, and opportunities for process improvement across engineering
  • drive adoption of innovative practices and tools that improve reliability, speed, and developer experience
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right

Data and AI Solution Architect

The Solution Architect is responsible for translating the client’s business requ...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, Information Technology, Engineering, or related field AND 12+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR Master's Degree in Computer Science, Information Technology, Engineering, or related field AND 8+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR equivalent experience
  • 6+ years technical sales experience
  • 6+ years project management experience
  • 3+ years people management experience, including managing consultant practice managers, technical sales managers, and/or technical architect managers
  • Technical or Professional Certification in Domain (e.g., Security)
Job Responsibility
Job Responsibility
  • Gather customer/partner insights from a broad range of stakeholders as well as the main business sponsor to shape and form both the definition and ongoing execution of projects and work with customer/partner stakeholders and business sponsor to socialize both the business solution and the project approach to determine if changes are needed to the business solution
  • Use evidence-based approach to represent the customer/partner as a customer advocate and share insights with Product Engineering teams and the business with a view to improving Microsoft technologies, products, services and offerings such that they can better meet customer/partner needs across a territory
  • Define and document the Architecture through an architecture description document, an architecture decisions log, and a requirements/constraints traceability matrix to communicate the value proposition of the business solution along with the project approach
  • Work with customers to understand and demonstrate business value (e.g., release of revenue, cost savings) that the business solution realizes and manage and resolve ambiguity in the requirements and constraints and documents assumptions and implications where it cannot be resolved
  • Generate new and/or improvements to existing intellectual property. Connects gaps and patterns across business and technology areas to drive changes
  • Identify which ideas should be culled, with consideration for scale across customers and drive the re-use of intellectual property and recommends practices in both pre-sales and delivery as well as participate and contribute to internal/external communities
  • Lead virtual teams around technologies and customer/partner challenges by sharing ideas, insight, and strategic, technical input with technical teams, internal communities across the field and the larger virtual team across Microsoft using knowledge of Microsoft architectures and their context in the competitive landscape
  • Design and lead end‑to‑end enterprise data, GenAI, Copilot, and agentic AI architectures, translating business outcomes into scalable, secure, and governed solutions across cloud data platforms and AI services
  • Own data platform design (lakehouse/data lake/warehouse, streaming, analytics, semantic models) to ensure the enterprise data estate is AI‑ready (quality, lineage, governance, observability)
  • Lead data migration and modernization from on‑premises environments to cloud, including planning and executing migrations with minimal disruption, and aligning migration choices to modernization goals
  • Fulltime
Read More
Arrow Right

Data and AI Solution Architect

The Solution Architect is responsible for translating the client’s business requ...
Location
Location
Thailand , Bangkok
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, Information Technology, Engineering, or related field AND 12+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR Master's Degree in Computer Science, Information Technology, Engineering, or related field AND 8+ years experience in technology solutions, practice development, architecture, consulting, and/or technology domain (e.g., Security)
  • OR equivalent experience
  • 6+ years technical sales experience
  • 6+ years project management experience
  • 3+ years people management experience, including managing consultant practice managers, technical sales managers, and/or technical architect managers
  • Technical or Professional Certification in Domain (e.g., Security)
Job Responsibility
Job Responsibility
  • Gather customer/partner insights from a broad range of stakeholders as well as the main business sponsor to shape and form both the definition and ongoing execution of projects and work with customer/partner stakeholders and business sponsor to socialize both the business solution and the project approach to determine if changes are needed to the business solution
  • Use evidence-based approach to represent the customer/partner as a customer advocate and share insights with Product Engineering teams and the business with a view to improving Microsoft technologies, products, services and offerings such that they can better meet customer/partner needs across a territory
  • Define and document the Architecture through an architecture description document, an architecture decisions log, and a requirements/constraints traceability matrix to communicate the value proposition of the business solution along with the project approach
  • Work with customers to understand and demonstrate business value (e.g., release of revenue, cost savings) that the business solution realizes and manage and resolve ambiguity in the requirements and constraints and documents assumptions and implications where it cannot be resolved
  • Generate new and/or improvements to existing intellectual property
  • Connects gaps and patterns across business and technology areas to drive changes
  • Identify which ideas should be culled, with consideration for scale across customers and drive the re-use of intellectual property and recommends practices in both pre-sales and delivery as well as participate and contribute to internal/external communities
  • Lead virtual teams around technologies and customer/partner challenges by sharing ideas, insight, and strategic, technical input with technical teams, internal communities across the field and the larger virtual team across Microsoft using knowledge of Microsoft architectures and their context in the competitive landscape
  • Design and lead end‑to‑end enterprise data, GenAI, Copilot, and agentic AI architectures, translating business outcomes into scalable, secure, and governed solutions across cloud data platforms and AI services
  • Own data platform design (lakehouse/data lake/warehouse, streaming, analytics, semantic models) to ensure the enterprise data estate is AI‑ready (quality, lineage, governance, observability)
  • Fulltime
Read More
Arrow Right

Principal Software Engineering Manager - Data Analytics

Build and operate core infrastructure services that power Fabric Data Engineerin...
Location
Location
Canada , Vancouver
Salary
Salary:
142400.00 - 257500.00 CAD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
  • 4+ years people management experience.
  • Software engineering foundation (data structures, algorithms, testing, debugging, performance) with the ability to guide and review technical decisions.
  • Demonstrated experience leading teams that build and ship production infrastructure (backend services, distributed systems, platform components) in cloud environments.
  • Understanding of distributed systems concepts, including fault tolerance, scaling, scheduling, and resource management, and ability to apply them to team-level architecture.
  • Proficiency in at least one backend/system language (e.g., Java, Scala, C#, C++, Python) and the ability to stay hands-on enough to unblock teams and assess designs.
  • Proven ability to ramp up quickly in new domains, tools, and codebases
  • growth mindset and learning agility.
Job Responsibility
Job Responsibility
  • Lead the design and delivery of world-class experiences for a new big data cloud offering, with emphasis on scale, reliability, and performance.
  • Manage and grow a team building core infrastructure services for data engineering and analytics workloads (compute, runtime services, job/session management, configuration, platform integrations).
  • Own technical strategy and execution end-to-end: translate product requirements into architecture, milestones, and high-quality production outcomes.
  • Drive operational excellence by establishing troubleshooting practices (logs, metrics, traces), guiding root-cause analysis, and converting operational learnings into engineering improvements.
  • Improve platform scalability, resiliency, and observability, including automation to reduce operational toil
  • ensure best practices are adopted consistently across the team.
  • Partner cross-functionally with product and engineering leaders to deliver end-to-end features, align priorities, and continuously raise the quality bar.
  • Coach and mentor engineers, provide technical guidance and performance feedback, and foster a culture of ownership, high standards, and continuous learning.
  • Fulltime
Read More
Arrow Right