CrawlJobs Logo

Lead Principal Technical Program Manager - ML Platform

United States 225900.00 - 354850.00 USD / Year · Job Posted December 27, 2025

Job offer has expired

Job Link Share

Job Responsibility

  • Analyze business objectives, customer needs, product adoption inhibitors and opportunities, industry trends, and based on these, in close collaboration with your stakeholders, define a long-term strategy and roadmap for your platform and product components.
  • Understand business objectives and translate them into technical systems problems that need to be prioritized solved in the current business environment.
  • Define specific systems programs and create a plan of action for realizing those programs. Such programs could be around capacity planning, migration efforts, high availability, network architecture, performance optimization, reliability improvements and more.
  • Use your technical understanding of Atlassian and related systems to partner with and influence engineers and architects in making progress on these problems.
  • Responsible for taking a systematic approach to engineering problems. This includes: prioritizing tasks, scoping out the project, defining objectives, and making consistent progress against each of these.
  • Be accountable for the success of these technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
  • Manage complex dependencies and projects with a broad scope across the company

Requirements

  • 12+ years of experience on software teams as Development Manager or TPM
  • Strategic thinking and ability to understand business objectives to translate them into technical problems and programs.
  • Technical understanding of systems involved. Willingness to develop domain expertise in the area they operate - storage, networking, authentication, capacity management, service deployments, etc.
  • TPMs are not expected to write or read code, but are expected to understand system flows, block architectures, APIs and such.
  • Experience defining and running end-to-end complex technical programs
  • Strong leadership, organizational, and communication skills

What we offer

  • health and wellbeing resources
  • paid volunteer days

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Lead Principal Technical Program Manager - ML Platform

8 matching positions

Principal ML Engineer, ML Platform Engineering

Xometry is seeking a Principal Machine Learning Engineer to join our core machin...
Location
Location
United States , North Bethesda
Salary
Salary:
140000.00 - 182000.00 USD / Year
cherry.vc Logo
Cherry Ventures
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 7 years of experience in machine learning engineering, software engineering, data science, or similar technical role
  • A bachelor’s degree is required, but an advanced degree (M.S. or PhD) in computer science, machine learning, AI, or a related field is preferred and may substitute for some years of experience
  • Demonstrated experience designing and deploying cloud infrastructure (AWS preferred) to support machine learning, and machine learning models, with considerations for scale, reliability and security
  • Deep understanding of the machine learning lifecycle and related infrastructure needs - feature stores, a/b testing, model registration, drift detection, automated retraining, etc
  • Strong technical expertise. You will need to either have or demonstrate the ability ability to quickly build technical expertise in the following: Software engineering principles, including parallel and distributed computing, version control, reproducibility, and continuous integration
  • Machine learning techniques and algorithms, with emphasis on their impact to infrastructure implementation Including large-scale language and vision models (Transformers, GPT, VLMs, LLMs), deep learning (PyTorch, Tensorflow)
  • Infrastructure as Code (IaC), especially Terraform
  • REST API design and implementation
  • Object oriented and functional programming in Python
  • Multimodal data processing (e.g., combining text, image, and 3D data)
Job Responsibility
Job Responsibility
  • Hands-On Technical Leadership: Adopt a 'lead by example' approach by actively coding and troubleshooting, as well as creating documentation and technical diagrams
  • Teaching & Mentorship: You will serve as a mentor and guide to engineers across the organization, teaching and mentoring them to grow their skills
  • Code Review: You will do code review and mentor others within the organization regarding best practices in ML Engineering
  • Operational Excellence: Guarantee the delivery of superior infrastructure and software that not only meets but exceeds customer expectations, while aligning with the strategic business timelines
  • Collaborative Strategy: Forge strong partnerships with product managers, data scientists, and company leadership to promote a culture of open communication and integrated team dynamics
  • Guide Innovation: Champion the adoption of cutting-edge technologies, methodologies, and practices to enhance problem-solving efficiency and effectiveness across the AI/ML organization.
What we offer
What we offer
  • 401(k) match
  • medical, dental and vision insurance
  • life and disability insurance
  • generous paid time off including vacation, sick leave, floating and fixed holidays, maternity and bonding leave
  • EAP, other wellbeing resources
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Principal Engineering Manager

As Microsoft continues to push the boundaries of AI, we are on the lookout for s...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, Javascript, or Python OR equivalent experience
  • Demonstrated track record of building and scaling engineering organizations (hiring teams from scratch, structuring orgs, growing managers)
  • Experience delivering large-scale software systems in AI, machine learning, or related fields
  • Experience managing organizations of 30+ engineers across multiple teams and workstreams
  • Deep expertise in LLM evaluation, AI quality measurement, or ML infrastructure at scale
  • Track record of partnering with senior leadership (VP/CVP level) to set strategy and drive cross-organizational programs
  • Experience recruiting and developing senior engineering talent (principal engineers, engineering managers) in a competitive market
  • Proven ability to operate effectively in fast-paced, ambiguous environments — comfortable making decisions with incomplete information and course-correcting quickly
  • Strong technical judgment: ability to evaluate architectural tradeoffs, assess technical risk, and guide teams toward sound engineering decisions without needing to write the code yourself
  • Experience leading distributed or multi-site engineering teams.
Job Responsibility
Job Responsibility
  • Build and lead a multi-team engineering organization (30+ engineers across multiple teams), including hiring and developing engineering managers who lead their own teams
  • Set the technical and organizational strategy for Copilot AI Evaluation and response quality, aligning with MAI's broader product and engineering vision
  • Partner with senior Eng and Product leadership (Partner+ level) to define priorities, influence roadmaps, and drive cross-organizational initiatives
  • Own end-to-end delivery of evaluation platforms, novel evaluation techniques, and agentic solutions for measuring and improving Copilot quality at scale
  • Recruit, develop, and retain world-class engineering talent — building a culture of technical excellence, accountability, and continuous learning
  • Drive operational rigor: establish engineering processes, quality bars, and delivery cadences that enable predictable, high-quality execution across multiple concurrent workstreams
  • Navigate ambiguity and make high-judgment tradeoff decisions on technology, staffing, and investment priorities in a fast-moving AI landscape
  • Foster a diverse, inclusive team culture where engineers at all levels can do their best work and grow their careers
  • Embody our Culture and Values.
  • Fulltime
Read More
Arrow Right

Senior Principal Engineering Manager

Microsoft Research (MSR) is working to transform the future of artificial intell...
Location
Location
United States , Redmond
Salary
Salary:
163000.00 - 296400.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 5+ years of people management experience leading software engineering teams, including managing principal engineers
  • Experience building or operating infrastructure for large-scale distributed systems, cloud platforms, or artificial intelligence (AI)/machine learning(ML) workloads
  • Track record of driving execution on complex, multi-workstream infrastructure projects with clear milestones and accountability
  • Technical fluency in one or more of: large-scale compute clusters, GPU infrastructure, scheduling and orchestration (Kubernetes, Volcano), or High-Performance Compute (HPC) environments
  • Experience with GPU programming (CUDA, NCCL) and frameworks such as PyTorch
  • Expertise in networking (InfiniBand, NVLink), storage systems, or distributed training parallelisms
  • A track record of strong cross-functional partnerships, including the ability to align on strategic direction, deliver joint accountabilities, and develop relationships with staff members with widely varied expertise
  • Experience scaling engineering teams through significant growth phases (hiring, onboarding, and integrating new engineers into a high-performing team)
  • Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 15+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Job Responsibility
Job Responsibility
  • Lead, mentor, and grow the engineering team that builds MSR’s AI research infrastructure
  • Recruit and develop exceptional engineering talent, building a diverse team - including hiring, onboarding, career development, and performance management
  • Drive execution across the team by setting clear goals, tracking milestones, managing dependencies, and ensuring accountability for delivering complex infrastructure projects on time and at high quality
  • Lead team culture and process changes, cultivating an AI-first mentality that accelerates our progress through agentic coding, automation, and skills development
  • Provide technical vision and judgment on the team's architecture, strategy, and roadmap — spanning supercomputer GPU clusters, high performance networking, workload optimization, researcher tools, and agentic workflows — while empowering engineers to own deep technical details
  • Collaborate closely cross-discipline with engineers, program managers, and research and science teams to align priorities, resolve dependencies, and build better solutions together
  • Foster a team culture of operational excellence, continuous improvement, and high psychological safety where engineers are empowered to take ownership and innovate
  • Fulltime
Read More
Arrow Right

Principal Product Manager- CISO

The Cloud & AI organization accelerates Microsoft’s mission and bold ambitions t...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree AND 8+ years experience in product/service/program management or software development OR equivalent experience
  • The ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
  • Bachelor's Degree AND 12+ years experience in product/service/program management or software development OR equivalent experience
  • 4+ years experience taking a product, feature, or experience to market (e.g., design, addressing product market fit, and launch, internal tool/framework)
  • 6+ years experience improving product metrics for a product, feature, or experience in a market (e.g., growing customer base, expanding customer usage, avoiding customer churn)
  • 6+ years experience disrupting a market for a product, feature, or experience (e.g., competitive disruption, taking the place of an established competing product)
  • Proven experience running large-scale, cross-organizational programs as a general contractor or program lead, including setting up ROBs, KPIs, scorecards, and executive reporting for initiatives spanning multiple divisions
  • Familiarity with post-quantum cryptography concepts, NIST PQC standards (ML-KEM, ML-DSA), CNSA 2.0 timelines, or cryptographic migration programs
  • Experience working within or alongside governance bodies (such as a crypto board, security standards council, or compliance program like SFI or SDL) to drive enterprise-wide adoption
Job Responsibility
Job Responsibility
  • Serve as the General Contractor and PQ Pillar owner for Microsoft’s post-quantum cryptography transition, driving end-to-end program execution across all product families and divisions
  • Partner with the PQ Principal PM Architect and the Principal Group PM Manager to translate technical strategy into program roadmaps, work item definitions, dependency maps, and sequenced execution plans across three priority scenarios: encryption in transit (TLS), PKI (code signing, secure boot, authentication certificates), and encryption at rest
  • Establish and run the PQ rhythm of business (ROBs), including milestone tracking, executive status reporting, scorecards, and regular business reviews with SLT members, expanding the review cadence as additional scenarios come online
  • Define KPIs and accountability frameworks that make PQ adoption measurable across dozens of engineering teams, and hold divisions accountable to committed timelines
  • Work closely with Azure Security Ops to drive PQ prerequisite adoption through existing compliance and security programs, and coordinate with service teams to sequence deployments so dependencies are resolved before teams are asked to move
  • Represent the PQ program in business forums, leadership reviews, and cross-company governance meetings, serving as the single point of contact for program status and escalations
  • Communicate milestone wins to the field and to customers, supporting RFPs, governance requirements, and compliance readiness
  • Fulltime
Read More
Arrow Right

Principal, Data Science & Analytics

Microsoft AI (MAI) builds an integrated consumer AI ecosystem across search, bro...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 5+ years data-science experience
  • OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 7+ years data-science experience
  • OR Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 10+ years data science experience
  • OR equivalent experience
Job Responsibility
Job Responsibility
  • Leadership: Mentor data scientists and align work with MAI ecosystem goals, driving technical excellence, innovation, and cross-team collaboration
  • Data Strategy & Execution: Develop ecosystem data strategies for marketplace and system performance, including standardized data collection, analysis, reporting, and interpretation
  • Advanced Analytics & Measurement: Apply machine learning, statistical modeling, data mining, and experimentation to large datasets
  • define and deliver metrics that accurately measure user and business value across products and marketplace components
  • Experimental Design & Implementation: Design and execute experiments across user and demand dimensions
  • translate strategy into clear, actionable, and measurable plans, sharing progress and results with stakeholders
  • Collaboration: Partner closely with product, program management, engineering, and business teams to integrate data science solutions into shared platforms and marketplace operations
  • Performance Optimization: Identify cross-team opportunities for product and process improvement
  • implement data-driven solutions to improve efficiency, reliability, and user experience
  • Influence & Decision-Making: Engage stakeholders with clear, compelling, and actionable insights
What we offer
What we offer
  • Certain roles may be eligible for benefits and other compensation
  • Fulltime
Read More
Arrow Right

Principal, Data Scientist

The Reliability Engineering group at Walmart Global Tech builds intelligent, dat...
Location
Location
United States of America , Sunnyvale
Salary
Salary:
143000.00 - 286000.00 USD / Year
walmart.com Logo
Walmart
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in software engineering with applied machine learning
  • Strong track record of building and operating ML systems in production
  • Experience owning systems end-to-end in distributed, high-availability environments
  • Experience leading technical initiatives or driving architectural decisions
  • Strong proficiency in one or more programming languages commonly used in ML engineering, such as Python, Go, or Java
  • Strong experience with ML frameworks such as Scikit-learn, PyTorch, TensorFlow, or similar
  • Strong SQL skills and experience working with large-scale datasets
  • Hands-on experience training, validating, and deploying machine learning models in production across domains such as recommendation systems, forecasting, anomaly detection, classification, or similar applied ML use cases
  • Experience building and maintaining end-to-end ML pipelines (data ingestion, feature engineering, training, evaluation, deployment, monitoring)
  • Experience with model serving architectures (REST/gRPC inference services, batch inference, streaming inference)
Job Responsibility
Job Responsibility
  • Architect and implement end-to-end ML systems (data pipelines, feature engineering, model training, deployment, and monitoring)
  • Design scalable, low-latency model serving infrastructure integrated with Kubernetes and cloud-native systems
  • Build intelligent automation solutions including predictive autoscaling, anomaly detection, seasonality-aware forecasting, and capacity optimization
  • Engineer safe and reliable ML-driven automation that operates in high-availability environments
  • Own model lifecycle management, including validation, experiment tracking, model registry, monitoring, drift detection, and rollback strategies
  • Collaborate closely with platform, SRE, and infrastructure teams to embed ML capabilities into production systems
  • Drive engineering best practices around ML system reliability, observability, testing, and performance
  • Contribute to architectural decisions and mentor engineers on ML systems design
What we offer
What we offer
  • medical, vision and dental coverage
  • 401(k)
  • stock purchase
  • company-paid life insurance
  • PTO (including sick leave)
  • parental leave
  • family care leave
  • bereavement
  • jury duty
  • voting
  • Fulltime
Read More
Arrow Right

Sr. Distinguished AI Engineer (Agentic AI Platform)

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , San Jose, California; San Francisco, California
Salary
Salary:
343400.00 - 392000.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or AI plus at least 10 years of experience developing AI and ML algorithms or technologies, or Master's degree plus at least 8 years of experience developing AI and ML algorithms or technologies
  • At least 10 years of experience programming with Python, Go, Scala, or Java
  • 9 years of experience deploying scalable and responsible AI solutions on cloud platforms
  • 2+ years of experience supporting Agentic Frameworks
  • 2+ years of experience with LLMOps
  • 8+ years of experience designing mission-critical machine learning platforms
  • 2+ years of experience architecting, designing, developing, integrating, delivering, and supporting complex AI systems
  • Demonstrated ability to lead and mentor multiple engineering teams and influence cross-functional stakeholders up to the VP level
  • Experience developing AI and ML algorithms or technologies using Python, C++, C#, Java, or Golang
  • Master's degree in Computer Science, Computer Engineering, or relevant technical field
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Contribute to the north star platform architecture, continuously publishing and refining living diagrams and canonical APIs
  • Standardizing and automating agentic workflows
  • Contribute to crafting an end to end GenAI SDK, CLI and starter kits
  • Help bring together a vision of central guardrail services
  • Collaborate with cross organization architects to drive end to end performance
  • Accelerate innovation by incubating proof of concepts and driving RFCs
  • Own central Helm charts, operators and CRDs that auto scale agents to hit tenant SLAs
  • Coach and evangelize - hosting architecture office hours, mentoring Staff, Principal and Senior engineers, authoring technical design documents and blogs and representing Capital One at Tier1 AI conferences
What we offer
What we offer
  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits
  • Fulltime
Read More
Arrow Right

Principal Data And Analytics Engineer

The Principal Data and Analytics Engineer holds comprehensive responsibility for...
Location
Location
United States
Salary
Salary:
108086.00 - 180144.00 USD / Year
oreillyauto.com Logo
O'Reilly Auto Parts
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience architecting enterprise-scale data platforms and ecosystems, including hybrid and cloud-native environments (e.g., GCP BigQuery, Snowflake, Iceberg, Advanced SQL, Erwin, dbt, Kafka, Alation, Collibra)
  • Deep expertise in designing and scaling highly available, secure, and fault-tolerant batch and streaming pipelines with strong emphasis on cost optimization, observability, and latency control
  • Advanced proficiency in semantic modeling, reusable data asset design, and cross-functional data product delivery aligned to medallion architecture
  • Leadership in implementing CI/CD-enabled pipelines, RBAC frameworks, schema evolution strategies, and interoperable data exchange using Iceberg or equivalent table formats
  • Ownership of organization-wide metrics store and semantic layers, ensuring consistency, governance, and performance across reporting, AI, and ML use cases
  • Advanced expertise in programming languages such as Python, Scala, with the ability to architect complex data solutions
  • Demonstrated leadership in designing and overseeing the implementation of scalable, idempotent workflows using orchestration frameworks such as Airflow and Prefect
  • Demonstrated ability to translate business transformation goals into scalable data solutions and reusable patterns
  • Deep understanding of business processes, KPIs, and capability maps across functions such as supply chain, customer, store ops, and finance
  • Proven experience in driving cross-functional data product prioritization, influencing senior stakeholders, and quantifying impact of data initiatives
Job Responsibility
Job Responsibility
  • Help define and evolve enterprise data engineering blueprints, including data mesh, medallion architecture, and hybrid cloud data platforms
  • Set strategic direction for data platforms, tools, and services (e.g., Snowflake, GCP BigQuery, dbt, Kafka, Airflow/Prefect) in alignment with future-state architecture and business priorities
  • Architect and design highly scalable, resilient, cost optimal and secure data platforms
  • Lead the design and implementation of next-generation data platforms, ensuring fault tolerance, high availability, and optimal performance for petabyte-scale data
  • Establish and enforce organization-wide best practices for data pipeline development, CI/CD for data workflows, automated deployment playbooks, and robust rollback strategies
  • Lead technology evaluation and adoption, proactively researching, evaluating, and championing the integration of cutting-edge data technologies, frameworks, and methodologies
  • Define and scale enterprise knowledge management frameworks that ensure consistent documentation, discoverability, and reusability of data assets across domains
  • Establish and govern standards for metadata management, data lineage, architectural diagrams, and runbooks
  • Lead the design of federated governance models that empower domain-aligned teams to operate autonomously while conforming to centralized policies, frameworks and playbooks
  • Collaborate with data governance, compliance, and security teams to operationalize policy-as-code frameworks for data retention, access control, and PII handling
What we offer
What we offer
  • Competitive Wages & Paid Time Off
  • Stock Purchase Plan & 401k with Employer Contributions Starting Day One
  • Medical, Dental, & Vision Insurance with Optional Flexible Spending Account (FSA)
  • Team Member Health/Wellbeing Programs
  • Tuition Educational Assistance Programs
  • Opportunities for Career Growth
  • Fulltime
Read More
Arrow Right