CrawlJobs Logo

Principal Software Engineer, Experimentation Platform - CoreAI

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Redmond

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

139900.00 - 274800.00 USD / Year

Job Description:

CoreAI sits at the center of Microsoft’s mission to redefine how software is built and experienced, providing the foundational platforms, services, and developer experiences that power the next generation of AI-driven applications. As part of CoreAI, the Experimentation Platform (ExP) enables trustworthy, high-scale online experimentation that accelerates product learning and drives progress across Microsoft’s AI ecosystem. You will play a pivotal role in shaping the technical direction of systems that help teams ship better AI experiences faster by providing the experimentation capabilities needed to evaluate, refine, and safely deploy new innovations. In this role, you will lead the architecture and development of one of the highest-scale experimentation platforms - critical infrastructure that enables rapid iteration in AI systems and product features across Microsoft. You will drive the technical vision for services that empower engineers and scientists across the company to measure impact, validate hypotheses, and advance state-of-the-art AI capabilities through rigorous experimentation. This is an opportunity to lead complex, cross-team technical initiatives while shaping the future of distributed systems architecture, service reliability, and experimentation methodologies at Microsoft scale. You will thrive in this role if you are a technical leader who enjoys driving architecture decisions across teams, mentoring senior engineers, and building the reliable infrastructure foundations that accelerate Microsoft’s progress in AI.

Job Responsibility:

  • Champion and improve AI tools and practices across the software development lifecycle (SDLC), incorporating appropriate controls over AI-generated assets
  • Lead by example across teams to produce extensible, maintainable, well-tested, secure, and performant code
  • identify and establish coding best practices, create and apply metrics to drive code quality and stability, and mentor engineers to continuously raise the engineering bar
  • Own and lead the architecture of complex product solutions, driving design discussions, evaluating new technologies to solve problems, and ensuring system architecture meets performance, scalability, resiliency and disaster recovery requirements
  • Lead cross-team collaboration to identify dependencies, negotiate delivery schedules, drive alignment across partner teams, and ensure proper end-to-end testing, live-site coverage, scalability and performance before going live
  • Drive engineering excellence across products
  • lead efforts targeting zero-touch deployment, production reliability, and security hardening for both protections and detections
  • Hold accountability as a designated responsible individual (DRI) across products and solutions, mentor engineers on live-site operations, lead incident retrospectives that drive systemic

Requirements:

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Nice to have:

  • Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Extensive experience architecting and operating large-scale distributed systems on cloud platforms (Azure, AWS, GCP), with demonstrated ownership of critical production infrastructure serving millions of users
  • Track record of designing highly scalable, resilient service architectures with strong emphasis on fault tolerance, disaster recovery, and cost optimization at scale
  • Deep experience using observability tools (logging, metrics, distributed tracing) to diagnose complex cross-service issues and drive systemic reliability improvements across multiple products
  • Proven experience mentoring senior engineers, driving technical direction, conducting design reviews, and raising the engineering bar across teams
  • Experience with experimentation platforms, A/B testing at scale, and statistical methodologies for measuring product impact and driving data-informed ship decisions
  • Experience leading security hardening efforts, threat modeling, and incident response processes for production systems
  • Experience championing AI-assisted development workflows and establishing responsible AI coding practices across engineering teams

Additional Information:

Job Posted:
March 20, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Principal Software Engineer, Experimentation Platform - CoreAI

Principal Software Engineer - Growth (CoreAI)

We’re building AI‑first growth and experimentation systems that scale across Mic...
Location
Location
United States , Mountain View
Salary
Salary:
163000.00 - 296400.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Own growth through engineering excellence and experimentation — at a systems level
  • Architect and build paved paths for online experimentation: standardized metrics, guardrails, analysis workflows, and rollout automation that improve reliability and decision quality across teams
  • Lead multi‑workstream initiatives that span teams/products (e.g., unified growth measurement, cross‑surface funnels, experimentation quality improvements)
  • Build and evolve core capabilities: telemetry foundations, experiment assignment/targeting, feature flighting, and risk controls (kill‑switches, guardrails, progressive delivery)
  • Partner with Product, Data Science, Design, and Research to turn ambiguous goals into shippable, measurable systems
  • Stay close to the work: write production code, review designs/PRs, and coach others through architecture and implementation tradeoffs
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, CoreAI Workload Engines

The CoreAI Workloads team builds the foundational inference engines and APIs tha...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 331200.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field and 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, or equivalent experience
  • Proven ability to design and operate large-scale, production inference services with high reliability and performance requirements, and to ship performance improvements safely via disciplined experimentation
  • Strong skills in performance analysis: benchmarking, profiling, diagnosing regressions, and turning results into concrete engine/runtime changes
  • Strong problem-solving skills and the ability to debug complex, cross layer systems issues
  • Demonstrated technical leadership, including mentoring engineers, driving cross-team architectural alignment, and leveraging AI tools and AI-assisted workflows to accelerate engineering velocity and quality
  • Hands-on experience with Kubernetes (building and operating services on k8s), including debugging production issues and designing platform abstractions (e.g., custom resources/controllers) and scheduling-aware deployments (e.g., node affinity, taints/tolerations, resource requests/limits)
  • Strong collaboration and communication skills, with the ability to work across organizational boundaries
Job Responsibility
Job Responsibility
  • Optimize inference engines for OpenAI and open-source models by implementing and shipping performance/efficiency improvements across runtime, scheduling, and serving paths (latency, throughput, utilization, availability, and cost)
  • Run experiments end-to-end: formulate hypotheses, implement engine changes (including Python/PyTorch integration points where relevant), analyze results, and ship improvements behind guardrails
  • Build and use experimentation capabilities for large-scale AI inference (experiment lifecycle, tracking, metric modeling, comparability standards, automated analysis) so the team can iterate quickly and safely
  • Own serving availability and efficiency for Azure OpenAI Service workloads through tiered experimentation, lean segmentation, and multi-modal utilization across heterogeneous fleets—turning findings into shipped engine improvements
  • Design and evolve inference serving architectures to improve utilization and latency using techniques such as disaggregated serving, multi-token prediction, KV offload/retrieval, and quantization—validated via staged rollouts and production guardrails
  • Extend AI infrastructure abstractions to support elastic, heterogeneous inference engines reliably at scale (e.g., dynamic scaling across model families, modalities, and workload classes while maintaining isolation and SLOs)
  • Tune and scale inference engines across NVIDIA GPU generations (A100, H100, H200) for state-of-the-art OpenAI models, focusing on serving efficiency, utilization, and reliability (not hardware bring-up)
  • Partner with networking and storage teams to leverage high-performance interconnects (e.g., RDMA/InfiniBand-class fabrics such as RoCE over IB) for distributed inference, without owning low-level kernel/driver enablement
  • Drive end-to-end features from design through production: observability, diagnostics, performance regression detection, and operational excellence for inference serving
  • Influence platform architecture and technical direction across teams through design reviews, clear metrics, and technical leadership focused on experimentation velocity and production reliability
  • Fulltime
Read More
Arrow Right

Principal Product Manager

At Microsoft, we are building the world’s most trusted and developer‑centric AI ...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in engineering, computer science, or a related technical field
  • Significant experience (typically 8–12+ years) in product management or software engineering with substantial product ownership, including experience working on platform or infrastructure products
  • Demonstrated ability to operate effectively in large, ambiguous, multi‑team environments with shared ownership and complex dependencies
  • Strong technical depth in cloud platforms, distributed systems, or AI/ML infrastructure, with the ability to engage credibly with senior engineers and architects
  • Proven track record of influencing strategy, driving alignment, and delivering outcomes through collaboration rather than direct authority
  • Strong analytical and systems‑thinking skills, with experience making high‑quality decisions in fast‑moving, evolving problem spaces
Job Responsibility
Job Responsibility
  • Act as a senior contributor to platform strategy for Azure AI Foundry and Azure ML, helping shape multi-year investments across model training, customization, deployment, and lifecycle management
  • Drive alignment and progress across federated, cross-organizational initiatives, working with peer Principal PMs and multiple engineering teams on shared platform outcomes
  • Contribute to the definition and evolution of high-leverage platform abstractions (APIs, SDKs, workflows) that enable scalable adoption of GenAI and custom code training workloads
  • Partner closely with senior engineering leaders to influence architectural direction, surface trade-offs, and ensure platform capabilities meet scale, reliability, and security expectations
  • Engage with strategic customers and internal stakeholders to gather insights, validate requirements, and translate learnings into durable, reusable platform capabilities
  • Use data, metrics, and experimentation to evaluate impact and inform product decisions across shared ownership areas
  • Serve as a thought leader and mentor within CoreAI, elevating product craft, platform thinking, and responsible AI practices across the organization
  • Fulltime
Read More
Arrow Right

Principal Product Manager - DevOps AI - CoreAI

Microsoft’s mission is to empower every person and every organization to achieve...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree AND 8+ years experience in product/service/program management or software development OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • 10+ years of product management experience building and shipping platform, infrastructure, or developer tooling products
  • Proven experience working across multiple teams and systems to deliver outcomes in complex, highly matrixed organizations
  • Ability to define and drive ambiguous problem spaces, turning strategy and research into concrete, actionable product investments
  • Experience with AI assisted developer workflows, automation, or intelligent systems applied to software engineering
  • Understanding of developer workflows and DevOps systems, including build, test, release, and operational feedback loops
  • Excellent communication and stakeholder management skills, with the ability to align diverse partners around shared goals, tradeoffs, and success metrics
  • Familiarity with enterprise requirements around security, compliance, privacy, and reliability, especially in largescale engineering environments
Job Responsibility
Job Responsibility
  • Drive crosscutting platform investments across the GitHub / Azure DevOps / 1ES ecosystem, with a focus on AI assisted developer productivity across the full engineering lifecycle (inner loop,CI/CD ,operations, governance)
  • Identify high leverage opportunities where AI can meaningfully reduce friction and toil for developers while improving quality, reliability, and consistency of outcomes
  • Define clear problem statements, product bets, and success metrics that balance speed of iteration with trust, safety, and operational requirements
  • Operate in a startup mode, moving quickly from hypothesis to MVP to scaled rollout through rapid experimentation and iteration
  • Partner closely with engineering, security, compliance, privacy, and AI platform teams to ensure solutions are production ready and scalable, not experimental or one off
  • Drive execution across multiple systems and teams in a highly matrixed environment, influencing roadmaps and priorities without direct authority
  • Ensure AI assisted workflows are designed end-to-end, with clear ownership, feedback loops, and failure modes—not isolated point solutions
  • Use data, developer feedback, and operational signals to evaluate impact and continuously improve platform investments
  • Act as a thought partner to engineering and leadership teams on how AI should be applied responsibly within Microsoft’s engineering system
  • Fulltime
Read More
Arrow Right

Principal Product Manager - Microsoft Foundry (CoreAI Efficiency)

We are building the industry’s leading AI-first application stack with enhanced ...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree AND 8+ years experience in product/service/program management or software development OR equivalent experience
  • 10+ years of experience in product development
  • Experience delivering one or more AI, cloud services, enterprise grade solutions, and mobile apps
  • Proven success in launching and scaling multi-million dollar revenue products, ideally in AI, ML, or platform domains
  • Technical fluency in model performance evaluation, usability, and integration strategies
  • Product led growth focus with data-driven decision-making skills
  • Executive presence and stakeholder management capabilities
  • Experience leading cross-functional teams and managing complex dependencies
  • Ability to thrive in fast-paced environments with a bias for action
Job Responsibility
Job Responsibility
  • Lead strategy and execution for increasing experimentation velocity, reducing time-to-insight, and accelerating time-to-realize efficiency
  • Reduce friction in the end-to-end experimentation process from hypothesis generation and experiment design to deployment and analysis
  • Define and track efficiency metrics, manage dependencies, and lead experimentation to drive Microsoft Foundry product growth and demonstrate efficiency gains
  • Collaborate with engineering, data science, and partner teams to develop and enhance our experimentation platform and tools
  • Drive alignment across teams to support experimentation initiatives, and ensure that insights from experiments translate into tangible improvements in our generative AI platform
  • Foster an organization-wide culture of experimentation and evidence-based innovation
  • Promote agility and learning, empowering teams to make data-driven decisions and continuously improve our AI systems at a faster pace
  • Develop compelling product narratives and performance dashboards for executive-level communication and decision-making
  • Provide clarity in ambiguous environments, influence stakeholders, and foster a culture of innovation and accountability
  • Fulltime
Read More
Arrow Right

Principal Product Manager - Foundry Inferencing & Training (CoreAI - multiple roles)

Microsoft Foundry sits at the center of Microsoft’s AI strategy, powering how mo...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 331200.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree and 8+ years of experience in product management, technical program management, software engineering, or related technical fields (or equivalent experience)
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Product Strategy & Ownership: Own product strategy and roadmap across AI model training, inference, experimentation, and platform enablement, balancing near-term delivery with long-term scale
  • Maintain end-to-end accountability from concept through launch, iteration, and measurable impact
  • Model Lifecycle & Platform Enablement: Drive initiatives across the AI model lifecycle, partnering with engineering and research to bring new capabilities from research into production
  • Enable internal teams and customers to access, integrate, and adopt models through high-quality platform experiences
  • Execution, Velocity & Operating Rigor: Lead complex, multi-quarter initiatives with high visibility, managing dependencies, risks, and tradeoffs across teams
  • Improve execution velocity by reducing friction in planning, experimentation, launches, and iteration cycles
  • Experimentation, Metrics & Continuous Improvement: Define and track metrics for efficiency, performance, reliability, and adoption, using experimentation and data to drive decisions
  • Identify opportunities for automation, simplification, and continuous improvement as systems scale
  • Cross-Functional Leadership & Communication: Act as a connective leader across engineering, data science, research, infrastructure, and go-to-market teams
  • Influence senior stakeholders through clear decision framing, executive-ready narratives, and data-backed recommendations
  • Fulltime
Read More
Arrow Right

Principal Product Manager - Observability & Evaluations

Microsoft Foundry is the platform that empowers developers to build, deploy, and...
Location
Location
United States , New York
Salary
Salary:
163000.00 - 296400.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree in Computer Science, Engineering, or related technical discipline AND 8+ years of product management, program management, or software development experience, OR equivalent experience
  • Proven experience leading technical product areas from concept through launch
  • Experience building developer platforms, cloud services, AI orchestration frameworks, or agentcentric architectures
  • Demonstrated ability to partner across engineering, design, research, and business teams
  • Ability to analyze complex technical and business requirements and drive alignment across stakeholders
  • Ability to pass Microsoft Cloud Background Check requirements
  • Background working with enterprise customers on technical adoption or migration efforts
  • Experience driving product strategy in global or complex markets
  • Strong analytical and communication skills, including the ability to synthesize complex technical concepts for executive audiences
Job Responsibility
Job Responsibility
  • Define and evolve the longterm vision for the Foundry platform, including agent frameworks, orchestration capabilities, and developer tooling used across CoreAI products
  • Translate complex enterprise and developer requirements into clear product roadmaps and prioritized investment plans
  • Identify market opportunities and validate product direction through research, experimentation, and customer insights
  • Engage directly with strategic AI customers to deeply understand their application modernizations, agent use cases, and infrastructure requirements
  • Synthesize customer feedback into actionable insights that inform backlog prioritization and ongoing product improvements
  • Collaborate with engineering teams to deliver scalable, reliable, and secure platform features
  • Lead execution of multiquarter initiatives, ensuring clarity on acceptance criteria, technical dependencies, and timelines
  • Partner with design, research, marketing, and PM peers across Azure AI, GitHub, and broader CoreAI to craft worldclass developer experiences
  • Drive capabilities for agenttoagent interactions, skill orchestration, prompt management, AI workflow tooling, and hybrid cloud/local inference patterns
  • Influence the platform infrastructure underpinning Foundry, ensuring that developer scenarios—from app migration to AIfirst application creation—are firstclass and intuitive
  • Fulltime
Read More
Arrow Right
New

Lead Risk Advisor

About the Role Join our high-growth tech enabled specialty insurance startup tha...
Location
Location
United States
Salary
Salary:
180000.00 - 200000.00 USD / Year
idexconsulting.com Logo
IDEX Consulting Ltd
Expiration Date
August 08, 2026
Flip Icon
Requirements
Requirements
  • Several years commercial insurance experience (brokerage or underwriting)
  • Proven talent for winning clients through relationship-building and consultative selling
  • Advanced commercial risk assessment capabilities
  • Outstanding communication and persuasion skills
  • Ability to quickly understand client needs across different verticals
  • Specialty knowledge in consumer brands, restaurants/hospitality, technology, or specific lines (property, cyber, management liability)
Job Responsibility
Job Responsibility
  • Convert qualified leads through consultative selling and risk expertise
  • Own and develop the risk management strategy for your commercial vertical
  • Engage in high-impact client touchpoints including proposal presentations
  • Partner with Lead Risk Advisors who handle ongoing account management
  • Develop tailored commercial solutions for mid-market clients ($50K-$5M GWP)
What we offer
What we offer
  • base & performance bonuses
  • Fulltime
Read More
Arrow Right