CrawlJobs Logo

Principal Software Engineer - CoreAI

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Redmond

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

139900.00 - 274800.00 USD / Year

Job Description:

Within AI Platform, the Azure AI Search team powers rich knowledge base experiences for apps of all kinds. We integrate the best of Microsoft AI for content understanding, search relevance, and knowledge sources. As a Principal Software Engineering, you will shape end-to-end customer and developer experiences for Azure AI Search—ensuring customers can discover, adopt, and succeed with AI capabilities at global scale. You will operate where product experience meets cloud services, setting direction that improves reliability, security, and time-to-value. You will be a technical leader that foster high-performing team to discover, define, and deliver new product capabilities—often turning ambiguous customer needs into clear, shippable features and reliable components. You will be a trusted leaders where engineers collaborate, move quickly, and continuously raise the bar in a demanding and exciting space. You will also apply evolving AI tools and AI-assisted engineering practices to design, implement, test, and ship AI features faster and more safely across UX flows, portal experiences, samples, SDKs, and APIs. You will thrive in a highly dynamic environment by staying close to users—listening, learning, and translating feedback into product and experience improvements. You will own key entry points and integration experiences (including partner integrations) so customers can onboard smoothly, extend the product confidently, and realize value quickly. You will set technical direction and drive operational excellence from design through production—building for security, reliability, and observability, and ensuring features can be operated with confidence at scale. You’ll establish strong engineering systems (quality, incident readiness, and release practices) and use data from real-world usage to continuously refine the product and developer experience. Success in this role is measured by outcomes customers feel: simpler onboarding, clearer and more consistent experiences across every surface area, and secure, reliable features that run smoothly in production—enabling faster adoption of new AI capabilities.

Job Responsibility:

  • Leads the disciplined adoption and continuous improvement of AI tools and Responsible AI practices across the SDLC, ensuring accountability for AI-generated assets and using engineering health metrics to drive measurable process improvements and share learnings
  • Leads engineering excellence for production services by driving diagnosability and incident prevention (debugging, telemetry, retrospectives), strengthening secure and privacy-preserving operations (least privilege), and raising code quality through timely, high-impact results, automated analysis, and best practices (including GenAI) to deliver secure, maintainable, high-performing code while proactively managing blockers and risks
  • Deliver success through empowerment and accountability by consistently demonstrating, sharing, and championing technical values
  • Uphold team culture, embody organizational values, and practice core engineering principles in daily work
  • Clarify technical objectives and outcomes, foster collaboration across boundaries, and help the team adapt and learn through example and leadership
  • Leads cross-group planning and execution (project/release/work management) by breaking long-term vision and ambiguous problems into milestones, driving estimation and capacity planning, and ensuring secure, compliant delivery with operational readiness (flighting, rollback, and disaster recovery)
  • Partners with internal and external stakeholders to validate user requirements and feasibility, incorporates customer insights and success metrics (including accessibility/globalization), and advocates for customer security and privacy needs across the solution

Requirements:

  • Bachelor’s Degree in Computer Science, Engineering, or related field AND 6+ years of technical engineering experience building and operating cloud services (or equivalent experience)
  • 4+ years of technical leadership and architecture

Nice to have:

  • Experience building AI-powered product capabilities and/or search, retrieval, or data/knowledge services at scale
  • Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages
  • Experience owning end-to-end customer and developer experiences across one or more product surfaces including defining requirements and driving delivery
  • Experience with distributed systems and production operations (reliability, incident response, observability/telemetry, and safe release practices)
  • Experience designing and delivering secure services, including identity/access patterns and privacy/compliance considerations
  • Demonstrated use of AI-assisted engineering tools to improve SDLC quality and velocity, including responsible use of AI-generated assets
  • Strong customer empathy with a track record of using qualitative and quantitative feedback to iterate product experiences
  • Highly effective at creating significant impact within intricate codebases and large organizations

Additional Information:

Job Posted:
April 22, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Principal Software Engineer - CoreAI

New

Principal Software Engineer, CoreAI Workload Engines

The CoreAI Workloads team builds the foundational inference engines and APIs tha...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 331200.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field and 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, or equivalent experience
  • Proven ability to design and operate large-scale, production inference services with high reliability and performance requirements, and to ship performance improvements safely via disciplined experimentation
  • Strong skills in performance analysis: benchmarking, profiling, diagnosing regressions, and turning results into concrete engine/runtime changes
  • Strong problem-solving skills and the ability to debug complex, cross layer systems issues
  • Demonstrated technical leadership, including mentoring engineers, driving cross-team architectural alignment, and leveraging AI tools and AI-assisted workflows to accelerate engineering velocity and quality
  • Hands-on experience with Kubernetes (building and operating services on k8s), including debugging production issues and designing platform abstractions (e.g., custom resources/controllers) and scheduling-aware deployments (e.g., node affinity, taints/tolerations, resource requests/limits)
  • Strong collaboration and communication skills, with the ability to work across organizational boundaries
Job Responsibility
Job Responsibility
  • Optimize inference engines for OpenAI and open-source models by implementing and shipping performance/efficiency improvements across runtime, scheduling, and serving paths (latency, throughput, utilization, availability, and cost)
  • Run experiments end-to-end: formulate hypotheses, implement engine changes (including Python/PyTorch integration points where relevant), analyze results, and ship improvements behind guardrails
  • Build and use experimentation capabilities for large-scale AI inference (experiment lifecycle, tracking, metric modeling, comparability standards, automated analysis) so the team can iterate quickly and safely
  • Own serving availability and efficiency for Azure OpenAI Service workloads through tiered experimentation, lean segmentation, and multi-modal utilization across heterogeneous fleets—turning findings into shipped engine improvements
  • Design and evolve inference serving architectures to improve utilization and latency using techniques such as disaggregated serving, multi-token prediction, KV offload/retrieval, and quantization—validated via staged rollouts and production guardrails
  • Extend AI infrastructure abstractions to support elastic, heterogeneous inference engines reliably at scale (e.g., dynamic scaling across model families, modalities, and workload classes while maintaining isolation and SLOs)
  • Tune and scale inference engines across NVIDIA GPU generations (A100, H100, H200) for state-of-the-art OpenAI models, focusing on serving efficiency, utilization, and reliability (not hardware bring-up)
  • Partner with networking and storage teams to leverage high-performance interconnects (e.g., RDMA/InfiniBand-class fabrics such as RoCE over IB) for distributed inference, without owning low-level kernel/driver enablement
  • Drive end-to-end features from design through production: observability, diagnostics, performance regression detection, and operational excellence for inference serving
  • Influence platform architecture and technical direction across teams through design reviews, clear metrics, and technical leadership focused on experimentation velocity and production reliability
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineering Manager, Simulation Platform - CoreAI

The AI Frameworks team at Microsoft develops AI software that enables running AI...
Location
Location
United States , Multiple Locations
Salary
Salary:
163000.00 - 296400.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, or related technical discipline
  • 5+ years of experience building/managing team of software engineers
  • 5+ years of experience managing a software project
  • 10+ years of experience in computer architecture and/or embedded systems/software
  • 10+ years of software development experience
  • 10+ years of experience with C++ based object-oriented programming and design
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • 1+ years’ experience with Python
  • A deep technical background and solid foundation in computer architecture, system/hardware simulation, embedded software development and/or firmware
  • Experience designing and/or managing large C++ OOP, scalable, multi-threaded and multi-process software
Job Responsibility
Job Responsibility
  • Managing a team of software engineers
  • Managing development of AI chip simulator, which involves writing requirements, scoping and planning solutions, estimating and assigning work, scheduling and tracking deliverables, integration and releases to partner team, documentation
  • Technical contribution to design, code quality reviews, and capable to step in with hands-on code development when necessary (C++ and Python)
  • Collaborate broadly across multiple disciplines and with various partner teams from hardware designers to AI models developers
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer - CoreAI

Software quality is being redefined by AI. As part of the Microsoft Playwright t...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
  • Bachelor's Degree in Computer Science or related technical field AND 10+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • 3+ years of experience with AI LLM models, such as OpenAI, Azure AI, ML
  • 3+ years of experience with browser engineering
  • 3+ years of experience with network security
Job Responsibility
Job Responsibility
  • Leads identification of dependencies and the development of design documents for a product, application, service, or platform
  • Leads by example and mentors others to produce extensible and maintainable code used across products
  • Leverages subject-matter expertise of cross-product features with appropriate stakeholders (e.g., project managers) to drive multiple group's project plans, release plans, and work items
  • Holds accountability as a Designated Responsible Individual (DRI), mentoring engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions
  • Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale and shares knowledge with other engineers
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, Experimentation Platform - CoreAI

CoreAI sits at the center of Microsoft’s mission to redefine how software is bui...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Champion and improve AI tools and practices across the software development lifecycle (SDLC), incorporating appropriate controls over AI-generated assets
  • Lead by example across teams to produce extensible, maintainable, well-tested, secure, and performant code
  • identify and establish coding best practices, create and apply metrics to drive code quality and stability, and mentor engineers to continuously raise the engineering bar
  • Own and lead the architecture of complex product solutions, driving design discussions, evaluating new technologies to solve problems, and ensuring system architecture meets performance, scalability, resiliency and disaster recovery requirements
  • Lead cross-team collaboration to identify dependencies, negotiate delivery schedules, drive alignment across partner teams, and ensure proper end-to-end testing, live-site coverage, scalability and performance before going live
  • Drive engineering excellence across products
  • lead efforts targeting zero-touch deployment, production reliability, and security hardening for both protections and detections
  • Hold accountability as a designated responsible individual (DRI) across products and solutions, mentor engineers on live-site operations, lead incident retrospectives that drive systemic
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, CoreAI

The GenAI Infrastructure and Solutions team is building large-scale GenAI traini...
Location
Location
United States , Redmond
Salary
Salary:
163000.00 - 296400.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field and 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python or equivalent experience.
  • 6+ years designing, developing, and shipping high quality software.
  • 3+ years of experience with distributed systems and cloud-based infrastructure.
  • 2+ years of experience with containerization tools (e.g., Docker, Kubernetes).
  • 2+ years of experience with DevOps practices (CI/CD, automated testing, deployment, etc.).
  • Passionate and self-motivated. Strong ability in self-learning, entering new domain, managing through uncertainty in an innovative team environment.
  • Familiarity with virtualization technology.
  • Familiarity with production ML systems and concepts like model serving, caching, batching, and monitoring.
Job Responsibility
Job Responsibility
  • Lead the collaboration with engineers and researchers to build and optimize training infrastructure and tools for LLMs, SLMs, multimodal, and code-specific models.
  • Design, build and improve services with high scalability and reliability.
  • Design and implement the services to serve the prod traffic and fulfill the security and privacy requirements.
  • Lead the efforts to deliver and improve engineering systems and practices to ensure service quality in complex cloud environments.
  • Contribute to the deployment and monitoring of services in production environments.
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, CoreAI

Joining the CoreAI organization at Microsoft means becoming part of the team tha...
Location
Location
United States , Multiple Locations
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field and 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, or equivalent experience.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years
Job Responsibility
Job Responsibility
  • Collaboration with engineers and researchers to build and optimize training infrastructure and tools for LLMs, SLMs, multimodal, and code-specific models.
  • Design, build and improve services with high scalability and reliability.
  • Design and implement the services to serve the prod traffic and fulfill the security and privacy requirements.
  • Participate in efforts to deliver and improve engineering systems and practices to ensure service quality in complex cloud environments.
  • Contribute to the deployment and monitoring of services in production environments.
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer, CoreAI

Core AI is at the forefront of Microsoft’s mission to redefine how software is b...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years
Job Responsibility
Job Responsibility
  • Shape the Product Vision: Define and influence the product roadmap by aligning technical strategy with business goals and customer needs
  • Drive Strategic Clarity: Leverage data-driven insights and competitive intelligence to inform product direction, identify opportunities, and guide decision-making
  • Architect for Scale and Sustainability: Design and evolve durable, scalable system architectures that balance long-term maintainability with short-term delivery needs, making thoughtful engineering trade-offs
  • Foster Engineering Alignment: Work with the engineering teams and partner organizations by driving clarity, alignment, and shared ownership of technical direction
  • Deliver Cohesive End-to-End Experiences: Collaborate closely with partner teams—including experience, SDK, and platform groups—to ensure seamless integration and delivery of features across the stack
  • Build Foundational Capabilities: Contribute to and lead the development of core platform components and reusable building blocks that accelerate team velocity and product innovation
  • Champion Customer-Centric Development: Engage directly with customers and product teams to capture feedback, understand demand signals, and refine product messaging—ensuring the voice of the customer shapes product evolution
  • Lead Live Site Excellence: Drive operational excellence in managing and operating large-scale distributed systems with a high bar for service-level agreements (SLAs). Lead root cause analyses (RCAs) for key live site incidents and outages, identify systemic improvements, and set high standards for reliability and performance
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, CoreAI

The CoreAI GPU Infrastructure team builds the foundational accelerated compute p...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field and 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python or equivalent experience
  • Proven ability to design and operate large-scale, production infrastructure with high reliability and performance requirements
  • Strong problem-solving skills and the ability to debug complex, cross-layer systems issues
  • Demonstrated technical leadership, including mentoring engineers and driving cross-team architectural alignment
  • Hands-on experience with virtualization and/or container platforms (e.g., VMs, Kubernetes, container runtimes)
  • Strong collaboration and communication skills, with the ability to work across organizational boundaries
Job Responsibility
Job Responsibility
  • Design and build GPU accelerated infrastructure for training and inference workloads, spanning bare metal, virtual machines, and containerized environments
  • Develop systems for GPU device management, scheduling, isolation, and sharing (e.g., partial GPU allocation, multi-tenant usage)
  • Build and operate advanced orchestration and resource governance scenarios using platforms such as AKS, Dynamic Resource Allocation (DRA), and related Kubernetes ecosystem capabilities to enable fair sharing, isolation, and efficient utilization of accelerated resources
  • Build and evolve virtualization and container stacks to support modern AI workloads, including secure and confidential compute scenarios
  • Optimize performance, reliability, and utilization across large GPU fleets, including scale-up and scale-out configurations
  • Partner with networking and storage teams to enable high-performance interconnects (e.g., RDMA/InfiniBand class networking) for distributed workloads
  • Drive end-to-end platform features from design through production, including observability, diagnostics, and operational excellence
  • Influence platform architecture and technical direction across teams through design reviews and technical leadership
  • Fulltime
Read More
Arrow Right