CrawlJobs Logo

Principal Software Engineering Manager - AI Engineering

Canada, Vancouver 142400.00 - 257500.00 CAD / Year · Job Posted March 13, 2026
Apply Position
Job Link Share

Job Description

The Fabric Data Engineering Experience & Infrastructure team is hiring a Principle Software Engineering Manager to lead a team building LLM-powered data engineering experiences and supporting infrastructure for Fabric Data Engineering, based on Apache Spark. This role spans people leadership and technical leadership: you will grow and coach engineers while guiding design and delivery of agentic workflows and scalable LLM-backed data features (e.g., AI-assisted notebook experiences, evaluation/telemetry, production-grade orchestration patterns) that help Data Engineers achieve more through Microsoft Fabric.

Job Responsibility

  • Lead and grow a team: Hire, onboard, coach, and develop engineers
  • set clear expectations
  • create an inclusive culture of accountability, learning, and collaboration.
  • Drive execution and delivery: Guide team planning and prioritization across multiple workstreams
  • manage dependencies, risks, and release readiness
  • ensure predictable delivery from requirements → architecture → implementation → rollout → live-site operations.
  • Shape requirements with partners: Partner with Product Management, Design, Research, and dependent engineering teams to translate ambiguous customer needs into crisp scenario plans and measurable outcomes.
  • Guide architecture and technical strategy: Lead identification of dependencies and development of design documents
  • guide architectural decisions for distributed, cloud-scale systems (Spark/PySpark + Python services) with explicit tradeoffs across performance, reliability, cost, security, privacy, and operability.
  • Raise the engineering quality bar: Establish and reinforce engineering standards (design reviews, coding patterns, test strategy, performance practices, operational readiness)
  • ensure code and designs meet quality and scale expectations.
  • Operational excellence and accountability: Own service health for your area—live-site readiness, on-call excellence, incident response, postmortems, and sustained improvements. Hold accountability for outcomes when services do not meet performance or reliability expectations.
  • AI Engineering at production scale: Guide the team to build and operationalize LLM-powered experiences using robust orchestration, grounding, evaluation/quality gates, telemetry, and iterative improvements aligned to customer value and Responsible AI principles.
  • Cross-team influence: Build partner relationships across organizations and geographies
  • align on shared goals, interfaces, and SLAs
  • unblock execution and drive decisions when tradeoffs arise.

Requirements

  • Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Nice to have

  • Modern LLM / AI Engineering: Solid understanding of LLM systems and applied AI Engineering (prompting, grounding/RAG, tool/function calling, agent orchestration, evaluation). Ability to define quality bars and drive adoption of repeatable patterns across teams.
  • Operationalizing AI/ML at scale: Experience establishing monitoring/telemetry, experimentation (A/B), rollout strategies, and cost/latency optimization—driving predictable operations and continuous improvement across services.
  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • People leadership: Experience leading and developing engineering teams (hiring, coaching, performance management, career growth), building inclusive culture, and improving team effectiveness.
  • Technical depth in distributed systems: Proven ability to guide design and delivery of scalable distributed systems and production services, including reliability, diagnosability, and operational excellence.
  • Spark + data platform expertise: Hands-on understanding of Apache Spark/PySpark and data engineering patterns for large-scale structured/semi-structured/unstructured workloads
  • ability to guide platform-level improvements (performance, cost, operability).
  • Cloud + security/compliance rigor: Cloud-native engineering experience (Azure compute/storage/networking) and ability to ensure solutions meet security, privacy, and compliance expectations.
  • Cross-team partner leadership: Demonstrated ability to align with multiple partner teams, manage dependencies, and deliver high-impact customer outcomes through influence and collaboration.

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal Software Engineering Manager - AI Engineering

8 matching positions

Principal Software Engineering Manager - AI Frameworks

As a Principal Software Engineering Manager - AI Frameworks on the team, you wil...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 304200.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Master’s Degree in Computer Science or related technical field AND 10+ years of software engineering experience, including 6+ years in engineering management, OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years of software engineering experience, including 6+ years in engineering management, or equivalent experience
  • Strong technical foundation in software engineering principles, computer architecture, GPU architecture, and hardware acceleration for neural networks, with the ability to guide teams working in these areas
  • Experience leading teams responsible for end-to-end performance analysis and optimization of LLMs, AI systems, or HPC workloads, including use of GPU profiling and performance analysis tools
  • Demonstrated ability to lead cross-team initiatives, align stakeholders, and translate research or platform capabilities into scalable, production-ready solutions
  • Proven people leadership skills, including hiring, coaching, performance management, and career development, with a track record of building high-performing, inclusive teams
  • Exposure to AI / ML infrastructure, including DNN or LLM training and/or inference systems, and experience with at least one modern deep learning framework (e.g., PyTorch, TensorFlow, ONNX Runtime)
  • Familiarity with GPU software stacks and acceleration technologies such as CUDA, ROCm, Triton, or equivalent, sufficient to guide technical direction and evaluate tradeoffs
Job Responsibility
Job Responsibility
  • Lead and develop a team of engineers working across multiple layers of the AI software stack to enable large-scale training and inference
  • Set technical vision and execution strategy for model performance benchmarking, optimization, and deployment across GPUs and Microsoft hardware
  • Drive performance outcomes by prioritizing and overseeing efforts to benchmark, profile, debug, and optimize training and inference workloads
  • Own performance health by establishing mechanisms to monitor regressions, measure impact, and continuously improve time-to-deploy and hardware efficiency
  • Partner cross-functionally with research, product, infrastructure, and hardware teams to deliver scalable, production-ready AI performance improvements
  • Balance short-term delivery and long-term investments, ensuring the team’s work aligns with organizational goals, platform roadmaps, and Azure capex objectives
  • Build a strong engineering culture through coaching, feedback, hiring, and career development, enabling the team to operate with increasing autonomy and impact
  • Fulltime
Read More
Arrow Right

Principal Software Engineering Manager - Data Science & Engineering

The MSRC Data Science team is responsible in building data pipelines, data minin...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Leads team on the disciplined use of, and improving artificial intelligence (AI) tools and practices across the software development lifecycle (SDLC)
  • Guides team on proactively taking responsibility for the content of their AI-generated requirements, design documents, code, and other assets, and assisting other members of the team to do the same
  • Leads team on incorporating Responsible AI practices into the SDLC to ensure appropriate controls over AI-generated assets
  • Coaches team on applying SDLC and engineering health measures (e.g., Accelerate, SPACE framework, Engineering System Success Playbook [ESSP]) to guide improvements to processes and practices, especially those involving AI
  • Leads team on experimenting with AI tools and practices to improve their own capabilities, and providing recommendations on how to adopt them to others
  • Reviews debugging tools, tests, logs, telemetry, and other methods, and acts as an expert for others to proactively verify assumptions while developing code before issues occur across products in production
  • Guides team to perform machine learning/data extraction, transformation, and loading (ETL) pipelines (e.g., data collection, cleaning) based on data prepared
  • Guides the architecture of scalable pipelines and datasets
  • Influences the direction of the team
  • Begins to anticipate potential data pipeline issues and provides solutions
  • Fulltime
Read More
Arrow Right

Principal Software Engineering - AI Frameworks

Are you looking for opportunities to deliver innovations to hundreds of millions...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience developing Inference software stack
  • Experience working on systems performance optimization
  • Working with Open-Source code
Job Responsibility
Job Responsibility
  • Partnering with appropriate stakeholders to determine user requirements for one or more complex scenarios
  • Providing technical leadership for the identification of dependencies and the development of design documents for a product, application, service, or platform
  • Leading by example and mentoring others to produce extensible and maintainable code used across the company
  • Leveraging deep subject-matter expertise of cross-product features with appropriate stakeholders (e.g., project managers) to lead multiple product's project plans, release plans, and work items
  • Holding accountability as a Designated Responsible Individual (DRI), mentoring engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions
  • Proactively seeking new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale and shares knowledge with other engineers
  • Embodying our Culture and Values
  • Fulltime
Read More
Arrow Right

Principal Software Engineering Manager - Search

Windows Search is undergoing a fundamental transformation — evolving from a trad...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python. OR equivalent experience.
  • 4+ years of people management experience leading engineering teams.
  • Solid systems programming background with experience in platform/infrastructure-level software development
  • Experience with search/indexing systems, database internals, file systems, or information retrieval at scale.
Job Responsibility
Job Responsibility
  • Owning the technical direction and architecture for the Windows Search Platform from design through retail delivery.
  • Driving the evolution of Search Platform into an AI-native infrastructure layer, enabling Copilot, MCP/LLM integrations, and future agentic discovery patterns while maintaining enterprise-grade reliability and performance.
  • Leading cross-functional partnerships with Products, File Explorer, Windows Search Box and other external partners to drive architectural consensus, scope clarity, and release governance.
  • Establishing and enforcing release discipline and observability as first-class requirements.
  • Defining and driving data-backed engineering decisions across the platform.
  • Leading, mentoring, and growing a team of 7-10 engineers — running effective 1:1s, providing direct feedback, building clear growth paths, and cultivating a culture of engineering rigor, ownership, and speed.
  • Recruiting and retaining top systems engineering talent, with a bias toward people who are curious about and energized by AI-native development and Windows platform internals.
  • Representing your team's work to senior leadership, communicating trade-offs, risks, delivery timelines, and strategic context with clarity and confidence in forums such as Shiproom, Mission Controls, and leadership reviews.
  • Driving program execution across multiple concurrent tracks (8-10 workstreams), including sprint cadence, ADO hygiene, capacity planning, and cross-org alignment.
  • Championing AI-assisted engineering practices — leveraging agentic workflows, automation, and AI tooling to reduce KTLO burden, accelerate delivery, and multiply team velocity.
  • Fulltime
Read More
Arrow Right

Principal Software Engineering Manager

M365 Copilot Inference is a high-impact engineering team advancing applied AI an...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Lead and grow a team of software engineers building control plane services and automations across the capacity buildout area
  • Drive technical design and execution for capacity automation — intake, planning, deployment, fleet health, and control plane components — prioritizing the highest-impact work for Copilot capacity
  • Replace manual, ticket-driven capacity workflows with automated, data-driven systems
  • reduce time from capacity request to production traffic for priority workloads
  • Own live-site, reliability, and operational excellence for the services your team builds
  • establish SLAs, metrics, and on-call practices
  • Partner with peer engineering managers on adjacent capacity areas, and with partner teams across M365 Core, AI Core, Azure, and Microsoft Research to align on dependencies and unblock execution
  • Coach and grow senior and mid-level engineers
  • raise the engineering bar
  • recruit strong platform talent into the team
  • Fulltime
Read More
Arrow Right

Principal Software Engineering Manager, Windows Platform & Developer Team

Windows is evolving beyond a platform for applications towards a foundation on w...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Job Responsibility
Job Responsibility
  • Owning the technical direction for new agentic capabilities, from design to delivery
  • Lead cross-functional partnerships with product, design, data, leaders of internal Microsoft product groups, as well as external partnerships with each frontier artificial intelligence (AI) company
  • Lead, mentor, and grow a team of engineers, including running effective 1:1s, providing direct feedback, and building clear growth paths for engineers at each stage of their career
  • Recruiting and retaining top systems engineering talent, with a bias toward people who are curious about and energized by AI-native development
  • Represent your team's work to leadership, communicating judgement in trade-offs, communicating decisions, risks, and strategic context with clarity and confidence
  • Thought leadership for the broader organization
  • Fulltime
Read More
Arrow Right

Principal Software Engineering Manager - Substrate Efficiency

M365 Copilot inference is a high-impact engineering team advancing applied AI an...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Build and lead a high-performing engineering team focused on inference runtime efficiency and model execution performance
  • Define and drive strategy to improve throughput per GPU through runtime optimizations
  • Increase engineering agility, enabling faster experimentation, iteration, and rollout of performance improvements
  • Partner across M365 Core, AI Core, Azure, and Microsoft Research to co-design and productionize advanced inference optimizations
  • Establish metrics, telemetry, and experimentation frameworks to measure efficiency gains and guide investment decisions
  • Own live-site performance, reliability, and operational excellence for inference engines at scale
  • Drive alignment across partner teams on engine interfaces, performance goals, and optimization priorities.
  • Fulltime
Read More
Arrow Right

Principal Software Engineering Manager

Microsoft is a company where passionate innovators come to collaborate, envision...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of engineering experience in distributed systems, databases, platform engineering and cloud services
  • 4+ years leading engineering teams delivering highly available cloud services and infrastructure
  • Experience with large scale services architectures and technologies
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
Job Responsibility
Job Responsibility
  • Leadership & Strategy: Define and evolve the long-term Fundamentals charter across engineering systems, reliability, security, observability, lifecycle, and AI-driven automation
  • People Management: Lead, coach, and grow a high-performing engineering team
  • Technical Execution: Own and deliver end‑to‑end features across the full engineering lifecycle
  • Mentor and coach engineers through design reviews, code reviews, and operational learnings
  • Incorporate customer requirements, usage patterns, and live‑site signals into engineering decisions
  • Drive an automation‑first engineering approach by leveraging AI across the engineering lifecycle
  • Fulltime
Read More
Arrow Right