CrawlJobs Logo

Principal Software Engineer - Performance Tooling

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Redmond

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

139900.00 - 274800.00 USD / Year

Job Description:

The Artificial Intelligence (AI) Frameworks team at Microsoft develops AI software that enables running AI models everywhere, from world’s fastest AI supercomputers, to servers, desktops, mobile phones, internet of things (IoT) devices and internet browsers. We collaborate with our hardware teams and partners, both internal and external, and operate at the intersection of AI algorithmic innovation, purpose-built AI hardware, systems, and software. We are a team of highly capable and motivated people that pride themselves on a collaborative and inclusive culture.  We own inference performance of OpenAI and other state of the art large language model (LLM) models and work directly with OpenAI on the models hosted on the Azure OpenAI service serving some of the largest workloads on the planet with trillions of inferences per day in major Microsoft products, including Office, Windows, Bing, SQL Server, and Dynamics.  As a Principal Software Engineer - Performance Tooling on the team, you will have the opportunity to work on multiple levels of the AI software stack, including the fundamental abstractions, programming models, compilers, runtimes, libraries and application programming interfaces (APIs) to enable large scale training and inferencing of models. You will benchmark OpenAI and other LLM models for performance on graphics processing units (GPUs) and Microsoft hardware, debug and optimize performance, monitor performance and enable these models to be deployed in the shortest amount of time and the least amount of hardware possible helping achieve Microsoft Azure's capex goals.

Job Responsibility:

  • Work across multiple layers of the AI software stack (abstractions, programming models, compilers, runtimes, libraries, and APIs) to enable large-scale model training and inference
  • Benchmark OpenAI and other LLMs for performance on Graphic Processing Units (GPUs) and Microsoft hardware
  • Debug, profile, and optimize performance for training/inference workloads on CPUs (Central Processing Units)/GPUs
  • Monitor performance regressions and drive continuous improvements to reduce time-to-deploy and hardware footprint
  • Collaborate across teams of researchers and engineers to deliver scalable, production-ready AI performance improvements

Requirements:

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C++, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. This includes passing the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C++, or Python OR Bachelor's Degree in Computer Science or related technical field AND 15+ years technical engineering experience with coding in languages including, but not limited to, C++, or Python OR equivalent experience
  • 4+ years’ practical experience working on high performance applications and performance debugging and optimization on CPUs/GPUs
  • Experience in DNN/LLM inference and experience in one or more DL frameworks such as PyTorch, Tensorflow, or ONNX Runtime and familiarity with CUDA, ROCm, Triton
  • Technical background and solid foundation in software engineering principles, computer architecture, GPU architecture, hardware neural net acceleration
  • Experience in end-to-end performance analysis and optimization of state of the art LLMs and HPC applications, including proficiency using GPU profiling tools
  • Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
  • Ability to independently lead projects

Nice to have:

  • Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience
  • 4+ years’ practical experience working on high performance applications and performance debugging and optimization on CPUs/GPUs
  • Experience in DNN/LLM inference and experience in one or more DL frameworks such as PyTorch, Tensorflow, or ONNX Runtime and familiarity with CUDA, ROCm, Triton
  • Technical background and solid foundation in software engineering principles, computer architecture, GPU architecture, hardware neural net acceleration
  • Experience in end-to-end performance analysis and optimization of state of the art LLMs and HPC applications, including proficiency using GPU profiling tools
  • Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
  • Ability to independently lead projects

Additional Information:

Job Posted:
April 11, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Principal Software Engineer - Performance Tooling

Senior Principal Software Quality Engineer

As a Senior Principal Software Quality Engineer at Baxter, you will play a criti...
Location
Location
United States , Raleigh
Salary
Salary:
120000.00 - 165000.00 USD / Year
https://www.baxter.com/ Logo
Baxter
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • B.S. in Engineering or other technical degree required, preferably Computer Science/Engineering or Electronics/Electrical Engineering
  • Software Quality experience is highly sought, especially in the medical device industry
  • Knowledge of AAMI 62304 standard is valuable
  • Minimum 8 years of experience in medical device or other regulated technical industry (e.g., aerospace, automotive, defense) in a design/development/quality role or in a role closely connected to design/development/quality is required
  • Proven ability to perform and influence in cross-functional team environments and utilize effective interpersonal skills
  • 2+ years of Quality experience desired
  • Product Design experience may be considered in lieu of Quality Assurance experience
  • Software development experience in a regulated industry is desirable
  • Knowledge of software development lifecycle processes and standards required
  • Understanding of software development tools and methods for medical devices and/or other regulated industries desirable
Job Responsibility
Job Responsibility
  • Responsible for all Design Assurance functions as a core team member on new product development (NPD) teams, ensuring the team complies with all portions of Design Control and related Quality System elements
  • Prepare and manage all Design Assurance required deliverables as well as support the remainder of the team in developing a quality product that meets regulatory requirements
  • Responsible for ensuring product development activities related to verification and validation are fully compliant to the quality system procedures
  • Assist in identification and mitigation of product or process-related risks
What we offer
What we offer
  • Support for Parents
  • Continuing Education/ Professional Development
  • Employee Health & Well-Being Benefits
  • Paid Time Off
  • 2 Days a Year to Volunteer
  • Medical and dental coverage that start on day one
  • Insurance coverage for basic life, accident, short-term and long-term disability, and business travel accident insurance
  • Employee Stock Purchase Plan (ESPP)
  • 401(k) Retirement Savings Plan (RSP)
  • Flexible Spending Accounts
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

We are seeking a highly skilled Principal Software Engineer with over 12 years o...
Location
Location
United States , San Francisco
Salary
Salary:
237188.00 - 347875.00 USD / Year
https://6sense.com Logo
6sense
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field
  • 12+ years of experience in software development, with a strong emphasis on software design and architecture
  • Proficiency in multiple programming languages, such as Java, Python, C++, or similar
  • Deep understanding of software design principles, patterns, and best practices
  • Experience with cloud technologies (e.g., AWS, Azure, GCP) and microservices architecture
  • Strong communication and leadership skills, with the ability to effectively collaborate with cross-functional teams
  • Proven track record of delivering complex software projects on time and within budget
  • Experience with Agile development methodologies and tools (e.g., Scrum, Kanban, JIRA)
  • Excellent problem-solving skills and a proactive attitude towards addressing technical challenges
  • Strong commitment to quality, with a focus on writing clean, maintainable, and efficient code
Job Responsibility
Job Responsibility
  • Lead the architecture and design of large-scale software systems, ensuring scalability, reliability, and performance
  • Provide technical leadership and guidance to development teams, mentoring engineers and promoting best practices
  • Collaborate with product managers, designers, and other stakeholders to translate business requirements into technical solutions
  • Drive innovation and continuous improvement in software development processes and methodologies
  • Conduct code reviews, identify areas for improvement, and enforce coding standards and best practices
  • Stay updated on industry trends and emerging technologies, evaluating their potential impact on our products and development practices
  • Troubleshoot and resolve complex technical issues, working closely with cross-functional teams to ensure timely resolution
  • Participate in hiring and onboarding activities, helping to build a strong and diverse engineering team
What we offer
What we offer
  • Health insurance coverage
  • life and disability insurance
  • 401K employer matching program
  • paid holidays
  • self-care days
  • paid time off (PTO)
  • paid parental leave
  • quarterly wellness education sessions
  • stock options
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

We are seeking a highly skilled Principal Software Engineer with over 12 years o...
Location
Location
India , Pune; Bangalore
Salary
Salary:
Not provided
https://6sense.com Logo
6sense
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field
  • 12+ years of experience in software development, with a strong emphasis on software design and architecture
  • Proficiency in multiple programming languages, such as Java, Python, C++, or similar
  • Deep understanding of software design principles, patterns, and best practices
  • Experience with cloud technologies (e.g., AWS, Azure, GCP) and microservices architecture
  • Strong communication and leadership skills, with the ability to effectively collaborate with cross-functional teams
  • Proven track record of delivering complex software projects on time and within budget
  • Experience with Agile development methodologies and tools (e.g., Scrum, Kanban, JIRA)
  • Excellent problem-solving skills and a proactive attitude towards addressing technical challenges
  • Strong commitment to quality, with a focus on writing clean, maintainable, and efficient code
Job Responsibility
Job Responsibility
  • Lead the architecture and design of large-scale software systems, ensuring scalability, reliability, and performance
  • Provide technical leadership and guidance to development teams, mentoring engineers and promoting best practices
  • Collaborate with product managers, designers, and other stakeholders to translate business requirements into technical solutions
  • Drive innovation and continuous improvement in software development processes and methodologies
  • Conduct code reviews, identify areas for improvement, and enforce coding standards and best practices
  • Stay updated on industry trends and emerging technologies, evaluating their potential impact on our products and development practices
  • Troubleshoot and resolve complex technical issues, working closely with cross-functional teams to ensure timely resolution
  • Participate in hiring and onboarding activities, helping to build a strong and diverse engineering team
What we offer
What we offer
  • Health coverage
  • Paid parental leave
  • Generous paid time off and holidays
  • Quarterly self-care days off
  • Stock options
  • Equipment and support for work and connectivity
  • Growth mindset culture
  • Learning and development initiatives including access to LinkedIn Learning
  • Quarterly wellness education sessions to encourage self-care and personal growth
  • Wellness days
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, Trusted Data Platform

As a Principal Software Engineer, you will be a technical leader and hands-on co...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related technical field
  • 10+ years of experience in backend software development, focusing on distributed systems and storage solutions
  • 5+ years of experience working with AWS storage services (S3, DynamoDB, EBS, EFS, FSx, Glacier)
  • Strong expertise in system design, architecture, and scalability for large-scale storage solutions
  • Proficiency in at least one major backend programming language (Kotlin, Java, Go, Rust, or Python)
  • Experience designing and implementing highly available, fault-tolerant, and cost-efficient storage architectures
  • Deep understanding of distributed systems, replication strategies, sharding, and caching
  • Knowledge of data security, encryption best practices, and compliance requirements (SOC2, GDPR, HIPAA)
  • Experience leading engineering teams, mentoring senior engineers, and driving technical roadmaps
  • Proficiency with observability tools, performance monitoring, and troubleshooting at scale
Job Responsibility
Job Responsibility
  • Designing and optimizing high-scale, distributed storage systems built on AWS storage technologies
  • Shaping the architecture, performance, and reliability of backend storage solutions that power critical applications at scale
  • Designing, implementing, and optimizing backend storage services that support high throughput, low latency, and fault tolerance
  • Working closely with senior engineers, architects, and cross-functional teams to drive scalability, availability, and efficiency improvements in large-scale storage solutions
  • Leading technical deep dives, architecture reviews, and root cause analyses to resolve complex production issues related to storage performance, consistency, and durability
  • Driving best practices in distributed system design, security, and cloud cost optimization
  • Mentoring senior engineers, contributing to technical roadmaps, and helping shape the long-term storage strategy
  • Collaborating with Site Reliability Engineers (SREs) to implement observability, monitoring, and disaster recovery strategies, ensuring high availability and compliance with industry standards
  • Advocating for automation, Infrastructure-as-Code (IaC), and DevOps best practices, leveraging tools like Terraform, AWS CloudFormation, Kubernetes (EKS), and CI/CD pipelines to enable scalable deployments and operational excellence
What we offer
What we offer
  • Atlassians can choose where they work – whether in an office, from home, or a combination of the two
  • Atlassians have more control over supporting their family, personal goals, and other priorities
  • We can hire people in any country where we have a legal entity
  • Interviews and onboarding are conducted virtually
  • Whatever your preference - working from home, an office, or in between - you can choose the place that's best for your work and your lifestyle
Read More
Arrow Right

Principal Software QA Engineer

Principal Software QA Engineer to lead test architecture and automation strategy...
Location
Location
Puerto Rico , Aguadilla
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of hands-on QA experience
  • Designing and building test automation frameworks from scratch
  • Non-functional testing (scale, reliability, performance, security)
  • Strong coding skills in Python, Java, or Go
  • Experience with Pytest, TestNG, JUnit, Playwright or similar tools
  • Deep understanding of Cloud platforms (AWS, Azure, GCP)
  • Microservices, Containers (Docker, Kubernetes)
  • Infrastructure & Data Center management
  • Linux/VM environments, Storage, Compute, Networking
  • REST APIs, JSON, SQL/NoSQL
Job Responsibility
Job Responsibility
  • Design, automate, and execute system-level test cases focused on scale, reliability, security, and performance
  • Lead the test automation strategy
  • evaluate and integrate new tools to improve efficiency and coverage
  • Collaborate closely with product, development, support, and platform engineering teams to ensure full lifecycle quality coverage
  • Provide technical leadership and mentorship to QA engineers and partners across teams
  • Contribute to design reviews with a QA lens to ensure testability and risk mitigation
  • Maintain and manage multiple product test configurations aligned with diverse deployment environments
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

At PointClickCare our mission is simple: to help providers deliver exceptional c...
Location
Location
Canada , Mississauga
Salary
Salary:
156000.00 - 174000.00 CAD / Year
pointclickcare.com Logo
PointClickCare
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience writing clean code that performs well at scale using Java
  • Experience with UI development and React frameworks
  • Experience with Spring Boot
  • In-depth knowledge of relational databases (e.g. Microsoft SQL Server, MySQL)
  • Solid experience writing RESTful API endpoints
  • Absolutely love TDD and have working knowledge of it
  • Proficient in GIT
  • Experience using system and performance monitoring tools (e.g. New Relic, DataDog)
  • Experience with automated testing frameworks (e.g. Selenium, Cypress, RestAssured)
  • Excellent organization, critical-thinking and personal leadership skills
Job Responsibility
Job Responsibility
  • Identify, prioritize and execute tasks in the software development life cycle
  • Work with business to iterate over software requirements
  • Develop tools and applications by producing clean, efficient code
  • Automate tasks through appropriate tools and scripting
  • Analyze and debug systems
  • Perform validation and verification testing in a test-driven manner
  • Review the work of others, and invite others to review your work
  • Collaborate with internal teams and vendors to fix and improve products
  • Ensure software is up-to-date with latest technologies
What we offer
What we offer
  • Benefits starting from Day 1!
  • Retirement Plan Matching
  • Flexible Paid Time Off
  • Wellness Support Programs and Resources
  • Parental & Caregiver Leaves
  • Fertility & Adoption Support
  • Continuous Development Support Program
  • Employee Assistance Program
  • Allyship and Inclusion Communities
  • Employee Recognition … and more!
  • Fulltime
Read More
Arrow Right

Lead / Principal Software Engineer

We’re hiring Lead and Principal Software Engineers to build the next generation ...
Location
Location
Australia , Sydney
Salary
Salary:
Not provided
blumeglobal.com Logo
Blume Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years building scalable, fault-tolerant systems and enterprise software
  • Strong experience with backend architecture, platform modernization, and CI/CD
  • Proficiency in C#, Java, Python, SQL, and JavaScript
  • Experience with cloud infrastructure (AWS, Kinesis, Lambda) and DevOps tools (Docker, Kubernetes, Jenkins)
  • Proven ability to lead technical decisions, mentor engineers, and improve team productivity
  • Strong experience integrating and evaluating AI tools like GitHub Copilot and AIOps in real-world engineering workflows
  • Strong communication across product, compliance, and engineering teams
  • Track record of aligning technical work with business outcomes and customer value
Job Responsibility
Job Responsibility
  • Build the next generation of our platforms
  • Work on high-scale systems that process billions of transactions
  • Modernize core infrastructure
  • Drive AI initiatives to improve performance and reliability
  • Set technical direction
  • Mentor senior engineers
  • Shape architecture across multiple domains
What we offer
What we offer
  • Competitive Package + Equity
  • Find the team/project that fits you best
  • Hybrid and Flexible Work
  • Continuous Learning and Growth
  • Access learning platforms (Coursera, Pluralsight, LinkedIn Learning, WiseTech Academy), mentorship, and development opportunities
  • Top-Tier Hardware
  • Onsite Meals and Snacks
Read More
Arrow Right

Principal Software Engineer

As a Principal Software Engineer at Global-e, you will design and deliver the co...
Location
Location
United States , Hoboken, NJ
Salary
Salary:
Not provided
global-e.com Logo
Global-e
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ Years of Experience: A proven track record building large-scale, customer-facing applications in a fast-paced environment (e-commerce, fintech, tech startups a plus)
  • Distributed Systems Expertise: Familiarity with designing, deploying, and operating resilient, fault-tolerant systems that handle high traffic
  • Engineering Practices Proficiency: Hands-on experience with Agile methodologies, CI/CD pipelines, and rapid release cycles
  • Strong Database Skills: Ability to optimize and scale applications involving complex data interactions
Job Responsibility
Job Responsibility
  • Deliver High-Impact Features: Lead the design, development, and deployment of new capabilities across logistics (fulfillment, labels, tracking) and order management workflows
  • Shape Technical Architecture: Define, communicate, and guide architectural decisions to ensure scalability and reliability
  • Elevate Standards: Champion clean code, best practices, and robust testing frameworks, pushing the team to achieve technical excellence
  • Scale the Product: Propose and implement features, tooling, and infrastructure that support exponential growth and operational efficiency
  • Ensure Quality & Reliability: Employ a rigorous approach to verification, focusing on stable, high-performing systems that meet critical metrics and SLAs
  • Move Fast with Confidence: Embrace a rapid, iterative release cycle, balancing speed and safety through CI/CD pipelines, effective monitoring, and efficient processes
  • Collaborate & Share Knowledge: Work closely with other engineering teams, product managers, and stakeholders to ensure alignment and share expertise
  • Write Code in Scala: Contribute high-quality Scala code (no prior Scala experience required, just a passion for learning and an interest in functional programming)
What we offer
What we offer
  • Impact at Global Scale: Build features used by millions, simplifying global commerce and transforming the e-commerce landscape
  • Modern Technology Stack: Work on an advanced microservices platform, leveraging cloud-native tools and best-in-class engineering practices
  • Growth & Development: Expand your expertise through challenging projects, mentorship opportunities, and professional development programs
Read More
Arrow Right