CrawlJobs Logo

Principal Software Engineer

United States, Multiple Locations 139900.00 - 274800.00 USD / Year · Job Posted February 13, 2026
Apply Position
Job Link Share

Job Description

Microsoft Azure High Performance Computing & AI Engineering (HPC & AI Eng) team is responsible for managing the core platform & fleet of AI High Performance Computing products that customers use to run their most performant and demanding workloads. The AI Customer Experience (AICE) engineering team within the HPC & AI Eng. team is on the frontlines managing the flagship supercomputers used by top tier AI customers that enable breakthroughs such as ChatGPT and are highlighted in Top500, MLPerf and Graph500 rankings. As a Principal Supercomputing Software Engineer, you will design and develop high volume low latency telemetry pipelines, connect to existing telemetry pipelines, and stitch together data to deliver first to know insights on customer facing issues across the infrastructure stack – from datacenter events to various hardware and networking subsystem events affecting job reliability and causing job interrupts. In this role, you will bring exceptional design and development expertise, with a solid background in large-scale High-Performance Computing & GPU systems, cloud computing platforms, high-performance data processing infrastructure. This opportunity will give you hands-on experience managing the largest scale of supercomputers delivered to our customers. As a key technical leader, you will engage deeply with strategic customers, directly influencing their business outcomes as well as drive engineering improvements in the Azure ecosystem benefiting the broader fleet. Your work will enable the next wave of growth and innovation in AI and high-performance computing (HPC) in the cloud.

Job Responsibility

  • Architect, design and develop high volume low latency end to end event pipelines that can provide first-to-know-insights on events causing job interrupts and job reliability
  • Conduct analysis of existing event pipelines to evaluate fidelity, granularity and latency of critical events
  • Contribute to improving key metrics such as Job Mean Time to Interrupt, Nodes in Service, Mean Time to Resolve on flagship supercomputers by enabling data scientists and domain experts to use the telemetry to identify events & issues at the intersection of datacenter and hardware, develop hypothesis, conduct A/B tests and synthesize results
  • Partner with cross organizational teams to evaluate available telemetry and latency drive architecture, design, development and deployment of end-to-end solutions to manage core infrastructure including current & next generation datacenter, IT hardware, power & cooling technologies
  • Drive engineering and operational excellence based on issues and learnings from strategic customers on their usage scenarios to improve product features and capabilities
  • Partner with teams on continuous learning and continuous improvement programs by leading the resolution of complex incidents, driving root cause analyses and championing initiatives to minimize future customer impact

Requirements

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python - OR equivalent experience
  • 5+ years hands on experience designing and developing high volume low latency pipelines using products such as AzPubSub, Event Hubs, Azure Stream Analytics, Kafka, Grafana, Event Hubs, Prometheus or equivalent products
  • 3+ years of experience with one of AI/HPC system management OR High-Speed Networks OR HPC Storage OR managing Cloud Infrastructure
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Nice to have

  • Bachelor's Degree in Computer Science - OR related technical field AND 10+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, OR Python - OR Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python - OR equivalent experience
  • 5+ years of experience in operating AI/HPC systems, developing and running AI/HPC applications on clusters, or operating Cloud Infrastructure
  • 3+ years of experience in multiple DataCenter technologies: power, cooling, IT hardware, telemetry

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal Software Engineer

8 matching positions

New

Principal Software Engineer

We are developing Manufacturing and Engineering AI tools that help employees gai...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 13-17 years of engineering experience building or platforming cloud services or developer platforms, with 3+ years leading engineering teams or technical programs
  • Proven experience designing and operating cloud-native platforms using Kubernetes, containers, microservices, and related distributed system patterns
  • Hands-on experience with LLM serving or adjacent model-serving patterns, including inference endpoints, routing, scaling, batching, and latency/cost optimization
  • Practical knowledge of API gateway patterns, authentication and authorization, and secure integrations
  • Familiarity with cost attribution and FinOps concepts for cloud and AI workloads
  • Strong track record partnering with product managers and senior technical stakeholders to deliver platform capabilities and roadmaps
  • Excellent communication skills with the ability to explain technical tradeoffs clearly to both technical and non-technical audiences
  • Experience with observability and SRE practices, including metrics, tracing, logging, incident management, and production support
  • Master's / Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Define the technical vision and reference architecture for AI platforms supporting chatbots, agents, orchestration, and related enterprise services
  • Translate product and business requirements into scalable platform capabilities, including agent hosting, model access, AI gateways, observability, and operational tooling
  • Drive platform decisions around LLM serving, model endpoints, caching, batching, latency-versus-cost tradeoffs, and multi-model support
  • Lead architecture for manufacturing integrations and industrial data connectivity, including patterns for SCADA, Data Historian, MES, ERP, LIMS, APIs, event streams, and document-based knowledge sources
  • Own platform reliability, scalability, and cost by defining SLIs/SLOs, capacity planning, cost attribution, and FinOps practices
  • Collaborate with Product Owners, Principal Engineers, and stakeholders to define roadmap, acceptance criteria, and delivery milestones
  • Lead and mentor engineers delivering platform services, integrations, CI/CD for agents and models, and marketplace/catalog capabilities
  • Establish standards for security, compliance, and model governance, including data handling, access controls, logging, auditability, and traceability
  • Be hands-on when needed to prototype architectures, review designs, troubleshoot production incidents, and participate in code and design reviews
What we offer
What we offer
  • In addition to the base salary, Amgen offers competitive and comprehensive Total Rewards Plans that are aligned with local industry standards
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer

We are looking for a Principal Engineer to join the Identity Solutions Data Plat...
Location
Location
Hungary , Budapest
Salary
Salary:
Not provided
mastercard.com Logo
Mastercard
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A track record of delivering complex systems through personal depth and attention to detail, with the ability to design an architecture, write and review code, and hold high standards for quality at every level of the stack
  • Deep experience designing and operating large-scale data processing pipelines, with the ability to reason about tradeoffs in pipeline architecture, cost, performance, and reliability
  • Experienced with distributed systems, cloud infrastructure (AWS), security, and reliability engineering
  • Extensive hands-on experience with Spark and Databricks
  • proficiency in Scala or Java, with Python experience a plus
  • Able to assess technical proposals with data: prototyping to validate feasibility and defining measurable success criteria to support clear recommendations
  • Familiar with machine learning workflows, including feature engineering, model training, and the data infrastructure requirements that support them
  • Able to review requirements critically and collaboratively, asking whether something is truly needed, proposing alternatives, and surfacing concerns early, while keeping the team moving forward
  • Clear communicator who can explain technical decisions to engineers, product owners, and business stakeholders at the right level of detail
  • Committed to mentoring the engineers you work with, patient in explaining complex topics and generous with your time and knowledge
Job Responsibility
Job Responsibility
  • Assess our data processing pipelines against cost, efficiency, and reliability targets, prototyping solutions to validate feasibility and measuring results against clear KPIs
  • Work with engineers, product managers, and business stakeholders to build the case for change
  • Provide hands-on technical leadership: setting the standard for engineering quality through your own work, developing lead and senior engineers, and identifying gaps in our development practices with the same methodical, evidence-based approach
  • Maintain a clear view of how our platform fits into the broader Mastercard technology ecosystem, anticipating the needs of partner teams and identifying opportunities to consolidate on shared platforms
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer

Microsoft has an exciting opportunity for a Principal Software Engineer in the M...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: The successful candidate must have an active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate U.S. Government clearance and/or customer screening requirements may result in employment action up to and including termination
  • This position requires successful verification of the stated security clearance to meet federal government customer requirements
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • This position requires verification of U.S. citizenship due to citizenship-based legal restrictions
Job Responsibility
Job Responsibility
  • Lead engineering excellence through mentorship, coding best practices, quality standards, and proactive risk management
  • Lead architecture and design strategy for complex systems, driving scalable, resilient, secure, and cost-efficient solutions through technical leadership, innovation, and data-driven decision making
  • Drive end-to-end delivery planning and execution for complex services, ensuring security, compliance, resiliency, scalability, and operational readiness throughout the product lifecycle
  • Investigate pre-production and production issues, implement, and deploy fixes
  • Participate in an on-call rotation (typically 24/7 for one week every 6-8 weeks) within a secure facility
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer

Are you interested in leading a team to build a world-class deployment system to...
Location
Location
United States , Multiple Locations
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Preferred: Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Proficiency in AI-native development
  • Fundamentals in data structures, algorithms, object-oriented design, and scalable systems
  • Experience building, testing, debugging, and maintaining production-quality software
  • Problem-solving and technical judgment skills
  • Experience with cloud platforms and distributed/service-oriented architecture
  • Experience with reliability, monitoring, and performance optimization practices
  • Experience in driving AI (LLM/ML) based engineering solution
Job Responsibility
Job Responsibility
  • Work with engineers, product managers, and partner teams to deliver experiences with the right overall design and architecture, leveraging AI where it can meaningfully improve deployment efficiency, reliability, and customer outcomes
  • Provide mentorship and coaching to engineers both in, and beyond, your team, including the adoption of modern AI-powered development practices and tools
  • Own and deliver complete features across the development lifecycle, including design, architecture, implementation, testability, debugging, shipping, and servicing
  • Drive innovation through automation and AI-powered solutions to improve deployment intelligence, operational efficiency, and service reliability at hyperscale
  • Ensure your team delivers clean, well-thought-out code with an emphasis on quality, performance, simplicity, durability, scalability, maintainability, and effective use of AI-assisted engineering practices.
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer

Join the "Microsoft AI Web Data Platform Team" as a Principal Software Engineer,...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Job Responsibility
Job Responsibility
  • Align team efforts with business and user requirements by collaborating with stakeholders to define priorities, resolve dependencies, and ensure delivery of well-documented design and implementation plans for products, applications, services, or platforms
  • Guide system design and architectural decisions across multiple components, encouraging the use of data and telemetry to make informed decisions
  • Drive engineering excellence by fostering a culture of building modular, secure, reliable, testable, maintainable, and reusable solutions, while promoting active monitoring practices
  • Establish quality assurance strategies by setting standards for improving test coverage, streamlining integration testing, and addressing critical problem areas proactively
  • Oversee and improve operational reliability, guiding efforts to troubleshoot and optimize automation, monitoring, and Live Site health
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer

Location
Location
United States , Columbus
Salary
Salary:
130500.00 - 170000.00 USD / Year
aflac.com Logo
Aflac
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must have a Bachelor’s degree in Computer Science, Information Systems or related technical field.
  • Must have at least 6 years of progressive experience in IT positions performing the following duties: Applying experience with the Wynsure/eWAM full development cycle, Wynsure production support, and specific expertise in Wynsure Enrollment domain including Wynsure Service Layer Integration (SLI) APIs, Wynsure Event-Driven-Architecture (EDA), Wynsure Data Migration, and Wynsure Optimizations.
  • Experience in supporting an Enterprise scaled Production environment for Wynsure, including troubleshooting thru logs and crash dumps.
  • Utilizing experience with: Wynsure technology: GOLD (enhanced variant of C++), eWAM, wMigrate, Wyseman, OQL
  • Web Services, SQL Server, ETL, Transact SQL (writing complex stored procedures, triggers), Powershell, IIS.
  • API and Integration including WebServices, REST, SOAP, XML, JSON.
Job Responsibility
Job Responsibility
  • Define architectural guidelines and best practices
  • Lead software development initiatives from conception to deployment
  • Collaborate with stakeholders to align software solutions with business objectives
  • Introduce and train teams in advanced programming languages and tools
  • Ensure cloud readiness and optimal performance of all applications
  • Lead and mentor technical and project team members at the business function level
  • Lead the project team in analyzing the requirements and providing accurate and detailed estimates for the designing, building, testing and deployment phases of the project
  • Provide technical leadership and mentoring to various technical teams
  • Collaborate with Architects, Developers, Senior Infrastructure Technical staff to evaluate and recommend technology advancements and business solutions for assigned projects and/or applications
  • Support the implementation and testing of cross-functional systems, ensuring system meets the needs of client and business
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

We’re looking for a Principal Software Engineer to help shape the next generatio...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
onetrust.com Logo
OneTrust
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s in Computer Science, Engineering, or a related field
  • 8+ years of full-cycle software development experience in Agile teams
  • Proven success designing scalable, distributed systems and microservice architectures
  • Strong hands-on expertise with Java, Spring ecosystem, RESTful APIs, and CI/CD pipelines
  • Deep understanding of SQL and NoSQL databases—schema design, optimization, and performance tuning
  • Experience with Kafka or similar streaming platforms
  • Cloud experience (Azure preferred
  • AWS/GCP welcome) and containerization (Docker, Kubernetes)
  • Demonstrated technical leadership—mentoring peers, setting best practices, and influencing architecture
  • Excellent analytical and communication skills, able to translate technical insights into clear solutions
Job Responsibility
Job Responsibility
  • Design, build, and optimize backend frameworks and microservices using Java, Spring Boot/Spring Cloud, and RESTful APIs
  • Architect and deliver multi-tenant, cloud-native, and high-availability systems in Azure (or other major cloud platforms)
  • Model and manage data across SQL and NoSQL databases, ensuring performance and scalability for large workloads
  • Enhance real-time systems using Kafka or similar streaming architectures
  • Lead by example—write clean, testable code, review pull requests, mentor engineers, and guide design discussions
  • Continuously improve reliability, performance, and developer experience through automation, CI/CD, and best practices
  • Champion innovation—explore emerging technologies and AI-assisted development tools to boost productivity and quality
What we offer
What we offer
  • Comprehensive healthcare coverage
  • Flexible PTO
  • Equity RSUs
  • Annual performance bonus opportunities
  • Retirement account support
  • 14+ weeks of paid parental leave
  • Career development opportunities
  • Company-paid privacy certification exam fees
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

Location
Location
United States , Columbus
Salary
Salary:
130500.00 - 170000.00 USD / Year
aflac.com Logo
Aflac
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Information Technology or related technical discipline
  • 7 years of progressive experience in business analyst or development positions performing: completion of full software development cycle from requirements gathering to implementation
  • Working in an Agile development environment
  • Technical writing
  • Troubleshooting/problem-solving skills in a software environment
  • Applying strong analytical and product management skills, including interpreting customer business needs and translating them into application and operational requirements
  • Strong SQL knowledge and working with complex database schemas and table structures
  • Utilizing experience with: MySQL, Jira, VersionOne, Confluence, Jenkins, SharePoint, SDLC methodologies, JSON, XML, Perl, Java, JavaScript, Angular, Visio, and Microsoft Project
  • In the alternative, employer will accept Master’s degree in Computer Science, Information Technology or related technical discipline plus 5 years of experience in business analyst or development positions performing the aforementioned
  • Must also have 3 years of experience with: Insurance products, plans, pricing and terminology
Job Responsibility
Job Responsibility
  • Define architectural guidelines and best practices by leading software development initiatives from conception to deployment
  • Collaborate with stakeholders to align software solutions with business objectives
  • Introduce and train teams in advanced programming languages and tools
  • Ensure cloud readiness and optimal performance of all applications
  • Lead and mentor technical and project team members at the business function level
  • Lead the project team in analyzing the requirements and providing accurate and detailed estimates for the designing, building, testing and deployment phases of the project
  • Provide technical leadership and mentoring to various technical teams
  • Collaborate with Architects, Developers, Senior Infrastructure Technical staff to evaluate and recommend technology advancements and business solutions for assigned projects and applications
  • Support the implementation and testing of cross-functional systems including ensuring system meets the needs of client and business
  • providing continuous support to internal and external clients who are experiencing problems with server hardware, operating systems, core infrastructure applications and related utilities, maintaining mainframe operating systems or major subsystems, and associated software and hardware products
  • Fulltime
Read More
Arrow Right