CrawlJobs Logo

Principal Software Engineer - System Optimization

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Mountain View

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

139900.00 - 274800.00 USD / Year

Job Description:

The Artificial Intelligence Frameworks team at Microsoft builds the software foundation that enables AI to run everywhere. We operate at the intersection of AI algorithmic innovation, purpose-built hardware, large-scale systems, and production software, working closely with internal hardware teams and external partners. Our team is composed of highly capable, deeply motivated engineers who value technical excellence, collaboration, and an inclusive culture. We own inference performance for OpenAI and other state-of-the-art large language models, working directly with OpenAI on models deployed through Azure OpenAI Service. These systems power some of the largest AI workloads on the planet—serving trillions of inferences per day across Microsoft products. As a Principal Software Engineer - System Optimization you will provide technical leadership across the AI inference software stack, with a strong focus on high-performance, large-scale serving systems. You will lead the benchmarking and optimization of cutting-edge LLMs across GPUs and Microsoft’s custom AI hardware, architect improvements to distributed serving pipelines, and drive deep performance investigations across complex, multi-layered systems. You will own critical performance and efficiency metrics, design and implement durable optimizations, and influence technical direction across teams. In close partnership with research, hardware, and production engineering groups, you will help deliver next-generation AI capabilities into Microsoft’s most widely used products—directly shaping Azure’s efficiency and the future of Microsoft’s AI infrastructure.

Job Responsibility:

  • Own and drive inference performance for OpenAI LLMs across NVIDIA, AMD, and Microsoft silicon, benchmarking, optimizing, and monitoring large-scale production workloads
  • Lead deep performance investigations across software, frameworks, and hardware
  • identify bottlenecks, design durable optimizations, and preserve system integrity
  • Build and evolve AI tooling that accelerates insight, simplifies pipelines, enables fast model and hardware bringup, and reduces operational complexity
  • Improve efficiency and reduce fleet footprint, influencing Azure AI CapEx goals and next-generation infrastructure through software-hardware codesign
  • Provide technical leadership and influence, partnering with research, hardware, and production teams
  • exercising exceptional judgment, autonomy, and execution while embodying Microsoft’s Culture and Values

Requirements:

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Nice to have:

  • Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python
  • OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python
  • OR equivalent experience
  • PhD Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python
  • 2+ year of experience with Large Language Models (LLMS) and large scale execution on AI workloads
  • 4+ years of technical design, problem solving, and debugging skills

Additional Information:

Job Posted:
March 22, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Principal Software Engineer - System Optimization

Principal Engineer, Software - Android

At T-Mobile, we invest in YOU! Our Total Rewards Package ensures that employees ...
Location
Location
United States , Bellevue; Denver; Overland Park; Frisco
Salary
Salary:
133500.00 - 240700.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years technical engineering experience
  • Experience in mobile software development using Kotlin, Jetpack Compose or Android SDK
  • Developing sophisticated Android mobile applications
  • Experience building a scalable customer facing application used by millions of customers
  • Provide on-call and in-person support for troubleshooting, isolation, maintenance, operations, patching, incident management, problem management, build and deployments for owned software and systems
  • Hands on experience in developing mobile networking, REST web-services, understanding large and complex code bases that involves mobile, backend and external SDK integration
  • Designing mobile application using VIPER, Factory, DAO, MVVM, MVC, Delegate, Builder, Adapter, Singleton and Facade design patterns and architecture
  • Experience in API design, SDK architecture, and mobile software lifecycle development practices
  • BS degree in Computer Science, Information Technology, or equivalent experience
  • Communication
Job Responsibility
Job Responsibility
  • Drives projects with the Product, UX/UI and Backend teams to design, build and extend consumer facing new products, platforms, and features
  • Improve product quality through code reviews, writing effective unit tests
  • Ability to digest feature requirements and high-level end to end design to guide in coding approach and work breakdown
  • Ability to produce a low-level design document to detail feature implementation
  • Presents project improvement scenarios to management for consideration
  • Lead development team in building native functionality with optimization and expansion to support T-Mobile’s Digital First mission
  • Present highly technical concepts to both technical and non-technical decision-makers
  • Continuously learns, builds content, and guides others specific subject areas
  • Informally coaches and gives to the development of others through mentoring or in house workshops and learning sessions
  • Develops engineers across functional teams on technology decisions
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off
  • Up to 12 paid holidays
  • Paid parental and family leave
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

As a Principal Software Engineer at Global-e, you will design and deliver the co...
Location
Location
United States , Hoboken, NJ
Salary
Salary:
Not provided
global-e.com Logo
Global-e
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ Years of Experience: A proven track record building large-scale, customer-facing applications in a fast-paced environment (e-commerce, fintech, tech startups a plus)
  • Distributed Systems Expertise: Familiarity with designing, deploying, and operating resilient, fault-tolerant systems that handle high traffic
  • Engineering Practices Proficiency: Hands-on experience with Agile methodologies, CI/CD pipelines, and rapid release cycles
  • Strong Database Skills: Ability to optimize and scale applications involving complex data interactions
Job Responsibility
Job Responsibility
  • Deliver High-Impact Features: Lead the design, development, and deployment of new capabilities across logistics (fulfillment, labels, tracking) and order management workflows
  • Shape Technical Architecture: Define, communicate, and guide architectural decisions to ensure scalability and reliability
  • Elevate Standards: Champion clean code, best practices, and robust testing frameworks, pushing the team to achieve technical excellence
  • Scale the Product: Propose and implement features, tooling, and infrastructure that support exponential growth and operational efficiency
  • Ensure Quality & Reliability: Employ a rigorous approach to verification, focusing on stable, high-performing systems that meet critical metrics and SLAs
  • Move Fast with Confidence: Embrace a rapid, iterative release cycle, balancing speed and safety through CI/CD pipelines, effective monitoring, and efficient processes
  • Collaborate & Share Knowledge: Work closely with other engineering teams, product managers, and stakeholders to ensure alignment and share expertise
  • Write Code in Scala: Contribute high-quality Scala code (no prior Scala experience required, just a passion for learning and an interest in functional programming)
What we offer
What we offer
  • Impact at Global Scale: Build features used by millions, simplifying global commerce and transforming the e-commerce landscape
  • Modern Technology Stack: Work on an advanced microservices platform, leveraging cloud-native tools and best-in-class engineering practices
  • Growth & Development: Expand your expertise through challenging projects, mentorship opportunities, and professional development programs
Read More
Arrow Right

Principal Software Engineer

Atlassian is a global leader in cloud collaboration, and one of the world’s larg...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field
  • Deep expertise in PostgreSQL, including internals, extension development, performance tuning, and scaling
  • 10+ years of experience in backend software development, with a focus on distributed systems and storage solutions
  • 5+ years of hands-on experience with AWS RDS/Aurora or equivalent cloud database platforms (GCP, Azure)
  • Demonstrated leadership in technical design, mentoring, and open-source contribution
  • Ability to drive technical roadmaps, influence architectural decisions, and champion best practices across teams
  • Experience mentoring engineers and building high-performing, collaborative teams
Job Responsibility
Job Responsibility
  • Contribute to open-source projects and represent Atlassian in the broader PostgreSQL community
  • Lead initiatives to improve scalability, performance, reliability, and security of the self managed Postgres
  • Collaborate with cross-functional teams to define technical strategy and deliver robust solutions for complex storage challenges
  • Establish and promote best practices in distributed systems, cloud infrastructure, and cost optimization
  • Mentor and develop engineers, fostering a culture of technical excellence and continuous learning
What we offer
What we offer
  • Health coverage
  • Paid volunteer days
  • Wellness resources
Read More
Arrow Right

Principal Software Engineer, Trusted Data Platform

As a Principal Software Engineer, you will be a technical leader and hands-on co...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related technical field
  • 10+ years of experience in backend software development, focusing on distributed systems and storage solutions
  • 5+ years of experience working with AWS storage services (S3, DynamoDB, EBS, EFS, FSx, Glacier)
  • Strong expertise in system design, architecture, and scalability for large-scale storage solutions
  • Proficiency in at least one major backend programming language (Kotlin, Java, Go, Rust, or Python)
  • Experience designing and implementing highly available, fault-tolerant, and cost-efficient storage architectures
  • Deep understanding of distributed systems, replication strategies, sharding, and caching
  • Knowledge of data security, encryption best practices, and compliance requirements (SOC2, GDPR, HIPAA)
  • Experience leading engineering teams, mentoring senior engineers, and driving technical roadmaps
  • Proficiency with observability tools, performance monitoring, and troubleshooting at scale
Job Responsibility
Job Responsibility
  • Designing and optimizing high-scale, distributed storage systems built on AWS storage technologies
  • Shaping the architecture, performance, and reliability of backend storage solutions that power critical applications at scale
  • Designing, implementing, and optimizing backend storage services that support high throughput, low latency, and fault tolerance
  • Working closely with senior engineers, architects, and cross-functional teams to drive scalability, availability, and efficiency improvements in large-scale storage solutions
  • Leading technical deep dives, architecture reviews, and root cause analyses to resolve complex production issues related to storage performance, consistency, and durability
  • Driving best practices in distributed system design, security, and cloud cost optimization
  • Mentoring senior engineers, contributing to technical roadmaps, and helping shape the long-term storage strategy
  • Collaborating with Site Reliability Engineers (SREs) to implement observability, monitoring, and disaster recovery strategies, ensuring high availability and compliance with industry standards
  • Advocating for automation, Infrastructure-as-Code (IaC), and DevOps best practices, leveraging tools like Terraform, AWS CloudFormation, Kubernetes (EKS), and CI/CD pipelines to enable scalable deployments and operational excellence
What we offer
What we offer
  • Atlassians can choose where they work – whether in an office, from home, or a combination of the two
  • Atlassians have more control over supporting their family, personal goals, and other priorities
  • We can hire people in any country where we have a legal entity
  • Interviews and onboarding are conducted virtually
  • Whatever your preference - working from home, an office, or in between - you can choose the place that's best for your work and your lifestyle
Read More
Arrow Right

Principal Manufacturing Systems Engineer - Amgen Dun Laoghaire Project Delivery Lead

Join Amgen’s Mission of Serving Patients. At Amgen, if you feel like you’re part...
Location
Location
Ireland , Dun Laoghaire
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate degree in Engineering and 2 years of combined Information Systems and Manufacturing Process Automation experience
  • Master’s degree in Engineering and 4 years of combined Information Systems and Manufacturing Process Automation experience
  • Bachelor’s degree in Engineering and 6 years of combined Information Systems and Manufacturing Process Automation experience
  • Engineering, Information Systems, Computer and/or Software GMP-regulated industry background with experience leading projects and resources
  • 8+ years of experience in manufacturing, including 5+ in Pharma/Biotech industry
  • 5+ years of combined experience with Automation/ Process Control Systems (PCS), Manufacturing Execution System (MES) and/or laboratory systems or IS platforms
  • 5+ years of experience with packaging and/or filling line systems
  • Strong communication, leadership, and teamwork skills
  • Innovative, technically minded, and problem-solving abilities
  • Effective verbal, and written communication, and facilitation skills in the English language
Job Responsibility
Job Responsibility
  • Lead the execution and successful delivery of varied portfolio of IS/Automation projects with appropriate portfolio planning, resource and risk management and financial management for the portfolio
  • Accountable for end-to-end Technology projects delivery from the business case creation up to qualification and go live into manufacturing production
  • Ability to anticipate, evaluate and resolve multiple, simultaneous project issues, delays, and problems by utilizing technical, project management, and business expertise
  • Performs cross system analysis, feasibility analysis, scope projects, prioritize deliverables, and recommend optimal solution
  • Manage multiple initiatives and priorities
  • Ability to translate strategic opportunities and emerging technology solutions into tangible pragmatic executable plans allied to the ability to apply corporate blueprint and standards using business drivers to local business needs and project requirements
  • Effectively manage relationships with Peers, IS service owners, business partners, enterprise IS service partners, and vendors
  • Communicates with multiple levels within the organization, highlighting issues and proposing solutions
  • Accountable to elicit and analyze needs identified by business stakeholders and convert them into functional design
What we offer
What we offer
  • A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
  • A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
  • Stock-based long-term incentives
  • Award-winning time-off plans
  • Flexible work models, including remote and hybrid work arrangements, where possible
  • Fulltime
Read More
Arrow Right

Sr. Principal Software Engineer – Search & Recommendation

We are seeking a Sr. Principal Search & Recommendation Engineer to lead the desi...
Location
Location
United States , Seattle
Salary
Salary:
277391.00 - 342391.00 USD / Year
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience building and scaling search or recommendation systems in production environments
  • Deep expertise in information retrieval, ranking algorithms, collaborative filtering, and/or neural search techniques
  • Strong programming skills in Python, Java, or Scala
  • experience with ML and IR frameworks such as Elasticsearch, FAISS, TensorFlow, or PyTorch
  • Familiarity with LLMs, embeddings, and modern vector search infrastructure
  • Proven leadership in cross-functional environments with a track record of mentoring and guiding technical teams
  • Strong grasp of MLOps practices and experience with cloud-native ML infrastructure (e.g., AWS, GCP)
Job Responsibility
Job Responsibility
  • Lead the end-to-end development of modern search and recommendation systems, from architecture to production deployment
  • Drive technical strategy and innovation in search relevance, personalized ranking, semantic search, and ML-powered retrieval/grounding
  • Collaborate with product, design, and data teams to define and deliver intelligent user experiences
  • Influence platform-level decisions on data pipelines, experimentation frameworks, and performance optimization
  • Mentor engineers, foster technical excellence, and promote a culture of learning and innovation
What we offer
What we offer
  • Comprehensive medical, dental, vision, disability, and life benefits
  • Health Savings Account (HSA) with employer contribution
  • 401(k) Matching with immediate vesting on employer match
  • Flexible PTO
  • 8 paid holidays and 5 paid days for Annual Holiday Week
  • Quarterly Recharge Fridays (paid days off for mental health recharge)
  • 18 weeks paid parental leave
  • Access to Coaches and Therapists through Modern Health
  • 2 volunteer days per year
  • Commuting benefits
  • Fulltime
Read More
Arrow Right

Principal Machine Learning System Engineer

As a Principal Machine Learning Systems Engineer, you will lead the design, deve...
Location
Location
United States , Seattle; San Francisco
Salary
Salary:
190300.00 - 305600.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Lead the design, development, and deployment of scalable machine learning (ML) systems and infrastructure
  • Collaborate closely with data scientists, software engineers, and product teams
  • Optimize model performance
  • Ensure system reliability
  • Implement efficient data pipelines
  • Drive architectural decisions for high-performance computing and cloud-based ML platforms
  • Mentor junior engineers
  • Promote best practices in ML operations (MLOps)
  • Stay updated on emerging technologies
Job Responsibility
Job Responsibility
  • Translate complex ML models into production-ready solutions
  • Ensure scalability and security
  • Deliver robust, scalable, and efficient machine learning solutions that support business growth and innovation
What we offer
What we offer
  • Health coverage
  • Paid volunteer days
  • Wellness resources
  • Fulltime
Read More
Arrow Right

Senior Principal Data Platform Software Engineer

We’re looking for a Sr Principal Data Platform Software Engineer (P70) to be a k...
Location
Location
Salary
Salary:
239400.00 - 312550.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years in Data Engineering, Software Engineering, or related roles, with substantial exposure to big data ecosystems
  • Demonstrated experience building and operating data platforms or large‑scale data services in production
  • Proven track record of building services from the ground up (requirements → design → implementation → deployment → ongoing ownership)
  • Hands‑on experience with AWS, GCP (e.g., compute, storage, data, and streaming services) and cloud‑native architectures
  • Practical experience with big data technologies, such as Databricks, Apache Spark, AWS EMR, Apache Flink, or StarRocks
  • Strong programming skills in one or more of: Kotlin, Scala, Java, Python
  • Experience leading cross‑team technical initiatives and influencing senior stakeholders
  • Experience mentoring Staff/Principal engineers and lifting the technical bar for a team or org
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Design, develop and own delivery of high quality big data and analytical platform solutions aiming to solve Atlassian’s needs to support millions of users with optimal cost, minimal latency and maximum reliability
  • Improve and operate large‑scale distributed data systems in the cloud (primarily AWS, with increasing integration with GCP and Kubernetes‑based microservices)
  • Drive the evolution of our high-performance analytical databases and its integrations with products, cloud infrastructures (AWS and GCP) and isolated cloud environments
  • Help define and uplift engineering and operational standards for petabyte scale data platforms, with sub‑second analytic queries and multi‑region availability (coding guidelines, code review practices, observability, incident response, SLIs/SLOs)
  • Partner across multiple product and platform teams (including Analytics, Marketplace/Ecosystem, Core Data Platform, ML Platform, Search, and Oasis/FedRAMP) to deliver company‑wide initiatives that depend on reliable, high‑quality data
  • Act as a technical mentor and multiplier, raising the bar on design quality, code quality, and operational excellence across the broader team
  • Design and implement self‑healing, resilient data platforms with strong observability, fault tolerance, and recovery characteristics
  • Own the long‑term architecture and technical direction of Atlassian’s product data platform with projects that are directly tied to Atlassian’s company-level OKRs
  • Be accountable for the reliability, cost efficiency, and strategic direction of Atlassian’s product analytical data platform
  • Partner with executives and influence senior leaders to align engineering efforts with Atlassian’s long-term business objectives
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
  • Fulltime
Read More
Arrow Right