Datacenter Program Manager New Product Integration Job at Microsoft Corporation (West Des Moines)

Infrastructure Hardware Technical Program Manager (Server And Network Systems)

As an Infrastructure Hardware Technical Program Manager (Server and Network Syst...

Location

United States; Canada , Sunnyvale; Toronto

Salary:

Not provided

Cerebras Systems

Expiration Date

Until further notice

Requirements

B.S. or M.S. in Computer Science, Electrical/Computer Engineering, or equivalent experience
8+ years in Technical Program Management (or similar delivery leadership) for server, network, or infrastructure platforms from concept through production
Experience coordinating complex server and/or datacenter network programs across OEM/ODMs, switch vendors, and internal engineering teams
Working knowledge of server architecture (CPU/NUMA, memory bandwidth, PCIe, NIC and storage IO) and enough networking fundamentals (leaf-spine fabrics, switch platforms, high-performance interconnects) to run effective technical reviews
Familiarity with Linux server fleet management (provisioning, firmware/BIOS, drivers, field triage)
Strong multi-team program execution skills: integrated plans, risk management, dependency tracking, and executive-level communication
Ability to operate in ambiguity and keep parallel server and network workstreams aligned

Job Responsibility

Own end-to-end program execution for server systems and network equipment in Cerebras clusters, including new platforms, refreshes, and major component/config changes
Drive requirements gathering and convert inputs into executable plans with clear milestones, readiness gates, and cross-functional deliverables
Represent Cluster Architecture in executive reviews, OKR cycles, and leadership/customer forums as needed
Build and manage integrated schedules across vendors and internal teams, track dependencies, critical path, and risks
Manage OEM/ODM and switch/vendor engagements (RFI/RFP, samples, escalations, roadmap alignment)
Partner with Compute / Server Platform / Network Architects to turn architectural decisions into qualification plans, acceptance criteria, and rollout strategies
Lead qualification and release readiness (lab/staging validation, regression tracking, go/no-go decisions)
Own risk and change management into production, including versioning, rollout sequencing, and stakeholder communication
Ensure operational readiness with deployment and fleet teams and maintain alignment with rack/physical DC owners on power, cooling, space, and cabling constraints

What we offer

Build a breakthrough AI platform beyond the constraints of the GPU
Publish and open source their cutting-edge AI research
Work on one of the fastest AI supercomputers in the world
Enjoy job stability with startup vitality
Our simple, non-corporate work culture that respects individual beliefs

Fulltime

Senior Technical Program Manager

Microsoft’s Cloud Operations & Innovation (CO+I) organization powers the infrast...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree AND 4+ years experience in engineering, product/technical program management, data analysis, or product development OR equivalent experience
2+ years of experience managing cross-functional and/or cross-team projects

Job Responsibility

Lead delivery of RADAR’s mission by implementing and scaling sensor‑health detection, alerting, and triage capabilities across Microsoft datacenters, ensuring high‑quality signal visibility and reliable operational outcomes
Design and operationalize core workflows for sensor‑health detection, alert routing, validation, and triage, partnering closely with upstream telemetry systems and downstream incident‑response teams
Drive cross‑team orchestration by creating and strengthening relationships across engineering, hardware, operations, and service teams to integrate and execute multi‑feature scenarios and platform capabilities
Build and manage onboarding processes for new telemetry types and detection scenarios, including requirements templates, validation criteria, handoff procedures, and governance frameworks
Champion Process Excellence by maturing workflows, training partners, and driving adoption of consistent operating models for new signals, anomaly detection patterns, and incident‑response processes
Lead partner alignment and influence to shape and deliver shared roadmaps across divisional boundaries, ensuring detection, alerting, and observability capabilities evolve cohesively
Identify gaps and opportunities through structured feedback loops
synthesize insights into clear problem statements, repeatable patterns, and actionable guidance for leadership and engineering stakeholders
Manage schedules and execution across epics, sprints, semester plans, and releases, tracking dependencies, anticipating risks, and driving cohesive delivery across partner teams
Produce clear technical documentation including specifications, decision records, runbooks, and operational procedures to support partner readiness and consistent implementation

Fulltime

Sr IT Network Engineer

The System Engineer job family has responsibility for infrastructure/technical p...

Location

United States , Englewood

Salary:

48.71 - 72.45 USD / Hour

American Nursing Care

Expiration Date

Until further notice

Requirements

Bachelors Of Arts in Computer Science, Technology, or Business discipline or equivalent experience
Minimum of 7 year of professional experience in an IT technical or infrastructure field
Experience with datacenter and MDF/IDF closet construction design (cabling, racks, power, cooling)
5-7 years of experience with vendor management

Job Responsibility

Provides advice, guidance and expertise to promote adoption of methods and tools and adherence to policies and standards
Evaluates and selects appropriate methods and tools in line with agreed policies and standards
Implements methods and tools at project and team level including selection and tailoring in line with agreed standards
Manages reviews of the benefits and value of methods and tools
Identifies and recommends improvements
Contributes to organizational policies, standards, and guidelines for methods and tools
Coordinates and manages planning of the system and/or acceptance tests
Takes responsibility for integrity of testing and acceptance activities and coordinates the execution of these activities
Provides advice and guidance on any aspect of test planning and execution
Identifies process improvements, and contributes to corporate testing standards and definition of best practice

What we offer

medical
prescription drug
dental
vision plans
life insurance
paid time off (full-time benefit eligible team members may receive a minimum of 14 paid time off days, including holidays annually)
tuition reimbursement
retirement plan benefit(s) including, but not limited to, 401(k), 403(b), and other defined benefits offerings

Fulltime

IT Technical Trainer

In alignment with our Microsoft values, we are committed to cultivating an inclu...

Location

United States , Boydton

Salary:

76800.00 - 151900.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science, Information Technologies, System Engineering, Information Systems, Education, Adult Learning, Business Management, or a related field AND 2+ years experience in training, education, information technology (IT), cloud systems, datacenter environments, artificial intelligence, information security, server environments, networking, cloud, or computer technologies OR equivalent experience
Background Check Requirements: Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Collaboration and Engagement: Provides course content feedback to broader Learning team using appropriate internal team channels as necessary and independently
Provides up-to-date knowledge about the content area to content development peers and other cross-functional teams
Collaborates with curriculum managers to understand the content and ensure the full curriculum is being delivered by trainers
Connects operations and program team needs to learning programs and interventions and ensures compliance with necessary trainings
Executes on new training opportunities with other teams as necessary
External: Begins to understand and define customer requirements by interpreting audience information and aligns training strategies to address these needs
Acts as a liaison to customers, demonstrating flexibility in adjusting training methods to meet evolving needs
Collaborates with sales teams to support diverse, multi-modality training sessions tailored for external customers
Uses social media platforms to support interactive learning experiences, improve training outcomes, and evangelize skilling and Microsoft products through a variety of modalities and platforms
Performs essential pre-delivery tasks, working alongside field teams and customers to enhance training sessions

Fulltime

Staff Thermal Attainment Engineer

Technical, hands-on engineer responsible for post-silicon thermal activities rel...

Location

Malaysia , Penang

Salary:

Not provided

AMD

Expiration Date

Until further notice

Requirements

Bachelor’s degree or higher in Electrical/Computer Engineering or Electronics / Mechanical Engineering related with 2-5 years of experience in SoC thermal validation and debug
Strong background in thermodynamics and heat transfer
Solid understanding of thermal management methodologies in datacenter products
Experience with power and thermal controllers and management
Experience developing validation methodologies and infrastructure
Test plan and test development experience
Participated in silicon bring up and debug, support to internal engineering teams
Debug skills at both GPU and system level
Familiarity with programming / scripting language (C/C++, Python, Perl, ...)
Working knowledge of Server OSes (Linux, Windows Server)

Job Responsibility

Learn and execute thermal attainment test plans in post-silicon time periods in support of Data Center GPU product roadmap
Investigating thermal management techniques through both hardware and firmware-based solutions
Actively participate in analysis of post silicon thermal and power data, ensure integrity of results and provide summary and conclusions of results
Hands-on experience to work locally or remotely with computers, systems or data center hardware for practical knowledge with hardware applicable to servers, data centers or thermal equipment as a means to accomplish thermal attainment work
Calibration of thermal sensors and working with other groups to correlate sensor accuracy across platforms
Support prototyping experiments for new GPU features that impact thermal and power characteristics
Work with cross-functional teams internally and externally to improve post-silicon validation test strategy, methodology, and process
Leading collaborative technical discussions to drive resolution on technical issues and roll out technical initiatives
Be able to work in a high demand, fast paced environment with lots of real-time problem solving and critical thinking

Fulltime

Power and Thermal Engineer

Technical, hands-on engineer responsible for post-silicon thermal activities rel...

Location

Malaysia , Penang

Salary:

Not provided

AMD

Expiration Date

Until further notice

Requirements

Bachelor’s degree or higher in Electrical/Computer Engineering or Electronics / Mechanical Engineering related with a minimum 2-5 years of experience in SoC thermal validation and debug
Strong background in thermodynamics and heat transfer
Solid understanding of thermal management methodologies in datacenter products
Experience with power and thermal controllers and management
Experience developing validation methodologies and infrastructure
Test plan and test development experience
Participated in silicon bring up and debug, support to internal engineering teams
Debug skills at both GPU and system level
Familiarity with programming / scripting language (C/C++, Python, Perl, ...)
Working knowledge of Server OSes (Linux, Windows Server)

Job Responsibility

Learn and execute thermal attainment test plans in post-silicon time periods in support of Data Center GPU product roadmap
Investigating thermal management techniques through both hardware and firmware-based solutions
Actively participate in analysis of post silicon thermal and power data, ensure integrity of results and provide summary and conclusions of results
Hands-on experience to work locally or remotely with computers, systems or data center hardware for practical knowledge with hardware applicable to servers, data centers or thermal equipment as a means to accomplish thermal attainment work
Calibration of thermal sensors and working with other groups to correlate sensor accuracy across platforms
Support prototyping experiments for new GPU features that impact thermal and power characteristics
Work with cross-functional teams internally and externally to improve post-silicon validation test strategy, methodology, and process
Leading collaborative technical discussions to drive resolution on technical issues and roll out technical initiatives
Be able to work in a high demand, fast paced environment with lots of real-time problem solving and critical thinking

Fulltime

Principal Mechanical/Thermal Architect

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the...

Location

Taiwan , Taipei

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

9+ years related technical engineering experience OR Bachelor's degree in Mechanical Engineering, or related field AND 6+ years related technical engineering experience OR Master's degree in Mechanical Engineering, or related field AND 4+ years related technical engineering experience OR Doctorate degree in Mechanical Engineering, or related field AND 3+ years related technical engineering experience OR equivalent experience
Proficiency in CAD software understanding of mechanical design principles, GD&T analysis, mechanical vibration and shock, material compatibility, tooling, manufacturing processes, validation, and packaging
Experience with prototyping, tooling, and high-volume manufacturing of electronic enclosures
Experience leading complex design programs, crossfunctional design reviews and architectural decisions
Experience with cabling and networking interconnect technology
Experience with development of specifications, interface control documents, and managing design requirements
Hands-on experience with the end-to-end design of liquid cooling subsystems, such as QD, Cold Plate, Manifold, RPU and CDU
Hands-on experience with FloTHERM, FLOEFD, Ansys, Icepak, Macroflow or equivalent CFD analysis software
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Job Responsibility

Lead Mechanical and thermal Development: Deliver innovative mechanical and thermal architecture and designs for compute & AI server and cooling systems, integrating thermal, structural, and electrical constraints with a deep understanding of server architecture and adjacencies
End-to-End System Design Ownership: Lead system-level design from concept through production, including architectural trade-offs, technical decision-making, and alignment across chip, tray, rack, and datacenter interfaces
Liquid Cooling Architecture Leadership: Architect and drive the design and development of complex liquid cooling solutions at the server, rack, and datacenter levels, including liquid-to-air heat exchangers, CDUs, manifolds, and piping
Fluid Distribution & Thermal Implementation: Own the mechanical and thermal implementation of fluid distribution systems within server racks, including: Fluid flow and pressure management
Cold plate design and reliability
Pump selection, design, and reliability
Material compatibility and corrosion resistance
Fluid and air volume management
Design for Manufacturing & Reliability: Ensure manufacturability, cost-efficiency, and long-term reliability through material selection, tolerance control, supplier engagement, and understanding of tooling and manufacturing processes
Drive for Results: Set goals with measurable outcomes, manage risk, and ensure timely delivery of high-quality hardware solutions across product lines

Fulltime

Senior Staff Engineer, Software Engineering

Our Senior Staff Engineer works with our Staff and Sr. Engineers to innovate and...

Location

United States , Chevy Chase; Austin; Richardson; Seattle; Palo Alto

Salary:

110000.00 - 260000.00 USD / Year

Geico

Expiration Date

Until further notice

Requirements

Exemplary ability to design, perform experiments, and influence engineering direction and product roadmap
Experience partnering with engineering teams and transferring research to production
Track-record of publications history in credible conferences and journals
Experience with continuous delivery and infrastructure as code
In-depth knowledge of CS data structures and algorithms
Experience solving analytical problems with quantitative approaches
Ability to excel in a fast-paced, startup-like environment
Knowledge of developer tooling across the software development life cycle (task management, source code, building, deployment, operations, real-time communication)
Fluency and Specialization with at least two modern languages such as Go, Java, C++, Python or C# including object-oriented design
Experience with Microservices oriented architecture and extensible REST APIs

Job Responsibility

Focus on multiple areas and provide technical and thought leadership to the enterprise
Collaborate with product managers, team members, customers, and other engineering teams to solve our toughest problems
Develop and execute technical software development strategy for a variety of domains
Accountable for the quality, usability, and performance of the solutions
Utilize programming languages like Python, C# or other object-oriented languages, SQL, and NoSQL databases, Container Orchestration services including Docker and Kubernetes, and a variety of Azure tools and services
Be a role model and mentor, helping to coach and strengthen the technical expertise and know-how of our engineering and product community. Influence and educate executives
Consistently share best practices and improve processes within and across teams
Analyze cost and forecast, incorporating them into business plans
Determine and support resource requirements, evaluate operational processes, measure outcomes to ensure desired results, and demonstrate adaptability and sponsoring continuous learning

What we offer

Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
Financial benefits including market-competitive compensation
a 401K savings plan vested from day one that offers a 6% match
performance and recognition-based incentives
and tuition assistance
Access to additional benefits like mental healthcare as well as fertility and adoption assistance
Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year

Fulltime

Select Country

Datacenter Program Manager New Product Integration

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?