Principal Silicon Performance Architect Job at Microsoft Corporation (Redmond)

Principal Performance Architect

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the...

Location

United States , Mountain View

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Doctorate in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 3+ years technical engineering experience
Master's Degree in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 6+ years technical engineering experience
Bachelor's Degree in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 8+ years technical engineering experience
equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Ability to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Must provide proof of citizenship, US residency, or other protected status for export control assessment
Experience with SIMD (Single Instruction, Multiple Data) and MIMD (Multiple Instruction, Multiple Data) architectures
Background in deep neural network training, and inference workload optimizations
Experience in architecture and workload analysis on GPUs

Job Responsibility

Develop and test SoC and IP models, including model integration
Analyze performance and bottlenecks for critical deep learning workloads
Actively collaborate with architects, and provide critical feedback for future SoC
Prototype opportunities for performance and power optimizations, and trade-offs on AI accelerators

Fulltime

Principal Hardware Architect

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the...

Location

United States , Hillsboro

Salary:

142800.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Doctorate in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 3+ years technical engineering experience OR Master's Degree in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 6+ years technical engineering experience OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 8+ years technical engineering experience OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
This role will require access to information that is controlled for export under export control regulations
As a condition of employment, the successful candidate will be required to provide either proof of their country of citizenship or proof of their US residency or other protected status
the successful candidate's citizenship will be verified with a valid passport
Lawful permanent residents, refugees, and asylees may verify status using other documents, where applicable

Job Responsibility

Architecting and developing PCIe Gen 7 subsystem
Working with Vendors evaluate IP and make recommendations
Working with Performance Modeling team to analyze SOC/platform Azure IO workloads
Working closely with Strategic Planning and Architecture as well as internal customers to understand workload and use case requirements with specific focus on identifying full stack optimization opportunities within the context of the overall memory hierarchy
Collaborating across teams to come up with the best solution possible with a One Microsoft mindset
Challenging the status quo with a growth mindset to push the envelope and enable world-class SOC products across Microsoft
Principal PCIe Architect responsible for defining next-generation PCIe/CXL architecture for Microsoft Azure silicon platforms, delivering high-performance, scalable, and reliable I/O subsystems

Fulltime

Principal Modeling Architect - DC GPU

AMD is seeking a highly accomplished Principal Modeling Architect to join the Pr...

Location

United States , San Jose

Salary:

229600.00 - 344400.00 USD / Year

AMD

Expiration Date

Until further notice

Requirements

12+ years of experience in workload modeling, performance engineering, system architecture, or related technical domains
Demonstrated expertise in modeling and analyzing AI/ML, HPC, or large-scale data analytics workloads on GPU or accelerator platforms
Deep understanding of performance modeling methodologies, benchmarking tools, simulation environments, and workload characterization techniques
Experience collaborating across hardware, software, and system engineering teams to drive workload-informed architectural decisions
Strong analytical, communication, and technical writing skills
ability to synthesize complex data into actionable insights
Advanced degree in Computer Science, Electrical Engineering, or related field preferred

Job Responsibility

Develop and refine workload modeling frameworks to characterize and project performance, scalability, and resource utilization for AI/ML, HPC, and data analytics workloads
Analyze emerging model architectures, datatypes, and scaling methodologies to anticipate future platform requirements
Collaborate with architecture, silicon design, software, and performance engineering teams to translate workload insights into platform-level technical requirements
Lead benchmarking, profiling, and simulation efforts to validate architectural assumptions and guide design trade-offs
Produce detailed workload characterization reports, performance projections, and sensitivity analyses to inform platform strategy and technical decision-making

Fulltime

Senior Principal AI Infrastructure Architect

The Senior Principal AI Infrastructure Architect is a highly skilled and advance...

Location

Italy , Milano

Salary:

Not provided

NTT DATA

Expiration Date

Until further notice

Requirements

Significant experience in a consulting, presales or architecture role within a large-scale (preferably multi-national) technology services environment, with a track record of leading AI infrastructure pursuits
Demonstrable experience designing and delivering production AI platforms — from single multi-GPU servers through to multi-rack training clusters and inference factories
Strong working knowledge of the AI hardware vendor landscape (NVIDIA, AMD, Intel, Dell, HPE, Lenovo, Supermicro, Cisco, Pure, VAST, WEKA, DDN, NetApp) and how to position partner ecosystems competitively
Proven ability to translate AI workload requirements (model size, parameter count, sequence length, throughput SLOs, latency targets) into accurate hardware bills of materials and sizing justifications
Significant client engagement and consulting experience, including client needs assessment, change management and the ability to identify whitespace for follow-on AI infrastructure and managed-services work
Significant business development and presales experience on infrastructure-led deals, ideally including sovereign AI, AI Factory or regulated-industry GenAI programmes
Strong understanding of how AI infrastructure integrates with business processes, applications, data platforms and existing enterprise architecture
Bachelor's degree or equivalent in Information Technology, Engineering, Computer Science or a related field
Deep, hands-on knowledge of AI hardware: GPU and accelerator portfolios (NVIDIA Hopper / Blackwell, AMD MI300/MI325, Intel Gaudi 3, emerging custom silicon), host CPU platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), system topologies (HGX, DGX, MGX, OAM) and how each choice maps to specific AI workloads
Strong understanding of AI-class storage: parallel filesystems, all-flash NVMe platforms, S3-class object stores, checkpoint and dataset pipelines and the I/O patterns of large-scale training and inference (VAST, WEKA, DDN EXAScaler, Pure FlashBlade, NetApp ONTAP AI, Dell PowerScale)

Job Responsibility

Lead the end-to-end design of large, complex AI infrastructure solutions — covering accelerated compute (NVIDIA H100/H200/B200 and GB200 NVL72, AMD Instinct MI300X/MI325X, Intel Gaudi 3), CPU host platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), high-throughput storage tiers and lossless AI fabric — for enterprise, sovereign AI and AI Factory clients
Architect reference designs built on NVIDIA DGX/HGX SuperPOD, Dell AI Factory with NVIDIA, Cisco Nexus HyperFabric AI, HPE / Lenovo / Supermicro accelerated compute and equivalent platforms, balancing single-node performance with cluster-scale efficiency
Size and validate GPU clusters against real workloads — foundation-model pre-training, distributed fine-tuning, RAG, real-time and batch inference — using the right combination of NVLink/NVSwitch domains, InfiniBand NDR/XDR or Ultra Ethernet / NVIDIA Spectrum-X fabrics and tiered NVMe and parallel storage (VAST, WEKA, DDN, Pure FlashBlade, NetApp ONTAP AI, Dell PowerScale)
Define the supporting datacenter design: high-density power (50–140 kW/rack), direct-to-chip and rear-door liquid cooling, structured cabling for AI fabrics and modular deployment models across on-prem, colo and sovereign-cloud footprints
Work closely with the sales team to drive the presales process for AI infrastructure pursuits — client discovery, technical workshops, proposal writing, executive presentations and bid defence
Translate clients' AI ambitions and business outcomes into a hardware and platform roadmap, positioning NTT DATA's end-to-end portfolio — silicon, systems, storage, fabric, MLOps stack and managed services — to land service-led AI solutions
Lead integration of compute, storage, networking, the AI software stack (CUDA, ROCm, Triton, NIM, NVIDIA AI Enterprise, Run:ai, Slurm, Kubernetes / Kubeflow) and managed-service operating models across multiple domains, delivery units and geographies
Build business cases, TCO and unit-economics models (cost per token, cost per training run, GPU-hour economics) and end-to-end transition roadmaps for cloud-to-private AI migrations and sovereign AI deployments
Define architectural principles for AI infrastructure — accelerator utilisation, data gravity, multi-tenancy, model lifecycle, energy efficiency — and apply them to influence architectural outcomes and governance
Develop As-Is, Vision, FMO and To-Be AI platform architectures, identify gaps and develop transition roadmaps

Fulltime

Senior Principal System Solution Architect

As Microsoft's cloud business continues to grow the ability to deploy new offeri...

Location

United States , Redmond

Salary:

163000.00 - 296400.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 9+ years technical engineering experience OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 11+ years technical engineering experience OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Technology Leadership – Drive concepts and definition for industry leading platforms focused on Azure data center covers Compute, Storage and GPU and AI accelerator-based solutions, with a strong focus on high performance and low latency networks at the forefront of density and speed
Cross-Functional Collaboration – Partner with silicon, firmware, and datacenter engineering teams to co-design infrastructure that meets performance, reliability, and deployment goals. Influence platform decisions across rack, chassis, and pod-level implementations
Technology Partnerships – build strong relationships with our technology and development partners to drive leading edge innovation into our next generation products
Customer Focus – partner across Microsoft teams and collaborate to deliver industry leading products
Design Strategy – champion innovative technical principles, design strategy and forward-looking technologies related to industry trends
Architecture Clarity - Distill and articulate architectural tradeoffs for the solution development encompassing electrical, optical, signal integrity, mechanical, power, and thermal inputs in terms of key metrics such as TCO, performance, schedule, and risk
Industry Influence - Drive and influence technology providers and design partners towards optimal components and solutions to meet the future requirements for Azure’s infrastructure

Fulltime

Principal AI Network Architect

Do you want to be at the forefront of innovating the latest hardware designs to ...

Location

United States , Redmond

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Job Responsibility

Leadership: Spearhead architecture definition and evaluation of AI accelerator platforms, with a focus on high bandwidth, low latency networks. Drive end to end optimization of the stack from hardware, the software kernels
Cross functional collaboration: Partner with silicon and platform design teams to co-design infrastructure that meets performance, reliability and deployment goals. Frame decisions in terms of TCO, performance, flexibility, scalability
Prototyping: You will be working with state of art networking lab to prototype new network architectures
Industry influence: Participate in industry consortiums to shape standards, and influence vendor roadmaps

Fulltime

Principal AI Network Architect

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the...

Location

United States , Redmond

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 7+ years technical engineering experience
OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 8+ years technical engineering experience
OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check
5+ years of experience in designing AI backend networks and integrating them into large-scale GPU systems
Proven expertise in system architecture across compute, networking, and accelerator domains
Deep understanding of RDMA protocols (RoCE, InfiniBand), congestion control (DCQCN), and Layer 2/3 routing
Experience with optical interconnects (e.g., PSM, WDM), link budget analysis, and transceiver integration
Familiarity with signal integrity modeling, link training, and physical layer optimization

Job Responsibility

Spearhead architectural definition and innovation for next-generation GPU and AI accelerator platforms, with a focus on ultra-high bandwidth, low-latency backend networks
Drive system-level integration across compute, storage, and interconnect domains to support scalable AI training workloads
Partner with silicon, firmware, and datacenter engineering teams to co-design infrastructure that meets performance, reliability, and deployment goals
Influence platform decisions across rack, chassis, and pod-level implementations
Cultivate deep technical relationships with silicon vendors, optics suppliers, and switch fabric providers to co-develop differentiated solutions
Represent Microsoft in joint architecture forums and technical workshops
Evaluate and articulate tradeoffs across electrical, mechanical, thermal, and signal integrity domains
Frame decisions in terms of TCO, performance, scalability, and deployment risk
Lead design reviews and contribute to PRDs and system specifications
Shape the direction of hyperscale AI infrastructure by engaging with standards bodies (e.g., IEEE 802.3), influencing component roadmaps, and driving adoption of novel interconnect protocols and topologies

Fulltime

Principal Memory Controller Architect

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the...

Location

United States , Raleigh

Salary:

163000.00 - 296400.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Doctorate in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 7+ years technical engineering experience OR Master's Degree in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 10+ years technical engineering experience OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Computer Science, or related field AND 12+ years technical engineering experience OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Job Responsibility

Architecting and development of memory controllers
Reviewing Memory technology roadmaps including, but not limited to: DDR5, DDR6, LPDDR, HBM, Type 3 CXL-based Memory, RDIMM, MRDIMM, LP-MRDIMM, and emerging memory technologies
Working closely with memory controller micro-architects, verification and validation to drive features into production
Working with Vendors evaluate IP and make recommendations
Work with Performance Modeling team to develop cycle approximate model of the controller and also analyze SOC/platform Azure workload results
Working closely with Strategic Planning and Architecture as well as internal customers to understand workload and use case requirements with specific focus on identifying full stack optimization opportunities within the context of the overall memory hierarchy
Collaborating across teams to come up with the best solution possible with a One Microsoft mindset.Challenging the status quo with a growth mindset to push the envelope and enable world-class SOC products across Microsoft.

Fulltime

Select Country

Principal Silicon Performance Architect

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?