Principal AI Network Architect Job at Microsoft Corporation (Redmond)

Principal AI Network Architect

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the...

Location

United States , Redmond

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 7+ years technical engineering experience
OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 8+ years technical engineering experience
OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check
5+ years of experience in designing AI backend networks and integrating them into large-scale GPU systems
Proven expertise in system architecture across compute, networking, and accelerator domains
Deep understanding of RDMA protocols (RoCE, InfiniBand), congestion control (DCQCN), and Layer 2/3 routing
Experience with optical interconnects (e.g., PSM, WDM), link budget analysis, and transceiver integration
Familiarity with signal integrity modeling, link training, and physical layer optimization

Job Responsibility

Spearhead architectural definition and innovation for next-generation GPU and AI accelerator platforms, with a focus on ultra-high bandwidth, low-latency backend networks
Drive system-level integration across compute, storage, and interconnect domains to support scalable AI training workloads
Partner with silicon, firmware, and datacenter engineering teams to co-design infrastructure that meets performance, reliability, and deployment goals
Influence platform decisions across rack, chassis, and pod-level implementations
Cultivate deep technical relationships with silicon vendors, optics suppliers, and switch fabric providers to co-develop differentiated solutions
Represent Microsoft in joint architecture forums and technical workshops
Evaluate and articulate tradeoffs across electrical, mechanical, thermal, and signal integrity domains
Frame decisions in terms of TCO, performance, scalability, and deployment risk
Lead design reviews and contribute to PRDs and system specifications
Shape the direction of hyperscale AI infrastructure by engaging with standards bodies (e.g., IEEE 802.3), influencing component roadmaps, and driving adoption of novel interconnect protocols and topologies

Fulltime

Principal AI Network Architect

Do you want to be at the forefront of innovating the latest hardware designs to ...

Location

United States , Redmond

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Master’s or Doctoral degree in Electrical Engineering, Computer Engineering, or related fields and 10+ years of technical experience in the domain
Deep expertise with ethernet networking, RDMA (RoCE, Infiniband), congestion control, and layer 2/3 switching
Experience architecting scale-out/backend network for AI GPU clusters
Familiarity with scale-up networks such as NVLinks, UALink
Experience with high radix ethernet switches
Familiarity with AI model execution pipelines, being able to analyze communication flows and its impact on model performance
Prior contributions in standards committee and experience on hyperscale network deployments would be an added benefit
Skilled in partnering and influencing architects, hardware engineers, and software leads

Job Responsibility

Leadership: Spearhead architecture definition and evaluation of AI accelerator platforms, with a focus on high bandwidth, low latency networks. Drive end to end optimization of the stack from hardware, the software kernels
Cross functional collaboration: Partner with silicon and platform design teams to co-design infrastructure that meets performance, reliability and deployment goals. Frame decisions in terms of TCO, performance, flexibility, scalability
Prototyping: You will be working with state of art networking lab to prototype new network architectures
Industry influence: Participate in industry consortiums to shape standards, and influence vendor roadmaps

Fulltime

New

Principal AI Architect

Wells Fargo is seeking a visionary Principal Systems Architect to shape the futu...

Location

United States , Iselin

Salary:

159000.00 - 305000.00 USD / Year

Wells Fargo

Expiration Date

June 25, 2026

Requirements

7+ years of architecture experience
7+ years of experience creating strategy
2+ years of AI, GenAI, and Agentic AI solutions with Model Risk Management (MRM) and Artificial Intelligence Risk Review (AIRR) governance requirements

Job Responsibility

Act as an advisor to leadership to develop or influence applications, network, information security, database, operating systems, or web technologies for highly complex business and technical needs across multiple groups
Lead the strategy and resolution of highly complex and unique challenges requiring in-depth evaluation across multiple areas or the enterprise, delivering solutions that are long-term, large-scale and require vision, creativity, innovation, advanced analytical and inductive thinking
Translate advanced technology experience, an in-depth knowledge of the organizations tactical and strategic business objectives, the enterprise technological environment, the organization structure, and strategic technological opportunities and requirements into technical engineering solutions
Provide vision, direction and expertise to leadership on implementing innovative and significant business solutions
Artificial Intelligence (AI) and Innovation - Promote a data-driven culture and drive architecture led-innovation
Lead architecture alignment for AI, GenAI, and Agentic AI solutions with Model Risk Management (MRM) and Artificial Intelligence Risk Review (AIRR) governance requirements, ensuring designs support required risk assessments, approvals, and enterprise control expectations
Partner with Model Risk Management, BCM, Legal, Compliance, Cyber, Data Use Assessment, and Risk Assessable Unit (RAU)-aligned stakeholders to ensure AI-enabled solutions are designed for appropriate model risk ranking, validation, explainability, control uplift, and readiness for AIRR and related tollgates where applicable
Define architecture patterns and engineering guardrails that support responsible AI, including traceability, monitoring, auditability, human-in-the-loop controls, secure data usage, resiliency, and change management across the AI service lifecycle
Ensure target-state architectures and implementation roadmaps account for post-deployment monitoring, control sustainability, and re-assessment triggers associated with model changes, scope expansion, data/input changes, platform changes, and evolving regulatory requirements
Advise business, product, and engineering leaders on how to accelerate AI adoption while meeting enterprise expectations for risk governance, model oversight, policy adherence, and safe deployment at scale

What we offer

Health benefits
401(k) Plan
Paid time off
Disability benefits
Life insurance, critical illness insurance, and accident insurance
Parental leave
Critical caregiving leave
Discounts and savings
Commuter benefits
Tuition reimbursement

Fulltime

!

Principal AI Architect

The Principal AI Architect designs, develops and implements advanced AI solution...

Location

United States , Waukesha

Salary:

Not provided

Energy Systems

Expiration Date

Until further notice

Requirements

Bachelors Degree in Computer Science or other related program
8 or more years of experience in AI, machine learning, or data science, with at least 4 years in a senior or lead architect role
Proven track record of designing and deploying large-scale AI systems in production environments
Experience leading cross-functional teams in the delivery of complex AI projects
Hands-on experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and AI frameworks (e.g., TensorFlow, PyTorch, scikit-learn)
Must be deeply curious, desire to experiment
Expertise in machine learning algorithms, Neural networks, Genetic Algorithms, Decision trees, Business dynamic models, Agent based models, Advanced statistical techniques and operations research
Strong proficiency in programming languages such as Python, R, or Java
Ability to design scalable, secure, and efficient AI architectures
Exceptional problem-solving and analytical skills

Job Responsibility

Leads the design and development of AI architectures, including machine learning models, deep learning frameworks, and generative AI systems
Defines technical strategies and roadmaps for AI-driven projects ensuring alignment with business objectives
Collaborates with data scientists, data engineers, business functional and product teams to integrate AI solutions into production environments
Advises and oversees the evaluation and adoption of AI technologies, tools, and platforms
Serves as the technical leader and mentor to AI and engineering teams
Delivers scalable, secure, and optimized AI solutions
Leads analytic literacy of the organization and serves as a translator of deep technical concepts into simple business vernacular
Leads industry trends and advancements in AI to help maintain a competitive edge
Communicates complex technical concepts to non-technical stakeholders effectively

Fulltime

Senior Principal AI Interconnect Architect

An AI Interconnect Architect defines and engineers high-speed networking and com...

Location

United States , Milpitas

Salary:

194425.00 - 322092.00 USD / Year

Sandisk

Expiration Date

Until further notice

Requirements

Master's or Ph.D. in Electrical Engineering, Computer Engineering, or Computer Science
10 - 15 years experience developing interconnect technologies including transport and link level protocols, switching fabrics, QoS and reliable communication methods, and Software Defined Networking
Familiarity with various fabric topologies such as Fat tree, Leaf-Spine (Clos), Torus, Meshed and their applicability to various workload and system configurations
Familiarity with GPU/accelerator clusters and data center infrastructure
Deep, working knowledge of various interconnect technologies and protocols such as PCIe, CXL, NVLink, UALink, Ethernet, Ultra-Ethernet, and serial links
Ability to develop performance models

Job Responsibility

Develop architectures for chip-to-chip interconnects and switched fabrics tailored for AI/ML scale-out
Analyze trade-offs in bandwidth, latency, power, area, and reliability
Participate in industry standard bodies and contribute/influence/shape the direction of industry specifications
Work with SoC, package design, and software teams to ensure seamless integration

What we offer

paid vacation time
paid sick leave
medical/dental/vision insurance
life, accident and disability insurance
tax-advantaged flexible spending and health savings accounts
employee assistance program
other voluntary benefit programs such as supplemental life and AD&D, legal plan, pet insurance, critical illness, accident and hospital indemnity
tuition reimbursement
transit
the Applause Program

Fulltime

Senior Principal AI Infrastructure Architect

The Senior Principal AI Infrastructure Architect is a highly skilled and advance...

Location

Italy , Milano

Salary:

Not provided

NTT DATA

Expiration Date

Until further notice

Requirements

Significant experience in a consulting, presales or architecture role within a large-scale (preferably multi-national) technology services environment, with a track record of leading AI infrastructure pursuits
Demonstrable experience designing and delivering production AI platforms — from single multi-GPU servers through to multi-rack training clusters and inference factories
Strong working knowledge of the AI hardware vendor landscape (NVIDIA, AMD, Intel, Dell, HPE, Lenovo, Supermicro, Cisco, Pure, VAST, WEKA, DDN, NetApp) and how to position partner ecosystems competitively
Proven ability to translate AI workload requirements (model size, parameter count, sequence length, throughput SLOs, latency targets) into accurate hardware bills of materials and sizing justifications
Significant client engagement and consulting experience, including client needs assessment, change management and the ability to identify whitespace for follow-on AI infrastructure and managed-services work
Significant business development and presales experience on infrastructure-led deals, ideally including sovereign AI, AI Factory or regulated-industry GenAI programmes
Strong understanding of how AI infrastructure integrates with business processes, applications, data platforms and existing enterprise architecture
Bachelor's degree or equivalent in Information Technology, Engineering, Computer Science or a related field
Deep, hands-on knowledge of AI hardware: GPU and accelerator portfolios (NVIDIA Hopper / Blackwell, AMD MI300/MI325, Intel Gaudi 3, emerging custom silicon), host CPU platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), system topologies (HGX, DGX, MGX, OAM) and how each choice maps to specific AI workloads
Strong understanding of AI-class storage: parallel filesystems, all-flash NVMe platforms, S3-class object stores, checkpoint and dataset pipelines and the I/O patterns of large-scale training and inference (VAST, WEKA, DDN EXAScaler, Pure FlashBlade, NetApp ONTAP AI, Dell PowerScale)

Job Responsibility

Lead the end-to-end design of large, complex AI infrastructure solutions — covering accelerated compute (NVIDIA H100/H200/B200 and GB200 NVL72, AMD Instinct MI300X/MI325X, Intel Gaudi 3), CPU host platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), high-throughput storage tiers and lossless AI fabric — for enterprise, sovereign AI and AI Factory clients
Architect reference designs built on NVIDIA DGX/HGX SuperPOD, Dell AI Factory with NVIDIA, Cisco Nexus HyperFabric AI, HPE / Lenovo / Supermicro accelerated compute and equivalent platforms, balancing single-node performance with cluster-scale efficiency
Size and validate GPU clusters against real workloads — foundation-model pre-training, distributed fine-tuning, RAG, real-time and batch inference — using the right combination of NVLink/NVSwitch domains, InfiniBand NDR/XDR or Ultra Ethernet / NVIDIA Spectrum-X fabrics and tiered NVMe and parallel storage (VAST, WEKA, DDN, Pure FlashBlade, NetApp ONTAP AI, Dell PowerScale)
Define the supporting datacenter design: high-density power (50–140 kW/rack), direct-to-chip and rear-door liquid cooling, structured cabling for AI fabrics and modular deployment models across on-prem, colo and sovereign-cloud footprints
Work closely with the sales team to drive the presales process for AI infrastructure pursuits — client discovery, technical workshops, proposal writing, executive presentations and bid defence
Translate clients' AI ambitions and business outcomes into a hardware and platform roadmap, positioning NTT DATA's end-to-end portfolio — silicon, systems, storage, fabric, MLOps stack and managed services — to land service-led AI solutions
Lead integration of compute, storage, networking, the AI software stack (CUDA, ROCm, Triton, NIM, NVIDIA AI Enterprise, Run:ai, Slurm, Kubernetes / Kubeflow) and managed-service operating models across multiple domains, delivery units and geographies
Build business cases, TCO and unit-economics models (cost per token, cost per training run, GPU-hour economics) and end-to-end transition roadmaps for cloud-to-private AI migrations and sovereign AI deployments
Define architectural principles for AI infrastructure — accelerator utilisation, data gravity, multi-tenancy, model lifecycle, energy efficiency — and apply them to influence architectural outcomes and governance
Develop As-Is, Vision, FMO and To-Be AI platform architectures, identify gaps and develop transition roadmaps

Fulltime

Principal Firmware Architect - Hyperscale & AI Rack-Based Compute Systems

The Principal Firmware Architect will be responsible for architecting server and...

Location

United States , Georgetown

Salary:

Not provided

Sanmina

Expiration Date

Until further notice

Requirements

Proficiency in one or more of the following: AMI BMC FW, OpenBMC FW, HP iLO, Dell iDRAC, UEFI FW (BIOS)
Experience with DMTF standards such as MCTP, NC‑SI, PLDM, OVF, Redfish, SPDM
Knowledge of security protocols, Root of Trust, and secure design principles
Experience with operating systems and driver design/usage
Strong background in Intel/AMD/ARM/GPU platform architectures
Strong understanding of Baseboard Management Controller (BMC) functionality, telemetry, and controls
Working knowledge of server operating systems including Windows Server (2016, 2019, 2022) and Linux (CentOS, Ubuntu, Fedora, SUSE)
Knowledge of virtualization technologies (VMware, Citrix, Microsoft)
Understanding of software driver implementation, IP schemas, and network protocols
Demonstrated ability to learn and apply new technologies

Job Responsibility

Develop long‑term hyperscale server firmware and security technology strategies based on customer needs
Develop, test, debug, and optimize firmware for ZT hyperscale compute/storage products and proof of concepts
Drive adoption of firmware development strategies internally and externally
Collaborate directly with customers on new firmware architectures for compute servers, storage servers, and add‑on cards
Solve performance and operational challenges to deliver business value through ZT firmware
Contribute firmware and security content to System Architecture Specifications for ZT server products
Build long‑term technical relationships within the firmware technology ecosystem to influence next‑generation server design
Align with customers and partners on security requirements and guide ZT engineering teams accordingly
Participate in in‑depth security reviews and drive compliance with industry standards
Engage in industry forums, workgroups, and consortiums related to firmware and security initiatives

What we offer

Competitive base salary
Performance-based annual bonus eligibility
401(k) retirement savings plan
Tuition reimbursement for eligible education programs
Comprehensive medical, dental, and vision coverage with access to leading providers
Mental health resources and employee wellness support programs
Company-paid life and disability insurance
Paid time off (PTO) and company-paid holidays
Parental leave and family care support programs
Structured training programs and on-the-job learning opportunities

Principal AI Security Enterprise Architect

The Principal AI Security Enterprise Architect at NTT DATA is a pivotal role res...

Location

Luxembourg , Diekirch

Salary:

Not provided

NTT DATA

Expiration Date

Until further notice

Requirements

Bachelor's degree or equivalent in Information Technology or Engineering or Computer Science or related field
Certification and expert knowledge of Enterprise Architecture methodologies (such as, TOGAF, Zachman, SOA)
Certification and expert knowledge of IT Service Management methodologies (ITIL, COBIT, etc.)
Extended experience in a consulting and IT role within a large scale (preferably multi-national) technology services environment
Extended experience in IT operations
Extended experience in a wide variety of Enterprise Application and Process, IT Services in Networking, Data Center, Communications, Security, End-User Computing and digital Business Solutions
Extended client engagement and consulting experience
Extended experience in integrating the solution for the project with the business domain, enterprise concerns, industry standards, established patterns, and best practices
Extended business development and pre-sales experience
Extended experience of the IT industry environment and business needs

Job Responsibility

Leads the design of large and complex managed service solutions by driving services teams
Works closely with the sales team to participate in and proactively drive the presales process with clients
Shares responsibility for win strategy, is responsible for translating clients' business strategy and desired business outcomes into an IT strategy/roadmap
Leads large complex solution design with clients
Integrates services, processes, applications, DATA, technology through a design process across multiple domains, delivery units, or geography
Contribute to the knowledge base of the organization's development and services by sharing best practices
Uses understanding of the client's business, industry practices, and breadth of knowledge on the full solution portfolio
Develops business case and end-to-end roadmap to optimize IT Operations and Measured Business Value
Interprets, influences, and develops IT strategies
Defines architectural principles and applies them to influence architectural outcomes

Fulltime

Select Country

Principal AI Network Architect

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Principal AI Network Architect

Principal AI Network Architect

Principal AI Network Architect

Principal AI Architect

Principal AI Architect

Senior Principal AI Interconnect Architect

Senior Principal AI Infrastructure Architect

Principal Firmware Architect - Hyperscale & AI Rack-Based Compute Systems

Principal AI Security Enterprise Architect

Our AI answers in your language