CrawlJobs Logo

Senior AI Network Architect

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Redmond

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

119800.00 - 234700.00 USD / Year

Job Description:

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Teams, OneDrive, and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions. Our focus is on smart growth, high efficiency, and delivering a trusted experience to customers and partners worldwide and we are looking for passionate engineers to help achieve that mission. As Microsoft's cloud business continues to grow the ability to deploy new offerings and hardware infrastructure on time, in high volume with high quality and lowest cost is of paramount importance. To achieve this goal, the Cloud Hardware Systems Engineering (CHSE) team is instrumental in defining and delivering operational measures of success for hardware manufacturing, improving the planning process, quality, delivery, scale and sustainability related to Microsoft cloud hardware. We are looking for seasoned engineers with a dedicated passion for customer focused solutions, insight and industry knowledge to envision and implement future technical solutions that will manage and optimize the Cloud infrastructure. We are looking for a Senior AI Network Architect to join the team.

Job Responsibility:

  • Spearhead architectural definition and innovation for next-generation GPU and AI accelerator platforms, with a focus on ultra-high bandwidth, low-latency backend networks
  • Drive system-level integration across compute, storage, and interconnect domains to support scalable AI training workloads
  • Partner with silicon, firmware, and datacenter engineering teams to co-design infrastructure that meets performance, reliability, and deployment goals
  • Influence platform decisions across rack, chassis, and pod-level implementations
  • Cultivate deep technical relationships with silicon vendors, optics suppliers, and switch fabric providers to co-develop differentiated solutions
  • Represent Microsoft in joint architecture forums and technical workshops
  • Evaluate and articulate tradeoffs across electrical, mechanical, thermal, and signal integrity domains
  • Frame decisions in terms of TCO, performance, scalability, and deployment risk
  • Lead design reviews and contribute to PRDs and system specifications
  • Shape the direction of hyperscale AI infrastructure by engaging with standards bodies (e.g., IEEE 802.3), influencing component roadmaps, and driving adoption of novel interconnect protocols and topologies

Requirements:

  • Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 3+ years technical engineering experience
  • OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, or related field AND 5+ years technical engineering experience
  • OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
  • 3+ years of experience in designing AI backend networks and integrating them into large-scale GPU systems
  • Proven expertise in system architecture across compute, networking, and accelerator domains
  • Deep understanding of RDMA protocols (RoCE, InfiniBand), congestion control (DCQCN), and Layer 2/3 routing
  • Experience with optical interconnects (e.g., PSM, WDM), link budget analysis, and transceiver integration
  • Familiarity with signal integrity modeling, link training, and physical layer optimization
  • Experience architecting backend networks for AI training and Inference workloads, including Hamiltonian cycle traffic and collective operations (e.g., all-reduce, all-gather)
  • Hands-on design of high-radix switches (≥400Gbps per port), orthogonal chassis, and cabled backplanes
  • Knowledge of chip-to-chip and chip-to-module interfaces, including error correction and equalization techniques
  • Experience with custom NIC IPs and transport layers for secure, reliable packet delivery
  • Familiarity with AI model execution pipelines and their impact on pod-level network design and latency SLAs
  • Prior contributions to hyperscale deployments or cloud-scale AI infrastructure programs

Additional Information:

Job Posted:
March 19, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior AI Network Architect

Senior Solution Architect AI & HPC

AI is a high-growth market for HPE, and we believe we are uniquely suited to bri...
Location
Location
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Engineering, Computer Science, or similar quantitative focus preferred
  • Ability to quickly prototype functionality into scripts for demos, integrations, troubleshooting, etc.
  • Expertise in cloud architectures, specifically with public cloud platforms such as AWS, Azure, or Google Cloud
  • Strong understanding of AI technologies, including machine learning, deep learning, and neural networks
  • Experience participating in solution configurations and the creation of PoCs to meet customer requirements
  • Solid knowledge of infrastructure components, including servers, storage, networking, and virtualization
  • Experience with high-performance computing (HPC) and GPU-accelerated systems is advantageous
  • Demonstrates expert technical skills in assigned area of specialization
  • Expert knowledge of the company offerings, strategic initiatives, current trends, competitor products and strategies within area of responsibility
  • Expert level written and verbal communication skills and mastery over English and local language
Job Responsibility
Job Responsibility
  • Collaborate with sales teams to understand customer requirements and develop tailored solutions for their AI infrastructure needs
  • Engage in pre-sales activities, including technical presentations, demonstrations, and proof-of-concepts
  • Act as a trusted advisor to customers, addressing their questions, concerns, and technical challenges effectively
  • Stay up-to-date with the latest advancements in AI technologies, cloud architectures, and infrastructure trends
  • Lead Proof-of-Concepts (PoC) for HPE customers expanding into Deep Learning or Machine Learning use cases
  • Architect reusable end-to-end AI solutions for HPE customers and prospects
  • Lead technical discussions with customers and partners to propose HPE and partner Integrated solutions
  • Identify solutions, define action plans, and help coordinate and deliver optimal solutions and enhancements
  • Recommend configurations and settings for different types of hardware and interconnect fabrics
  • Assist in any product or technical issue towards an initial sale or renewal of a customer
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Senior Devops & AI Engineer

This role presents a unique opportunity to contribute to the future of impactful...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
fissionlabs.com Logo
Fission Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or related field
  • 6+ years of experience in Infrastructure Mgmt. roles, with a focus on cloud platforms (Azure and AWS Preferred)
  • Hands-on experience with operations (DevSecOps) principles and best practices
  • Proficiency in scripting languages such as Python, PowerShell, or Bash
  • Excellent communication and collaboration skills
  • In-depth knowledge of Linux operating systems, including CentOS, Ubuntu, and Red Hat, with expertise in shell scripting, package management, and system administration
  • Hands-on experience with a wide range of AWS and Azure services
  • Develop and maintain Infrastructure as Code (IAC) templates using tools such as Terraform or AWS CloudFormation
  • Experience setting up cloud infrastructure stack, databases, service endpoints, GPU as well as CPU resource scaling, optimization etc.
  • Should have worked AIOps/MLOP
Job Responsibility
Job Responsibility
  • Configure and optimize Linux-based servers for performance, security, and resource utilization, including kernel tuning, file system management, and network configuration
  • Architect cloud solutions leveraging best practices and services offered by AWS and Azure, optimizing for scalability, reliability, and cost-effectiveness
  • Implement and manage hybrid cloud environments, facilitating seamless integration and interoperability between AWS and Azure services
  • Establish version control practices for IAC templates, ensuring traceability, auditability, and reproducibility of infrastructure changes
What we offer
What we offer
  • Opportunity to work on impactful technical challenges with global reach
  • Vast opportunities for self-development, including online university access and knowledge sharing opportunities
  • Sponsored Tech Talks & Hackathons to foster innovation and learning
  • Generous benefits packages including health insurance, retirement benefits, flexible work hours, and more
  • Supportive work environment with forums to explore passions beyond work
  • Fulltime
Read More
Arrow Right

Senior Java Architect & Cloud Engineer

The Equity Middle Office technology group is actively transforming its technolog...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Degree in Computer Science or Electronic/Electrical Engineering
  • ~15 years of Banking Software development experience, including management experiences or equivalent
  • Knowledge of low-latency frameworks such as Chronicle / garbage-free programming in Java
  • Knowledge in IT Infrastructure (i.e. IT Networks, Communications, and Data C-entre Management) and Infra Support Operations
  • Working experience in Linux operating system, Windows, Groovy, Python, JavaScript, Java, ELK, Bitbucket, Jenkins, Confluence, SonarQube, Nexus and scripting experience to do integrations through API, CLI for extracting data and to perform automated operations
  • Very Strong experience in in Shell Scripting, Batch Scripting to do automation, command line integration and invoking REST API using postman is mandatory
  • Must have hands on experience in building microservices using in Java and Spring Boot Framework Stack
  • Working experience in Messaging platform such AMPS, TIBCO, SOLACE and MQ
  • Experience with relational SQL and NoSQL database
  • Strong knowledge and experience in DevOps automation, containerization and orchestration using tools such as Gradle, Maven, Docker, Kubernetes, Terraform, Artifactory
Job Responsibility
Job Responsibility
  • Be recognized as a trusted partner for business application owners and other technology teams who seek to make use of Cloud based infrastructure
  • Define the technology roadmap and prioritize technical resources against to achieve maximum success
  • Ensuring the platform conforms to security best practices and is fully consistent with banking audit and compliance requirements and fully consistent with the design ethos and technical requirements of external cloud providers
  • Supporting adoption of containers and container control frameworks for internal Cloud Services, including container platform selection and design and ensuring that self-service design/deployment/control web containers is appropriate for requirements
  • Ensuring lifecycle management consists of documentation such as test cases, source code repositories etc are actively used and maintained
  • Recommend new services to complement and enhance infrastructure elements to stream-line and support applications development and deployment
  • Developing highly available infrastructures in a cloud services environment, preferably with cloud providers such as OpenShift or AWS
  • Implement continuous Integration / Continuous Deployment practice, tooling, and techniques, particularly evidence of leading organizational and cultural change to adopt CI/CD practices (Jira, Confluence, BitBucket, Git
  • Jenkins, Artifactory, Terraform, Packer, Rundeck, Ansible, AWS, ELK, AppDynamics)
  • Enable AI based monitoring automation to effectively detect/predict/prevent issues in the environment and code base
  • Fulltime
Read More
Arrow Right

Senior Software Developer

Senior Software Developer role at Hewlett Packard Enterprise focused on AI and m...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in computer science, engineering, data science, machine learning, artificial intelligence, or closely related quantitative discipline
  • Typically 4-7 years' experience
  • Deep understanding of machine learning algorithms (linear regression, decision trees, support vector machines, random forests, deep learning models, reinforcement learning)
  • Strong foundation in mathematics and statistics (linear algebra, calculus, probability theory)
  • Proficiency in programming languages such as Python, R, or Java
  • Experience with software engineering best practices and version control systems (Git)
  • Knowledge of libraries and frameworks like TensorFlow, PyTorch, sci-kit, Keras
  • Advanced knowledge in deep learning and neural network architectures
  • Proficiency in using agentic frameworks like langGraph
  • Knowledge of evaluation of traditional AI/ML and Gen-AI based applications
Job Responsibility
Job Responsibility
  • Conduct advanced research in AI and machine learning
  • Design and architect AI solutions for complex problems
  • Provide technical guidance and mentorship to junior team members
  • Work with stakeholders to translate requirements into technical solutions
  • Drive continuous improvement and innovation in AI/ML practices
  • Evaluate and integrate third-party tools or services
  • Facilitate design review sessions
  • Collaborate with engineering manager and team lead
  • Prepare and deliver presentations to stakeholders
  • Design and develop solutions to complex application problems
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive benefits suite supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Senior Information Technology Engineer

The IT Systems Engineer is responsible for architecting, securing, and scaling L...
Location
Location
India , Pune
Salary
Salary:
Not provided
logicmonitor.com Logo
LogicMonitor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of IT experience in a global high-tech environment
  • 5+ years of hands-on networking experience (enterprise/global scale)
  • Strong experience managing Cisco switches, Fortinet firewalls, VPNs, and wireless infrastructure
  • Demonstrated experience with Zero-Trust Network Architecture (ZTNA), Secure Web Gateways, and CASB (preferred: Cloudflare)
  • Proficiency with Terraform for Infrastructure-as-Code and familiarity with GitOps practices
  • Strong understanding of networking in cloud environments (AWS, GCP, Azure)
  • Familiarity with FedRAMP/GovCloud requirements preferred
  • Experience using AI tools to enhance productivity, innovation, or problem-solving
  • Solid Linux systems experience
  • macOS networking and certificate compatibility knowledge required
Job Responsibility
Job Responsibility
  • Own Cloudflare ZTNA and Secure Web Gateway end-to-end: design, policy enforcement, monitoring, troubleshooting, and Terraform-based configuration
  • Handle multiple instances of Cloudflare ZTNA, covering commercial and government infrastructure
  • Ensure compatibility and reliability of certificates and macOS networking with SWG/Zero-Trust controls
  • Architect and administer global networking across offices, data centers, and multi-cloud (AWS, GCP, Azure) environments
  • Manage Cisco switches, Fortinet firewalls, VPNs, Wi-Fi, and global remote access infrastructure
  • Implement Infrastructure-as-Code practices with Terraform and support GitOps workflows
  • Deliver and maintain network observability dashboards, SLAs, and uptime reporting using LogicMonitor
  • Partner with Security and Technical Operations to maintain compliance in both commercial and FedRAMP environments
  • Ability to work within an on-call rotation schedule and be available after hours for specialized support
  • Proactively identify opportunities for AI-driven automation within IT operations and quietly deliver solutions that reduce manual workloads
Read More
Arrow Right

Senior Security Engineer

As a AI engineer on Security Detections and Operations team you will build and s...
Location
Location
United States , New York; Washington DC; Seattle; San Francisco; Austin
Salary
Salary:
146300.00 - 235000.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Fluency in at least one modern object-oriented programming language (preferably Python, Java/Kotlin)
  • B.S. M.S. in Engineering or STEM discipline with emphasis on Data Science/AI
  • At least 1-4 years experience in real world AI cybersecurity applications (along with 6+ years of Data Science Experience)
  • Deep background in statistical modeling and techniques and experience with multimodal data sets
  • Understanding of Machine Learning project lifecycle and tools
  • Experience in architecting and implementing high-performance RESTful microservices
  • Experience building and operating large scale distributed systems using Amazon Web Services (S3, Kinesis, Cloud Formation, EKS, AWS Security and Networking)
  • Experience with Continuous Delivery and Continuous Integration
Job Responsibility
Job Responsibility
  • Regularly tackle the largest and most complex problems in the team, from technical design to launch
  • Deliver solutions that are used by other teams and functional areas
  • Deliver AI/ML solutions which have a tangible business outcome - i.e. improve detection rate/accuracy, reduce manual toil cost and demonstrate improvement in detection and response metrics
  • Design and Deploy Tools for Proactive Threat Detection: Leverage AI/ML models to identify anomalous behavior and potential threats in real time, reducing the time to detect breaches
  • Design Automated Incident Response Workflows: Drive AI/ML-driven automation to enable rapid containment and mitigation of threats, minimizing downtime and impact
  • Design Threat Intelligence Enrichment: Leverage AI/ML to analyze large volumes of threat intelligence to provide actionable insights to endpoints, email, malware, network and application threats
What we offer
What we offer
  • Health coverage
  • Paid volunteer days
  • Wellness resources
  • Fulltime
Read More
Arrow Right

Senior Account Executive

The Senior Account Executive will be responsible for leading compute sales into ...
Location
Location
United States , Los Angeles
Salary
Salary:
210500.00 - 495000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Information Technology, Engineering, or Business or equivalent work experience
  • Minimum 5 years of enterprise sales experience with compute, infrastructure, or high-performance technology platforms
  • Proven success developing and managing large enterprise accounts within high-tech manufacturing, semiconductor, or cloud software sectors
  • Deep understanding of compute and networking architectures across hybrid and hyperscale environments
  • Experience coordinating with cross-functional engineering and product teams on large-scale solutions
  • Effective executive communication and proposal development skills
Job Responsibility
Job Responsibility
  • Develop and execute account strategies targeting large enterprise customers in Irvine, San Diego, and the Greater Los Angeles area
  • Lead customer discussions on compute performance optimization, edge integrations, and AI workload enablement
  • Partner with account leader, solution architects and product specialists to design proposals addressing compute
  • Manage multimillion-dollar negotiation cycles, including pricing alignment, contract structuring, and executive engagement
  • Maintain a robust sales funnel through direct enterprise relationships and partner ecosystems
What we offer
What we offer
  • Comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Programs to help in career development
  • Inclusive work environment
  • Fulltime
Read More
Arrow Right

Senior Principal Technical Program Manager - ML Platform

Location
Location
Salary
Salary:
231300.00 - 301975.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience on software teams as Development Manager, Technical Product Manager or TPM leading technical platforms areas
  • Deep domain experience in AI and/or Search. Example: Model Inference, Model Evaluation, Model Training, LLM Ops, Semantic Search, Search Relevance, etc.
  • Partner with Engineering in defining direction, strategy and execution at Platform level
  • Strategic thinking and ability to understand business objectives to translate them into technical problems and programs.
  • Technical understanding of systems involved. Willingness to develop domain expertise in the area they operate - storage, networking, authentication, capacity management, service deployments, etc.
  • TPMs are not expected to write or read code, but are expected to understand system flows, block architectures, APIs and such.
  • Experience defining and running end-to-end complex technical programs
  • Strong leadership, organizational, and communication skills
Job Responsibility
Job Responsibility
  • Understand and stay up-to-date on latest innovations in AI and Search. Partner closely with engineering teams to translate these into practical platform evolution for Atlassian bringing value to our customers.
  • Analyze business objectives, customer needs, product adoption inhibitors and opportunities, industry trends, and based on these, in close collaboration with your stakeholders, define a long-term strategy and roadmap for your platform and product components.
  • Understand business objectives and translate them into technical systems problems that need to be prioritized solved in the current business environment.
  • Define specific systems programs and create a plan of action for realizing those programs. Such programs could be around capacity planning, migration efforts, high availability, network architecture, performance optimization, reliability improvements and more.
  • Use your technical understanding of Atlassian and related systems to partner with and influence engineers and architects in making progress on these problems.
  • Responsible for taking a systematic approach to engineering problems. This includes: prioritizing tasks, scoping out the project, defining objectives, and making consistent progress against each of these.
  • Be accountable for the success of these technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
  • Manage complex dependencies and projects with a broad scope across the company
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
Read More
Arrow Right