CrawlJobs Logo

Product Manager - AI Data Center Infrastructure

https://www.hpe.com/ Logo

Hewlett Packard Enterprise

Location Icon

Location:
India , Bangalore

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Product Manager - AI Data Center Infrastructure. We are seeking a Product Line Manager (PLM) for AI Data Center Infrastructure to define and deliver next-generation data center networking platforms for large-scale GPU clusters. This role is ideal for a visionary, hands-on leader who understands how AI workloads stress networks at scale and can translate that insight into clear product requirements and roadmaps.

Job Responsibility:

  • AI Data Center & Fabric Architecture: Define product requirements for AI data center network architectures supporting thousands of GPUs
  • Develop requirements for low-latency Ethernet fabrics using Juniper QFX platforms and Apstra-based automation
  • Enable high-bandwidth GPU and NIC interconnects optimized for large-scale distributed training and inference workloads
  • GPU, NIC & Interconnect Strategy: Lead requirements definition for next-generation GPUs, NICs, and interconnect technologies, staying ahead of industry roadmaps
  • Drive alignment with NVIDIA and AMD ecosystems
  • Ensure interoperability across DAC, AEC, ACC, and optical transceivers between switches and NIC endpoints
  • Define scale-up paths using PCIe, NVLink, NVSwitch, ensuring GPU-to-GPU symmetry, consistency, and bandwidth determinism
  • Switching, Routing & Telemetry: Specify and optimize L2/L3 architectures, including EVPN-VXLAN, Class-E IPv4, and AI-optimized buffer tuning
  • Leverage hardware telemetry, streaming sensors, and analytics for proactive performance assurance
  • Drive automation using Python, Ansible, Apstra, Terraform, and related tools to enforce configuration consistency and compliance
  • Performance Optimization & Troubleshooting: Analyze GPU job performance to identify network hotspots, congestion, packet loss, and microbursts
  • Tune ECN, RDMA/ROCEv2, PFC, and traffic-engineering policies for AI workloads
  • Optimize server-to-switch interactions, including BIOS and firmware alignment, NIC queue and link-training parameters, Cable selection and management (AEC/ACC/optics)
  • Cross-Functional & Ecosystem Collaboration: Partner closely with AI platform teams, GPU system architects, data center operations, and strategic vendors (NVIDIA, AMD, Juniper)
  • Lead and participate in root-cause analysis for Link flaps and training failures, FEC and PCS errors, Thermal or power-related performance degradation
  • Drive lab validation, scale testing, and certification of new optics, NIC firmware, and switch software releases

Requirements:

  • 5–10+ years of experience in data center networking, AI infrastructure, or HPC environments
  • Strong hands-on experience with Juniper QFX platforms and JunOS
  • Deep understanding of GPU architectures: NVIDIA: H100/H200, GB200/GB300, NVLink/NVSwitch AMD: MI300/MI400, Pollara NICs, Infinity Fabric
  • Proven expertise in scale-up GPU interconnects and scale-out Ethernet fabrics
  • Strong knowledge of RDMA/ROCEv2, ECN, PFC, and buffer management
  • Familiarity with distributed AI workloads, collective operations (NCCL, RCCL)
  • Hands-on troubleshooting experience with high-speed optics, AEC cables, link training, and NIC firmware
  • Proficiency in automation and scripting (Python, Ansible, Bash, Terraform)

Nice to have:

  • Certification: JNCIE, CCIE, (NCP-AII), (NCA-AIIO), (NCP-AIO), (NCP-AIN)
  • Experience with Apstra or other intent-based networking platforms
  • Knowledge of 1.6T optics, 200G PAM4 SerDes, and CPO/LPO architectures
  • Experience supporting liquid-cooled GPU clusters and rack-level power/network design
  • Understanding of data center operations, observability, and SLAs for AI training and inference clusters
What we offer:
  • Health & Wellbeing: comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Personal & Professional Development: specific programs catered to helping you reach any career goals
  • Unconditional Inclusion: unconditionally inclusive in the way we work and celebrate individual uniqueness

Additional Information:

Job Posted:
March 04, 2026

Work Type:
Hybrid work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 2204 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Product Manager - AI Data Center Infrastructure

Product Manager, Specialized Compute

Designs, plans, develops and manages a product or portfolio of products througho...
Location
Location
United States , Spring
Salary
Salary:
106000.00 - 243000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree or equivalent in computer science, engineering or related field of study
  • Proven experience as a Product Manager in the server, data center, or related hardware industry
  • Strong analytical skills with a knack for translating complex market data into clear, actionable business insights
  • Deep understanding of the technical and market dynamics of at least one of the following: AI servers, Edge computing, or Telco infrastructure
  • Exceptional communication and presentation skills, with the ability to articulate a vision and build consensus with both technical and non-technical audiences
  • A proactive and adaptable mindset, with the ability to manage multiple investigations and projects simultaneously in a fast-paced environment
Job Responsibility
Job Responsibility
  • Conduct in-depth market research and competitive analysis to identify emerging trends, customer pain points, and new product opportunities across AI, Edge, and Telco markets
  • Develop and present data-driven business cases for new product initiatives, including market sizing, financial projections, and strategic alignment with company goals
  • Work closely with engineering, sales, and customers to define and prioritize detailed product requirements and user stories, ensuring they are well-documented and understood by all stakeholders
  • Navigate the unique technical and business challenges of each market, adapting your approach to meet the diverse needs of AI, Edge, and Telco customers
  • Act as the primary liaison between technical teams and business units, championing the product vision and ensuring alignment across the organization
What we offer
What we offer
  • Comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Specific programs catered to helping you reach career goals
  • Inclusive work environment celebrating individual uniqueness
  • Fulltime
Read More
Arrow Right

Is Data Center Operations Engineer

Bridging Information Technology (IT) and the Mechanical, Electrical, and Plumbin...
Location
Location
United States , New Albany
Salary
Salary:
91731.00 - 114948.00 USD / Year
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s degree
  • Bachelor’s degree and 2 years of data center operations experience
  • Associate’s degree and 6 years of data center operations experience
  • High school diploma / GED and 8 years of data center operations experience
  • Hands-on experience with rack/stack, structured cabling, and IT hardware installation
  • Familiarity with Dell PowerEdge, Nutanix, NetApp, and Cisco platforms
  • Ability to interpret electrical and mechanical drawings (awareness-level competency)
  • Experience using monitoring, alerting, or automation systems (AI-enabled platforms preferred)
  • Solid understanding of IT operations concepts including hardware lifecycle management and disaster recovery
  • Ability to read and update documentation, diagrams, and cable records
Job Responsibility
Job Responsibility
  • Serve as the liaison between IT teams and facilities staff, ensuring flawless communication
  • Interpret electrical one-line diagrams, distribution drawings, and cooling schematics to support incident response and planning
  • Install, rack, cable, and support enterprise IT systems including Dell PowerEdge, Nutanix, NetApp, and Cisco technologies
  • Support day-to-day moves, adds, and changes (MACs) in building IDF and VDER environments
  • Perform fiber and copper patch cabling in data centers, IDFs, and VDER closets
  • Trace and troubleshoot cabling issues to restore connectivity
  • Monitor infrastructure, proactively detect issues, and bring up with urgency to appropriate teams
  • Apply AI-enabled monitoring and automation platforms to enhance data center operations
  • Maintain documentation of infrastructure layouts, procedures, and operational standards
  • Participate in capacity planning, disaster recovery drills, and continuous improvement initiatives
What we offer
What we offer
  • A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
  • A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
  • Stock-based long-term incentives
  • Award-winning time-off plans
  • Flexible work models, including remote and hybrid work arrangements, where possible
  • Fulltime
Read More
Arrow Right

Senior Product Manager for Data Center AIOps

Senior Product Manager for Data Center AIOps. HPE is seeking an experienced Seni...
Location
Location
United States , Sunnyvale
Salary
Salary:
136500.00 - 276500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science / Engineering, or equivalent practical experience in data center networking or cloud infrastructure
  • 6+ years of product management or closely related experience in data center networking, network operations, observability, or AIOps/AI-driven infrastructure products
  • Understanding of data center network architectures (leaf-spine and EVPN-VXLAN), intent-based networking, and operational workflows across Day 0 through Day 2+
  • Demonstrated ability to translate complex technical capabilities into clear customer value, business cases, and prioritized roadmaps in a fast-paced environment
  • Excellent communication and stakeholder management skills with a track record of driving alignment across engineering, sales, marketing, and support
  • Understanding of network operations, SRE, or managing large-scale data center or private cloud environments, including troubleshooting and incident response
  • Experience in building business cases, pricing, and go-to-market strategies for enterprise software or cloud services
Job Responsibility
Job Responsibility
  • Influence the product vision and roadmap for Data Center AIOps across the HPE Networking portfolio, aligning with HPE’s AI-native networking strategy and broader data center portfolio
  • Define requirements and user journeys for capabilities such as cross-domain visibility, application awareness, anomaly detection, root cause analysis, impact analysis, and proactive assurance
  • Collaborate closely with engineering, data science, UX, and architecture teams to deliver cloud-hosted AIOps services integrated with Apstra’s intent-based networking and telemetry
  • Partner with GTM, sales, and marketing to craft positioning, packaging, and licensing for AIOps-driven features, including premium and value-added services for data center customers
  • Engage deeply with customers, partners, and field teams to gather feedback, validate hypotheses, and translate operational pain points into prioritized backlog items for AIOps
  • Define and track success metrics for AI and analytics features (e.g., MTTR reduction, automation adoption, model accuracy, recommendation utilization) and iterate based on real-world outcomes
  • Represent the product in customer briefings, business reviews, industry events, and analyst conversations as the subject matter expert for HPE’s Data Center AIOps strategy
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Sr. Data Center Solutions Product Manager

The Sr. Data Center Solutions Product Manager owns the system-level product defi...
Location
Location
United States , Santa Clara
Salary
Salary:
173840.00 - 260760.00 USD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Senior level, cross functional, large enterprise experience in Product Management, Solutions Product Management, Technical Product Management, Systems Engineering, or related roles in: Data center infrastructure
  • AI/ML platforms
  • HPC systems
  • Enterprise solutions
  • Integrated hardware + software products
  • Strong working knowledge of several of the following: Platform/system architecture (power/thermals, topology, I/O, networking, memory, RAS)
  • Firmware/driver and software stack dependencies
  • Benchmarking methodology and reproducibility
  • Demonstrated experience defining and launching system-level solutions requiring coordination across hardware, firmware, drivers, software frameworks, and partner ecosystems
  • Bachelor degree in Engineering, Computer Science, Computer Engineering, Electrical Engineering, or related technical field strongly desired
Job Responsibility
Job Responsibility
  • Own system-level product definition (HW + SW)
  • Author and maintain System Requirements Specifications (SRS) for solutions spanning hardware platform, firmware/driver, software stack, compatibility, manageability, and lifecycle expectations
  • Define assumptions, constraints, non-goals, and acceptance criteria
  • Translate customer/segment needs into prioritized requirements
  • Consolidate and normalize customer, Sales, OEM/ODM partner, and internal stakeholder inputs into a structured, prioritized requirements backlog with decision rationale
  • Ensure requirements are workload-grounded and aligned to segment outcomes
  • Lead cross-functional dependency and integration planning
  • Identify dependencies across silicon/platform, firmware/drivers, ROCm/software stack, frameworks, OEM systems, and cloud images
  • Surface integration risks early, drive tradeoff discussions, and maintain decision logs
  • Set performance targets and benchmarking expectations (with TME/Eng)
  • Fulltime
Read More
Arrow Right

Manager, Data Center Network Deployment & Support

Meta is hiring a Data Center Network Deployment & Support Manager to lead the de...
Location
Location
United States , Aiken
Salary
Salary:
162000.00 - 227000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in network engineering in large-scale data center network deployment, including hands-on responsibility for planning, building, or migrating production networks
  • 3+ years of direct people management experience leading network engineers responsible for Data Center network deployment and operations
  • Bachelor's degree in Computer Engineering, Math, Physics, Networking, or a related technical field (or equivalent practical experience)
  • Hands-on experience with Data Center switching, routing, and dedicated network topologies (“AI Zones”) for backend GPU communication
  • Proven experience leading and executing network deployment and migration projects
  • A proven track record in partnering with cross-functional teams and directly managing large-scale projects
  • Technical expertise in data center, enterprise, or service provider network infrastructure
  • Communication, stakeholder management, and problem-solving skills
  • Knowledge of data center design and operational best practices
  • Experience with process improvement and systems development, leveraging automation to streamline repetitive workflows
Job Responsibility
Job Responsibility
  • Manage the deployment, configuration, and ongoing support of large-scale network infrastructure across data centers and global regions
  • Act as the escalation point for technical issues, providing hands-on support and guidance to resolve complex problems swiftly
  • Develop and maintain relationships with partner teams, vendors, and managed service providers
  • Collaborate with engineering, data center, backbone, AI, and equipment vendors to integrate new technologies and drive innovation
  • Ensure project timelines for network turn-ups and ensure that milestones are met
  • Distribute the deployment workload among team members based on projections, OKRs, and metrics across skill sets and priorities
  • Ensure that standard operating procedures (MOPs, runbooks) are consistently followed
  • Contribute to the standardization and best practices of design, testing, and implementation of scalable network solutions
  • Lead by example and build team chemistry across regions to ensure alignment with organizational priorities
  • Provide mentorship, coaching, and career development to team members
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Senior Technical Product Marketing Manager

The Senior Technical Product Marketing Manager ( Integrated Solutions ) leads te...
Location
Location
United States , Austin
Salary
Salary:
156480.00 - 234720.00 USD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Senior level, technical experience over years in technical product marketing, solution marketing, or GTM programs in enterprise infrastructure, AI platforms, data center systems, or semiconductors
  • Deep technical knowledge of enterprise servers, accelerators/GPU platforms, HPC workloads, and system‑level integration (networking, storage)
  • Strong understanding of competitive data center ecosystems and positioning
  • Proven success leading complex cross‑functional marketing programs end‑to‑end
  • Ability to simplify and communicate technical depth as compelling customer value
  • Experience linking marketing investment to measurable business outcomes
  • Experience launching AI systems or full‑stack data center platforms
  • Background working with performance engineering or benchmarking teams
  • Vertical industry solution marketing experience
  • Bachelor’s degree in Engineering, Computer Science, Marketing, or related field desired
Job Responsibility
Job Responsibility
  • Lead end‑to‑end GTM programs for AMD Integrated System Solutions, including AI systems and enterprise solution stacks
  • Drive program strategy, prioritization, budget planning, execution, and ROI measurement
  • Convert platform and solution strategy into scalable GTM motions across industries and workloads
  • Coordinate with Product Management, Engineering, Sales, and Corporate Marketing for integrated execution
  • Own technical marketing programs for AI systems launches, including platform architecture, performance, TCO, and scalability positioning
  • Develop workload‑specific narratives spanning AI training, inference, HPC‑AI convergence, and analytics
  • Build vertical solution motions across industries such as financial services, healthcare, manufacturing, research, telecom, and cloud service providers
  • Partner with Engineering and Product Management to define benchmark requirements and interpret performance data for GTM use
  • Develop differentiated technical storytelling grounded in AI performance, HPC workloads, virtualization, analytics, and hybrid cloud usage
  • Explain system‑level design considerations, performance drivers, and architectural tradeoffs
  • Fulltime
Read More
Arrow Right

Solutions Architect – Campus, DCN Switching & Routing

We are looking for a seasoned TME/Networking Solutions Architect with deep exper...
Location
Location
China , Beijing
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep knowledge and hands-on experience in networking protocols: BGP, OSPF, EVPN, VXLAN, MCLAG, DRNI, ISSU, MACSec, DCI
  • Experience in Day 0 to Day 1 deployment of spine-leaf fabrics with any SDN controllers, micro segmentation, and service chaining
  • Working knowledge of automation and orchestration tools used in data center deployments
  • Familiarity with SDN controller architecture and integration with third-party services
  • Proven ability to engage with both technical and business stakeholders to design and defend high-impact networking solutions
  • Strong competitive knowledge of other vendor offerings — including campus solutions, 400G/800G switching platforms, and transceivers such as but not limited to QSFP-DD and OSFP
  • Excellent written and verbal communication skills
  • ability to create compelling documentation and technical collateral
Job Responsibility
Job Responsibility
  • Serve as a trusted technical advisor for customers across AI data centers, enterprise campus networks, and service provider environments — identifying technical requirements, resolving pain points, and showcasing HPE’s end-to-end networking capabilities
  • Architect and support AI-ready Ethernet data center deployments using leaf-spine topologies, EVPN-VXLAN overlays, and RoCEv2 fabrics optimized for GPU-based workloads
  • Lead and participate in customer-facing workshops, whiteboard sessions, and technical deep dives across campus switching, data center fabrics, and edge routing solutions
  • Conduct Proof of Concepts (PoCs) and hands-on validations to assess performance, scale, Day-0 automation, telemetry, and orchestration tools in both data center and campus environments
  • Create and maintain design guidelines, infrastructure blueprints, and best practices for performance-optimized and scalable networking deployments across AI DC, enterprise, and routers use cases
  • Collaborate with pre-sales and go-to-market teams to drive solution adoption and ensure alignment with customer needs and competitive differentiators
  • Contribute to RFP/RFI responses, creating comprehensive solution documentation including Bill of Materials (BoM), redundancy and topology planning
  • Work closely with product management and engineering, providing real-world field feedback to enhance product roadmaps around automation, telemetry, security, and feature development
  • Represent HPE at industry events, AI summits, and technology forums, highlighting the value of HPE’s networking portfolio in comparison to competitors
  • Stay ahead of the curve by tracking emerging trends, analysing the competitive landscape, and influencing internal strategies for next-gen network innovation
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Principal Technical Program Manager

The CO+I AI Delivery team is focused on delivering various platform services to ...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree AND 6+ years experience in engineering, product/technical program management, data analysis, or product development OR equivalent experience
  • 3+ years of experience managing cross-functional and/or cross-team projects
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
  • Proven experience leading complex, cross‑team technical programs with significant infrastructure or platform components
  • Strong technical foundation in one or more of the following: Cloud infrastructure and distributed systems, Large‑scale datacentre delivery projects, Hardware‑software integrations (compute, networking, storage, power, cooling)
  • Demonstrated ability to manage execution in ambiguous, fast‑moving environments
  • Excellent written and verbal communication skills, with experience presenting to senior leadership
  • Experience delivering or scaling AI, HPC, or GPU‑based platforms in production environments
  • Familiarity with data center operations, hardware lifecycle management, or global deployment programs
Job Responsibility
Job Responsibility
  • Program Ownership & Execution: Own end‑to‑end technical programs focused on accelerating AI deployment timelines
  • Drive execution across multiple parallel workstreams
  • Establish clear success metrics and mechanisms
  • Document appropriately all artifacts
  • Cross‑Functional Leadership: Partner deeply with hardware engineering, software engineering, infrastructure, networking, data center operations, and supply chain teams
  • Act as the central point of coordination
  • Influence decision‑making with data, technical insight, and strong executive communication
  • Technical Rigor: Develop deep working knowledge of AI deployment architectures
  • Identify technical risks early and drive mitigation strategies
  • Translate complex technical concepts into clear, actionable plans
  • Fulltime
Read More
Arrow Right