CrawlJobs Logo

Compute Server Platform Architect

cerebras.net Logo

Cerebras Systems

Location Icon

Location:
United States; Canada , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Compute / Server Platform Architect on the Cluster Architecture Team, you will own the server-side platform architecture that enables Cerebras CS3-based AI clusters (training and inference) to deliver predictable performance, scalability, and reliability. Our accelerators are network-attached, so the x86 server fleet is a first-class part of the end-to-end system: it runs critical-path runtime functions (for example orchestration, prompt caching, and IO/control services) and must be co-designed with software for token-level latency, throughput, and cost efficiency. You will translate workload behavior into CPU, memory, IO, PCIe, and host-networking requirements, drive platform evaluations with vendors, and provide technical leadership through qualification and production adoption in close partnership with other function leaders and TPMs.

Job Responsibility:

  • Own the architecture for all server roles in Cerebras clusters, including definitions of server types, configurations, and lifecycle strategy
  • Define and maintain server formulas (counts and ratios per CS-3 count, cluster size, and workload type) including capacity planning and headroom policy
  • Specify platform configurations: CPU SKU and core strategy, our vendor roadmap (e.g., AMD, Intel, ARM), memory topology (channels, DIMM type, capacity), PCIe topology and lane budgeting, NIC selection/placement, and local NVMe policy where applicable
  • Translate software and runtime flows into measurable hardware requirements (CPU utilization, memory bandwidth/latency, bursty IO patterns, queueing and concurrency limits) and communicate clear guardrails back to software teams
  • Develop performance and scaling models
  • validate with microbenchmarks and workload-level experiments
  • identify bottlenecks and drive cross-stack fixes
  • Define the OS, BIOS, firmware, and driver baseline for each server type
  • there are other teams that follow these recommendations and apply them on our fleet
  • Stay current on emerging server technologies (CPU generations, new memory technologies, CXL, NVMe evolutions, SmartNIC/DPU capabilities where relevant) and run proof-of-concept evaluations to determine when to adopt
  • Lead technical vendor engagements (OEM/ODM and component vendors): influence roadmap, request platform knobs, and drive joint debugging on performance or reliability issues
  • Define qualification and acceptance criteria (performance, stability, operability) and partner with the Infrastructure Hardware TPM to execute qualification plans and land changes cleanly into production
  • Support bring-up and rare deployment debugging in lab and staging environments
  • drive root-cause analysis for regressions spanning firmware, drivers, OS, and runtime behavior

Requirements:

  • PhD. in Computer Science or Electrical/Computer Engineering and + 8 years industry experience, or Master’s/Bachelor’s in CS or EE + 10 years industry experience
  • 5+ years of experience in server platform architecture, systems performance engineering, or large-scale infrastructure design for AI/ML, HPC, or performance-sensitive distributed systems
  • Deep understanding of x86 server architecture: CPU microarchitecture basics, cache hierarchies, NUMA, memory controllers/channels, and memory bandwidth vs latency tradeoffs
  • Strong Linux systems knowledge: profiling and performance analysis, scheduling and syscall overheads, memory management behavior, and practical tuning methodology
  • Experience reasoning about high-performance IO paths, including NIC behavior at a systems level, RDMA/RoCE concepts, and NVMe performance characteristics
  • Proven ability to create capacity and performance models and validate them empirically with a rigorous benchmarking plan
  • Experience working directly with vendors/partners to evaluate platforms, drive issue resolution, and influence roadmaps
  • Strong cross-functional communication skills and ability to drive technical decisions through clear tradeoff documents and reviews
  • Familiarity with application and system software (C, C++, Python)
What we offer:
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs

Additional Information:

Job Posted:
March 09, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Compute Server Platform Architect

As-a-Service Solution Architect Intern

Hewlett Packard Enterprise offers an internship for university students to gain ...
Location
Location
Chile , Santiago
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Actively enrolled in a Bachelor’s degree program in Computer Engineering, Infrastructure and Technology Platforms Engineering, Administration or Business Administration with strong technology orientation
  • Demonstrated background in Information Technology fundamentals (cloud, AI, servers, storage, and networks)
  • Fluent English (oral and written)
  • Strong commercial mindset
  • Results-oriented approach
  • Ability to collaborate and build relationships
  • Effective communication skills
  • Creative thinking and problem-solving
  • Excellent time management, prioritization, and self-organization
  • Curiosity, creativity, and desire for continuous learning
Job Responsibility
Job Responsibility
  • Learn and support solution development based on HPE’s Hybrid Cloud, AI, and As-a-Service portfolio
  • Gain hands-on exposure to technologies such as virtualization, cloud computing, infrastructure, and data protection
  • Assist in creating packaged services, Statements of Work (SoW), and other deliverables
  • Support presales and sales teams in identifying opportunities
  • Prepare technology demonstrations
  • Showcase HPE technologies through presentations, demos, and validations
  • Assist in research and analysis of market trends
  • Help map customer pain points to Hybrid Cloud use cases
  • Participate in the creation of customized advisory reports
  • Collaborate with senior advisors to document findings
What we offer
What we offer
  • Comprehensive benefits suite
  • Professional development programs
  • Inclusion and diversity initiatives
  • Parttime
Read More
Arrow Right

iLO Test Architect

iLO Test Architect role at Hewlett Packard Enterprise focused on developing test...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Information Systems, or equivalent
  • Typically 12+ years' experience in system firmware/BIOS/BMC validation
  • Experience in reviewing and contributing to designing and developing software systems design tools and languages
  • Excellent analytical and problem-solving skills
  • Experience in overall architecture of software systems for products and solutions
  • Designing and integrating software systems running on multiple platform types into overall architecture
  • Evaluating and selecting forms and processes for software systems testing and methodology
  • History of innovation with multiple patents or deployed solutions in the field of software design
  • Excellent written and verbal communication skills
  • mastery in English
Job Responsibility
Job Responsibility
  • Develops Test Strategy for organization-wide architectures and methodologies for Firmware design and development
  • Identifies and evaluates new technologies, innovations, tools for alignment with technology roadmap and business value
  • Reviews and evaluates designs and project activities for compliance with development guidelines and standards
  • Leverages recognized domain expertise to influence decisions of executive business leadership
  • Provides guidance and mentoring to less-experienced staff members
  • Defining test strategy, validation methodologies, providing automation inputs
  • Ensuring compliance with industry standards
  • Guaranteeing the quality, reliability, and security of enterprise-class server firmware
  • Work with various teams across the product lifecycle
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Fulltime
Read More
Arrow Right

Firmware Experience Architect

Seeking Firmware Experience Architect to work on HPE iLO (Integrated Lights Out)...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in Electronics & Communication, Computer Science, Information Systems, or equivalent
  • Typically 15+ years' experience
  • Strong understanding of the Server manageability domain and Server Industry
  • Design, Debug and Development on RTOS like Green hills Integrity, Embedded Linux
  • Experience working with ARM processor or similar system controllers
  • BMC / OpenBMC Experience
  • Experience with Data Centre infrastructure setup and manageability domain
  • Strong understanding of hardware and software interactions and various protocols
  • Prior experience in understanding and elaborating complex product requirements
  • Ability to convert product requirements into small features/tasks with clear acceptance criteria and impact
Job Responsibility
Job Responsibility
  • Develops organization-wide architectures, strategies, and methodologies for software systems design and development
  • Identifies and makes informed recommendations regarding new technologies, innovations, and outsourced development partner relationships
  • Work continuously with Product Management and partners to refine, prioritize, and elaborate new requirements
  • Elaborating complex requirements into effective solutions
  • Reviews, evaluates, and influences designs and project activities for compliance with development guidelines and standards
  • Contributes to domain expertise, business acumen, and experience to influence decisions of executive business leadership
  • Provides guidance and mentoring to less-experienced team members
  • Acts as an internal authority on software systems design
  • Contributes to the external technical community through whitepapers, patents, or other significant innovations
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Senior/Principal Solutions Architect - Advanced Engagement Systems

We are seeking a Solutions Architect who can fulfill the role of a platform and ...
Location
Location
United States , Albuquerque
Salary
Salary:
117500.00 - 235700.00 USD / Year
sandia.gov Logo
Sandia National Laboratories
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in relevant field of Information Technology (IT) or related software or computer engineering degree
  • 5 years of experience in three or more desired qualifications
  • Ability to travel infrequently (once to twice per year, on average)
  • Ability to obtain and maintain a DOE Q-level security clearance
Job Responsibility
Job Responsibility
  • Design, develop and maintain the underlying platform infrastructure on a variety of embedded RF hardware solutions which run Digital Signal Processing code on edge Linux systems
  • Load, configure, and utilize operating systems, develop platform automation tools, design and configure networks, run static code analyzers, implement cyber security best practices, scripting and software development
  • Configure and manage RF laboratory equipment to include test beds comprised of full systems (User Interface server, embedded RF processing digital electronics, and RF Front End systems)
  • Ensure standup and maintain of development computers, servers running system code, and networking with the RF laboratory
  • Continued improvement to the existing development environment (Gitlab and Continuous Integration and Continuous Deployment technologies)
  • Provide automated platform solutions by synergizing researching technology fundamentals, testing and evaluating those hypotheses in a controlled test environment and then planning and implementing those solutions to satisfy requirements
  • Maintain and grow technical expertise in all areas of computing platforms including servers, storage, network devices, and security devices (SIEMs, ACSs, and firewalls), Operating Systems (Linux), and virtualization to facilitate continuous platform operations and troubleshooting time-sensitive issues
  • Evaluating and reporting on new platform technologies and recommending viable and scalable solutions for incremental enhancements
What we offer
What we offer
  • Challenging work with amazing impact
  • Extraordinary co-workers
  • Some of the best tools, equipment, and research facilities in the world
  • Career advancement and enrichment opportunities
  • Flexible work arrangements for many positions include 9/80 (work 80 hours every two weeks, with every other Friday off) and 4/10 (work 4 ten-hour days each week) compressed workweeks, part-time work, and telecommuting (a mix of onsite work and working from home)
  • Generous vacation, strong medical and other benefits, competitive 401k, learning opportunities, relocation assistance and amenities aimed at creating a solid work/life balance
  • Fulltime
Read More
Arrow Right

Power Platform Architect

Valorem Reply, part of the Reply Network, is a leader in Microsoft-based IT solu...
Location
Location
United States , Chicago, Illinois; Seattle, Washington; Detroit Area, Michigan
Salary
Salary:
140000.00 - 170000.00 USD / Year
valoremreply.com Logo
Valorem Reply
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, engineering, or related field
  • 5 years of experience in developing applications using the Microsoft Power Platform and Azure
  • 2 years of experience developing AI solutions
  • Strong knowledge of Power Apps, Power Automate, Power BI, Power Pages, and Copilot Studio, and their capabilities including proficiency in using various data sources and connectors, such as Dataverse, SQL Server, SharePoint, Excel, etc
  • Experience in integrating Power Platform applications with other Microsoft and third-party services, such as custom solutions, Azure, Dynamics 365, Office 365, etc
Job Responsibility
Job Responsibility
  • Architect and develop traditional applications using the Microsoft Power Platform, leveraging its capabilities and features to meet business requirements and user needs using various app types, such as canvas, model-driven, Copilot Studio and portal apps
  • Architect and develop AI / agentic solutions using Copilot Studio and Foundry and use AI tools to assist in the end-to-end project lifecycle
  • Provide expertise in data storage and access using Dataverse, SQL Server, Azure Blob Storage, and Azure AI Search. Work closely with data engineers to provide data requirements for AI and traditional solutions
  • Deploy and manage Power Platform applications using DevOps tools and processes and ensure compliance with security and governance policies and apply deployment strategies, such as packaging, importing, exporting, and versioning
  • Integrate Power Platform applications with other Microsoft and third-party services, such as custom applications, Azure services, SharePoint, Dynamics 365, SQL Server, etc. using various connectors, such as standard, custom, and premium connectors, and apply integration patterns, such as orchestration, mediation, and transformation
  • Perform unit testing, debugging, and troubleshooting of Power Platform applications, and provide technical support and maintenance
  • Serve and technical lead on projects, directing other Power Platform developers while providing code reviews and feedback and ensure proper coding standards, such as naming conventions, code formatting, and code commenting while building a strong, healthy team culture
  • Provide technical sales guidance, estimations, and grow opportunities by establishing deep relationships with prospective and existing clients
  • Fulltime
Read More
Arrow Right

Power Platform Architect

As a Power Platform Architect, you will design and deliver innovative solutions ...
Location
Location
United States , Atlanta, Georgia; Kansas City, Missouri; Philadelphia Area, Pennsylvania
Salary
Salary:
Not provided
valoremreply.com Logo
Valorem Reply
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, engineering, or related field
  • 5 years of experience in developing applications using the Microsoft Power Platform and Azure
  • 2 years of experience developing AI solutions
  • Strong knowledge of Power Apps, Power Automate, Power BI, Power Pages, and Copilot Studio, and their capabilities including proficiency in using various data sources and connectors, such as Dataverse, SQL Server, SharePoint, Excel, etc
  • Experience in integrating Power Platform applications with other Microsoft and third-party services, such as custom solutions, Azure, Dynamics 365, Office 365, etc
Job Responsibility
Job Responsibility
  • Architect and develop traditional applications using the Microsoft Power Platform, leveraging its capabilities and features to meet business requirements and user needs using various app types, such as canvas, model-driven, Copilot Studio and portal apps
  • Architect and develop AI / agentic solutions using Copilot Studio and Foundry and use AI tools to assist in the end-to-end project lifecycle
  • Provide expertise in data storage and access using Dataverse, SQL Server, Azure Blob Storage, and Azure AI Search. Work closely with data engineers to provide data requirements for AI and traditional solutions
  • Deploy and manage Power Platform applications using DevOps tools and processes and ensure compliance with security and governance policies and apply deployment strategies, such as packaging, importing, exporting, and versioning
  • Integrate Power Platform applications with other Microsoft and third-party services, such as custom applications, Azure services, SharePoint, Dynamics 365, SQL Server, etc. using various connectors, such as standard, custom, and premium connectors, and apply integration patterns, such as orchestration, mediation, and transformation
  • Perform unit testing, debugging, and troubleshooting of Power Platform applications, and provide technical support and maintenance
  • Serve and technical lead on projects, directing other Power Platform developers while providing code reviews and feedback and ensure proper coding standards, such as naming conventions, code formatting, and code commenting while building a strong, healthy team culture
  • Provide technical sales guidance, estimations, and grow opportunities by establishing deep relationships with prospective and existing clients
  • Fulltime
Read More
Arrow Right

Sr. Kubernetes Engineer

The Sr Kubernetes Engineer is a hands-on technical role responsible for designin...
Location
Location
United States , Morristown; Boston; St. Petersburg; St. Louis; Atlanta
Salary
Salary:
139000.00 - 186000.00 USD / Year
zelis.com Logo
Zelis
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in cloud-native infrastructure, with deep expertise in Kubernetes (e.g., Native, Amazon EKS and Amazon ECS)
  • Proven track record of designing and operating production-grade Kubernetes platforms in multi-account AWS environments
  • Strong proficiency in infrastructure-as-code (CDK with Python), AWS DevOps native CI/CD tooling, and observability stacks (e.g. CloudWatch)
  • Experience implementing security controls, RBAC, and compliance frameworks (e.g., CIS Benchmarks)
  • Demonstrated ability to influence technical direction across multiple teams and domains
Job Responsibility
Job Responsibility
  • Architect and operationalize a Kubernetes platform(s) on AWS supporting multi-account, multi-region deployments aligned with AWS Well-Architected principles
  • Define platform capabilities including compute autoscaling, pod networking, network policies, load balancing, and storage drivers
  • Define paved path container standards and support consumption of those standards
  • Lead platform roadmap development and cross-functional alignment with architecture, security, FinOps, and product engineering
  • Operating System, Kubelet, CRI & AMI Configuration: Define and own lifecycle management, patching, and performance tuning of worker nodes
  • Worker Node Scaling: Design and manage autoscaling groups, node pools, and lifecycle automation
  • VPC Configuration: Architect secure and scalable VPCs, subnets, route tables, NAT gateways, and security groups
  • EKS Cluster Configuration: Manage cluster-level settings including version upgrades, endpoint access, audit logging, and control plane integrations
  • Add-ons Management: Deploy and maintain cluster add-ons such as CoreDNS, kube-proxy, metrics server, and custom controllers
  • Policies & Governance: Define and enforce RBAC, network policies, pod security standards, and IAM roles for service accounts
What we offer
What we offer
  • 401k plan with employer match
  • flexible paid time off
  • holidays
  • parental leaves
  • life and disability insurance
  • health benefits including medical, dental, vision, and prescription drug coverage
  • Fulltime
Read More
Arrow Right

ServiceNow Platform Architect

Aptiv is advancing its global digital transformation through a strategic partner...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
aptiv.com Logo
Aptiv plc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor or Masters degree in Computer Science, Information Technology, Business Administration, or a related field
  • 8+ years hands-on experience with ServiceNow
  • Certifications: CSA, CAD required
  • CTA / CIS preferred
  • Strong experience with ITSM, ITOM, CMDB, Discovery, HAM, SAM, and CSM
  • Deep understanding of ServiceNow architecture patterns and performance optimization
  • Expertise in enterprise integrations and data synchronization
  • Proven ability to lead multi-instance strategies and complex migrations (clones, splits, consolidations)
  • Experience in GenAI / automation (Now Assist, Virtual Agent, NLU, AI Search preferred)
  • Excellent stakeholder management and communication skills
Job Responsibility
Job Responsibility
  • Define and own the ServiceNow platform architecture and roadmap, aligned with enterprise strategy
  • Establish and enforce platform governance, standards, and best practices
  • Create and maintain technical design documentation, architecture diagrams, and platform standards
  • Lead design and implementation across ServiceNow modules: ITSM, ITOM, CSM, HAM, SAM, CMDB, Discovery, SPM, and GenAI/Now Assist
  • Design scalable solutions using Flow Designer, Workflow Data Fabric, APIs, MID servers, and custom applications
  • Oversee upgrades, patches, cloning, and environment strategy (Dev, Test, Sandbox, Prod)
  • Drive integration strategy with enterprise systems (Entra ID, Azure, AWS, Zabbix, Tanium, Cisco ISE, Salesforce, Microsoft 365)
  • Ensure CMDB health, data quality, and reconciliation strategies
  • Manage and optimize license usage and platform costs
  • Partner with security and infrastructure teams to ensure compliance and risk mitigation
What we offer
What we offer
  • Personal holidays
  • Healthcare
  • Pension
  • Tax saver scheme
  • Free Onsite Breakfast & Lunch
  • Discounted Corporate Gym Membership
  • Multicultural environment
  • Learning, professional growth and development in a world-recognized international environment
  • Access to internal & external training, coaching & certifications
  • Recognition for innovation and excellence
  • Fulltime
Read More
Arrow Right