CrawlJobs Logo

Supercomputing Engineer (Test)

etched.com Logo

Etched

Location Icon

Location:
United States , San Jose

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

150000.00 - 275000.00 USD / Year

Job Description:

We are seeking highly motivated and detail-oriented Supercomputing Engineer (Test) to join our team. This team plays a critical role in ensuring the reliability and stability of our highest-performance Inference server hardware and software. As a Software Engineer on this team, you will design, develop, and execute comprehensive burn-in test suites, analyze test results, and collaborate with hardware and software engineering teams at Etched and our ODM partners to identify and resolve potential issues. You will be at the forefront of ensuring our server products meet the highest quality standards before they reach our customers.

Job Responsibility:

  • Test Development: Design, develop, and implement automated burn-in test suites using common scripting languages (Python, Go, Bash) and test frameworks across all aspects of System Operation including: boot sequences, root-of-trust, system management, workload deployment and performance
  • Test Execution: Execute burn-in tests on server hardware, monitor system performance and health, and analyze test results
  • Failure Analysis: Investigate and debug hardware and software failures identified during testing, providing detailed reports and mitigation plans
  • Collaboration: Collaborate with internal and external hardware and software engineering teams to identify root causes of failures and implement corrective actions
  • Test Infrastructure: Contribute to the development and maintenance of the burn-in testing infrastructure, including portable test environments and automation tools runable in any environment
  • Documentation: Create and maintain comprehensive documentation for test plans, test cases, and test results
  • Performance Analysis: Analyze system performance metrics to identify potential bottlenecks and areas for optimization
  • Continuous Improvement: Participate in continuous improvement efforts to enhance the efficiency and effectiveness of the burn-in testing process

Requirements:

  • Proficiency in at least one scripting language (e.g., Python, Bash, Go)
  • Experience with software testing methodologies and tools
  • Strong understanding of operating systems (Linux preferred) and server hardware architectures
  • Ability to analyze complex technical problems and provide effective solutions
  • Excellent communication and collaboration skills
  • Ability to work independently and as part of a team
  • Experience with version control systems (e.g., Git)
  • Experience with reading and interpreting hardware logs

Nice to have:

  • Experience with hardware burn-in testing or reliability testing
  • Knowledge of server virtualization and cloud computing concepts
  • Experience with performance testing and benchmarking tools
  • Familiarity with hardware diagnostic tools and techniques
  • Experience with containerization technologies (e.g., Docker, Kubernetes)
  • Experience with CI/CD pipelines
  • Knowledge of low level hardware communication protocols (i2c, etc.)
  • Experience with data analysis tools and techniques
What we offer:
  • Medical, dental, and vision packages with generous premium coverage
  • $500 per month credit for waiving medical benefits
  • Housing subsidy of $2k per month for those living within walking distance of the office
  • Relocation support for those moving to San Jose (Santana Row)
  • Various wellness benefits covering fitness, mental health, and more
  • Daily lunch + dinner in our office

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Supercomputing Engineer (Test)

New

Supercomputing Test Software Engineer

We are seeking highly motivated and detail-oriented Software Engineers to join o...
Location
Location
Taiwan , Taipei
Salary
Salary:
Not provided
etched.com Logo
Etched
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in at least one scripting language (e.g., Python, Bash, Go)
  • Experience with software testing methodologies and tools
  • Strong understanding of operating systems (Linux preferred) and server hardware architectures
  • Ability to analyze complex technical problems and provide effective solutions
  • Excellent communication and collaboration skills
  • Ability to work independently and as part of a team
  • Experience with version control systems (e.g., Git)
  • Experience with reading and interpreting hardware logs
Job Responsibility
Job Responsibility
  • Design, develop, and implement automated supercomputing test suites using common scripting languages (Python, Go, Bash) and test frameworks across all aspects of System Operation including: boot sequences, root-of-trust, system management, workload deployment and performance
  • Execute tests on server hardware, monitor system performance and health, and analyze test results
  • Investigate and debug hardware and software failures identified during testing, providing detailed reports and mitigation plans
  • Collaborate with internal and external hardware and software engineering teams to identify root causes of failures and implement corrective actions
  • Contribute to the development and maintenance of the supercomputing testing infrastructure, including portable test environments and automation tools runnable in any environment
  • Create and maintain comprehensive documentation for test plans, test cases, and test results
  • Analyze system performance metrics to identify potential bottlenecks and areas for optimization
  • Participate in continuous improvement efforts to enhance the efficiency and effectiveness of the testing process
What we offer
What we offer
  • Competitive compensation packages including generous equity packages
  • Comprehensive insurance coverage and other top-of-market benefits
  • Fulltime
Read More
Arrow Right
New

Supercomputing Software Engineer

We are seeking a highly skilled and motivated Supercomputing Software Engineer t...
Location
Location
Taiwan , Taipei
Salary
Salary:
Not provided
etched.com Logo
Etched
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in C/C++ or Python
  • Strong understanding of BIOS and BMC firmware architectures
  • Experience with server boot processes
  • Knowledge of root-of-trust and security principles
  • Strong understanding of operating systems (Linux preferred) and server hardware architectures
  • Experience with advanced system logging and diagnostic tools
  • Ability to analyze complex technical problems and provide effective solutions
  • Excellent communication and collaboration skills
  • Experience with version control systems (e.g., Git)
  • Experience with reading and interpreting hardware logs
Job Responsibility
Job Responsibility
  • Integrate and maintain BIOS and BMC firmware, ensuring robust and efficient server boot processes
  • Measure and Tune System Performance Configuration: Analyze DRAM timings, PCIe configurations, power state transitions etc. to ensure high performance and maximal reliability
  • Root of Trust and Security: Validating security features, including root of trust mechanisms, to protect system integrity and data security
  • Advanced System Logging and Diagnostics: Design and implement advanced system logging and diagnostic capabilities to facilitate efficient troubleshooting and performance analysis
  • Data Center Orchestration Integration: Integrate and optimize node-level data center orchestration technologies, such as Kubernetes and Docker, into the system software stack
  • System Validation and Testing: Develop and execute comprehensive test plans to validate system software functionality, stability, and performance
  • Collaboration and Troubleshooting: Collaborate with hardware and software teams to diagnose and resolve complex system-level issues
What we offer
What we offer
  • Competitive compensation packages including generous equity packages
  • Comprehensive insurance coverage and other top-of-market benefits
  • Fulltime
Read More
Arrow Right
New

Supercomputing Engineer (Network)

We are seeking highly motivated and skilled Supercomputing Engineers (Network) t...
Location
Location
United States , San Jose
Salary
Salary:
150000.00 - 275000.00 USD / Year
etched.com Logo
Etched
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in C/C++
  • Proficiency in at least one scripting language (e.g., Python, Bash, Go)
  • Strong experience with device-to-device networking technologies (RDMA, GPUDirect, etc.), including RoCE
  • Experience with zero-copy networking, RDMA verbs and memory registration
  • Familiarity with queue pairs, completions queues, and transport types
  • Strong understanding of operating systems (Linux preferred) and server hardware architectures
  • Ability to analyze complex technical problems and provide effective solutions
  • Excellent communication and collaboration skills
  • Ability to work independently and as part of a team
  • Experience with version control systems (e.g., Git)
Job Responsibility
Job Responsibility
  • Design, develop, and implement RDMA based networking peering, supporting high bandwidth, low latency communication across PCIe nodes within and across racks
  • Develop tests that qualify host processors (x86), NICs, TORs and device network interfaces for high performance
  • Furnish burn-in teams with tests that represent both real-world use cases and workloads for device to device networking, and extreme-load stress testing
  • Define the key metrics that system software must collect to maintain high availability and performance under extreme communications workloads
What we offer
What we offer
  • Medical, dental, and vision packages with generous premium coverage
  • $500 per month credit for waiving medical benefits
  • Housing subsidy of $2k per month for those living within walking distance of the office
  • Relocation support for those moving to San Jose (Santana Row)
  • Various wellness benefits covering fitness, mental health, and more
  • Daily lunch + dinner in our office
  • Fulltime
Read More
Arrow Right

Software Engineer II

Microsoft Azure Artificial Intelligence/High Performance Computing (AI/HPC) team...
Location
Location
United States , Multiple Locations
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
Job Responsibility
Job Responsibility
  • Be proactive and innovative about adding new metrics for monitoring the health of the supercomputers
  • Collaborate with team members and stakeholders to understand requirements and produce detailed, data-driven, collaborative design for assigned features
  • Independently uses appropriate artificial intelligence tools and practices across the software development lifecycle to develop, test, debug, and maintain code for Supercomputer health monitoring systems
  • Remain current in skills by investing time and effort into staying abreast of current developments that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale
  • Act as a Designated Responsible Individual (DRI) working on-call to monitor system/product feature/service for degradation, downtime, or interruptions and gain approval to restore system/product/service for simple problems
  • Fulltime
Read More
Arrow Right
New

Principal Software Engineer

Microsoft Azure High Performance Computing & AI Engineering (HPC & AI Eng) team ...
Location
Location
United States , Multiple Locations
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python - OR equivalent experience
  • 5+ years hands on experience designing and developing high volume low latency pipelines using products such as AzPubSub, Event Hubs, Azure Stream Analytics, Kafka, Grafana, Event Hubs, Prometheus or equivalent products
  • 3+ years of experience with one of AI/HPC system management OR High-Speed Networks OR HPC Storage OR managing Cloud Infrastructure
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Architect, design and develop high volume low latency end to end event pipelines that can provide first-to-know-insights on events causing job interrupts and job reliability
  • Conduct analysis of existing event pipelines to evaluate fidelity, granularity and latency of critical events
  • Contribute to improving key metrics such as Job Mean Time to Interrupt, Nodes in Service, Mean Time to Resolve on flagship supercomputers by enabling data scientists and domain experts to use the telemetry to identify events & issues at the intersection of datacenter and hardware, develop hypothesis, conduct A/B tests and synthesize results
  • Partner with cross organizational teams to evaluate available telemetry and latency drive architecture, design, development and deployment of end-to-end solutions to manage core infrastructure including current & next generation datacenter, IT hardware, power & cooling technologies
  • Drive engineering and operational excellence based on issues and learnings from strategic customers on their usage scenarios to improve product features and capabilities
  • Partner with teams on continuous learning and continuous improvement programs by leading the resolution of complex incidents, driving root cause analyses and championing initiatives to minimize future customer impact
  • Fulltime
Read More
Arrow Right
New

SEN Teaching Assistant

Are you a passionate SEN Teaching Assistant ready to empower young learners? Joi...
Location
Location
United Kingdom , Betchworth, Surrey
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
February 19, 2026
Flip Icon
Requirements
Requirements
  • Proven experience supporting children with SEN
  • Patient, empathetic, and resilient approach
  • Excellent communication and teamwork skills
  • Proactive, adaptable, and understanding of safeguarding
  • Right to work in the UK
Job Responsibility
Job Responsibility
  • Provide vital 1:1 and small group support to students with diverse Special Educational Needs
  • Help implement IEPs
  • Adapt materials
  • Foster an inclusive learning space
Read More
Arrow Right
New

Market People Partner

The Market People Partner serves as a trusted HR advisor and leader for the NAPA...
Location
Location
United States , Kent
Salary
Salary:
105000.00 USD / Year
allianceautomotive.co.uk Logo
Alliance Automotive UK LV Ltd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • High school diploma and 5 years experience implementing HR process in a high growth environment
  • 1+ years proven ability to lead teams and drive results without direct reporting authority
  • Collaboration skills focused on consultations with NAPA leadership, peers, and business partners to drive operational projects and or programs
  • Experience supporting multiple locations to drive HR operational success
  • Strong understanding and delivery of People team components to include talent acquisition, talent planning, onboarding, learning and development, employee engagement, performance management, compensation, and compliance
  • Intermediate proficiency with Microsoft Office applications – Outlook, Excel, Word, PowerPoint
  • Familiarity with HRIS systems, reporting and analytics tools
  • Proven ability to manage complex HR challenges and implement solutions in alignment with business goals
  • Experience with payroll, compliance, benefits administration and unemployment claims processing
  • Ability to maintain confidential and sensitive information
Job Responsibility
Job Responsibility
  • Proactively consults with NAPA leadership to ensure short and long-term talent needs are fulfilled through effectively implementing talent acquisition, talent planning and employee retention efforts
  • Partner with Talent Acquisition team to review and monitor staffing needs, and ensure facilities remain staffed to meet priority business needs
  • Partners with HR Talent Planning COE and Managers to assess current workforce capabilities, identify current readiness and succession gaps, and provides targeted development experiences as needed
  • Requires ambitious standards in performance management, employee coaching, discipline documentation and other employment documentation from People teammates
  • Manages employee programs, including new hire onboarding and orientation, manager training, compliance training, skill and process-based learning and development, and offboarding
  • Leads and supports key employee focused events throughout the year such as performance reviews, leadership meetings, monthly meetings, engagement surveys, safety programs, and benefits enrollment
  • Conducts data collection, reporting and analysis across teammate lifecycle for informed decision-making such as employee retention reporting to identify turnover trends, root causes, and partnering with leaders to implement targeted engagement and development strategies
  • Maintains compliance with company, federal, state, and local regulations related to policies, employment, compensation, safety, workers compensation and security
  • Ensures the accurate processing of payroll for hourly and salaried staff (including payroll deductions, salary adjustments, timesheet reconciliation & time/attendance tracking)
  • Ensures relevant administrative employee files are maintained. Including accurate and up-to-date employee data within the HRIS system
What we offer
What we offer
  • Health Insurance: Comprehensive medical, dental, and vision plans
  • Retirement Plan: 401(k) with company match
  • Paid Time Off: Vacation, personal days, holidays, sick days, and paternal leave
  • Additional Perks: Employee stock purchase plan, tuition reimbursement, professional development opportunities, and wellness programs
  • Fulltime
Read More
Arrow Right
New

Manager, Mid-Market Sales

We're looking for a Manager, Mid-Market Sales to help lead and scale our growing...
Location
Location
United States , New York City; San Francisco
Salary
Salary:
Not provided
assembled.com Logo
Assembled
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of management experience leading sales teams within a SaaS company with a track record of consistent quota attainment
  • Based in San Francisco or New York City with ability to be in-office on Mondays and Thursdays (hybrid)
  • Proven track record of leading Mid-Market sales teams to consistently meet and exceed goals while also contributing to deal execution
  • Strong track record of recruiting, retaining, and developing top account executives
  • Familiarity with Command of the Message, MEDDIC, or similar value selling methodologies
  • Have experience in a rapidly growing startup or tech environment where adaptability and flexibility are essential for success
Job Responsibility
Job Responsibility
  • Hire, train, and develop top Mid-Market AEs while fostering a high-performance and collaborative culture
  • Participate actively in prospect meetings alongside your team
  • Manage sales forecasting, reporting, and overall pipeline management, ensuring accurate and timely performance tracking
  • Identify and capitalize on new market opportunities, driving Assembled’s continued product expansion into new markets
  • Bring creative solutions to complex challenges, iterating on our sales process as we scale
  • Build pipeline through creative outbound strategies and in partnership with Marketing, SDR and Partnership teams
What we offer
What we offer
  • Generous medical, dental, and vision benefits
  • Paid company holidays, sick time, and unlimited time off
  • Monthly credits to spend on each: professional development, general wellness, Assembled customers, and commuting
  • Paid parental leave
  • Hybrid work model with catered lunches everyday (M-F), snacks, and beverages in our SF & NY offices
  • 401(k) plan enrollment
  • Fulltime
Read More
Arrow Right