This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking highly motivated and detail-oriented Software Engineers to join our Supercomputing Testing team. This team plays a critical role in ensuring the reliability and stability of our highest-performance Inference server hardware and software. As a Software Engineer on this team, you will design, develop, and execute comprehensive supercomputing test suites, analyze test results, and collaborate with hardware and software engineering teams at Etched and our ODM partners to identify and resolve potential issues. You will be at the forefront of ensuring our server products meet the highest quality standards before they reach our customers.
Job Responsibility:
Design, develop, and implement automated supercomputing test suites using common scripting languages (Python, Go, Bash) and test frameworks across all aspects of System Operation including: boot sequences, root-of-trust, system management, workload deployment and performance
Execute tests on server hardware, monitor system performance and health, and analyze test results
Investigate and debug hardware and software failures identified during testing, providing detailed reports and mitigation plans
Collaborate with internal and external hardware and software engineering teams to identify root causes of failures and implement corrective actions
Contribute to the development and maintenance of the supercomputing testing infrastructure, including portable test environments and automation tools runnable in any environment
Create and maintain comprehensive documentation for test plans, test cases, and test results
Analyze system performance metrics to identify potential bottlenecks and areas for optimization
Participate in continuous improvement efforts to enhance the efficiency and effectiveness of the testing process
Requirements:
Proficiency in at least one scripting language (e.g., Python, Bash, Go)
Experience with software testing methodologies and tools
Strong understanding of operating systems (Linux preferred) and server hardware architectures
Ability to analyze complex technical problems and provide effective solutions
Excellent communication and collaboration skills
Ability to work independently and as part of a team
Experience with version control systems (e.g., Git)
Experience with reading and interpreting hardware logs
Nice to have:
Experience with hardware burn-in testing or reliability testing
Experience with performance testing and benchmarking tools
Familiarity with hardware diagnostic tools and techniques
Experience with CI/CD pipelines
Knowledge of low level hardware communication protocols (i2c, etc.)
Experience with data analysis tools and techniques
What we offer:
Competitive compensation packages including generous equity packages
Comprehensive insurance coverage and other top-of-market benefits