Failure Analysis Engineer Job at Etched (Taipei)

Job Description

Etched is hiring a Failure Analysis Engineer to own the end-to-end debug process across our full hardware stack: chip, board, and rack-scale systems. You will be responsible for rapidly diagnosing, triaging, and resolving hardware failures; determining whether issues originate in the chip, board, or rack infrastructure; and driving resolution with the appropriate team. This is a highly cross-functional role, working closely with US-based hardware and silicon teams to build and refine debug playbooks as production scales. The ideal candidate has deep EE fundamentals, systems-level debugging experience, and the ability to solve hard problems under pressure.

Job Responsibility

Own failure triage across the stack. Receive field and production failures, isolate whether the root cause is chip, board-level, or system/rack-level, and route to the appropriate team with a clear problem statement
Drive root cause analysis using electrical test equipment (oscilloscopes, logic analyzers, multimeters) and system-level diagnostics to identify failure mechanisms and determine corrective actions
Build and refine debug processes. Partner with US hardware counterparts to document debug flows for different failure modes, creating repeatable playbooks that scale with production volume
Debug rack-level issues. Troubleshoot communication failures between rack managers, CDUs, and system components. Understand how thermal, power, and network infrastructure interact at the rack scale
Interface with BMC and system firmware. Use Linux command line and BMC interfaces to pull logs, run diagnostics, and validate system health during failure investigations
Close the loop on quality. Feed failure trends and root cause findings back to design, manufacturing, and operations teams to drive systemic improvements

Requirements

Bachelor’s or Master’s degree in Electrical Engineering or a related field
Fluency in oscilloscopes, signal integrity basics, power delivery, and board-level debug
Systems-level thinking. Strong understanding of how servers work end-to-end: BMC, BIOS, OS, thermals, and power sequencing. Can debug issues that span multiple subsystems
Linux command line proficiency. Comfortable with CI pulling logs, running scripts, and navigating server environments from the terminal
Strong communication skills across teams. You can translate a complex hardware failure into a clear problem statement for silicon, firmware, or mechanical teams. You've worked across time zones and functions
Composure under pressure. Production failures don't wait. You're energized by urgent, ambiguous problems and take ownership until they're resolved
3+ years of experience in hardware debug, failure analysis, or systems engineering in a server, datacenter, or semiconductor environment

Nice to have

Rack-scale infrastructure (cooling systems, power distribution, rack managers)
High-speed interfaces (PCIe, Ethernet, SerDes) and their common failure modes
ATE or production test environments
Experience with Datacenters, GPUs, FPGAs, or custom ASICs

What we offer

Competitive compensation packages, including generous equity packages
Comprehensive insurance coverage and other top-of-market benefits

Etched - All Job Offers

Select Country

Failure Analysis Engineer

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

Failure Analysis Engineer

Staff Engineer, Failure Analysis Engineering

Test and Failure Analysis Engineer

Failure Analysis Test Engineer

Failure Analysis Test Engineer

Product Failure Analysis Lab Technician

Structural Analysis Engineer IV

Eee Parts Failure Engineer

Electrical Analysis Engineer, BMS

Our AI answers in your language