This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Production Support Engineer, you will play a critical role in ensuring the stability, reliability, and resilience of Schwab’s Order Management System (OMS) in a high-availability environment. You will operate at the center of complex, enterprise-impacting issues—providing technical leadership, restoring service during major incidents, and strengthening platforms so issues occur less frequently over time. Working closely with engineering, architecture, infrastructure, and business partners, you’ll influence how production systems are supported and improved at scale. This role is ideal for someone who thrives under pressure, brings a systems-thinking mindset, and is motivated by protecting client trust through operational excellence.
Job Responsibility:
Lead response and resolution for complex, business-critical production incidents, serving as a senior escalation point
Drive incident command, stakeholder communication, and post-incident analysis with clarity and accountability
Shape reliability strategy, operational priorities, and support standards for mission-critical platforms
Identify systemic risks and partner with engineering teams to drive long-term reliability improvements
Design and evolve operational models including monitoring, alerting, observability, and automation
Apply sound judgment to make high-impact decisions in fast-moving, ambiguous situations
Mentor and support team members by sharing expertise and promoting strong operational practices
Support a 24x7 production environment through a rotating on-call schedule
Requirements:
Bachelor’s degree in Computer Science or a related field, or equivalent practical experience
8+ years of experience supporting production systems in roles such as production support, SRE, reliability engineering, or software operations
Advanced ability to troubleshoot complex, distributed systems and diagnose issues across application, database, and infrastructure layers
Demonstrated skill leading high-severity incident response and restoring service under pressure
Strong expertise supporting Java-based platforms and data-intensive systems
Experience operating and tuning Linux-based environments
Ability to assess risk, prioritize effectively, and make decisions using incomplete information
Clear, confident communication skills when engaging both technical and business stakeholders
Deep experience supporting Oracle database platforms in production environments
Hands-on experience improving observability, alerting strategies, and operational automation
Familiarity with formal problem management, root cause analysis, and reliability metrics
Demonstrated success influencing cross-functional partners to improve system resilience
Experience mentoring or informally leading other engineers through complex problem-solving
What we offer:
401(k) with company match and Employee stock purchase plan
Paid time for vacation, volunteering, and 28-day sabbatical after every 5 years of service for eligible positions
Paid parental leave and family building benefits
Tuition reimbursement
Health, dental, and vision insurance
Medical, dental and vision benefits
401(k) and employee stock purchase plans
Tuition reimbursement to keep developing your career
Paid parental leave and adoption/family building benefits
Sabbatical leave available after five years of employment