This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
At Schwab, you will build a rewarding career while making a difference in the lives of our millions of clients. Here, innovative thinking meets creative problem solving as we work together to challenge the status quo. We believe in the power of collaboration and value being together in the office, which is why this role is based on-site in our San Francisco office. Joining Schwab means joining a company committed to transforming the financial industry and putting clients at the center of everything we do. Schwab’s AI Strategy & Transformation team, known as AI.x, is the central hub for Artificial Intelligence at Schwab. We are an integrated product, engineering, strategy and risk team, all based in San Francisco. We help set the enterprise vision for AI, invest in the most promising opportunities, and accelerate delivery across the company. We also build the core platform that powers AI at scale and explore next-generation GenAI efforts that will redefine how we serve our clients. As a Senior AI Site Reliability Engineer on AI.x, you will play a key role in ensuring our AI solutions are reliable, scalable, and resilient—enabling us to deliver innovative experiences to millions of clients. This role is more than a reliability engineering position. It is an opportunity to join a high-profile team shaping Schwab’s future with AI, to build and maintain solutions that matter to millions of clients, and to grow your career in one of the most exciting areas of technology today.
Job Responsibility:
Design, implement, and manage the reliability and operational excellence of GenAI applications and platforms
Work closely with architects, engineers, and business leaders to align reliability practices with Schwab’s enterprise strategy
Mentor and coach junior engineers, helping to build strong operational practices and foster a culture of continuous improvement
Lead by example in solving complex reliability challenges, advancing SRE standards, and driving rapid iteration from concept to production
Requirements:
8+ years of software development or reliability engineering experience, with 4+ years as a hands-on senior engineer in startups and/or large organizations
Bachelor’s degree in Computer Science or related field
5+ years of experience building and operating complex products from scratch and running them in production
3+ years of experience supporting applications that use Artificial Intelligence (AI) models to deliver real business impact
3+ years of experience building and maintaining data pipelines and infrastructure for large datasets
3+ years of experience with containers and cloud-native applications, and the ability to operationalize them in the public cloud with infrastructure as code
Experience implementing monitoring, alerting, and incident response for large-scale distributed systems
Proven track record in driving reliability, scalability, and performance improvements for production AI systems
Nice to have:
Strong computer science fundamentals and experience working across different parts of the tech stack
Experience working with proprietary or open-source LLMs (Gemini, Claude, OpenAI or other models) and supporting LLM-powered applications in production
Focus on quality and reliability in everything you do. Continue to raise the bar and drive others to deliver high-quality, resilient products, with experience writing tests and implementing automated reliability checks
Experience writing and running evaluations to ensure quality and monitor consistency in LLM-generated responses and actions
Strong communication skills – you balance written and verbal communication to clearly share your perspective with others on the team
Experience mentoring junior engineers and helping them grow their technical and operational skills through clear feedback and code reviews
Demonstrated mindset of continuous learning and improvement
Ability to solve complex problems with ambiguous or incomplete data in highly distributed systems
Demonstrated business domain knowledge related to all products you have worked on
Curiosity about new technologies and processes – you always seek to improve yourself and everyone around you and proactively seek and share knowledge with others on your team
Experience with Python and front-end development preferred but not required
Master’s or advanced degrees in Computer Science or related fields
What we offer:
401(k) with company match and Employee stock purchase plan
Paid time for vacation, volunteering, and 28-day sabbatical after every 5 years of service for eligible positions
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.