CrawlJobs Logo

Sr. AI Site Reliability Engineer

schwab.com Logo

Charles Schwab

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

190000.00 - 270000.00 USD / Year

Job Description:

At Schwab, you will build a rewarding career while making a difference in the lives of our millions of clients. Here, innovative thinking meets creative problem solving as we work together to challenge the status quo. We believe in the power of collaboration and value being together in the office, which is why this role is based on-site in our San Francisco office. Joining Schwab means joining a company committed to transforming the financial industry and putting clients at the center of everything we do. Schwab’s AI Strategy & Transformation team, known as AI.x, is the central hub for Artificial Intelligence at Schwab. We are an integrated product, engineering, strategy and risk team, all based in San Francisco. We help set the enterprise vision for AI, invest in the most promising opportunities, and accelerate delivery across the company. We also build the core platform that powers AI at scale and explore next-generation GenAI efforts that will redefine how we serve our clients. As a Senior AI Site Reliability Engineer on AI.x, you will play a key role in ensuring our AI solutions are reliable, scalable, and resilient—enabling us to deliver innovative experiences to millions of clients. This role is more than a reliability engineering position. It is an opportunity to join a high-profile team shaping Schwab’s future with AI, to build and maintain solutions that matter to millions of clients, and to grow your career in one of the most exciting areas of technology today.

Job Responsibility:

  • Design, implement, and manage the reliability and operational excellence of GenAI applications and platforms
  • Work closely with architects, engineers, and business leaders to align reliability practices with Schwab’s enterprise strategy
  • Mentor and coach junior engineers
  • Help to build strong operational practices and foster a culture of continuous improvement
  • Lead by example in solving complex reliability challenges
  • Advance SRE standards
  • Drive rapid iteration from concept to production

Requirements:

  • 8+ years of software development or reliability engineering experience
  • 4+ years as a hands-on senior engineer in startups and/or large organizations
  • Bachelor’s degree in Computer Science or related field
  • 5+ years of experience building and operating complex products from scratch and running them in production
  • 3+ years of experience supporting applications that use Artificial Intelligence (AI) models to deliver real business impact
  • 3+ years of experience building and maintaining data pipelines and infrastructure for large datasets
  • 3+ years of experience with containers and cloud-native applications
  • Ability to operationalize them in the public cloud with infrastructure as code
  • Experience implementing monitoring, alerting, and incident response for large-scale distributed systems
  • Proven track record in driving reliability, scalability, and performance improvements for production AI systems

Nice to have:

  • Strong computer science fundamentals and experience working across different parts of the tech stack
  • Experience working with proprietary or open-source LLMs (Gemini, Claude, OpenAI or other models) and supporting LLM-powered applications in production
  • Focus on quality and reliability in everything you do
  • Experience writing and running evaluations to ensure quality and monitor consistency in LLM-generated responses and actions
  • Strong communication skills
  • Experience mentoring junior engineers
  • Demonstrated mindset of continuous learning and improvement
  • Ability to solve complex problems with ambiguous or incomplete data in highly distributed systems
  • Demonstrated business domain knowledge related to all products you have worked on
  • Curiosity about new technologies and processes
  • Experience with Python and front-end development preferred but not required
  • Master’s or advanced degrees in Computer Science or related fields
What we offer:
  • 401(k) with company match and Employee stock purchase plan
  • Paid time for vacation, volunteering, and 28-day sabbatical after every 5 years of service for eligible positions
  • Paid parental leave and family building benefits
  • Tuition reimbursement
  • Health, dental, and vision insurance
  • Bonus or incentive opportunities

Additional Information:

Job Posted:
February 17, 2026

Expiration:
February 24, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. AI Site Reliability Engineer

Sr. Embedded Software Engineer

Location
Location
Canada , Toronto or Ottawa
Salary
Salary:
Not provided
advancedtechsearch.com Logo
Advanced Technology Search Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s in electrical engineering, Computer Engineering, or Computer Science
  • Experience with C/C++
  • Experience writing Python scripts
  • Ability to read and understand board schematics and device datasheets
  • Ability to debug embedded software using Oscilloscopes and Logic Analysers
  • Experience with SCM tools (GIT or SVN)
  • Strong analytical and problem-solving abilities
  • Strong communication skills
  • Ability to work in a multi-site team environment
Job Responsibility
Job Responsibility
  • Design, develop, and optimize embedded software for silicon-based systems throughout the entire lifecycle, from conceptualization to deployment, ensuring seamless integration and optimal performance
  • Collaborate with cross-functional teams including hardware engineers, software developers, and machine learning experts to integrate ML models into embedded systems
  • Architect and implement software frameworks for efficient data processing, device control, and communication protocols
  • Conduct performance analysis, debugging, and optimization of embedded systems for reliability and efficiency
  • Develop software and firmware applications to interact with hardware and third-party interfaces
  • Contribute to the architecture and design of the overall AI solution
  • Develop debug and performance analysis tools for AI solution development
  • Play a role in all the phases of embedded AI software development, from requirement gathering, analysis, design, development, testing and final release to customers
  • Provide clear and timely communication related to status and other key aspects of the project to leadership team
  • Develop and maintain software documentation, including specifications, design documents, and test plans
  • Fulltime
Read More
Arrow Right
New

Sr. Software Engineer

The Sr. Software Engineer (Site Reliabiilty Engineer) ensures the reliability, s...
Location
Location
United States
Salary
Salary:
Not provided
bamboohealth.com Logo
Bamboo Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience in Site Reliability Engineering, Production Support, or a similar role focused on system reliability and operations
  • Strong experience supporting and troubleshooting production systems, including ownership of support tickets and incident response
  • Proficiency in Ruby and the ability to read, debug, and contribute to application code when needed
  • Experience with monitoring, alerting, and observability tools (metrics, logs, traces, dashboards)
  • Solid understanding of SQL and database fundamentals, including performance and troubleshooting
  • Familiarity with cloud platforms (AWS preferred), including serverless architectures and distributed systems
  • Experience using automation, scripting, or tooling (e.g., Python) to reduce operational effort
  • Comfort using or learning AI-supported tools (e.g., ChatGPT, CoPilot, or role-specific tools) to improve daily workflows
  • A forward-thinking, curious mindset with an openness to experimenting with new technologies
  • Strong analytical and problem-solving skills, with sound judgment and creativity in designing solutions
Job Responsibility
Job Responsibility
  • Own the end-to-end lifecycle of production issues, including triage, investigation, incident response, postmortems, and follow-up actions
  • Troubleshoot complex, cross-system issues, identify root causes, and implement long-term fixes
  • Design, implement, and maintain monitoring, alerting, and dashboards to proactively detect reliability and performance issues
  • Use AI-assisted tools responsibly to accelerate debugging, log analysis, incident response, and knowledge sharing
  • Partner with Product, Engineering, and Customer Success to resolve customer-impacting issues efficiently and transparently
  • Reduce recurring operational issues through automation, improved tooling, and process improvements
  • Contribute code to improve reliability, observability, scalability, and operational safety
  • Document incidents and standard operating procedures to improve response consistency and team effectiveness
What we offer
What we offer
  • Receive competitive compensation including health, dental, vision and other benefits
  • Fulltime
Read More
Arrow Right

Sr Platformization/Cloud Automation Engineer

Palo Alto Networks CDSS group is looking for a seasoned platformization and clou...
Location
Location
United States , Santa Clara
Salary
Salary:
104600.00 - 169225.00 USD / Year
paloaltonetworks.it Logo
Palo Alto Networks Italia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelors/Masters degree in Computer Science or a related field
  • 5+ years of industry experience in engineering
  • Fluent scripting skills (preferably Python or Bash) with deep experience in Unix/Linux systems from kernel to shell and beyond
  • 4+ years of working with Microservices architectures on Kubernetes
  • HandsOn experience with container native tools like Docker, Helm for managing workloads running in Kubernetes
  • Experience managing AWS and GCP at scale, with knowledge of cloud-neutral connectivity between platforms
  • Experience designing and maintaining API specifications using Swagger/OpenAPI, and working with API frameworks such as Apigee to enable secure, scalable integrations
  • HandsOn experience with infrastructure-as-code and automation tools such as Terraform, Ansible, etc.
  • Proficient in CI/CD platforms like GitlabCI, Jenkins, ArgoCD, CircleCI etc.
  • In-depth knowledge of operating systems (processes, threads, concurrency, etc)
Job Responsibility
Job Responsibility
  • Work with development teams to ensure that applications have scalability and reliability built-in from day one
  • Design, review and enhance software architecture to improve scalability, service reliability, cost, and performance
  • Drive platformization by building standardized, self-service infrastructure platforms that improve developer productivity, scalability, and operational efficiency
  • Deploy automation for provisioning and operating infrastructure at large scale
  • Partner with teams to improve CI/CD processes and technology
  • Mentor members of the staff on large scale cloud deployments
  • Drive the adoption of observability practices and a data-driven mindset
  • Setup processes like on-call rotations, Postmortems, Run books to continue supporting the infrastructure owned by the SRE team while finding ways to reduce the time to resolution and improve the reliability of services
  • Support, optimize and deploy mission critical, front-end and back-end production
  • Improving site performance, monitoring, and overall stability of our infrastructure
  • Fulltime
Read More
Arrow Right
New

Sr. Product Manager - Web Personalization (Agentic Web)

Microsoft’s mission is to empower every person and every organization on the pla...
Location
Location
United States , Redmond
Salary
Salary:
106400.00 - 203600.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Business, Marketing, Communications, Finance, Engineering, or related field AND 5+ years of marketing operations, program management, technical product management or related experience OR equivalent experience
  • 3+ years technology/process improvement experience
  • 8+ years in web personalization, SEO/AEO, or digital experience roles, with demonstrated ownership of roadmaps and measurable outcomes
  • Understanding of information architecture, modular content systems, funnels, attribution constraints, and Core Web Vitals
  • 2+ years experience taking a product, feature, or experience to market (e.g., design, addressing product market fit, and launch, internal tool/framework)
  • Experience with content modeling, metadata/taxonomy design, and modular design systems that support dynamic assembly at scale
  • Understanding of privacy, consent, and data minimization principles for personalization, measurement, and AI-assisted experiences
Job Responsibility
Job Responsibility
  • Own the Agentic Web strategy and operating model: Define the annual vision and roadmap across AI discovery, personalization, and on-site agent journeys, with clear intake, prioritization, and operating rhythms to sustain optimization at scale
  • Run AI discovery (AEO) end to end: Set standards for answer-ready content and machine interpretability, and continuously monitor, troubleshoot, and evolve performance with SEO, content, and engineering as answer engines change
  • Launch and manage web personalization as a product: Own the full lifecycle from requirements and launch through ongoing roadmap delivery, tuning, and continuous improvement based on business priorities and customer signals
  • Ensure personalization platform reliability: Define SLAs, instrumentation, monitoring, and quality gates (signal freshness, latency, delivery health), and drive incident response and post-incident improvements with engineering and analytics
  • Productize and operate interactive agent experiences: Define agent vision and requirements, then own post-launch iteration of intent capture, journey handoffs, escalation paths, and recommendation quality through measurable learning loops
  • Build and maintain the intent and content model: Steward the unified intent layer, taxonomy, metadata, and module mapping that connects discovery signals, in-session behavior, and agent context so the system stays scalable and governable
  • Lead measurement and experimentation for long-term lift: Own dashboards and performance reviews, run an ongoing test-and-learn cadence to prove incrementality, scale winners, retire underperformers, and align cross-functional teams on actions and outcomes
  • Champion trust, safety, and governance: Embed privacy, accessibility, accuracy, provenance, and brand-voice guardrails into both launches and ongoing operations, ensuring the system remains compliant and trustworthy as content and models evolve
  • Fulltime
Read More
Arrow Right
New

Application Security Engineer

Our Product Security team works on secure-by-design and deep product partnership...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
palantir.com Logo
Palantir Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Development or software engineering experience and a deep passion for information security
  • Experience with a modern high-level programming language (e.g. Java, Golang, Javascript, Python, etc.)
  • Demonstrated experience evaluating code for vulnerabilities and weaknesses
  • Experience with complex architectures and codebases (e.g. SOA or micro-services)
  • Experience utilizing/with CodeQL or other static code analysis platforms
  • Experience performing black-box testing of web applications
Job Responsibility
Job Responsibility
  • Perform deep architecture and security reviews on highly complex products to identify vulnerabilities
  • Lead engineering teams in feature design, threat modeling, and security-critical code and architecture
  • Develop and implement automation to eliminate entire classes of weaknesses across the organization
  • Drive decision-making by determining the tradeoffs between security and product design
  • Lead implementation of strategic security initiatives that improve security across Palantir
  • Fulltime
Read More
Arrow Right
New

Sales consultant

In this customer-facing role, you will combine instinct, empathy, and commercial...
Location
Location
Australia , Castle Hill
Salary
Salary:
Not provided
plush.com.au Logo
Plush Think Sofas
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A natural hunter with a strong drive to achieve and exceed sales targets
  • high in emotional intelligence and able to build rapport quickly
  • experienced in consultative selling and solution-based approaches
  • proven ability to close sales in high-value categories such as furniture, jewellery, automotive, or luxury goods
  • energetic, self-motivated, and adaptable, even in high-traffic periods
  • organised, professional, and collaborative with a growth mindset
Job Responsibility
Job Responsibility
  • Deliver exceptional, emotionally aware customer service that inspires loyalty and repeat business
  • engage customers through emotive storytelling to help envision their dream space
  • identify new sales opportunities and close deals confidently and consistently
  • maintain accurate sales records and ensure timely processing of all customer orders
  • collaborate with the Showroom Manager to maintain high presentation and merchandising standards
What we offer
What we offer
  • Competitive salary with generous, uncapped commissions
  • ongoing training and professional development opportunities
  • a supportive, growth-focused team environment within an ASX-listed company
  • Fulltime
Read More
Arrow Right
New

TechOps Architect

Location
Location
United Kingdom
Salary
Salary:
Not provided
octopus.energy Logo
Octopus Energy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with Okta, Slack and Google Workspaces
  • Experience leading a large-scale technical project
  • Knowledge of Identity & Access Management (IAM) systems and SSO technologies such as SCIM, OAuth, OIDC, and SAML
  • Experience with scripting and automation tools
  • Comfortable taking on tasks across all levels of technical support, from 1st line troubleshooting to 2nd and 3rd line issues
  • Fulltime
Read More
Arrow Right
New

Assistant Engineer (Rail Track)

As an Assistant Engineer for Olsson’s Rail Track Team, you will be responsible f...
Location
Location
United States , Lincoln; Omaha
Salary
Salary:
Not provided
olsson.com Logo
Olsson
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in civil engineering
  • Must possess or obtain Engineer Intern certificate (EI)
  • Strong Communication Skills
  • Has solid interpersonal, problem-solving, and decision-making skills
  • Has the ability to diligently research applicable requirements and regulations
  • Has advanced mathematical and analytical abilities
  • Has working knowledge (or the ability to become proficient) in applicable technologies
  • Develops an understanding of how the firm operates as a consulting business and how this role contributes to the success of the organization
  • Has a valid driver’s license and a good driving history
Job Responsibility
Job Responsibility
  • Assists with project design elements for engineering projects utilizing familiarity with standard techniques and established methods
  • Performs entry-level plan production and receives clearly defined instructions for necessary tasks
  • Gains knowledge and experience working within design and modeling software
  • Receives guidance from senior level staff to assist with project schedules and technical engineering calculations
  • Contributes to limited portions of a broader project while under direct supervision
  • Gathers and prepares research to assist with the assembly of technical reports
  • May provide mentoring to student interns by answering questions and offering guidance with routine assignments
  • May travel and work in all types of terrain and weather conditions at project sites in various stages of construction
  • Support marketing and business development efforts
  • Coordinate with technical staff
What we offer
What we offer
  • Receive a competitive 401(k) match
  • Be empowered to build your career with tailored development paths
  • Have the possibility for flexible work arrangements
  • Engage in work that has a positive impact on communities
  • Participate in a wellness program promoting balanced lifestyles
  • Health care
  • Vision
  • Dental
  • Paid time off
  • Opportunity to participate in a bonus system that rewards performance
  • Fulltime
Read More
Arrow Right