Senior Site Reliability Engineer Job at Braze (San Francisco)

Job Description

We're looking for a Senior Site Reliability Engineer for our Currents team, responsible for building, maintaining, and evolving Currents, our data export system at scale. The Currents system is a robust Kafka-based event pipeline handling tens of billions of messages daily that our customers leverage to analyze user behavior in near real-time. You’ll be a key engineer on a highly collaborative and skilled team, responsible for bringing projects from concept to production and improving our existing high-scale systems. You will be leveraging your experience, your skills, and a strong sense of teamwork to tackle the significant engineering challenges of running a critical data streaming system. As a Senior Site Reliability Engineer, you will specifically focus on the observability, scalability, and reliability strategy aspects of every project.

Job Responsibility

Solve live performance and reliability issues and prevent their recurrence
Write and review code, educating engineers and building a culture of reliability
Practice sustainable incident response and blameless postmortems
Define and enable standards for monitoring, reliability, and performance
Bridge the gap between infrastructure and platform engineering teams
Support and improve services by planning for scale and reliability
Guide junior engineers in SRE best practices, software engineering, and agile project leadership

Requirements

Bachelor’s in Computer Science, Software Engineering, or a related STEM field
Five (5) years of experience in any role/occupation/position involving software engineering or site reliability engineering
Experience using distributed systems to deploy and monitor live applications such as Kubernetes or Docker Swarm
Experience working with alerting software (Sentry, Datadog, and/or PagerDuty)
Experience utilizing programming languages (Java, Kotlin, and/or Ruby) to understand and contribute to the codebase
Experience storing data in relational and non-relational databases such as Postgres and MongoDb
Experience with data streaming or queuing systems to build data pipelines with technologies like Kafka, Sidekiq or SQS and SNS
Experience leveraging continuous integration tools such as Jenkins or Buildkite
Experience collaborating with engineers through pull requests and code reviews in version control software such as GitHub or GitLab

What we offer

Competitive compensation that may include equity
Retirement and Employee Stock Purchase Plans
Flexible paid time off
Comprehensive benefit plans covering medical, dental, vision, life, and disability
Family services that include fertility benefits and equal paid parental leave
Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
A curated in-office employee experience, designed to foster community, team connections, and innovation
Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
Employee Resource Groups that provide supportive communities within Braze

Braze - All Job Offers

Select Country

Senior Site Reliability Engineer

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?