CrawlJobs Logo

Engineering Manager, SRE

abridge.com Logo

Abridge

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

220000.00 - 260000.00 USD / Year

Job Description:

Abridge’s services and engineering teams are in hyperscale mode, and multiplying rapidly with our customer base and new product launches. We are looking for a seasoned leader who can harness this growth across the organization through reliability and performance engineering, engineering velocity, software replatforming and rearchitecture, and application security . You’ll lead and build an extremely fast growing organization, iteratively scope and execute a company-wide application reliability roadmap, and lead development and improvement of SLOs across the entire company and spanning multi-region and multi-cloud. The combination of security, scale, uptime, and timeline requirements Abridge has has never been executed before in tech. This is a rapidly expanding role that sits at the intersection of AI, reliability engineering, security, and healthcare.

Job Responsibility:

  • Visionary leadership: Scope, resource, evangelize, and execute a company-wide reliability and engineering velocity roadmap across environments and clouds, real-time streaming infrastructure under immense scale, compute as well as AI -at-edge infrastructure, and the most ambitious cloud security roadmap in the entire tech industry. Collaborate with department heads across product engineering, security, product management, commercial, and more to develop, align, and execute an extremely ambitious strategic roadmap
  • Gifted tactician: Work at the level of small tiger teams to unblock, enable, and drive execution and solutioning. Juggle several ambiguous and tricky problems at a time
  • Recruiter extraordinaire: Scale out your team to meet this roadmap - both ICs and managers. Attract top talent and hire quickly while maintaining a consistently high bar. Iterate on the hiring process, improve diversity and equity, retain and maximize the effectiveness of an extremely senior team
  • Mentor to the mentors: Develop their careers, create top-of-ladder development opportunities, and continuously raise the bar for your staff as well as your peers and leaders in their abilities and awareness. Earn their trust, lead by example, be a doctor rather than a judge for organizational and people challenges, and help establish and maintain a hivemind, de-siloed culture across all engineering pods

Requirements:

  • 3 - 6+ years as a manager in rapidly growing organizations including at least 1 year as a manager of managers
  • Seeking an extremely challenging role that will push you beyond your limits, where failures are inevitable and not to be feared
  • Seeking a senior leadership role to develop people, environments, and impact - not ego, accolades, or ladder climbing
  • Able to ask for help, fail fast and admit defeat
  • get yourself and others out of their comfort zone
  • Track record of leading performance engineering including load test and chaos engineering, large scale distributed telemetry implementation, major architectural and software refactors, engineering velocity, and full stack development
  • Experience running production workloads in more than one cloud provider (at a time, or across your experience)
  • Experience managing workloads across containerized solutions, Kubernetes, and CNCF-approved tooling such as Argo, istio, OTel, and more
  • Thought leader in platform building, with a strong desire to represent Abridge as a reliability engineering leader in the tech industry
  • Genuine passion for Abridge’s mission to improve healthcare in America and across the world
What we offer:
  • Generous Time Off: 14 paid holidays, flexible PTO for salaried employees, and accrued time off for hourly employees
  • Comprehensive Health Plans: Medical, Dental, and Vision coverage for all full-time employees and their families
  • Generous HSA Contribution: If you choose a High Deductible Health Plan, Abridge makes monthly contributions to your HSA
  • Paid Parental Leave: Generous paid parental leave for all full-time employees
  • Family Forming Benefits: Resources and financial support to help you build your family
  • 401(k) Matching: Contribution matching to help invest in your future
  • Personal Device Allowance: Tax free funds for personal device usage
  • Pre-tax Benefits: Access to Flexible Spending Accounts (FSA) and Commuter Benefits
  • Lifestyle Wallet: Monthly contributions for fitness, professional development, coworking, and more
  • Mental Health Support: Dedicated access to therapy and coaching to help you reach your goals
  • Sabbatical Leave: Paid Sabbatical Leave after 5 years of employment
  • Compensation and Equity: Competitive compensation and equity grants for full time employees

Additional Information:

Job Posted:
January 30, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Engineering Manager, SRE

Manager, Site Reliability Engineering and Incident Management

Planet DDS is seeking a Manager, Site Reliability Engineering and Incident Manag...
Location
Location
United States , Atlanta
Salary
Salary:
118000.00 - 160000.00 USD / Year
planetdds.com Logo
Planet DDS
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in SRE, DevOps, or Infrastructure roles
  • 3+ years in Incident Management leadership
  • Deep understanding of reliability, scalability, and performance optimization
  • Multi-cloud expertise in AWS, Azure, or GCP
  • Understanding of DNS, load balancing, firewalls, and compliance frameworks
  • Knowledge of fundamental cloud security (e.g., identity and access management, firewalls)
  • Deep understanding of logging and monitoring and security best practices
  • Strong collaboration and communication skills
  • Bachelor’s Degree in a relevant major or equivalent years of experience is a plus
Job Responsibility
Job Responsibility
  • Lead and mentor a team of SREs and Incident Managers
  • Foster a culture of reliability, accountability, and continuous improvement
  • Collaborate with engineering teams to design resilient platform architectures
  • Oversee the incident response process for outages and service disruptions
  • Ensure timely detection, escalation, and resolution of incidents
  • Drive post-incident reviews (PIRs) and root cause analysis
  • Implement improvements based on lessons learned to prevent recurrence
  • Mature and enforce best practices for incident response and runbooks
  • Automate operational tasks to reduce toil and improve efficiency
  • Maintain observability tools (monitoring, alerting, logging)
  • Fulltime
Read More
Arrow Right

Site Reliability Engineering Manager

The Wikimedia Foundation is looking for an Engineering Manager to join our SRE t...
Location
Location
United States of America
Salary
Salary:
132439.00 - 208378.00 USD / Year
wikimediafoundation.org Logo
Wikimedia Foundation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Prior experience managing teams
  • Prior hands-on experience with software or reliability engineering (within the last 3 years preferred)
  • Ability to analyze complex systems, troubleshoot issues, and devise effective solutions under pressure
  • Proficiency in project management methodologies to effectively plan, execute, and track new and existing initiatives
  • Strong understanding of cloud computing, networking, Linux systems administration, containerization (e.g., Docker, Kubernetes), and infrastructure as code (e.g., Terraform, Ansible) to be able to provide technical support to the team
  • Aptitude for automation and streamlining of tasks
  • Communicate effectively in both spoken and written English
  • Ability to work independently, as an effective part of a globally distributed team
  • Ability to travel several times a year for occasional in-person meetings
  • B.S. or M.S. in Computer Science or the equivalent in related work experience
Job Responsibility
Job Responsibility
  • Managing one to two globally distributed teams within Wikimedia’s Site Reliability Engineering organization
  • Providing guidance, mentorship, and support to ensure the team's effectiveness and growth
  • Working with team members to set individual performance goals, and supporting them in meeting and evolving their goals and career path
  • Recruiting, hiring, and helping onboard new team members
  • Triaging incoming workload, maintaining focus on priorities, and setting realistic expectations for both peers and team members
  • Coordinating and communicating with other members of the Wikimedia product & engineering teams on relevant projects, executing complex projects and contributing to the organizational strategy
  • Continuously developing the roadmap of the team in alignment with other SRE and Product & Technology teams, and helping to draft and execute the team’s annual and quarterly plans
  • Project managing new and existing initiatives
  • Leading the definition, refinement, and execution of the processes through which the team manages and performs work
  • Leading incident response, diagnosis, and follow-up on system alerts and outages across Wikimedia’s production infrastructure
  • Fulltime
Read More
Arrow Right

Engineering Manager, Infrastructure

As an Engineering Manager for the Infrastructure team, you’ll lead the engineers...
Location
Location
Canada; United States
Salary
Salary:
195000.00 - 285000.00 USD / Year
apollo.io Logo
Apollo.io
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on software or infrastructure engineering experience
  • 2+ years of experience leading teams of senior and staff-level engineers in platform, SRE, or infrastructure domains
  • Proven ability to design and operate large-scale distributed systems in cloud environments (preferably GCP or AWS)
  • Expertise with Kubernetes, Docker, Terraform, Ubuntu, and CI/CD pipelines
  • Familiarity with observability tools (Grafana, Prometheus, ELK, Datadog, NewRelic) and performance tuning
  • Strong grounding in networking, security, and reliability principles
  • Experience managing infrastructure costs, availability SLAs, and high-throughput systems at scale
Job Responsibility
Job Responsibility
  • Lead, coach, and grow a distributed team of high-impact Infrastructure Engineers
  • Partner with senior engineering leadership on strategic initiatives such as cloud migration, infrastructure scaling, platform reliability, and cost efficiency
  • Define and implement modern operational excellence practices, including SLOs, error budgets, incident reviews, and performance monitoring
  • Guide technical decision-making across key areas like Kubernetes, GCP, observability, networking, CI/CD, and IaC (Terraform, Ansible)
  • Collaborate with AI, Data, and Product Engineering teams to ensure infrastructure scalability for ML and AI-native workloads
  • Run effective 1:1s, career development conversations, and quarterly performance reviews
  • Support recruiting efforts to attract top engineering talent across time zones
What we offer
What we offer
  • Equity
  • Company bonus or sales commissions/bonuses
  • 401(k) plan
  • At least 10 paid holidays per year
  • Flex PTO
  • Parental leave
  • Employee assistance program and wellbeing benefits
  • Global travel coverage
  • Life/AD&D/STD/LTD insurance
  • FSA/HSA and medical, dental, and vision benefits
  • Fulltime
Read More
Arrow Right

Engineering Manager, Platform

We are looking for an engineering manager to help us scale, improve organisation...
Location
Location
Salary
Salary:
Not provided
airalo.com Logo
Airalo
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 5 years of hands-on technical experience in cloud-native environments, specifically with distributed systems and platform development
  • Minimum 2 years of experience in directly leading and managing platform, DevOps, or SRE teams
  • Expertise in designing, building, refactoring, and operating distributed systems and scalable cloud infrastructure at scale
  • Expertise in event-driven architecture and various Messaging systems (e.g., Kafka, SQS, RabbitMQ, Pub/Sub)
  • Strong knowledge of both relational (SQL) and NoSQL database technologies and their operational considerations in cloud environments
  • Extensive hands-on experience and deep understanding of core AWS services (e.g., EC2, EKS, Lambda, SQS, Security Groups, IAM, Aurora, DynamoDB, S3, RDS, CloudWatch, CloudTrail)
  • Proven expertise with Infrastructure as Code (e.g., Terraform, CloudFormation)
  • Strong experience with containerisation technologies (Docker) and orchestration platforms (Kubernetes), including Helm and related ecosystem tools
  • Extensive experience with modern monitoring, logging, and observability platforms (e.g., Datadog, Prometheus, Grafana, ELK Stack, Jaeger/OpenTelemetry)
  • Strong familiarity with DevSecOps practices and the implementation of automated security tooling throughout the CI/CD pipeline (e.g., SAST, DAST, secret management, vulnerability scanning)
Job Responsibility
Job Responsibility
  • Lead the strategy, architecture, and execution of our core platform technologies
  • Extend and improve engineering best practices across the organisation
  • Maintain and improve a collaborative environment, acting as a key bridge between application development teams and the platform team
  • Motivate and instil a strong sense of ownership in your team for the end-to-end lifecycle, stability, scalability, and performance of our core platform services
  • Mentor and guide the professional and technical development of your team members
  • Ensures that the team delivers high quality products and solutions by following the best practices
  • Build and scale teams that are collaborative, inclusive, and respectful of each other
  • Provide continuous, actionable feedback, address underperformance proactively, and recognise the individual strengths and contributions of your team members
  • Work closely with engineers and collaborate with key stakeholders to define, maintain a prioritised backlog, and establish clear short-term and long-term goals for the platform roadmap
  • Own your team’s deliverables and ensure the continuous delivery of scalable, highly-available, and cost-efficient platform services and infrastructure
What we offer
What we offer
  • Health Insurance
  • work-from-anywhere stipend
  • annual wellness & learning credits
  • annual all-expenses-paid company retreat in a gorgeous destination
  • Fulltime
Read More
Arrow Right

Engineering Manager

The Engineering Manager is responsible for leading a team of software engineers ...
Location
Location
France , Paris
Salary
Salary:
65000.00 - 80000.00 EUR / Year
beamy.io Logo
Beamy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 7 years of coding experience (back-end / full stack)
  • At least one significant prior experience managing a full-stack engineering squad (2-3 years)
  • Strong leadership to guide your team
  • Extensive technical expertise for coaching
  • Pragmatic mindset for identifying clear, effective solutions
  • Comfortable building, running and continuously iterating on squad rituals
  • Comfortable collaborating with cross-functional counterparts: PM, Product Designer
  • Comfortable supporting and coaching your IC direct reports, and giving them regular feedback
  • Comfortable managing in both French and English, in a remote context
  • Comfortable being hands-on in a full-stack context, through code and design document review or code contributions
Job Responsibility
Job Responsibility
  • Technical Leadership at Squad Level: Collaborate with Product Manager & Designer on roadmap, project management, and strategic planning
  • Provide technical guidance and architectural oversight, with selective hands-on coding (20-30%)
  • Enable engineers through architectural guidance, technical decision-making, and removing technical blockers
  • Oversee code review process and standards
  • Ensure team meets velocity and quality targets through effective prioritization and resource allocation
  • Define and communicate technical direction and architecture decisions for the squad
  • Analyze requirements, assess feasibility, and ensure appropriate technical documentation
  • Guarantee project goal achievement, identify risks early, and orchestrate solutions to blockers
  • Champion best practices (unit testing, TDD, CI/CD, etc.) in collaboration with the QA team
  • Drive implementation of clean code principles, testing standards, release processes, and pair programming culture
What we offer
What we offer
  • Four-day week
  • Professional development plan
  • Sick child leave
  • Mental health benefits
  • Employee Resource Groups (ERG)
  • Fulltime
Read More
Arrow Right

Cloud Engineer II - SRE

Cloud Engineer II - SRE role at Hewlett Packard Enterprise, part of the 24X7 ope...
Location
Location
India
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in computer science, engineering, information systems, or closely related quantitative discipline
  • Master's desirable
  • Typically 3-5 years' experience
  • Strong Experience in Ubuntu & K8s platforms
  • Experience in programming skills in Scripting / Python / Golang/ Ansible/ Terraform
  • Strong experience in DevOps practices like continuous integration/continuous deployment (CI/CD)
  • Knowledge on Git Ops model
  • Working experience in cloud platforms, especially AWS
  • Ability to quickly learn new skills and technologies
  • Strong system debugging skills
Job Responsibility
Job Responsibility
  • Part of the 24X7 operations group working in shifts managing an application or multiple applications
  • Monitor & remediate alerts and maintain uptime
  • Develops and maintains automated systems to improve operational efficiency and ensure compliance with security policies
  • Executes automation and debugs issues as required
  • Leverage CI/CD & Git Ops for managing the application platform
  • Patching security vulnerabilities
  • Manage public cloud infrastructure
  • Shares and reviews innovative technical ideas with peers
  • Analyses incidents / problems to develop and implement solutions to complex application problems
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Site Reliability Engineering (SRE)

Fyld is a Portuguese consulting company specializing in IT services. We bring hi...
Location
Location
Portugal , Lisboa; Porto
Salary
Salary:
Not provided
https://www.fyld.pt Logo
Fyld
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Degree in Computer Science, Information Technology, Engineering, or a related
  • Previous experience working as an SRE or in a similar role within DevOps, system administration, or software engineering
  • Familiarity with industry-specific applications and regulatory requirements (e.g., HIPAA, GDPR)
  • Proficiency in system administration for Linux/Unix and Windows systems
  • Strong understanding of networking concepts, including TCP/IP, DNS, load balancing, and firewalls
  • Proficiency in programming languages such as Python, Go, Java, or C++
  • Strong skills in scripting languages like Bash, Perl, or Ruby
  • Experience with automation tools such as Ansible, Puppet, Chef, or Terraform
  • Knowledge of Infrastructure as Code (IaC) principles and practices
  • Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), or Splunk
  • Fulltime
Read More
Arrow Right

Site Reliability Engineering Manager

Hewlett Packard Enterprise (HPE) is looking for a Site Reliability Engineering M...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7–10 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
  • Minimum 2 years of experience managing or leading cloud operations teams
  • Deep understanding of cloud platforms (AWS, GCP, or Azure) and cloud-native architectures
  • Hands-on experience with Kubernetes, containers, infrastructure as code (e.g., Terraform), and configuration management tools
  • Strong foundation in observability (monitoring, logging, tracing), automation using Python, and incident response
  • Familiarity with modern CI/CD automation and tools
  • Excellent communication, stakeholder management, and team-building skills
  • Experience scaling SRE practices in high-growth or large-scale environments
  • Ability to balance long-term reliability initiatives with short-term delivery needs.
Job Responsibility
Job Responsibility
  • Lead and mentor a team of Site Reliability Engineers, supporting their growth, performance, and well-being
  • Own the reliability strategy for SASE cloud infrastructure systems, including incident management, SLIs/SLOs, and capacity planning
  • Partner with Engineering, Product, and Security teams to design and deliver highly available, scalable, and resilient cloud-native services
  • Guide the team in building automation, improving observability, and improve operational efficiency of our cloud infrastructure
  • Drive adoption of best practices in monitoring, alerting, on-call operations, and runbook development
  • Build and maintain a strong engineering culture based on ownership, collaboration, and continuous learning
  • Define and track key reliability metrics, and report on team performance and system health to leadership
  • Contribute to hiring, onboarding, and career development for SREs.
What we offer
What we offer
  • Health & Wellbeing benefits for physical, financial, and emotional wellbeing
  • Personal & Professional Development programs
  • Unconditional inclusion in the workplace.
  • Fulltime
Read More
Arrow Right