CrawlJobs Logo

Senior Software Engineer - Sre

United States, Austin · Job Posted May 05, 2026
Apply Position
Job Link Share

Job Description

Hybrid: This role is categorized as hybrid and is expected to report to Austin TX or Warren MI 3 days per week, (T-W-TH) at minimum [potential subject to change based on business need] The rapid adoption of advanced software in vehicles marks a new era for automakers and consumers, bringing both advantages and challenges. As part of Site Reliability Engineering (SRE) database group at General Motors, you'll join a dedicated team focused on enhancing the reliability, efficiency, and scalability of our distributed database systems. We leverage engineering principles to manage operations effectively and build solutions that enable us to grow without sacrificing performance or quality. Our SREs work closely with software development teams, acting as specialists in reliability and production engineering, with a focus on automation, observability, and shared responsibility. We are looking for individuals who are passionate about maintaining the health of our infrastructure while optimizing for reliability and cost-efficiency. This role involves a blend of database engineering and systems engineering skills to keep our services resilient, robust, and scalable. The Role: The database team within the SRE organization is chartered to provide best-in-class Database Management System (DBMS) project solutions to our application partners worldwide. This role involves modernizing our infrastructure and processes to provide database as a service capability into a highly standardized, reliable, and automated environment. The team is responsible for participating in all phases of the application development life cycle while designing, developing, and deploying databases on behalf of the application in a way that ensures GM's data is secure, highly available, current, flexible, and monitored. This individual will be working on transforming GM applications and database services into modernized cloud offerings.

Job Responsibility

  • Develop tools and software to automate operational processes, improve system reliability, and reduce manual intervention
  • Lead, Implement and improve monitoring and observability frameworks, enabling proactive detection and resolution of incidents
  • Participate in an on-call rotation to diagnose, troubleshoot, and mitigate production incidents, ensuring minimal downtime and swift resolution
  • Work alongside developers to ensure the quality, scalability, and reliability of our database services
  • Practice shared ownership of services in production, fostering a "You build it, you run it" culture
  • Manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to manage reliability expectations effectively
  • Conduct deep-dive analyses of incidents and collaborate on post-incident reviews to derive learnings and prevent recurrence
  • Champion a culture of continuous improvement
  • Evaluate system performance and advocate for optimizations that reduce infrastructure costs while maintaining service reliability

Requirements

  • Bachelor's degree in computer science or a related field, or equivalent work experience
  • 7-10 years software experience with strong proficiency in PostgreSQL and at least one other (Oracle, SQL Server) database technologies
  • Proficiency in at least one programming language (e.g., Python, Go, Java) and familiarity with multiple language ecosystems
  • Solid understanding of operating systems, networking, distributed systems, databases, and storage architectures
  • Deep understanding of how code runs on underlying hardware, including operating systems, algorithms, and data structures
  • Ability to optimize or troubleshoot code by understanding its execution and the impact on system resources
  • Experience handling production incidents, including root cause analysis, mitigation, and working through complex system failures
  • Strong communication skills, with an ability to explain technical concepts to both engineering and business stakeholders
  • Commitment to collaborative problem-solving and shared ownership of services
  • Proven experience in automating manual processes, building deployment pipelines, or managing configuration systems

Nice to have

  • Experience with GIT/source code management, CI/CD development, open-source development
  • Hands-on experience in Infrastructure as Code tools like Terraform, Terragrunt, Azure Resource Manager (ARM) templates, YAML pipelines, or Bicep
  • Experience in FiveTran or Goldengate configuration and operation
  • Experience in Cosmos or other NoSQL technologies
  • Experience with cloud platforms (AWS, GCP, Azure)
  • Experience of observability using OpenTelemetry, Prometheus or services such as DataDog
  • Familiarity with container orchestration systems like Kubernetes
  • A track record of managing or developing distributed systems

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Software Engineer - Sre

8 matching positions

Senior Software Engineer and Principal Software Engineer

We are building a planet-scale multi-modal database and infrastructure for execu...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C++, C#, or Java
  • OR Equivalent experience
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C++, C#, Java
  • OR equivalent experience
  • Experience in shipping products and scalable, reliable services
  • Currently programming/coding in your current or most recent role
  • Hands on experience with asynchronous programming and concurrency (threads, tasks, futures, async/await)
  • Experience with Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS), and/or Google Kubernetes Engine (GKE)
  • Experience in building database engines, query engines, indexing solutions (columnar, full-text, vector), at scale
  • Experience with programming CUDA, AI systems at scale
Job Responsibility
Job Responsibility
  • Independently execute in the face of ambiguity
  • Leads identification of dependencies and the development of design documents for a product, application, service, or platform
  • Writes efficient systems code and able to debug distributed systems
  • Holds accountability as a Designated Responsible Individual (DRI), mentoring engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, SRE

Abridge’s services and engineering team are in hyperscale mode. We are looking f...
Location
Location
United States , SF Office, NYC Office
Salary
Salary:
210800.00 - 248000.00 USD / Year
abridge.com Logo
Abridge
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of software engineering experience focused on distributed systems or tooling, with an interest in engineering enablement and software scaling
  • At least 2 years experience as a back-end engineer focused on system performance and scalability
  • Experience reducing latency in software by multiples through leveraging observability and profiling tools
  • Experience building on Kubernetes and scaling compute services on Kubernetes
  • experience with related cloud native technologies including ArgoCD, Argo Rollouts, Istio, etc
  • Comfortable implementing and securing services in Google Cloud Platform with Infrastructure as Code, including GCP Projects, VPC Networks, Google Kubernetes Engine, and IAM Roles, Groups and policies
  • Experience building software with backend languages (e.g. Python, GoLang, Node, and Rust)
  • Experience monitoring distributed systems with Prometheus, OpenTelemetry Collector, and Grafana (or something similar), including metrics collection, visualization, alerting, and using observability data to drive performance optimizations
  • Passion for engineering enablement and solving software and distributed systems scaling challenges under pressure
  • Must be willing to travel up to 10%
Job Responsibility
Job Responsibility
  • Leverage load testing, chaos engineering, and other test practices to identify performance and latency bottlenecks across all of our systems, and make changes to application code to resolve them
  • Drive software changes that can rehome applications at the code level onto new infrastructure (run times, event driven infrastructure, databases, and more) in order to dramatically improve scalability as well as enable multi-tenant deployments
  • Identify and implement software configuration changes and performance tuning parameters that will dramatically improve performance and scalability
  • Build developer tools and software modules that help engineers build code faster and more effectively with more enablements to the entire engineering organization
  • Work with the Platform team to develop, and application teams to adopt, emerging elements of our internal developer platform, such as service templates and self-serve infrastructure
  • Work with application teams to establish and adopt SLOs and error budgets, and drive better metrics for application health that can drive automated canary releases, improved health monitoring, and better engineering practices
  • Uplevel our ability to respond to incidents by improving observability, runbooks, and incident response muscle across the organization
  • Evangelize, document, and train the engineering team on the solutions being built and uplevel them on cloud native design strategies and tools
  • Be a public evangelist for Abridge in the global platform engineering community, including conferences, open source, and research as we pioneer new AI-first cloud-native-first security-first implementations at scale
What we offer
What we offer
  • Generous Time Off: 14 paid holidays, flexible PTO for salaried employees, and accrued time off for hourly employees
  • Comprehensive Health Plans: Medical, Dental, and Vision coverage for all full-time employees and their families
  • Generous HSA Contribution: If you choose a High Deductible Health Plan, Abridge makes monthly contributions to your HSA
  • Paid Parental Leave: Generous paid parental leave for all full-time employees
  • Family Forming Benefits: Resources and financial support to help you build your family
  • 401(k) Matching: Contribution matching to help invest in your future
  • Personal Device Allowance: Tax free funds for personal device usage
  • Pre-tax Benefits: Access to Flexible Spending Accounts (FSA) and Commuter Benefits
  • Lifestyle Wallet: Monthly contributions for fitness, professional development, coworking, and more
  • Mental Health Support: Dedicated access to therapy and coaching to help you reach your goals
  • Fulltime
Read More
Arrow Right

Senior Software Engineer/ SE II (DevOps/ SRE)

We are looking for DevOps/SRE Engineers to join the Optimizely team in Dhaka.
Location
Location
Bangladesh , Dhaka
Salary
Salary:
Not provided
optimizely.com Logo
Optimizely
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • AWS & GCP experience (multi-account, multi-region)
  • Kubernetes & container orchestration (EKS, Helm, Docker)
  • Terraform / Infrastructure-as-Code at scale
  • Automation scripting (Python, Bash, Fabric)
  • Experience managing scalable, fault-tolerant distributed infrastructure
  • Others: Datadog, Atlantis, Karpenter, Spark/EMR
  • Should be comfortable contributing code to service repositories if necessary (e.g. Node/Python/Golang)
  • Minimum experience 3+ years
  • Bachelor’s Degree (Computer Science or engineering preferred) or equivalent work experience
Job Responsibility
Job Responsibility
  • Multi-cloud infrastructure spanning multiple AWS accounts and GCP projects
  • 50+ microservices running on both EKS and GKE with auto-scaling
  • 36+ Terraform modules, 149+ Ansible roles, and more
  • Real-time data pipelines with Kinesis, Redshift, OpenSearch, and MongoDB Atlas
  • Self-managed OpenSearch, RabbitMQ, and other services
  • GitOps workflows powered by Atlantis with automated plan/apply cycles
  • CI/CD across 250+ Jenkins pipelines and Github Actions
  • Fulltime
Read More
Arrow Right

Senior Software Engineer

In Microsoft’s CoreAI division, the Azure SRE Agent Platform team builds and run...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, or equivalent practical experience
  • 7+ years of experience building production software using one or more modern programming languages such as C#, C++, Go, Java or Python
  • Strong understanding of Generative AI & software engineering fundamentals, data structures, and problem-solving
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Ability to pass the Microsoft Cloud background check upon hire/transfer and every two years
Job Responsibility
Job Responsibility
  • Take ownership of important areas of the Azure SRE Agent Platform, including agent capabilities, orchestration, evaluation, user experiences on different form factors and supporting platform services
  • Build and iterate on agentic systems, including tools, planning and execution loops, evaluations, and safety mechanisms
  • Design and ship reliable capabilities that improve incident detection, diagnosis, mitigation, and operational learning
  • Use telemetry, experiments, evaluations, and user feedback to guide iteration and investment
  • Contribute to resilient, observable systems that operate safely and effectively in production
  • Partner closely with engineers, SREs, and product counterparts to turn ambiguous problems into high-quality shipped solutions
  • Participate in debugging, live-site learning, and post-incident hardening to continuously improve system quality
  • Contribute to architecture, engineering standards, and development practices across the team
  • Fulltime
Read More
Arrow Right

Senior Software Engineer

The Firefox Monitor Engineering Team builds tools that help people understand an...
Location
Location
United States
Salary
Salary:
Not provided
mozilla.org Logo
Mozilla
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in software development with a strong focus on backend technologies
  • Deep expertise in Node.js and TypeScript, with experience building and leading backend engineering projects
  • Proficiency with PostgreSQL and SQL query optimization
  • experience with query builders such as Knex is a plus
  • Experience deploying and operating applications on Kubernetes
  • Experience with GCP (Pub/Sub, Cloud Logging) with a solid understanding of DevOps and SRE collaboration
  • Experience with Infrastructure as Code tools such as Terraform
  • Experience with AWS (S3) or similar cloud storage services
  • Hands-on experience with observability tooling including OpenTelemetry, Sentry, Prometheus, and Grafana
  • Familiarity with Redis for caching and session management
Job Responsibility
Job Responsibility
  • Lead backend development in Node.js and TypeScript, building and maintaining server-side logic within a Next.js full-stack architecture
  • Design, implement, and maintain integrations with external data sources such as Have I Been Pwned (HIBP) and other breach intelligence providers, with a focus on data privacy and security
  • Build and maintain event-driven systems using Google Cloud Pub/Sub, and own cloud infrastructure on GCP (GKE) and AWS (S3, SES)
  • Own and evolve the data layer, including PostgreSQL schema design and query optimization using Knex, and Redis caching strategies
  • Work closely with our SRE team to maintain and improve production environments, including monitoring and alerting with OpenTelemetry, Sentry, Prometheus, and Grafana
  • Triage and resolve production issues, partnering with SRE and support teams to investigate incidents, address bug reports, and keep the application running reliably
  • Periodically rotate into a Base Load Engineer (BLE) role, handling releases, dependency updates, and incoming work requests from customer support and other stakeholders
  • Partner with and support the frontend team in their work with React, TypeScript, Next.js, and SCSS, ensuring backend systems, APIs, and data contracts meet their needs
  • Partner with cross-functional teams to align on project goals, ensure seamless frontend-backend integration, and contribute to API design and evaluations
  • Participate in code reviews to maintain high standards of code quality and system reliability
What we offer
What we offer
  • Generous performance-based bonus plans
  • Rich medical, dental, and vision coverage
  • Generous retirement contributions with 100% immediate vesting
  • Quarterly all-company wellness days
  • Country specific holidays plus a day off for your birthday
  • One-time home office stipend
  • Annual professional development budget
  • Quarterly well-being stipend
  • Considerable paid parental leave
  • Employee referral bonus program
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Cloud Infrastructure & Observability

Location
Location
India , Bengaluru
Salary
Salary:
Not provided
roku.com Logo
Roku
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years in software engineering with a track record of architecting distributed systems or platforms at scale
  • Strong hands‑on experience in Golang and one scripting language (e.g., Python or Shell)
  • Experience operating observability at pb-scale ingestion and hundreds of millions of series
  • Expertise in observability platforms and tooling (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, ClickHouse) and standards (OpenTelemetry, OpenMetrics)
  • Deep experience building systems of scale and operating cloud infrastructure with Kubernetes
  • strong proficiency with service mesh technologies (Istio/Envoy), infrastructure‑as‑code (Terraform) and experience in multi‑cloud (AWS, GCP)
  • Demonstrated ability to evolve storage and query architectures for cost, scale, and latency (e.g., TSDB, Parquet, distributed processing)
  • Proven experience integrating security as part of infrastructure and platform development
  • Exceptional cross‑functional communication
  • effective collaboration with both technical and non‑technical stakeholders
Job Responsibility
Job Responsibility
  • Architect and lead Roku’s observability platform across metrics, logs, and traces
  • evolve data pipelines and storage layers optimized for high throughput, performance, and cost at Roku scale (TSDBs, Parquet, distributed processing)
  • Extend and harden open‑source observability systems
  • overhaul core components (e.g., storage layers, query paths) to improve performance, reliability, and usability at scale
  • Implement features such as pre‑aggregation, down-sampling, and sampling to reduce load and accelerate queries across the platform
  • Collaborate across platform, SRE, and product teams to migrate hundreds of workloads to our common platform
  • augment and automate CI/CD flows and onboarding
  • Integrate security into infrastructure and platform services
  • ensure robust multi‑tenant, multi‑cluster, and multi‑cloud designs
  • Contribute improvements back to open source and CNCF‑aligned projects
What we offer
What we offer
  • Global access to mental health and financial wellness support and resources
  • healthcare (medical, dental, and vision)
  • life, accident, disability, commuter, and retirement options (401(k)/pension)
  • time off in accordance with local leave policies
  • Fulltime
Read More
Arrow Right

Senior Software Engineer

We are seeking a Senior Software Engineer to design, build, and operate high-thr...
Location
Location
United States , Dallas
Salary
Salary:
Not provided
aquent.com Logo
Aquent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years Strong Java expertise with experience building production-grade services
  • 8+ years Hands-on API development experience, including RESTful services and API security
  • Cloud deployment experience on GCP and/or equivalent cloud platforms
  • 2+ Solid experience with Access Management protocols: OAuth 2.0, OpenID Connect (OIDC), SAML
  • Proven experience building and operating high-criticality systems with strict SLAs
  • Strong understanding of distributed systems, concurrency, and performance optimization and resiliency
  • Experience with observability (metrics, logs, traces) and production troubleshooting
  • Proficiency in building server-side applications (API SME) using C# and .NET Technologies
  • Solution design and implementation experience for high availability, High throughput, high scalability Application
  • Good understanding of the latest System Architecture and Development Standards and Guidelines
Job Responsibility
Job Responsibility
  • Lead Solution in the development and delivery of the organization’s software products to QA, UAT and Production
  • Manage day-to-day activities and promote Agile software development practices within the team
  • Collaborate with product owners and key stakeholders in Project Management, Business, QA, and Technology Operations to ensure timely and budget-friendly software project delivery
  • Work with Scrum Master and product owner to provide development sizing and cost analysis estimates
  • Collaborate with the product owner and team members in story decomposition, feature design, and task prioritization
  • Utilize automated software testing tools and frameworks, including test-driven development, to meet software quality standards
  • Support Single Sign-On (SSO) integration efforts to connect systems both internally and externally to Schwab
  • Assist the release manager in assembling releases and improving the release process
  • Help resolve needs and roadblocks identified by team members with the Scrum Master
  • Ensure the coordination of individual team deliverables to achieve product releases
  • Fulltime
Read More
Arrow Right

Senior Software Engineer

The Firefox Monitor Engineering Team builds tools that help people understand an...
Location
Location
United States; Canada
Salary
Salary:
Not provided
mozilla.org Logo
Mozilla
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in software development with a strong focus on backend technologies
  • Deep expertise in Node.js and TypeScript, with experience building and leading backend engineering projects
  • Proficiency with PostgreSQL and SQL query optimization
  • experience with query builders such as Knex is a plus
  • Experience deploying and operating applications on Kubernetes
  • Experience with GCP (Pub/Sub, Cloud Logging) with a solid understanding of DevOps and SRE collaboration
  • Experience with Infrastructure as Code tools such as Terraform
  • Experience with AWS (S3) or similar cloud storage services
  • Hands-on experience with observability tooling including OpenTelemetry, Sentry, Prometheus, and Grafana
  • Familiarity with Redis for caching and session management
Job Responsibility
Job Responsibility
  • Lead backend development in Node.js and TypeScript, building and maintaining server-side logic within a Next.js full-stack architecture
  • Design, implement, and maintain integrations with external data sources such as Have I Been Pwned (HIBP) and other breach intelligence providers, with a focus on data privacy and security
  • Build and maintain event-driven systems using Google Cloud Pub/Sub, and own cloud infrastructure on GCP (GKE) and AWS (S3, SES)
  • Own and evolve the data layer, including PostgreSQL schema design and query optimization using Knex, and Redis caching strategies
  • Work closely with our SRE team to maintain and improve production environments, including monitoring and alerting with OpenTelemetry, Sentry, Prometheus, and Grafana
  • Triage and resolve production issues, partnering with SRE and support teams to investigate incidents, address bug reports, and keep the application running reliably
  • Periodically rotate into a Base Load Engineer (BLE) role, handling releases, dependency updates, and incoming work requests from customer support and other stakeholders
  • Partner with and support the frontend team in their work with React, TypeScript, Next.js, and SCSS, ensuring backend systems, APIs, and data contracts meet their needs
  • Partner with cross-functional teams to align on project goals, ensure seamless frontend-backend integration, and contribute to API design and evaluations
  • Participate in code reviews to maintain high standards of code quality and system reliability
What we offer
What we offer
  • Generous performance-based bonus plans
  • Rich medical, dental, and vision coverage
  • Generous retirement contributions with 100% immediate vesting
  • Quarterly all-company wellness days
  • Country specific holidays plus a day off for your birthday
  • One-time home office stipend
  • Annual professional development budget
  • Quarterly well-being stipend
  • Considerable paid parental leave
  • Employee referral bonus program
  • Fulltime
Read More
Arrow Right