CrawlJobs Logo

Software Architect, Reliability Engineering

Stytch

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

227840.00 - 335000.00 USD / Year

Job Description:

As an Architect in SRE, you will drive the technical strategy, vision and outcomes for Twilio’s Reliability Engineering organization. You will define and lead solutions and initiatives that ensure Twilio products are reliable worldwide, and you will define standards and guide engineering teams on best practices for designing, building, and operating resilient systems. This role is pivotal to Twilio’s commitment to operational excellence, scalability, and pragmatic, large-scale systems design in the cloud.

Job Responsibility:

  • Partner with senior technical leaders across Twilio to set and communicate the reliability strategy, translating business goals into measurable outcomes
  • Influence company-wide architectural decisions while balancing long-term vision with near-term and compliance needs
  • Lead the design, implementation, and operation of scalable solutions and paved roads that enable reliable, high-traffic services
  • Influence company-wide architectural decisions to focus on availability, performance, resilience, and cost efficiency using Kubernetes, AWS, Terraform, and modern observability
  • Ensure integrity and quality across the service lifecycle
  • design fault-tolerant architectures, incident response, disaster recovery, and capacity/cost management
  • Collaborate with product and cross-functional teams to identify reliability risks and convert them into actionable designs, programs, and tooling
  • Establish and champion reliability practices and drive systemic improvements
  • Mentor and grow engineers and technical leaders
  • Track and apply emerging SRE, cloud, and large-scale systems best practices
  • introduce pragmatic innovations that improve reliability at scale

Requirements:

  • 15+ years of experience in Reliability Engineering, Software Engineering, DevOps roles with a focus on infrastructure, backend systems, and reliability, including as a principal/architect
  • Strong experience in driving strategic technical decisions and defining long-term technical vision
  • In-depth understanding of the role of Reliability Engineering in a large and diverse SaaS organization
  • Experience driving cross-org technical architecture outcomes
  • Knowledge of cloud architecture, devops practices, and large-scale systems design with microservices
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience)
  • Strong production experience, including operational management, scaling, partitioning strategies, and tuning for performance and reliability in high-scale environments
  • Hands-on experience with Kubernetes (e.g., EKS), deploying and managing stateful services, and cloud services like AWS
  • Proficiency in infrastructure-as-code tools such as Terraform or CloudFormation for automating infrastructure
  • Expertise in observability tools (e.g., Prometheus, Grafana, Datadog) for monitoring distributed systems and setting up alerting
  • Proficient in at least one programming language (e.g., Go, Python, Java) for building automation and tooling
  • Experience designing incident response processes, SLOs/SLIs, runbooks, and participating in on-call rotations
  • Experience running cross-functional post-incident reviews and driving improvements
  • Strong understanding of distributed systems principles, including consensus, durability, throughput, and availability tradeoffs
  • Proven track record of leading reliability improvements in data-intensive or mission-critical systems and collaborating with engineering teams
  • Excellent problem-solving, analytical, verbal, and written communication skills, with the ability to work in cross-functional and distributed environments
  • Demonstrated leadership in mentoring teams, influencing decisions, and balancing long-term objectives with short-term needs
  • Ability to influence and build effective working relationships with all levels of the organization

Nice to have:

  • Specific experience owning and operating large AWS footprints
  • Knowledge of Kubernetes architecture and concepts
  • Experience with data technologies like Apache Kafka, AWS MSK, or similar for reliable streaming
  • Passion for building reliable products, with prior projects in high-availability systems
What we offer:
  • competitive pay
  • generous time off
  • ample parental and wellness leave
  • healthcare
  • a retirement savings program
  • equity plan
  • corporate bonus plan
  • health care insurance
  • 401(k) retirement account
  • paid sick time
  • paid personal time off
  • paid parental leave

Additional Information:

Job Posted:
March 19, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Architect, Reliability Engineering

Customer Reliability Engineer

As a Customer Reliability Engineer at Endor Labs on our Customer Success team, y...
Location
Location
United States
Salary
Salary:
Not provided
https://www.endorlabs.com Logo
Endor Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong background in software engineering, with 4 -10 years of deep understanding of programming languages, application security, and DevOps practices
  • Demonstrated experience in developing custom technical solutions and actively engaging in customer-facing roles, with a proven ability to handle project-based work effectively
  • A passionate advocate for customer success, with a focus on building secure, scalable solutions from the ground up
  • Exceptional communication skills, capable of breaking down complex technical topics into clear, understandable terms for a variety of audiences
  • Proactive and anticipatory approach to problem-solving, with the ability to foresee customer needs and craft strategic solutions that align with their overarching goals
Job Responsibility
Job Responsibility
  • Own technical escalations from Customer Success Engineers, Solution Architects and Implementation Engineers ensuring swift reproduction and resolution of critical issues
  • Collaborate with Engineering and Product teams to triage and resolve bugs or architectural issues
  • Provide insight and build closely with our engineering teams, translating customer feedback and troubleshooting insights into tangible product improvements
  • Act promptly when technical issues emerge, applying your advanced troubleshooting skills and understanding of programming and DevOps practices to ensure our customers are successful
  • Conduct deep diagnostics, including logs, APIs, and infrastructure troubleshooting
  • Serve as a bridge between the customer and R&D for complex or systemic issues
  • Document and share solutions for long-term knowledge management and root cause prevention
What we offer
What we offer
  • Competitive salary and comprehensive benefits package including Health, Dental, Vision and Mental Health plans
  • 401(k) plan to support your longterm financial goals
  • Flexible PTO to maintain a healthy work-life balance
  • Opportunities for co-working and team meetups to foster collaboration
  • A dog-friendly office environment for those who love to bring their fur babies along
Read More
Arrow Right

Staff Software Engineer, Compute

Play a key role in building our platform from zero to one. Partner across teams ...
Location
Location
United States
Salary
Salary:
200000.00 - 275000.00 USD / Year
getdbt.com Logo
dbt Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in software engineering, with expertise in database systems, query engines, or storage systems
  • Strong coding skills at the systems level C++, Rust, Go, Python, or Java
  • Experience designing and scaling distributed systems or SaaS platforms
  • Expertise with cloud infrastructure (AWS, GCP, Azure, Kubernetes, Terraform)
  • Proven ability to lead complex projects and collaborate across functions
  • Excellent problem-solving skills, clear communication, and a strong sense of ownership
Job Responsibility
Job Responsibility
  • Design, build, and maintain the Compute layer that powers dbt’s ability to optimize queries across ingestion, transformation, and consumption
  • Lead technical architecture discussions with a focus on query engines, storage systems, and distributed database design
  • Collaborate with Product, Design, Operations, and Security to deliver well-architected, scalable compute solutions
  • Build services, APIs, and experiences that support user delight, quality, high availability, and performance
  • Tackle ambiguous, open-ended technical challenges with strategic thinking, balancing technical constraints with user needs and product goals
  • Define and drive best practices in testing, observability, and system reliability
  • Mentor engineers across the company, fostering technical growth and collaboration
  • Champion a culture of technical excellence and innovation, influencing engineering direction across multiple teams or domains
What we offer
What we offer
  • Unlimited vacation
  • 401k
  • Pension Plan
  • 16 weeks Paid Parental Leave
  • Wellness stipend
  • Home office stipend
  • Equity Stake
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Search

Truveta is the world’s first health provider led data platform with a vision of ...
Location
Location
United States , Seattle
Salary
Salary:
155000.00 - 190000.00 USD / Year
truveta.com Logo
Truveta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Information Systems, or a related field (advanced degree a plus)
  • 5+ years of professional software engineering experience
  • Designing, building, and operating distributed systems at scale
  • Writing production-quality, efficient, multi-threaded code that runs reliably in cloud environments
  • Architecting and implementing search system features (indexing, querying, optimization), including building robust test frameworks
  • Reviewing data specifications and handling large-scale data storage and distribution using specialized protocols
  • Debugging and resolving complex production issues in distributed systems
  • Proven experience with cloud-native architectures and DevOps practices (preferably Azure, though AWS/GCP experience is relevant)
Job Responsibility
Job Responsibility
  • Design, build, and maintain index, query, and search system features utilized to aggregate and analyze health data
  • Architecting, implementing, and testing new index and query features
  • Optimizing end-to-end index performance
  • Planning, architecting, and deploying highly scalable and highly reliable search systems
  • Implement relevant compliance controls and conduct thorough security reviews
  • Drive observability, reliability, and automation across the infrastructure and platform
  • Monitor emerging technology in the search and infrastructure domains, evaluate applicability, and champion adoption where appropriate
  • Contribute to knowledge sharing and best practices within the team
What we offer
What we offer
  • Comprehensive benefits with strong medical, dental and vision insurance plans
  • 401K plan
  • Professional development & training opportunities for continuous learning
  • Work/life autonomy via flexible work hours and flexible paid time off
  • Generous parental leave
  • Regular team activities (virtual and in-person)
  • Additional compensation such as incentive pay and stock options
  • Fulltime
Read More
Arrow Right

Software Engineering Manager

We are seeking a dynamic Software Engineering Manager to lead our PreForm deskto...
Location
Location
Hungary , Budapest
Salary
Salary:
Not provided
formlabs.com Logo
Formlabs GmbH
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have working knowledge of C++ (or managed a team of C++ developers) to be able to hold technical conversations
  • Substantial experience as a people manager and technical leader, specifically leading teams focused on desktop application development or complex software products
  • Expertise in designing and architecting complex software systems, with a deep understanding of design patterns, reliability, and scalability, particularly relevant to desktop applications
  • Deep knowledge of engineering practices and patterns across the entire software development lifecycle, including release processes for desktop products
  • Demonstrated success in partnering effectively with product or program management teams to translate business needs into technical solutions for user-facing applications
  • Comfortable managing technical risk, making strategic decisions for your organization, and thriving in a fast-paced, innovative environment
  • A strong commitment to growing the skills and fully leveraging the talent of your team members
  • A proven track record of managing teams that successfully develop and ship new features for software products, ideally desktop applications
Job Responsibility
Job Responsibility
  • Directly manage and empower a team of talented software engineers focused on the PreForm desktop application, driving its continuous improvement and expansion
  • Be a key contributor to software architecture decisions for PreForm, ensuring the application is built on scalable, maintainable, and robust foundations for new features and future products
  • Leverage your proven track record as both a skilled software engineer and a successful manager/leader to guide your team through complex technical challenges unique to desktop application development and 3D print preparation
  • Work in lockstep with product managers, 3D printing R&D teams, and hardware teams to guarantee the seamless delivery of new algorithms, features, and capabilities within PreForm that unlock the full potential of our 3D printers
  • Coordinate effectively with peer software teams on release and project planning, specifically as it relates to PreForm's integration and functionality
  • Take full ownership of hiring, growing, and retaining top-tier software engineering talent for your PreForm team, fostering an environment where engineers can thrive and achieve their full potential
What we offer
What we offer
  • Shares in the company
  • Catered lunch at the office 3 days per week
  • Private health insurance with Medicover (Blue package + hospital coverage)
  • A monthly or quarterly public transportation pass for Budapest
  • Free beverages and snacks at the office
  • All You Can Move sports pass with 7000 HUF monthly allowance
  • Free 3D prints
  • An inclusive, dog-friendly office with diverse and inspiring colleagues
  • Development opportunities both in-house and off-site
Read More
Arrow Right

Staff Software Engineer

We're looking for a Staff Software Engineer to drive significant technical impac...
Location
Location
United States , San Jose
Salary
Salary:
164000.00 - 246000.00 USD / Year
floqast.com Logo
FloQast
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of software engineering experience with a track record of designing and driving complex technical projects
  • Deep expertise in backend and frontend technologies like Go, Node.js, TypeScript, React, MongoDB, and AWS
  • Experience designing and implementing scalable distributed systems serving enterprise customers
  • Strong product sense and ability to balance technical excellence with business impact
  • Track record of mentoring engineers and elevating team capabilities
  • Experience working closely with product, UI/UX design, and cross-functional teams
  • Proven ability to navigate ambiguity and drive projects from conception to production
  • Experience with API design and building developer-friendly interfaces
  • Understanding of security and compliance requirements for enterprise SaaS
Job Responsibility
Job Responsibility
  • Architect and Build: Design and implement core platform features that power FloQast's applications & workflows
  • Technical Leadership: Lead technical design discussions and establish engineering best practices across the team
  • Product Partnership: Collaborate with product and design to shape the roadmap and deliver exceptional user experiences for accounting teams
  • Mentorship: Guide and develop other engineers through code reviews, pairing sessions, and technical workshops
  • System Excellence: Drive improvements in system reliability, performance, and developer experience
  • Strategic Decisions: Own critical technical decisions that impact the entire platform architecture
  • Customer Focus: Engage with customers and internal stakeholders to understand workflows and deliver solutions that transform FloQast applications
  • Innovation: Explore and implement new technologies to strengthen core platform services
What we offer
What we offer
  • Medical
  • Dental
  • Vision
  • Family Forming benefits
  • Life & Disability Insurance
  • Unlimited Vacation
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Observability

The Observability team at Airtable ensures that engineers have the tools they ne...
Location
Location
United States , San Francisco; New York; Seattle
Salary
Salary:
196000.00 - 270000.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of software engineering experience
  • 3+ years focused on observability or infrastructure at scale
  • Demonstrated success implementing and running production-grade logging, metrics, or tracing systems
  • Proficiency in distributed systems concepts, data streaming pipelines, and container orchestration (Kubernetes)
  • Deep hands-on knowledge of tools such as Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack, Loki, or ClickHouse
  • Comfort with at least one programming language (e.g., Go, Python, Java) to build and maintain observability tooling
  • Experience mentoring engineers and collaborating across multiple teams
  • Strong communication skills
  • Eagerness to own high-impact initiatives
  • Proven ability to balance short-term fixes with long-term strategic vision
Job Responsibility
Job Responsibility
  • Architect and scale core observability systems
  • Lead the design and evolution of logging, metrics, and tracing pipelines
  • Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack)
  • Guide and mentor a growing team of infrastructure engineers
  • Define and uphold coding standards and operational excellence
  • Partner with Deploy Infrastructure, Service Orchestration, and Product teams
  • Align infrastructure decisions with business goals
  • Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
  • Optimize performance and cost of large-scale data pipelines
  • Shape the observability roadmap
What we offer
What we offer
  • Opportunity to receive benefits
  • Restricted stock units
  • May include incentive compensation
  • Comprehensive benefit offerings
  • Fulltime
Read More
Arrow Right

Senior Site Reliability Engineer

Architect, develop, and troubleshoot large-scale infrastructure, maintain and im...
Location
Location
United States , San Francisco
Salary
Salary:
180960.00 - 230900.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Software Engineering, Information Technology or a closely related field
  • four years of experience as a Site Reliability Engineer architecting, developing, and troubleshooting large scale infrastructure utilizing programming languages such as PowerShell, Python, or Bash
  • networking technologies such as TCP/IP or security
  • four years of experience in automation development and infrastructure as code implementation using tools such as Terraform, AWS CloudFormation, Ansible, or Salt
  • knowledge of Linux and Windows systems
  • cloud technologies within AWS, GCP, Azure
  • continuous integration continuous delivery/deployment (CICD) practices and monitoring and observability practices
  • must pass technical interview
Job Responsibility
Job Responsibility
  • Architect, develop, and troubleshoot large scale infrastructure utilizing programming languages such as PowerShell, Python, or Bash and networking technologies such as TCP/IP or security
  • provide real-time feedback on production systems
  • work with product family and platform developers to maintain and improve services and performance with a strong customer focus
  • utilize a variety of data collection, enrichment, analytics, and visualizations to support our complex systems
  • responsible for automation development and infrastructure-as-code implementation using tools such as Terraform, AWS CloudFormation, Ansible, and/or Salt
  • build solutions to enhance availability, performance, and stability for hundreds of Atlassian enterprise customers in the cloud as well as automate repetitive work
  • help secure the cloud architecture with penetration testing, vulnerability resolution, and compliance audit responses
  • responsible for continuous integration continuous delivery/deployment (CICD) practices and monitoring and observability practices
What we offer
What we offer
  • Health and wellbeing resources
  • paid volunteer days
  • Fulltime
Read More
Arrow Right

Staff Software Engineer

We are looking for an experienced Staff Software Engineer to lead the developmen...
Location
Location
United States
Salary
Salary:
201000.00 - 271000.00 USD / Year
getdbt.com Logo
dbt Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience as a software engineer developing SaaS platforms and applications at scale
  • Proven experience designing and scaling full stack applications
  • Proficiency with backend languages and frameworks such as Python, Go, Rust, django, Node.js, Java, Spring
  • Strong understanding of API design, system architecture, and database management
  • Experience leading complex projects and driving cross-functional collaboration
  • A systematic problem-solving approach, strong communication skills, and a sense of ownership
  • Familiarity with cloud infrastructure such as AWS, GCP, Azure, Kubernetes, Terraform
  • Ability to mentor engineers and influence technical direction across teams
  • Minimum requirement of Bachelor's Degree in a related field (computer science, computer engineering, etc.) OR completed enrollment in engineering related bootcamp
Job Responsibility
Job Responsibility
  • Design, build, and maintain full stack applications that scale with our growing customer base
  • Lead technical architecture discussions, ensuring the platform is performant, maintainable, and secure
  • Tackle ambiguous, open-ended problems with strategic thinking, balancing technical constraints with user needs and product goals
  • Build services, APIs, and experiences that support user delight, quality, high availability and performance
  • Work closely with Product, Design, Operations, and Security teams to deliver well-architected solutions
  • Define and drive best practices in testing, observability, and system reliability
  • Mentor engineers across the company, fostering technical growth and collaboration
  • Champion a culture of technical excellence and innovation, influencing engineering direction across multiple teams or domains
What we offer
What we offer
  • Equity Stake
  • Unlimited PTO
  • 401k with a 3% guaranteed contribution
  • Excellent healthcare coverage
  • Paid parental leave
  • Wellness and home office stipends
  • Fulltime
Read More
Arrow Right