CrawlJobs Logo

Manager, Reliability Engineering

optimizely.com Logo

Optimizely

Location Icon

Location:
Netherlands , Amsterdam

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Team Manager in Reliability Engineering at Optimizely, you will oversee the day-to-day operations of an international team of four engineers. You will manage team performance, support career development, and contribute on a technical level to ensure the reliability and scalability of our platforms. Your role will also involve applying agile methodologies to enhance team productivity and collaboration. Please note, this role includes participation in an on-call rotation.

Job Responsibility:

  • Team Leadership and Management: Lead and manage an international team of reliability engineers. Oversee the team's daily activities, ensuring alignment with organizational goals and objectives
  • Technical Contribution: Actively contribute to technical projects and initiatives. Support the team with your expertise in system design, implementation, and troubleshooting
  • Agile Methodologies: Apply agile methods such as Scrum and Kanban to manage workflows and improve team productivity. Facilitate agile ceremonies and encourage continuous improvement
  • Performance Management: Monitor and evaluate team performance, providing regular feedback and guidance. Conduct performance reviews and set clear objectives for team members
  • Career Development: Support the professional growth and development of team members. Identify training and development opportunities to enhance team skills and capabilities
  • Collaboration and Communication: Foster a collaborative team environment. Communicate effectively with stakeholders and cross-functional teams to ensure alignment and transparency

Requirements:

  • Proven experience in a leadership role within a reliability engineering or similar technical team
  • Strong technical background with experience in reliability engineering, system design, and troubleshooting
  • Strong understanding of cloud computing, networking, and system architecture (preferably GCP)
  • Proficiency in scripting and automation tools (e.g., Python, Bash, Terraform)
  • Experience with observability tools (e.g., Datadog, Prometheus, Grafana, ELK Stack)
  • Demonstrated experience in designing, deploying, and managing applications in Kubernetes environments. Proficiency in configuring and optimizing Kubernetes clusters for scalability, reliability, and performance
  • Proficiency in version control software, particularly Git/Github, is required
  • Experience with agile methodologies such as Scrum and Kanban
  • Excellent leadership, communication, and interpersonal skills
  • Ability to manage team performance and support career development
  • Proficiency in English is required
  • Familiarity with Istio service mesh architecture and its components is a plus
  • Experience working with international teams is a plus

Nice to have:

  • Familiarity with Istio service mesh architecture and its components is a plus
  • Experience working with international teams is a plus

Additional Information:

Job Posted:
April 23, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Manager, Reliability Engineering

Site Reliability Engineering Manager

Hewlett Packard Enterprise (HPE) is looking for a Site Reliability Engineering M...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7–10 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
  • Minimum 2 years of experience managing or leading cloud operations teams
  • Deep understanding of cloud platforms (AWS, GCP, or Azure) and cloud-native architectures
  • Hands-on experience with Kubernetes, containers, infrastructure as code (e.g., Terraform), and configuration management tools
  • Strong foundation in observability (monitoring, logging, tracing), automation using Python, and incident response
  • Familiarity with modern CI/CD automation and tools
  • Excellent communication, stakeholder management, and team-building skills
  • Experience scaling SRE practices in high-growth or large-scale environments
  • Ability to balance long-term reliability initiatives with short-term delivery needs.
Job Responsibility
Job Responsibility
  • Lead and mentor a team of Site Reliability Engineers, supporting their growth, performance, and well-being
  • Own the reliability strategy for SASE cloud infrastructure systems, including incident management, SLIs/SLOs, and capacity planning
  • Partner with Engineering, Product, and Security teams to design and deliver highly available, scalable, and resilient cloud-native services
  • Guide the team in building automation, improving observability, and improve operational efficiency of our cloud infrastructure
  • Drive adoption of best practices in monitoring, alerting, on-call operations, and runbook development
  • Build and maintain a strong engineering culture based on ownership, collaboration, and continuous learning
  • Define and track key reliability metrics, and report on team performance and system health to leadership
  • Contribute to hiring, onboarding, and career development for SREs.
What we offer
What we offer
  • Health & Wellbeing benefits for physical, financial, and emotional wellbeing
  • Personal & Professional Development programs
  • Unconditional inclusion in the workplace.
  • Fulltime
Read More
Arrow Right

Manager, Site Reliability Engineering and Incident Management

Planet DDS is seeking a Manager, Site Reliability Engineering and Incident Manag...
Location
Location
United States , Atlanta
Salary
Salary:
118000.00 - 160000.00 USD / Year
planetdds.com Logo
Planet DDS
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in SRE, DevOps, or Infrastructure roles
  • 3+ years in Incident Management leadership
  • Deep understanding of reliability, scalability, and performance optimization
  • Multi-cloud expertise in AWS, Azure, or GCP
  • Understanding of DNS, load balancing, firewalls, and compliance frameworks
  • Knowledge of fundamental cloud security (e.g., identity and access management, firewalls)
  • Deep understanding of logging and monitoring and security best practices
  • Strong collaboration and communication skills
  • Bachelor’s Degree in a relevant major or equivalent years of experience is a plus
Job Responsibility
Job Responsibility
  • Lead and mentor a team of SREs and Incident Managers
  • Foster a culture of reliability, accountability, and continuous improvement
  • Collaborate with engineering teams to design resilient platform architectures
  • Oversee the incident response process for outages and service disruptions
  • Ensure timely detection, escalation, and resolution of incidents
  • Drive post-incident reviews (PIRs) and root cause analysis
  • Implement improvements based on lessons learned to prevent recurrence
  • Mature and enforce best practices for incident response and runbooks
  • Automate operational tasks to reduce toil and improve efficiency
  • Maintain observability tools (monitoring, alerting, logging)
  • Fulltime
Read More
Arrow Right

Site Reliability Engineering Manager

The Wikimedia Foundation is looking for an Engineering Manager to join our SRE t...
Location
Location
United States of America
Salary
Salary:
132439.00 - 208378.00 USD / Year
wikimediafoundation.org Logo
Wikimedia Foundation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Prior experience managing teams
  • Prior hands-on experience with software or reliability engineering (within the last 3 years preferred)
  • Ability to analyze complex systems, troubleshoot issues, and devise effective solutions under pressure
  • Proficiency in project management methodologies to effectively plan, execute, and track new and existing initiatives
  • Strong understanding of cloud computing, networking, Linux systems administration, containerization (e.g., Docker, Kubernetes), and infrastructure as code (e.g., Terraform, Ansible) to be able to provide technical support to the team
  • Aptitude for automation and streamlining of tasks
  • Communicate effectively in both spoken and written English
  • Ability to work independently, as an effective part of a globally distributed team
  • Ability to travel several times a year for occasional in-person meetings
  • B.S. or M.S. in Computer Science or the equivalent in related work experience
Job Responsibility
Job Responsibility
  • Managing one to two globally distributed teams within Wikimedia’s Site Reliability Engineering organization
  • Providing guidance, mentorship, and support to ensure the team's effectiveness and growth
  • Working with team members to set individual performance goals, and supporting them in meeting and evolving their goals and career path
  • Recruiting, hiring, and helping onboard new team members
  • Triaging incoming workload, maintaining focus on priorities, and setting realistic expectations for both peers and team members
  • Coordinating and communicating with other members of the Wikimedia product & engineering teams on relevant projects, executing complex projects and contributing to the organizational strategy
  • Continuously developing the roadmap of the team in alignment with other SRE and Product & Technology teams, and helping to draft and execute the team’s annual and quarterly plans
  • Project managing new and existing initiatives
  • Leading the definition, refinement, and execution of the processes through which the team manages and performs work
  • Leading incident response, diagnosis, and follow-up on system alerts and outages across Wikimedia’s production infrastructure
  • Fulltime
Read More
Arrow Right

Engineering Manager, Product Engineering

Engineering is the backbone of Everlaw. We build features that delight our custo...
Location
Location
United States , Oakland
Salary
Salary:
198000.00 - 250000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS or PhD in Computer Science (or equivalent)
  • Sound foundational understanding of a wide range of computer science topics and concerns relating to system and software design
  • At least 5 years of experience as a senior engineer building product features and full-stack web applications
  • Good dynamic range that you apply to different situations - you can step back and empower, while also diving deep into the code to understand the details
  • Ability to communicate at the right altitude with both technical and non-technical stakeholders
  • Experience working with stakeholder teams (internal and/or external) in setting and collaborating on technical roadmaps
  • Experience communicating with customers articulating to them how the platform works on reliability, security and compliance matters
  • At least 1 year experience leading software engineers - either as a manager managing engineers or as a technical lead managing the technical workstreams of software engineers
  • Experience managing the technical workstreams of software engineers and supporting them in execution
  • Demonstrated ability to lead an inspired, high performing and highly motivated and accountable team
Job Responsibility
Job Responsibility
  • Build features and functionality for the Everlaw core product
  • Work closely with Product, Design, DevOps, Security Engineering and application engineering leads to synthesize requirements and prioritize efforts
  • Lead roadmapping, resourcing and execution for critical features and capabilities
  • Support and coach engineers in their career development and growth
  • Work closely with Engineering Operations team to improve processes to help with goal setting, empowerment and execution across Everlaw Engineering efforts
  • Critically observe and understand Everlaw’s platform, tooling and processes
  • Understand current and upcoming challenges and requirements from the viewpoint of multiple stakeholders
  • Understand company goals and Product roadmaps
  • Strategize, prioritize, resource and execute against features
  • Actively coach your reports to deliver on projects and ensure they get the right types of feedback and coaching they need to succeed in their careers
What we offer
What we offer
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Seventeen paid vacation days plus 11 federal holidays
  • Membership to Modern Health to help employees prioritize mental health and wellness
  • Annual allocation for Learning & Development opportunities and applicable professional membership dues
  • Company-sponsored life and disability insurance
  • Work in Uptown Oakland, just steps from the BART line and dozens of restaurants and walking distance to Lake Merritt
  • Fulltime
Read More
Arrow Right

Engineering Manager, Mobile Quality Engineering

The Quality Engineering team helps Airbnb build high quality software efficientl...
Location
Location
Brazil , São Paulo
Salary
Salary:
Not provided
airbnb.com Logo
Airbnb
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of engineering management
  • 7+ years of industry experience with a strong focus on iOS and/or Android developer tooling, testing, or quality engineering
  • Strong familiarity with software engineering principles, including object-oriented and functional programming paradigms, design patterns, and code quality practices
  • Hands-on technical leadership leading multiple teams and setting technical direction
  • Led projects with notable risk and complexity
  • develops the strategy for project execution
  • Proven experience leading distributed or regional engineering teams and driving technical outcomes
  • Skilled in mentoring engineers and creating strong team dynamics across cultural and geographic boundaries
  • Be an agent of change inside the organization and be comfortable leading through ambiguity
  • Excellent communication and collaboration skills, with the ability to align local execution to a global strategy
Job Responsibility
Job Responsibility
  • Define and promote a quality mindset and strategy across the organization, by creating a vision that drives QE policies, programs and initiatives
  • Hire and retain a team of high-performing engineers
  • empower the team to achieve a high level of productivity, reliability and simplicity
  • Drive a sense of trust and belonging, and build inclusive teams with world class talent
  • Build and maintain our testing environments, testing data, and testing frameworks, working through ambiguity, concept validation and implementation of a best-in-class solution
  • Collaborate with cross-functional stakeholders to analyze internal/external failures, and suggest corrective and preventive action
  • Use a data-based approach to help resolve internal quality issues to prevent defects in code shipment
  • Partner with global Infra leadership to execute the Brazil site’s technical roadmap
  • Drive excellence in technical design, reliability, scalability, and efficiency across projects
  • Collaborate cross-functionally with global teams to ensure alignment on architecture, tooling, and standards
Read More
Arrow Right

Platform Engineering Manager

Metronome is reimagining the core systems that will define the future of our pla...
Location
Location
United States , New York City; San Francisco Bay Area
Salary
Salary:
208000.00 - 260000.00 USD / Year
metronome.com Logo
Metronome
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of engineering management experience: Including managing managers or senior/staff ICs and scaling teams through periods of growth and complexity
  • 8+ years of technical experience: You’ve built and operated distributed systems or platforms at scale and understand platform architecture and reliability deeply
  • Proven leadership skills: You’ve led teams through ambiguity, made long-term technical bets, and know how to align execution with business goals
  • Strong organizational and communication skills: You navigate complex stakeholder needs, drive clarity across teams and functions, and communicate tradeoffs to executives
  • Commitment to inclusivity: You build diverse, inclusive teams and lead with empathy, candor, and authenticity
Job Responsibility
Job Responsibility
  • Set organizational and technical direction: Shape the roadmap and vision for your team in collaboration with infrastructure, data, and product leadership
  • Advocate for the right long-term system investments: Lead cross-cutting initiatives that enable scale, improve margins, and raise the bar for reliability and testing standards
  • Drive complex cross-team execution: Bring clarity to sequencing, align priorities across stakeholders, and ensure timely, high-quality delivery—making deliberate tradeoffs to balance near-term customer value with long-term platform foundations
  • Manage and grow leaders: Lead senior engineers and emerging managers. Provide career development, performance feedback, and coaching to build the next generation of technical and organizational leaders
  • Recruit and retain top talent: Scale the team by hiring exceptional engineers and managers. Foster a high-trust, high-performance culture that values innovation, collaboration, and operational excellence
What we offer
What we offer
  • Excellent medical, dental, vision, and life insurance coverage, including a One Medical membership
  • Paid parental leave
  • FSA (Flexible spending account)
  • Retirement planning - Traditional and ROTH 401(k)
  • Flexible time off
  • Employee assistance program (mental health benefits)
  • Culture where personal growth is highly valued
  • market-benched equity
  • incentive pay
  • comprehensive health benefits
  • Fulltime
Read More
Arrow Right

Engineering Manager

As the Engineering Manager for Checkout & Payments (m/f/d), you'll play a vital ...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
cherry.vc Logo
Cherry Ventures
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • People leadership experience: Demonstrated experience building psychological safety, coaching engineers, and providing direct, compassionate feedback
  • You have a track record of hiring, developing, and retaining high-performing engineering teams
  • Payments domain expertise: Direct experience building or leading teams that operate payment systems at scale
  • You understand payment provider integrations, transaction reliability, idempotency patterns, and the complexities of processing payments across different methods and markets
  • Strong technical foundation: Solid knowledge of backend systems, microservices architecture, and building for scale
  • You can engage meaningfully in architectural discussions and guide your team toward quality trade-offs
  • Reliability mindset: Experience running high-reliability services with SLIs/SLOs, observability, and incident management practices
  • Communication skills: Ability to translate complex technical challenges into clear business impact for diverse stakeholders
Job Responsibility
Job Responsibility
  • Lead & Grow Engineers: Build a high-trust environment where engineers thrive and take ownership
  • You own end-to-end hiring, onboarding, and performance management, accountable for building and continuously improving how we attract talent
  • Ensure every team member has a clear career path and receives regular, actionable feedback
  • Help your team get 1% better every day
  • Drive Product & Business Impact: Partner with Product, Design, and Analytics to shape initiatives that directly impact Flink's revenue and customer experience
  • You'll work on challenges like increasing our Payment Success Rate, enabling customers to shop seamlessly across multiple devices, building internal tools that enable self-service for the products we build, and protecting our customers through fraud prevention initiatives
  • Own Critical Systems: Your team owns the systems that power the checkout experience, process payments, set prices, apply promotions, and present delivery options to customers (partnering with our dispatching teams to surface what's possible)
  • Guide Technical Direction: Shape the technology strategy for Checkout & Payments
  • Ensure your team makes the right technical decisions to deliver high-quality solutions reliably and repeatedly
  • You'll be included in architectural discussions, RFCs, and trade-off decisions, championing reliability, observability, and pragmatic engineering
What we offer
What we offer
  • A €1000 annual L&D budget as well as individual coaching options to ensure you have plenty of opportunities to learn, grow and achieve your goals
  • 26 days of vacation, +1 day every year up to a maximum of 30 days
  • A mobility budget of 35 EUR per month for Deutschland Ticket subsidy
  • A cool discount on your Urban Sports Club membership
  • Attractive company pension options
  • Unlimited access to an e-learning and development platform, MyAcademy, including online German courses
  • Online discounts with Corporate Benefits and Future Bens
  • A cool discount off your personal Flink orders
  • be the first to test out new products!
  • A modern and dog-friendly office in the heart of Berlin - lots of delicious lunch spots available within short walking distance
  • Fulltime
Read More
Arrow Right

Engineering Manager, Evidence Local

At Axon, our mission is to protect life and preserve truth. As an Engineering Ma...
Location
Location
Vietnam , Ho Chi Minh City
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of professional software engineering experience (ideally in product-focused environments)
  • 3+ years as an engineering manager/lead, including direct people management and development
  • Demonstrated experience in architectural design and distributed systems, paired with a deep focus on resiliency engineering, operational excellence, and architectural vision
  • Proven success leading teams to deliver complex, ambiguous projects on time and to spec
  • Demonstrated ability to improve engineering processes across security, quality, releases, and incident management
  • Excellent communication and collaboration skills
  • comfortable working with global teams
Job Responsibility
Job Responsibility
  • Build and lead a high-performance engineering team in Ho Chi Minh City, developing talent, mentoring future leaders, and creating an environment where AI-enabled engineering excellence thrives
  • Drive end-to-end technical execution for major product initiatives, roadmap-defining features, and contract-critical capabilities - from large-scale deployments and advanced security to device interoperability and cloud/on-prem hybrid operational workflows
  • Drive the scalability, resilience, and reliability of a mission-critical platform, ensuring it performs flawlessly from small single-server agencies to complex, multi-node, large-scale environments supporting thousands of devices and real time capabilities
  • Collaborate deeply across Axon’s global engineering organization, advocating for the needs of AEL, influencing cross-team architecture, and driving alignment on shared outcomes
  • Partner with product and global cross-functional stakeholders to define the roadmap, priorities, and key customer requirements, balancing feature development with operational excellence
  • Oversee operational and on-call processes, own production health, escalations, incident response, and post-incident reviews
  • Champion a deep understanding of customer workflows, pain points, and operational environments, ensuring engineering decisions are grounded in real-world use cases
What we offer
What we offer
  • Medical and Dental Insurance and cover your family up to 3 members
  • Vision Insurance
  • Robust Paid Time Off policy
  • Bonuses
  • Lunch allowance
  • Cell phone stipend
  • Free LinkedIn Learning/Udemy account
  • Access to 24/7 online emotional and mental support
  • Gym membership
  • Free parking
  • Fulltime
Read More
Arrow Right