CrawlJobs Logo

Software Reliability Engineer

United Kingdom, North West 45000.00 - 55000.00 GBP / Year · Job Posted May 05, 2026
Apply Position
Job Link Share

Job Description

This is built for engineers who see software through a systems lens rather than treating individual services in isolation.You'll be writing production-grade code daily, not occasional scripts or one-off tooling, but real application code that ships and runs at scale. You'll be embedded directly within teams, as an engineer who brings deep knowledge of using software to solve problems. The work centers on designing, building, and maintaining cloud native systems where low latency, security, and scalability are non-negotiable. Reliability isn't something bolted on after features are complete; it's woven into the architecture and code from the start. You'll think in terms of SLAs and SLOs as natural constraints that shape how systems are built, and you'll engineer solutions that prevent incidents rather than just responding to them. Monitoring, observability, and metrics aren't afterthoughts, they're critical. In play is Open Telemetry, Grafana, Splunk and Pager Duty. The hybrid setup based in the North West is designed to balance collaborative work with uninterrupted focus time for deep engineering. You'll spend time working closely with teams to think through system design, then break away to deliver high-impact code. You will elevate the teams you join, bringing an reliability mindset into everyday engineering practices while contributing directly to application codebases. They're looking for engineers with strong software development backgrounds, people comfortable with modern programming languages like Python, JavaScript, or Go, and open to picking up new tools as needed. Experience with distributed or cloud-native systems matters, as does a proactive approach to performance and system health. If you're someone who builds, ships, and genuinely cares about reliability at a systems level, this role is positioned as an opportunity to work on meaningful, high-impact infrastructure within a quality-focused engineering culture.

Job Responsibility

  • Designing, building, and maintaining cloud native systems where low latency, security, and scalability are non-negotiable
  • engineering solutions that prevent incidents rather than just responding to them
  • writing production-grade code daily
  • thinking in terms of SLAs and SLOs
  • monitoring, observability, and metrics using Open Telemetry, Grafana, Splunk and Pager Duty

Requirements

  • Strong software development backgrounds
  • comfortable with modern programming languages like Python, JavaScript, or Go
  • open to picking up new tools as needed
  • experience with distributed or cloud-native systems
  • proactive approach to performance and system health

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Software Reliability Engineer

8 matching positions

Software Reliability Engineer

This role improves and protects software and systems supporting IT services by m...
Location
Location
United States , Atlanta
Salary
Salary:
83900.00 - 151200.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Legally authorized to work in the United States
  • At least 18 years of age
  • Bachelor's Degree plus 2 years of related work experience OR combination of education and experience deemed equivalent (Required)
  • 2-4 years Relevant experience (Preferred)
  • Experience working in an Agile and DevOps environment (Preferred)
  • Experience in one or more of: C, C#, Java, Perl, Python, Go, or scripting experience in Shell and Perl (Preferred)
  • Experience in Continuous Integration/Continuous Delivery tools, such as, Jenkins, Cloudbees, etc., and other automation tools (Preferred)
  • Experience with DevOps tools, such as, Ansible, Chef, Puppet, etc. Experience in Docker, Kubernetes, etc. is preferable (Preferred)
  • Experience in APM tool, like, AppDynamics, logging tool, like Splunk (Preferred)
  • Experience working in a cloud environment (public/private) (Preferred)
Job Responsibility
Job Responsibility
  • Apply DevOps automation tools to manage CI/CD pipelines and configuration for production and non-production environments
  • Perform environment management and automated server provisioning to support scalable infrastructure
  • Deliver software improvements that improve availability, scalability, latency, and efficiency of IT services
  • Create and manage dashboards, alerts, logging standards, and health checks to improve service quality, supportability, and visibility across services
  • Contribute to software delivery process improvements including cloud enablement, containerization, and deployment automation
  • Support cloud-native applications, APIs, microservices, and platform operations across production and non-production environments
  • Troubleshoot production incidents, participate in root cause analysis, and support implementation of long-term reliability improvements with assistance from leadership and senior technical team members
  • Partner with Software Engineering, DevOps, and platform teams to improve application resiliency, scalability, and deployment automation under established technical direction
  • Contribute to operational readiness activities, including release validation, capacity planning, disaster recovery support, and environment support, under the guidance of senior leadership
  • Participate in Agile ceremonies, production support activities, and continuous improvement initiatives
What we offer
What we offer
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Medical insurance
  • Dental insurance
  • Vision insurance
  • Flexible spending account
  • Paid time off
  • Up to 12 paid holidays
  • Fulltime
Read More
Arrow Right

Senior Software Engineer / Principal Software Engineer - Copilot CLI

Within GitHub and Microsoft CoreAI, the Copilot CLI team builds GitHub's coding ...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years
Job Responsibility
Job Responsibility
  • Take ownership of critical product and platform areas of the Copilot CLI and shared agent runtime
  • Set a high technical and quality bar for agentic systems and developer-facing tooling
  • Design and ship performant, reliable terminal experiences that developers depend on for daily work
  • Use data, benchmarks, and direct user feedback to guide iteration and investment
  • Collaborate across org boundaries to enable other teams to build agentic products on top of a shared foundation
  • Influence architecture, technical direction, and engineering standards beyond your immediate team
What we offer
What we offer
  • Certain roles may be eligible for benefits and other compensation
  • Fulltime
Read More
Arrow Right

Backend Software Engineer / Senior Software Engineer- Kusto

Are you excited by the challenge of redefining how people explore and analyze ma...
Location
Location
Israel , Tel Aviv, Herzliya
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of technical engineering experience with coding in languages including, but not limited to, C#, Python or Java
  • 2+ years building and running services in a cloud environment (Azure, AWS, or GCP)
  • Experience in designing and operating large-scale distributed systems with high availability and reliability
Job Responsibility
Job Responsibility
  • Design, develop, and improve cloud-native services that are scalable, secure, and easy to operate
  • Drive architectural decisions and lead the development of major components in a distributed, high-SLA system
  • Collaborate with cross-functional teams in ILDC and abroad to deliver end-to-end solutions
  • Conduct code and design reviews and mentor junior engineers to grow technical excellence across the team
  • Help shape the future of real-time analytics in Microsoft Fabric RTI, with customer impact as your north star
  • Fulltime
Read More
Arrow Right

Senior Software Engineer and Software Engineer II

OneDrive and SharePoint are rapidly growing services at the center of Microsoft'...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience in related to cloud scale distributed design and patterns
  • The ability to deliver informed designs and plans ahead of production and execution
  • Knowledge of others' expertise and the ability to involve multiple players (within and outside the organization) in the creation or development of novel products, processes, or research streams
Job Responsibility
Job Responsibility
  • Design and deliver systems that enable partners and ISVs to migrate from other cloud providers, improve core systems performance and efficiencies, and ensure zero customer impact throughout the change management cycle
  • Deliver systems to meet our business continuity planning goals, provide telemetry for optimizing the service and drive our response time for detecting and resolving service issues down
  • Create, implement, optimize, debug, refactor, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
  • Contribue to the identification of dependencies, and the development of design documents for a product area with little oversight
  • Helps to identify other teams and technologies that will be leveraged, how they will interact, and when one's system may provide support to others
  • Contributes to determining back-end dependencies associated with product, application, service, or platform functionality for product features
  • Understands downstream effects of solutions and work provided
  • Helps to identify areas of dependency and overlap with other teams or team members and drives coordination
  • Remain current in skills by investing time and effort into staying abreast of current developments that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale
  • Reviews work items to deepen knowledge of product features in partnership with appropriate stakeholders (e.g., project managers) and executes project plans, release plans, and work items
  • Fulltime
Read More
Arrow Right

Software Engineer II/Sr. Software Engineer

Join Microsoft’s Core AI team and help shape the future of intelligent software ...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements is required for this role
  • This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Design and ship AI-assisted features in Visual Studio that help developers generate, explain, and refactor code—measured by adoption, reliability, and user satisfaction
  • Bring intelligence into IDE by integrating GitHub Copilot/MCP tools into core IDE workflows with strong attention to performance, privacy, and safety-by-default
  • Collaborate with partner teams across Microsoft and GitHub to deliver secure, performant solutions and iterate quickly based on real developer feedback
  • Contribute to designs (APIs, data flows, extensibility points) and participate in code/design reviews to maintain quality and scalability for a large codebase
  • Instrument and learn using telemetry, experimentation, and diagnostics to improve latency, reliability, and relevance over time
  • Fulltime
Read More
Arrow Right

Software engineer 2 / Senior Software engineer - Azure Data

Microsoft's Azure Data engineering team is leading the transformation of analyti...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Experience with the Azure stack including Storage, Compute, Networking, Fabric, Purview, Synapse, AKS, DevOps, Data Factory, or Power BI
  • Experience with big data technologies such as Spark, Kafka, Hadoop, or HBase
  • Experience building data lake or data engineering products, tools, or pipelines
  • Familiarity with container-based architectures (Docker, Kubernetes)
  • Ability to debug complex distributed systems on Linux and/or Windows platforms
Job Responsibility
Job Responsibility
  • Write extensible, maintainable code in C#, Java, Scala, or Python for Fabric Materialized Lake View services and HDInsight components
  • Use AI tools and coding best practices across the development lifecycle
  • Design data refresh, scheduling, and query optimisation features with minimal supervision
  • Review code from teammates for correctness, test coverage, security risks, and adherence to team standards
  • Coach junior engineers through code reviews
  • Debug complex issues in distributed systems running on Azure, Linux, and Windows
  • Run live site operations on a rotational, on-call basis
  • Integrate logging and instrumentation to gather telemetry on system health, performance, reliability, and security
  • Work with product managers, technical leads, and partners across geographies to define customer requirements for Materialized Lake View features
  • Fulltime
Read More
Arrow Right

Software Engineer II & Senior Software Engineer

Security represents the most critical priorities for our customers in a world aw...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, C, C++, C#, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
  • Master's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 5+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience with Troubleshoot and optimize automation, reliability, and monitoring for Live Site running as part of an on-call rotation owned by engineering team
  • Experience with distributed systems, messaging systems like Kafka etc - Large scale system design
Job Responsibility
Job Responsibility
  • Lead the architecture, design and implementation of services for extremely high scale, throughput, durability, and low latency
  • Innovate and make service deployment and maintenance an efficient well-oiled machine that provides excellent reliability with minimal manual engineer intervention
  • Ability to conduct in-depth triage, troubleshooting, and forensics across all facets of the cloud stack while executing processes corrective action and continual service improvement
  • Drive Infrastructure security improvements for mission critical high scale workloads
  • Lead the definition of requirements, KPIs, priorities and planning of engineering deliverables
  • Mentor and grow the energetic, diverse, and driven team with a good mix of senior and mid-level
  • Fulltime
Read More
Arrow Right

Senior Software Engineer and Principal Software Engineer - Power Point AI Team

The PowerPoint team is embarking on an exciting new chapter - evolving a product...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 8+ years of experience in backend service engineering, including work on high-scale infrastructures
  • Proficiency in one or more systems programming languages such as C#, C++
  • 1+ years of experience in software engineering, designing and developing systems (and APIs) that deploy and integrate with AI models
  • 2+ years of experience working with rich telemetry, making data driven decisions, and carrying out rapid experimentation
  • 2+ years of experience building software for scale, performance, and reliability
  • Academic or industry experience with building, finetuning, deploying or building eval-driven systems utilizing the models (any category)
Job Responsibility
Job Responsibility
  • Lead design and delivery of complex, scalable AI features ensuring resilience and exceptional user experience
  • Drive technical strategy and architecture decisions across multiple services, influencing partner teams and aligning with compliance and security requirements
  • Champion modern engineering practices, including AI-driven approaches, automation, and cloud-native patterns, across the full development lifecycle
  • Mentor and guide engineers, fostering technical excellence and continuous improvement in security, reliability, and performance
  • Collaborate cross-org to solve challenging technical problems, streamline processes, and reduce operational costs while improving live-site health
  • Design and implement scalable backend services optimized for machine learning workflows and large language model integration
  • Develop and maintain evaluation-driven systems that leverage text and multimodal inputs (e.g., images) to power visual-creation experiences
  • Build and optimize APIs and infrastructure to support high-performance model inference and experimentation at scale
  • Collaborate with product, ML, and design teams to integrate models into user-facing features, ensuring seamless functionality and performance
  • Conduct model evaluations and experiments, analyze results, and iterate on improvements to enhance accuracy and user experience
  • Fulltime
Read More
Arrow Right