CrawlJobs Logo

Software Engineer - Reliability

United Kingdom, North West 45000.00 - 55000.00 GBP / Year · Job Posted April 24, 2026
Apply Position
Job Link Share

Job Description

You're a software engineer who enjoys solving complex engineering problems. Your default is to engineer reliable software, that will contribute to overall performance of systems. This opportunity offers exactly that challenge. You will sit at the intersection of software engineering and platform reliability, giving you the chance to code solutions that ensure critical systems run smoothly, efficiently and continuously. You will design and build internal tools that help development and operations teams run systems at scale. Using languages such as Python, Golang or JavaScript, you will develop automation, monitoring and performance solutions that reduce manual effort and increase operational efficiency. This is an opportunity to work with modern technologies across the full software development lifecycle, from development through automated pipelines into cloud-native container environments. You will be applying engineering principles to reliability challenges, creating meaningful improvements to systems that must operate at high speed and maintain near-perfect uptime. Collaboration is central to how the teams work. You will partner with platform engineers, developers and operations specialists to solve problems and implement solutions that improve the stability and scalability of the organisation’s most critical systems. This environment encourages engineers to think creatively, contribute ideas and continuously improve the way technology supports the business. In return, you will join a team that values engineering excellence and invests in its people. The role offers a strong benefits package including a generous pension and holiday, a bonus and free gym membership. With hybrid working available, you will also have the flexibility to balance focus time with valuable collaboration across teams. For engineers who want to move beyond writing features and instead build the systems that keep an entire platform running reliably at scale, this role provides the perfect next step.

Job Responsibility

  • Design and build internal tools for development and operations teams
  • Develop automation, monitoring and performance solutions
  • Apply engineering principles to reliability challenges
  • Partner with platform engineers, developers and operations specialists to improve system stability and scalability

Requirements

  • Experience in software engineering
  • Proficiency in Python, Golang or JavaScript
  • Experience with automation, monitoring and performance solutions
  • Knowledge of cloud-native container environments
  • Understanding of full software development lifecycle
  • Collaboration with platform engineers, developers and operations specialists

What we offer

  • Generous pension
  • Holiday
  • Bonus
  • Free gym membership
  • Hybrid working

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Software Engineer - Reliability

8 matching positions

Software Engineer - Reliability

We are looking for a hands-on, first-principles engineer who is fluent in Linux,...
Location
Location
United States , Palo Alto
Salary
Salary:
170000.00 - 360000.00 USD / Year
lumalabs.ai Logo
Luma AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience as an SRE, production engineer, or infrastructure engineer in a fast-paced, large-scale environment
  • Deep, hands-on expertise in Linux, containerized systems, and debugging low-level system performance
  • Strong experience with providers like AWS or OCI
  • Thrive on solving complex, low-level problems where hardware and software intersect
  • Energetic and thrive in a less structured, fast-paced environment
  • Working knowledge of security best practices and familiarity with compliance frameworks, such as SOC 2 and ISO
  • Practical experience with InfiniBand, RDMA, or RoCE and understand how to optimize throughput for massive distributed training jobs
Job Responsibility
Job Responsibility
  • Architect for Reliability & Scale: Participate in critical re-architecture sessions to redesign our systems for higher efficiency and scale
  • Own Multi-Cloud GPU Clusters: Take end-to-end ownership of our production clusters for training and inference across AWS and OCI, ensuring high availability and peak performance
  • Drive Security & Compliance: Assist in achieving and maintaining security certifications (SOC 2 Type 1 & 2, ISO standards) by implementing robust infrastructure security practices
  • Deep Linux Performance Tuning: Use your mastery of Linux systems to troubleshoot and optimize performance at the OS and kernel level
  • Build Robust Automation: Write high-quality tools and automation in Python, Go, or Bash to manage, monitor, and heal our infrastructure
  • Debug Complex Hardware/Software Failures: Serve as the final escalation point for the most challenging GPU, networking (InfiniBand/RDMA), and system-level issues
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer / Principal Software Engineer - Copilot CLI

Within GitHub and Microsoft CoreAI, the Copilot CLI team builds GitHub's coding ...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years
Job Responsibility
Job Responsibility
  • Take ownership of critical product and platform areas of the Copilot CLI and shared agent runtime
  • Set a high technical and quality bar for agentic systems and developer-facing tooling
  • Design and ship performant, reliable terminal experiences that developers depend on for daily work
  • Use data, benchmarks, and direct user feedback to guide iteration and investment
  • Collaborate across org boundaries to enable other teams to build agentic products on top of a shared foundation
  • Influence architecture, technical direction, and engineering standards beyond your immediate team
What we offer
What we offer
  • Certain roles may be eligible for benefits and other compensation
  • Fulltime
Read More
Arrow Right

Backend Software Engineer / Senior Software Engineer- Kusto

Are you excited by the challenge of redefining how people explore and analyze ma...
Location
Location
Israel , Tel Aviv, Herzliya
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of technical engineering experience with coding in languages including, but not limited to, C#, Python or Java
  • 2+ years building and running services in a cloud environment (Azure, AWS, or GCP)
  • Experience in designing and operating large-scale distributed systems with high availability and reliability
Job Responsibility
Job Responsibility
  • Design, develop, and improve cloud-native services that are scalable, secure, and easy to operate
  • Drive architectural decisions and lead the development of major components in a distributed, high-SLA system
  • Collaborate with cross-functional teams in ILDC and abroad to deliver end-to-end solutions
  • Conduct code and design reviews and mentor junior engineers to grow technical excellence across the team
  • Help shape the future of real-time analytics in Microsoft Fabric RTI, with customer impact as your north star
  • Fulltime
Read More
Arrow Right

Senior Software Engineer and Software Engineer II

OneDrive and SharePoint are rapidly growing services at the center of Microsoft'...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience in related to cloud scale distributed design and patterns
  • The ability to deliver informed designs and plans ahead of production and execution
  • Knowledge of others' expertise and the ability to involve multiple players (within and outside the organization) in the creation or development of novel products, processes, or research streams
Job Responsibility
Job Responsibility
  • Design and deliver systems that enable partners and ISVs to migrate from other cloud providers, improve core systems performance and efficiencies, and ensure zero customer impact throughout the change management cycle
  • Deliver systems to meet our business continuity planning goals, provide telemetry for optimizing the service and drive our response time for detecting and resolving service issues down
  • Create, implement, optimize, debug, refactor, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
  • Contribue to the identification of dependencies, and the development of design documents for a product area with little oversight
  • Helps to identify other teams and technologies that will be leveraged, how they will interact, and when one's system may provide support to others
  • Contributes to determining back-end dependencies associated with product, application, service, or platform functionality for product features
  • Understands downstream effects of solutions and work provided
  • Helps to identify areas of dependency and overlap with other teams or team members and drives coordination
  • Remain current in skills by investing time and effort into staying abreast of current developments that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale
  • Reviews work items to deepen knowledge of product features in partnership with appropriate stakeholders (e.g., project managers) and executes project plans, release plans, and work items
  • Fulltime
Read More
Arrow Right

Software Engineer II/Sr. Software Engineer

Join Microsoft’s Core AI team and help shape the future of intelligent software ...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements is required for this role
  • This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Design and ship AI-assisted features in Visual Studio that help developers generate, explain, and refactor code—measured by adoption, reliability, and user satisfaction
  • Bring intelligence into IDE by integrating GitHub Copilot/MCP tools into core IDE workflows with strong attention to performance, privacy, and safety-by-default
  • Collaborate with partner teams across Microsoft and GitHub to deliver secure, performant solutions and iterate quickly based on real developer feedback
  • Contribute to designs (APIs, data flows, extensibility points) and participate in code/design reviews to maintain quality and scalability for a large codebase
  • Instrument and learn using telemetry, experimentation, and diagnostics to improve latency, reliability, and relevance over time
  • Fulltime
Read More
Arrow Right

Software engineer 2 / Senior Software engineer - Azure Data

Microsoft's Azure Data engineering team is leading the transformation of analyti...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Experience with the Azure stack including Storage, Compute, Networking, Fabric, Purview, Synapse, AKS, DevOps, Data Factory, or Power BI
  • Experience with big data technologies such as Spark, Kafka, Hadoop, or HBase
  • Experience building data lake or data engineering products, tools, or pipelines
  • Familiarity with container-based architectures (Docker, Kubernetes)
  • Ability to debug complex distributed systems on Linux and/or Windows platforms
Job Responsibility
Job Responsibility
  • Write extensible, maintainable code in C#, Java, Scala, or Python for Fabric Materialized Lake View services and HDInsight components
  • Use AI tools and coding best practices across the development lifecycle
  • Design data refresh, scheduling, and query optimisation features with minimal supervision
  • Review code from teammates for correctness, test coverage, security risks, and adherence to team standards
  • Coach junior engineers through code reviews
  • Debug complex issues in distributed systems running on Azure, Linux, and Windows
  • Run live site operations on a rotational, on-call basis
  • Integrate logging and instrumentation to gather telemetry on system health, performance, reliability, and security
  • Work with product managers, technical leads, and partners across geographies to define customer requirements for Materialized Lake View features
  • Fulltime
Read More
Arrow Right

Software Engineer II & Senior Software Engineer

Security represents the most critical priorities for our customers in a world aw...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, C, C++, C#, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
  • Master's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 5+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience with Troubleshoot and optimize automation, reliability, and monitoring for Live Site running as part of an on-call rotation owned by engineering team
  • Experience with distributed systems, messaging systems like Kafka etc - Large scale system design
Job Responsibility
Job Responsibility
  • Lead the architecture, design and implementation of services for extremely high scale, throughput, durability, and low latency
  • Innovate and make service deployment and maintenance an efficient well-oiled machine that provides excellent reliability with minimal manual engineer intervention
  • Ability to conduct in-depth triage, troubleshooting, and forensics across all facets of the cloud stack while executing processes corrective action and continual service improvement
  • Drive Infrastructure security improvements for mission critical high scale workloads
  • Lead the definition of requirements, KPIs, priorities and planning of engineering deliverables
  • Mentor and grow the energetic, diverse, and driven team with a good mix of senior and mid-level
  • Fulltime
Read More
Arrow Right

Senior Software Engineer and Principal Software Engineer - Power Point AI Team

The PowerPoint team is embarking on an exciting new chapter - evolving a product...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 8+ years of experience in backend service engineering, including work on high-scale infrastructures
  • Proficiency in one or more systems programming languages such as C#, C++
  • 1+ years of experience in software engineering, designing and developing systems (and APIs) that deploy and integrate with AI models
  • 2+ years of experience working with rich telemetry, making data driven decisions, and carrying out rapid experimentation
  • 2+ years of experience building software for scale, performance, and reliability
  • Academic or industry experience with building, finetuning, deploying or building eval-driven systems utilizing the models (any category)
Job Responsibility
Job Responsibility
  • Lead design and delivery of complex, scalable AI features ensuring resilience and exceptional user experience
  • Drive technical strategy and architecture decisions across multiple services, influencing partner teams and aligning with compliance and security requirements
  • Champion modern engineering practices, including AI-driven approaches, automation, and cloud-native patterns, across the full development lifecycle
  • Mentor and guide engineers, fostering technical excellence and continuous improvement in security, reliability, and performance
  • Collaborate cross-org to solve challenging technical problems, streamline processes, and reduce operational costs while improving live-site health
  • Design and implement scalable backend services optimized for machine learning workflows and large language model integration
  • Develop and maintain evaluation-driven systems that leverage text and multimodal inputs (e.g., images) to power visual-creation experiences
  • Build and optimize APIs and infrastructure to support high-performance model inference and experimentation at scale
  • Collaborate with product, ML, and design teams to integrate models into user-facing features, ensuring seamless functionality and performance
  • Conduct model evaluations and experiments, analyze results, and iterate on improvements to enhance accuracy and user experience
  • Fulltime
Read More
Arrow Right