CrawlJobs Logo

Senior AWS DevOps Engineer (Test & Infrastructure Resilience)

United Kingdom, City of London · Job Posted April 27, 2026
Apply Position
Job Link Share

Job Description

A top-tier consultancy firm is looking for an experienced AWS DevOps Engineer with a strong background in testing to help maintain and evolve Critical National Infrastructure (CNI) for a top UK client. This is a pivotal role in maintaining critical national infrastructure where scalability and resilience are essential. Location: Remote Working | 6 Months Contract | £400 to £600 a day inside IR35

Job Responsibility

  • Develop and structure a comprehensive test framework
  • Create approaches and plans, gain approval for those approaches
  • Deliver a complex set of functional and performance tests for software components running within an AWS multi-account model
  • Work alongside client teams and support in a 3rd line capacity as required

Requirements

  • Strong background in testing (including APIs)
  • Self-motivated and results-driven
  • Capable of thriving in an Agile team
  • Comfortable with Confluence for documentation and Jira for project tracking and story management
  • Must be eligible for SC Clearance
  • Experience with AWS services (Compute, Identity)
  • Experience with Vault, Consul, Kubernetes, Prometheus, ELK, Jenkins, Python, Ansible, and/or Bash
  • Networking advantage with proxies and firewalls including Fortinet and Palo Alto

Nice to have

Networking with proxies and firewalls including Fortinet and Palo Alto

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior AWS DevOps Engineer (Test & Infrastructure Resilience)

8 matching positions

Senior DevOps Engineer

Riverstone Enterprise Solutions, an Envision Innovative Solutions Company, deliv...
Location
Location
United States , Annapolis Junction
Salary
Salary:
170000.00 - 190000.00 USD / Year
rivsol.com Logo
Riverstone Enterprise Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree or higher in Engineering (i.e. Computer, Electrical, Mechanical, Aerospace, etc.) or Computer Science with a minimum of five (5) years of related experience
  • Five (5) years of additional DevOps experience may be substituted for a bachelor's degree
  • Must be fluent with Git
  • Strong knowledge of Linux and Linux environments (RHEL 617/8, RHCSNRHCE CentOS)
  • Experience with Windows system administration, system monitoring, instrumentation, resiliency and performance
  • Experience integrating Jenkins/Bamboo Docker, and Kubernetes for automated deployment preferred
  • Experience with caching technologies (Memcache, Active MQ, Redis, APC, etc.)
  • Experience with MySQL (Clusters, Replication, and Tuning) and Elasticsearch (Kibana a plus)
  • Knowledge of security practices, networking protocols, firewalls, PCI compliance etc.
  • Experience managing/monitoring AWS cloud and virtualized servers for optimal performance while working in a Platform as a Service (PaaS) environment
Job Responsibility
Job Responsibility
  • Support the development life cycle of platform architectural design, deployment and debugging
  • Develop & maintain sound version control best practices-based CM systems (GIT), including branching and merging strategies
  • Serve as a technical lead for an Agile team and actively participate in all Agile ceremonies
  • Ability to automate release deployments across development, test, staging, QA and production stacks using scripting languages and automation toolkits
  • Set up new sites and applications via configuration management such as Puppet and Ansible
  • Maintain/upgrade/patch tracking and documentation software (Confluence/Jira)
  • Create, assist, and implement design and maintenance web service infrastructure and deployment
  • Leverage programming languages such as Python, Ruby, Perl, and Java
  • Proficient with DevOps or Site Reliability Engineering methodologies
  • Proficient automating network infrastructure configuration using Software Defined Networking
  • Fulltime
Read More
Arrow Right

Senior Devops Engineer- Assistant Vice President

Join a world-class technology team at the heart of global finance. The Global Cu...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep, practical experience with Docker and Kubernetes for deploying and managing enterprise-scale applications
  • Hands-on proficiency with tools like Terraform or Ansible
  • Proven experience designing and maintaining sophisticated CI/CD pipelines using tools like Jenkins or TeamCity
  • Strong experience with monitoring and logging stacks such as Prometheus, Grafana, or ELK to ensure system health and performance
  • Solid understanding of cloud-native architecture and experience deploying applications on platforms like OpenShift, AWS, Azure, or GCP
  • Proficiency in Java (especially with frameworks like Spring Boot) and/or Python
  • Hands-on experience with the configuration, administration, and troubleshooting of messaging technologies such as IBM MQ, RabbitMQ, or Apache Kafka
  • Strong background in administering IBM WebSphere Application Server (WAS), including clustering and admin scripting
  • Experience with relational and/or NoSQL databases (e.g., Oracle, PostgreSQL, MongoDB)
  • Strong background in Linux/Unix administration and shell scripting
Job Responsibility
Job Responsibility
  • Design, implement, and manage robust, scalable, and secure application systems in coordination with the global technology team
  • Develop and maintain resilient CI/CD pipelines to automate builds, testing, and deployments, ensuring rapid and reliable delivery
  • Automate infrastructure provisioning and configuration management using Infrastructure as Code (IaC) principles and tools
  • Architect and manage containerized applications using Docker and Kubernetes on private and public cloud platforms (OpenShift, AWS, Azure, GCP)
  • Implement and refine observability strategies using industry-standard monitoring, logging, and tracing tools (e.g., Prometheus, Grafana, ELK)
  • Analyze and tune application performance, troubleshoot complex issues in distributed systems, and ensure high availability in an always-on service environment
  • Collaborate with cross-functional teams to integrate security best practices throughout the development lifecycle (DevSecOps)
  • Fulltime
Read More
Arrow Right

Senior DevOps Engineer, AI

LogicMonitor® is the AI-first hybrid observability platform powering the next ge...
Location
Location
India , Pune
Salary
Salary:
Not provided
logicmonitor.com Logo
LogicMonitor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience in DevOps or similar roles
  • Proven experience with AWS (preferred), and GCP in production environments
  • Strong expertise in Infrastructure as Code practices
  • Solid knowledge of Kubernetes (EKS), container orchestration, and cluster security
  • Hands-on experience with Grafana, Prometheus, and alerting/monitoring systems
  • Understanding of network connectivity over the private link endpoint, VPC, cross-account vpc connectivity, how to make things accessible internally, externally, etc.
  • Experience in deploying automated Canary and Integration testing pipelines, CI/CD pipeline etc.
  • Exposing internal self-hosted services like LangFuse via WebUI for internal users using Traefik or Ingress controller or any other tool
  • Experience in deployment of LLM related solutions that require MCP, LangFuse, Airflow, GraphDB, VectorDB, Redis etc.
  • Experience working with developers on on-demand JIT access to Prod clusters to troubleshoot/debug issues with tools like Teleport or some other
Job Responsibility
Job Responsibility
  • Multi-Cloud Enablement: Expand and manage application hosting across AWS and Google Cloud, ensuring performance, flexibility, and resilience
  • Infrastructure as Code (IaC): Develop and maintain Terraform or similar installers for Azure and GCP to fully automate infrastructure deployments
  • Cost Optimization: Design and implement AWS cost optimization strategies, including reserved instances, right-sizing, and resource efficiency initiatives
  • Cloud Security: Strengthen infrastructure security with robust access controls, encryption, monitoring, and alerting frameworks
  • Observability: Build and enhance monitoring platforms with Grafana dashboards and Prometheus alerts for real-time performance insights and proactive issue resolution
  • Kubernetes Management: Implement Role-Based Access Control (RBAC) and optimize Ingress controllers (Traefik or similar) for enhanced security and delivery resilience
  • Automation & Scripting: Create Python and Bash scripts to automate repetitive tasks, streamline workflows, and improve operational efficiency
Read More
Arrow Right

Senior Software Engineer – AWS Developer

We’re looking for a Senior Software Engineer (AWS Developer) to lead the design ...
Location
Location
United States , San Diego
Salary
Salary:
Not provided
resmed.com Logo
ResMed
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional software development experience
  • Significant hands-on work in AWS-based production systems
  • Strong proficiency in Python with deep understanding of object-oriented design, clean code principles, and design patterns
  • Expertise with AWS services, especially serverless and cloud-native architectures, including several of: Lambda, API Gateway, DynamoDB, S3, SQS/SNS, EventBridge, CloudWatch, CloudFront, RDS/Aurora, and IAM
  • Solid experience with infrastructure-as-code (e.g., Terraform, CloudFormation, CDK) and multi-environment deployments
  • Strong grasp of RESTful API design, authentication/authorization mechanisms (OAuth2, JWT), and microservices / event-driven architectures
  • Practical experience designing and optimizing data models for both NoSQL (e.g., DynamoDB, MongoDB) and relational databases (e.g., PostgreSQL, MySQL)
  • Experience with DevOps practices: CI/CD (e.g., GitHub Actions, CodePipeline), Git workflows, Docker, and monitoring/observability tools (e.g., CloudWatch, Datadog)
  • Deep understanding of software testing strategies (unit, integration, contract, and end-to-end testing) and how to embed them into pipelines (e.g., Cypress or similar)
  • Strong communication skills, a collaborative mindset, and a track record of influencing technical direction, aligning stakeholders, and mentoring other engineers
Job Responsibility
Job Responsibility
  • Lead the design, development, testing, and operation of cloud-native software systems that are reliable, scalable, secure, and cost-effective
  • Own end-to-end architecture for services and features on AWS, making informed tradeoffs between serverless, containers, data stores, and integration patterns
  • Collaborate closely with engineers, product managers, designers, and architects to translate complex requirements into clear technical designs and implementation plans
  • Set the bar for code quality, testing, and engineering practices
  • write clean, maintainable, well-tested code and help others do the same
  • Conduct and drive code and design reviews, provide constructive feedback, and foster a culture of technical excellence and continuous improvement
  • Investigate and resolve complex production issues, performance bottlenecks, and reliability problems across multiple services and components
  • Shape and evolve our CI/CD pipelines, deployment strategies, and observability (logging, metrics, tracing, alerting) to improve developer productivity and system resilience
  • Mentor and coach associate and mid-level engineers, supporting their growth through pairing, feedback, and knowledge sharing
  • Contribute to and influence technical roadmaps, standards, and best practices for our AWS usage and overall system architecture
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer

This is an exciting opportunity to join a fast-growing, venture-backed technolog...
Location
Location
United Kingdom , Cambridge
Salary
Salary:
70000.00 - 100000.00 GBP / Year
signifytechnology.com Logo
Signify Technology
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience designing, automating, and maintaining AWS infrastructure (e.g. EKS, RDS, EC2, CloudFront, VPC, IAM, Security Hub)
  • Hands-on experience building IaC pipelines using Terraform, integrated with CI/CD tools (e.g. GitHub Actions, GitLab CI, Jenkins, or AWS CodePipeline)
  • Strong knowledge of Kubernetes operations on AWS, including scaling, deployment automation, and monitoring
  • Solid foundation in Linux systems administration, networking, and cloud security principles
  • Experience with observability tooling (e.g. Prometheus, Grafana, Loki) and structured alerting practices
  • Experience managing databases, including migrations, high availability setups, backups, and disaster recovery strategies
  • Strong scripting and automation skills (e.g. Terraform, Python, Bash)
  • Excellent communication and collaboration skills, with a focus on improving engineering efficiency through automation and standardisation
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable cloud infrastructure using AWS services such as EC2, EKS, RDS/Aurora, ElastiCache, OpenSearch, and CloudFront
  • Drive the development and adoption of Kubernetes (EKS) for managing both production and internal workloads
  • Architect and implement Infrastructure-as-Code (IaC) pipelines, integrating tools such as Terraform into CI/CD workflows for provisioning, validation, and testing
  • Implement and improve zero-downtime deployment strategies (e.g. blue/green, rolling, canary), including automated rollback and recovery
  • Continuously enhance platform resilience by removing single points of failure and improving autoscaling and high availability
  • Collaborate with SRE, Security, and Engineering teams to strengthen observability, monitoring, and alerting using tools like Prometheus, Grafana, and CloudWatch
  • Partner with Security to embed best practices across IAM, secrets management, web application firewalls, and posture management
  • Optimise infrastructure performance and cloud spend through automation and cost visibility tooling
  • Participate in on-call rotations, incident reviews, and ongoing reliability improvements
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer

We are seeking a foundational Site Reliability Engineer to join our Device Insur...
Location
Location
United States , Bellevue; Atlanta; Overland Park; Frisco
Salary
Salary:
107300.00 - 193500.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience in DevOps and SRE role
  • Experience in developing and maintaining CI/CD pipelines for software deployment
  • Experience with Gitlab pipelines and helm
  • 4+ years - Implementing and managing cloud-native platforms and solutions
  • Hands-on experience with containerization (Docker, Kubernetes)
  • 4+ years Hands-on experience with monitoring/logging tools such as Splunk, Grafana, OpenTelemetry and incident management
  • 4+ years - Guiding and mentoring teams in reliability engineering practices
  • Understanding of web protocols, how full stack applications operate and data flows
  • Basic knowledge of at least one major cloud platform (AWS preferred)
  • Strong communication skills and ability to work under pressure
Job Responsibility
Job Responsibility
  • Develop, configure, and support CI/CD pipelines
  • Automate build, test, and deployment workflows to enable safe and repeatable releases
  • Integrate automated quality checks, code scanning, and deployment validations into pipelines
  • Support containerized deployments using Docker and Kubernetes
  • Use Infrastructure-as-Code (IaC) tools like Helm to manage cloud infrastructure
  • Participate in automated provisioning of environments and system configurations
  • Embed monitoring and alerting into delivery pipelines
  • Support debugging of build, deployment, and environment issues across Dev/Test/Prod systems
  • Automate processes to enhance system reliability and resilience
  • Minimize operational incidents through proactive monitoring and maintenance
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Annual bonus or periodic sales incentive or bonus
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off
  • Up to 12 paid holidays
  • Fulltime
Read More
Arrow Right

Senior Cloud Engineer

Carex is partnering with an insurance company to identify a Senior Cloud Enginee...
Location
Location
United States , Madison, WI
Salary
Salary:
Not provided
carexconsulting.com Logo
Carex Consulting Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Comprehensive knowledge of cloud computing and the ability to handle complex technical issues and problems
  • Demonstrated leadership experience, including coaching and mentoring others
  • Experience with one or more cloud computing platforms
  • Strong system administration experience
  • Good understanding of security management solutions
  • Strong collaboration skills, adaptability, resourcefulness, and follow-through
  • High proficiency with cloud-compatible monitoring tools and logging solutions
  • Technical experience with: AWS
  • Azure
  • VMware (VMC, VCDR)
Job Responsibility
Job Responsibility
  • Design and implement cloud infrastructure and cloud-based solutions
  • Analyze business requirements and define technical specifications and standards
  • Deploy and oversee implementation and integration of web-based applications while ensuring information security standards are met
  • Manage cloud security processes and maintain reports, logs, and records related to security audits
  • Monitor system uptime and performance, and troubleshoot and resolve cloud-based issues
  • Stay current on emerging cloud technologies and evaluate their value to business operations
  • Partner with development teams to provide guidance on secure coding, architecture, and technical oversight
  • Lead disaster recovery efforts and regularly test recovery procedures to support business continuity across cloud-based systems
  • Work with governance teams to implement automated processes and standard methodologies for cloud policies, roles, and identity management
  • Provide coaching and mentoring across the team
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer

As a Platform Engineer at PEXA, you will be at the heart of our global technolog...
Location
Location
Australia
Salary
Salary:
Not provided
pexa.co.uk Logo
PEXA UK
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years’ experience in platform, site reliability, DevOps or cloud infrastructure engineering roles within complex or large-scale environments
  • Strong knowledge of AWS including networking, compute, storage and identity services
  • Proficiency with Infrastructure-as-Code tools such as Terraform or CloudFormation
  • Strong automation and scripting skills in Python, NodeJS or Bash
  • Experience designing and maintaining CI/CD pipelines using tools such as GitHub Actions or ArgoCD
  • Hands-on experience with Kubernetes, Helm and service meshes such as Istio
  • Experience working with event streaming platforms such as Kafka
  • Solid understanding of system and application security best practices including IAM, secrets management and compliance frameworks
  • Experience operating Linux-based systems in production at scale
  • Knowledge and hands-on experience with generative and agentic AI tooling
Job Responsibility
Job Responsibility
  • Designing and evolving the foundational platform capabilities that power secure, scalable and efficient product delivery
  • Build and automate robust cloud infrastructure across our AWS environments using Infrastructure-as-Code and modern automation frameworks
  • Design and enhance CI and CD pipelines to improve delivery velocity, reliability and observability
  • Partner closely with software delivery squads, security, and resiliency and observability teams to strengthen our platform’s performance, security and developer experience
  • Mentor Associate Platform Engineers and Graduates, contribute to engineering forums and architecture reviews, and help shape the future direction of our platform roadmap
  • Designing, delivering and continuously improving scalable, resilient and secure platform infrastructure across PEXA’s global cloud environments
  • Champion self-service capabilities that empower delivery squads and reduce operational bottlenecks
  • Embed monitoring, alerting and incident response best practices
  • Support strategic initiatives such as cloud cost optimisation, architecture standardisation and technology modernisation
  • Drive continuous improvement across testing, observability and platform performance
What we offer
What we offer
  • Quarterly wellness days to recharge
  • Four weeks Workcation per year – work from an approved country
  • Take the opportunity to purchase up to four weeks additional annual leave per year
  • Learn from the best and upskill with PEXA Academy certifications and grow your career
  • Fulltime
Read More
Arrow Right