Senior AWS DevOps Engineer (Test & Infrastructure Resilience) Job at Randstad (City of London)

Senior DevOps Engineer

Riverstone Enterprise Solutions, an Envision Innovative Solutions Company, deliv...

Location

United States , Annapolis Junction

Salary:

170000.00 - 190000.00 USD / Year

Riverstone Enterprise Solutions

Expiration Date

Until further notice

Requirements

Bachelor's degree or higher in Engineering (i.e. Computer, Electrical, Mechanical, Aerospace, etc.) or Computer Science with a minimum of five (5) years of related experience
Five (5) years of additional DevOps experience may be substituted for a bachelor's degree
Must be fluent with Git
Strong knowledge of Linux and Linux environments (RHEL 617/8, RHCSNRHCE CentOS)
Experience with Windows system administration, system monitoring, instrumentation, resiliency and performance
Experience integrating Jenkins/Bamboo Docker, and Kubernetes for automated deployment preferred
Experience with caching technologies (Memcache, Active MQ, Redis, APC, etc.)
Experience with MySQL (Clusters, Replication, and Tuning) and Elasticsearch (Kibana a plus)
Knowledge of security practices, networking protocols, firewalls, PCI compliance etc.
Experience managing/monitoring AWS cloud and virtualized servers for optimal performance while working in a Platform as a Service (PaaS) environment

Job Responsibility

Support the development life cycle of platform architectural design, deployment and debugging
Develop & maintain sound version control best practices-based CM systems (GIT), including branching and merging strategies
Serve as a technical lead for an Agile team and actively participate in all Agile ceremonies
Ability to automate release deployments across development, test, staging, QA and production stacks using scripting languages and automation toolkits
Set up new sites and applications via configuration management such as Puppet and Ansible
Maintain/upgrade/patch tracking and documentation software (Confluence/Jira)
Create, assist, and implement design and maintenance web service infrastructure and deployment
Leverage programming languages such as Python, Ruby, Perl, and Java
Proficient with DevOps or Site Reliability Engineering methodologies
Proficient automating network infrastructure configuration using Software Defined Networking

Fulltime

Senior Devops Engineer- Assistant Vice President

Join a world-class technology team at the heart of global finance. The Global Cu...

Location

India , Pune

Salary:

Not provided

Citi

Expiration Date

Until further notice

Requirements

Deep, practical experience with Docker and Kubernetes for deploying and managing enterprise-scale applications
Hands-on proficiency with tools like Terraform or Ansible
Proven experience designing and maintaining sophisticated CI/CD pipelines using tools like Jenkins or TeamCity
Strong experience with monitoring and logging stacks such as Prometheus, Grafana, or ELK to ensure system health and performance
Solid understanding of cloud-native architecture and experience deploying applications on platforms like OpenShift, AWS, Azure, or GCP
Proficiency in Java (especially with frameworks like Spring Boot) and/or Python
Hands-on experience with the configuration, administration, and troubleshooting of messaging technologies such as IBM MQ, RabbitMQ, or Apache Kafka
Strong background in administering IBM WebSphere Application Server (WAS), including clustering and admin scripting
Experience with relational and/or NoSQL databases (e.g., Oracle, PostgreSQL, MongoDB)
Strong background in Linux/Unix administration and shell scripting

Job Responsibility

Design, implement, and manage robust, scalable, and secure application systems in coordination with the global technology team
Develop and maintain resilient CI/CD pipelines to automate builds, testing, and deployments, ensuring rapid and reliable delivery
Automate infrastructure provisioning and configuration management using Infrastructure as Code (IaC) principles and tools
Architect and manage containerized applications using Docker and Kubernetes on private and public cloud platforms (OpenShift, AWS, Azure, GCP)
Implement and refine observability strategies using industry-standard monitoring, logging, and tracing tools (e.g., Prometheus, Grafana, ELK)
Analyze and tune application performance, troubleshoot complex issues in distributed systems, and ensure high availability in an always-on service environment
Collaborate with cross-functional teams to integrate security best practices throughout the development lifecycle (DevSecOps)

Fulltime

Senior DevOps Engineer, AI

LogicMonitor® is the AI-first hybrid observability platform powering the next ge...

Location

India , Pune

Salary:

Not provided

LogicMonitor

Expiration Date

Until further notice

Requirements

4+ years of experience in DevOps or similar roles
Proven experience with AWS (preferred), and GCP in production environments
Strong expertise in Infrastructure as Code practices
Solid knowledge of Kubernetes (EKS), container orchestration, and cluster security
Hands-on experience with Grafana, Prometheus, and alerting/monitoring systems
Understanding of network connectivity over the private link endpoint, VPC, cross-account vpc connectivity, how to make things accessible internally, externally, etc.
Experience in deploying automated Canary and Integration testing pipelines, CI/CD pipeline etc.
Exposing internal self-hosted services like LangFuse via WebUI for internal users using Traefik or Ingress controller or any other tool
Experience in deployment of LLM related solutions that require MCP, LangFuse, Airflow, GraphDB, VectorDB, Redis etc.
Experience working with developers on on-demand JIT access to Prod clusters to troubleshoot/debug issues with tools like Teleport or some other

Job Responsibility

Multi-Cloud Enablement: Expand and manage application hosting across AWS and Google Cloud, ensuring performance, flexibility, and resilience
Infrastructure as Code (IaC): Develop and maintain Terraform or similar installers for Azure and GCP to fully automate infrastructure deployments
Cost Optimization: Design and implement AWS cost optimization strategies, including reserved instances, right-sizing, and resource efficiency initiatives
Cloud Security: Strengthen infrastructure security with robust access controls, encryption, monitoring, and alerting frameworks
Observability: Build and enhance monitoring platforms with Grafana dashboards and Prometheus alerts for real-time performance insights and proactive issue resolution
Kubernetes Management: Implement Role-Based Access Control (RBAC) and optimize Ingress controllers (Traefik or similar) for enhanced security and delivery resilience
Automation & Scripting: Create Python and Bash scripts to automate repetitive tasks, streamline workflows, and improve operational efficiency

Senior Software Engineer – AWS Developer

We’re looking for a Senior Software Engineer (AWS Developer) to lead the design ...

Location

United States , San Diego

Salary:

Not provided

ResMed

Expiration Date

Until further notice

Requirements

5+ years of professional software development experience
Significant hands-on work in AWS-based production systems
Strong proficiency in Python with deep understanding of object-oriented design, clean code principles, and design patterns
Expertise with AWS services, especially serverless and cloud-native architectures, including several of: Lambda, API Gateway, DynamoDB, S3, SQS/SNS, EventBridge, CloudWatch, CloudFront, RDS/Aurora, and IAM
Solid experience with infrastructure-as-code (e.g., Terraform, CloudFormation, CDK) and multi-environment deployments
Strong grasp of RESTful API design, authentication/authorization mechanisms (OAuth2, JWT), and microservices / event-driven architectures
Practical experience designing and optimizing data models for both NoSQL (e.g., DynamoDB, MongoDB) and relational databases (e.g., PostgreSQL, MySQL)
Experience with DevOps practices: CI/CD (e.g., GitHub Actions, CodePipeline), Git workflows, Docker, and monitoring/observability tools (e.g., CloudWatch, Datadog)
Deep understanding of software testing strategies (unit, integration, contract, and end-to-end testing) and how to embed them into pipelines (e.g., Cypress or similar)
Strong communication skills, a collaborative mindset, and a track record of influencing technical direction, aligning stakeholders, and mentoring other engineers

Job Responsibility

Lead the design, development, testing, and operation of cloud-native software systems that are reliable, scalable, secure, and cost-effective
Own end-to-end architecture for services and features on AWS, making informed tradeoffs between serverless, containers, data stores, and integration patterns
Collaborate closely with engineers, product managers, designers, and architects to translate complex requirements into clear technical designs and implementation plans
Set the bar for code quality, testing, and engineering practices
write clean, maintainable, well-tested code and help others do the same
Conduct and drive code and design reviews, provide constructive feedback, and foster a culture of technical excellence and continuous improvement
Investigate and resolve complex production issues, performance bottlenecks, and reliability problems across multiple services and components
Shape and evolve our CI/CD pipelines, deployment strategies, and observability (logging, metrics, tracing, alerting) to improve developer productivity and system resilience
Mentor and coach associate and mid-level engineers, supporting their growth through pairing, feedback, and knowledge sharing
Contribute to and influence technical roadmaps, standards, and best practices for our AWS usage and overall system architecture

Fulltime

Senior Platform Engineer

This is an exciting opportunity to join a fast-growing, venture-backed technolog...

Location

United Kingdom , Cambridge

Salary:

70000.00 - 100000.00 GBP / Year

Signify Technology

Expiration Date

Until further notice

Requirements

Proven experience designing, automating, and maintaining AWS infrastructure (e.g. EKS, RDS, EC2, CloudFront, VPC, IAM, Security Hub)
Hands-on experience building IaC pipelines using Terraform, integrated with CI/CD tools (e.g. GitHub Actions, GitLab CI, Jenkins, or AWS CodePipeline)
Strong knowledge of Kubernetes operations on AWS, including scaling, deployment automation, and monitoring
Solid foundation in Linux systems administration, networking, and cloud security principles
Experience with observability tooling (e.g. Prometheus, Grafana, Loki) and structured alerting practices
Experience managing databases, including migrations, high availability setups, backups, and disaster recovery strategies
Strong scripting and automation skills (e.g. Terraform, Python, Bash)
Excellent communication and collaboration skills, with a focus on improving engineering efficiency through automation and standardisation

Job Responsibility

Design, build, and maintain scalable cloud infrastructure using AWS services such as EC2, EKS, RDS/Aurora, ElastiCache, OpenSearch, and CloudFront
Drive the development and adoption of Kubernetes (EKS) for managing both production and internal workloads
Architect and implement Infrastructure-as-Code (IaC) pipelines, integrating tools such as Terraform into CI/CD workflows for provisioning, validation, and testing
Implement and improve zero-downtime deployment strategies (e.g. blue/green, rolling, canary), including automated rollback and recovery
Continuously enhance platform resilience by removing single points of failure and improving autoscaling and high availability
Collaborate with SRE, Security, and Engineering teams to strengthen observability, monitoring, and alerting using tools like Prometheus, Grafana, and CloudWatch
Partner with Security to embed best practices across IAM, secrets management, web application firewalls, and posture management
Optimise infrastructure performance and cloud spend through automation and cost visibility tooling
Participate in on-call rotations, incident reviews, and ongoing reliability improvements

Fulltime

Senior Platform Engineer

We are seeking a foundational Site Reliability Engineer to join our Device Insur...

Location

United States , Bellevue; Atlanta; Overland Park; Frisco

Salary:

107300.00 - 193500.00 USD / Year

T-Mobile

Expiration Date

Until further notice

Requirements

4+ years of experience in DevOps and SRE role
Experience in developing and maintaining CI/CD pipelines for software deployment
Experience with Gitlab pipelines and helm
4+ years - Implementing and managing cloud-native platforms and solutions
Hands-on experience with containerization (Docker, Kubernetes)
4+ years Hands-on experience with monitoring/logging tools such as Splunk, Grafana, OpenTelemetry and incident management
4+ years - Guiding and mentoring teams in reliability engineering practices
Understanding of web protocols, how full stack applications operate and data flows
Basic knowledge of at least one major cloud platform (AWS preferred)
Strong communication skills and ability to work under pressure

Job Responsibility

Develop, configure, and support CI/CD pipelines
Automate build, test, and deployment workflows to enable safe and repeatable releases
Integrate automated quality checks, code scanning, and deployment validations into pipelines
Support containerized deployments using Docker and Kubernetes
Use Infrastructure-as-Code (IaC) tools like Helm to manage cloud infrastructure
Participate in automated provisioning of environments and system configurations
Embed monitoring and alerting into delivery pipelines
Support debugging of build, deployment, and environment issues across Dev/Test/Prod systems
Automate processes to enhance system reliability and resilience
Minimize operational incidents through proactive monitoring and maintenance

What we offer

Competitive base salary and compensation package
Annual stock grant
Employee stock purchase plan
401(k)
Access to free, year-round money coaches
Annual bonus or periodic sales incentive or bonus
Medical, dental and vision insurance
Flexible spending account
Paid time off
Up to 12 paid holidays

Fulltime

Senior Cloud Engineer

Carex is partnering with an insurance company to identify a Senior Cloud Enginee...

Location

United States , Madison, WI

Salary:

Not provided

Carex Consulting Group

Expiration Date

Until further notice

Requirements

Comprehensive knowledge of cloud computing and the ability to handle complex technical issues and problems
Demonstrated leadership experience, including coaching and mentoring others
Experience with one or more cloud computing platforms
Strong system administration experience
Good understanding of security management solutions
Strong collaboration skills, adaptability, resourcefulness, and follow-through
High proficiency with cloud-compatible monitoring tools and logging solutions
Technical experience with: AWS
Azure
VMware (VMC, VCDR)

Job Responsibility

Design and implement cloud infrastructure and cloud-based solutions
Analyze business requirements and define technical specifications and standards
Deploy and oversee implementation and integration of web-based applications while ensuring information security standards are met
Manage cloud security processes and maintain reports, logs, and records related to security audits
Monitor system uptime and performance, and troubleshoot and resolve cloud-based issues
Stay current on emerging cloud technologies and evaluate their value to business operations
Partner with development teams to provide guidance on secure coding, architecture, and technical oversight
Lead disaster recovery efforts and regularly test recovery procedures to support business continuity across cloud-based systems
Work with governance teams to implement automated processes and standard methodologies for cloud policies, roles, and identity management
Provide coaching and mentoring across the team

Fulltime

Senior Platform Engineer

As a Platform Engineer at PEXA, you will be at the heart of our global technolog...

Location

Australia

Salary:

Not provided

PEXA UK

Expiration Date

Until further notice

Requirements

3+ years’ experience in platform, site reliability, DevOps or cloud infrastructure engineering roles within complex or large-scale environments
Strong knowledge of AWS including networking, compute, storage and identity services
Proficiency with Infrastructure-as-Code tools such as Terraform or CloudFormation
Strong automation and scripting skills in Python, NodeJS or Bash
Experience designing and maintaining CI/CD pipelines using tools such as GitHub Actions or ArgoCD
Hands-on experience with Kubernetes, Helm and service meshes such as Istio
Experience working with event streaming platforms such as Kafka
Solid understanding of system and application security best practices including IAM, secrets management and compliance frameworks
Experience operating Linux-based systems in production at scale
Knowledge and hands-on experience with generative and agentic AI tooling

Job Responsibility

Designing and evolving the foundational platform capabilities that power secure, scalable and efficient product delivery
Build and automate robust cloud infrastructure across our AWS environments using Infrastructure-as-Code and modern automation frameworks
Design and enhance CI and CD pipelines to improve delivery velocity, reliability and observability
Partner closely with software delivery squads, security, and resiliency and observability teams to strengthen our platform’s performance, security and developer experience
Mentor Associate Platform Engineers and Graduates, contribute to engineering forums and architecture reviews, and help shape the future direction of our platform roadmap
Designing, delivering and continuously improving scalable, resilient and secure platform infrastructure across PEXA’s global cloud environments
Champion self-service capabilities that empower delivery squads and reduce operational bottlenecks
Embed monitoring, alerting and incident response best practices
Support strategic initiatives such as cloud cost optimisation, architecture standardisation and technology modernisation
Drive continuous improvement across testing, observability and platform performance

What we offer

Quarterly wellness days to recharge
Four weeks Workcation per year – work from an approved country
Take the opportunity to purchase up to four weeks additional annual leave per year
Learn from the best and upskill with PEXA Academy certifications and grow your career

Fulltime

Select Country

Senior AWS DevOps Engineer (Test & Infrastructure Resilience)

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?