Cloud SRE Intern Job at Alliance Automotive UK LV Ltd (Birmingham)

Senior Site Reliability Engineer (SRE) – Cloud & Distributed Systems

We are seeking an experienced Senior Site Reliability Engineer (SRE) to design, ...

Location

United States , Austin

Salary:

Not provided

Dutech Systems

Expiration Date

Until further notice

Requirements

8+ years of experience in SRE, DevOps, or Systems Engineering
Strong expertise in Linux/Unix systems and system internals
Proficiency in at least one programming/scripting language (Python, Go, Java, Bash)
Experience designing and operating distributed systems
Hands-on experience with cloud platforms (AWS or GCP)
Experience with Docker and Kubernetes
Strong understanding of monitoring, alerting, and logging concepts
Experience managing SLIs, SLOs, and error budgets
Experience with incident management and RCA processes

Job Responsibility

Design, implement, and manage highly available, distributed systems
Maintain and optimize cloud infrastructure (AWS/GCP)
Develop automation scripts using Python, Go, Java, or Bash
Manage containerized environments using Docker and Kubernetes
Define and monitor SLIs, SLOs, and error budgets
Implement monitoring, logging, and alerting solutions
Lead incident management, root cause analysis (RCA), and postmortems
Ensure system security and compliance within operational workflows
Improve system reliability through performance tuning and optimization
Collaborate with engineering teams to enhance deployment and release processes

Engineering Intern, SRE

This is a 12 week internship program beginning on May 26th 2026 or June 22nd 202...

Location

United States , Sunnyvale

Salary:

Not provided

Illumio

Expiration Date

Until further notice

Requirements

Currently enrolled in a full-time Bachelors degree program in Computer Science, Software Engineering, or a related field, with an expected graduation date in Winter 2026/Spring 2027
Familiarity with at least one programming or scripting language (Python, Java, Go, or similar)
Knowledge in cloud technologies such as AWS, Azure, or GCP
Basic understanding of Linux/Unix systems and networking concepts
Curiosity about automation, Infrastructure as Code (eg. Terraform), and containerization (eg. Docker, Kubernetes)
Eagerness to learn, ask questions, and contribute in a collaborative environment

Job Responsibility

Learn how to design and support scalable and reliable cloud infrastructure
Assist engineers in monitoring, troubleshooting, and improving platform and application reliability
Gain exposure to Kubernetes, Docker, and cloud-native technologies
Help automate tasks and processes using scripting languages (Python, Go, or others)
Work with engineers to strengthen security, compliance, and observability of systems
Shadow engineers in incident management and root cause analysis
Participate in team meetings and collaborate on projects to support ongoing initiatives

What we offer

Medical, Dental, Vision Coverage
Health and Dependent Savings Accounts
Life and Disability Programs
Paid Parental Leave
Voluntary Benefit Programs
Company Sponsored Wellness Program
Wellness Reimbursement Program
Retirement Savings
Equity Opportunities
Paid time off and Paid Holidays

Fulltime

Cloud Solution Architecture - SRE

Do you have a passion for partnering with fast‑growing Software Development Comp...

Location

India , Multiple Locations

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelors Degree in Computer Science, Information Technology, Engineering, Business, Liberal Arts, or related field AND 7+ years experience in cloud/infrastructure technologies, information technology (IT) consulting/support, systems administration, network operations, software development/support, technology solutions, practice development, architecture, and/or consulting OR equivalent experience
Deep proficiency in cloud, software, ISV, or consulting ecosystems
Strong technical depth, including level‑500 expertise in at least one Azure domain, with broad familiarity across the Azure platform
Experience with AI services and the Microsoft ecosystem, including Security, M365, Data, and AI platforms
Proven ability to design, operate, and troubleshoot complex, highly available, mission‑critical systems and to lead customer escalations effectively
Demonstrated experience with monitoring, observability, and reliability engineering practices in large‑scale distributed systems
Software development experience, including AI‑enabled solutions, and strong understanding of DevOps and CI/CD practices
Exceptional communication, stakeholder management, and relationship‑building skills

Job Responsibility

Customer Advocacy & Technical Leadership: Actively listen and empathize with customers to anticipate their technical and business needs, advocate for them within Microsoft, and measure success through customer satisfaction, system reliability, and operational excellence
Serve as a senior technical leader, driving vision for customers and internal teams
pilot new operating models, AI‑enabled capabilities, and data‑driven practices
scale proven architectures and patterns
and mentor others to elevate technical depth across the organization
Resiliency, Reliability & Operational Excellence: Apply a reliability‑first mindset, designing and validating highly available, fault‑tolerant systems through proactive testing, failure simulations, chaos engineering, and resilience reviews
Guide customers in defining and achieving SLOs, SLIs, and error budgets, with clear accountability and measurable outcomes
Drive continuous improvement by going beyond traditional root‑cause analysis to understand systemic, architectural, and organizational contributors to incidents
Monitoring, Observability & Intelligent Operations: Lead adoption of modern monitoring and observability practices, including distributed tracing, metrics, logs, and end‑to‑end service health visibility across complex, distributed systems
Correlate telemetry, customer signals, and platform events to produce actionable insights, risk identification, and proactive recommendations

Fulltime

New

Software Engineer

Wells Fargo is seeking a Software Engineer to join the Wealth & Investment Manag...

Location

United States , Minneapolis; Charlotte

Salary:

37.02 - 63.94 USD / Hour

Wells Fargo

Expiration Date

June 17, 2026

Requirements

2+ years of software engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education

Job Responsibility

Participate in low to moderately complex initiatives and projects associated with the technology domain, including installation, upgrades, and deployment efforts
Identify opportunities for service quality and availability improvements within the technology domain environment
Design, code, test, debug, and document for low to moderately complex projects and programs associated with technology domain, including upgrades and deployments
Review and analyze technical assignments or challenges that are related to low to medium risk deliverables and that require research, evaluation, and selection of alternative technology domains
Present recommendations for resolving issues or may escalate issues as needed to meet established service level agreements
Exercise some independent judgment while also developing understanding of given technology domain in reference to security and compliance requirements
Provide information to technology colleagues, internal partners, and stakeholders

What we offer

Health benefits
401(k) Plan
Paid time off
Disability benefits
Life insurance, critical illness insurance, and accident insurance
Parental leave
Critical caregiving leave
Discounts and savings
Commuter benefits
Tuition reimbursement

Fulltime

New

Public Cloud Network Lead

Join us at Barclays as a Public Cloud Network Lead, to architect, implement and ...

Location

United Kingdom , London; Glasgow

Salary:

Not provided

Barclays

Expiration Date

Until further notice

Requirements

Multi-Cloud Network Architecture & Hybrid Connectivity – Lead enterprise-scale network design across AWS, Azure, and GCP, delivering hybrid connectivity, encrypted interconnects (MACsec/IPsec), circuit provider management, and legacy infrastructure remediation through Infrastructure as Code
Network Security & Compliance – Implement Zero Trust segmentation, deploy cloud-native firewall controls, and ensure compliance with PCI-DSS, DORA, and internal governance frameworks
Strategic Planning, Consultancy & Stakeholder Engagement – Define cloud network strategy, evaluate emerging technologies, produce ADRs and HLD/LLD designs, lead Landing Zone design, and influence senior stakeholders on risk, strategy, and cost optimisation
Operational Excellence & Incident Response – Own incident escalation, SLA/SLO monitoring, flow analysis, and SRE enablement to drive network operational excellence
Automation, IaC & DevOps Practices – Build reusable Terraform, CloudFormation, and Bicep IaC with CI/CD pipelines and Python/Bash automation for standardised network provisioning

Job Responsibility

architect, implement and operate enterprise-grade multi-cloud network infrastructure at scale for Barclays
design secure, high-performance hybrid and multi-cloud architectures connecting thousands of cloud accounts across global regions to Barclays' on-premises infrastructure
work horizontally across GTIS Networks, SRE, DevOps, Product, and senior leadership to deliver strategic initiatives and resolve complex technical debt
mentor engineers and serving as the escalation point for critical network incidents
Build Engineering: Development, delivery, and maintenance of high-quality infrastructure solutions to fulfil business requirements
Incident Management: Monitoring of IT infrastructure and system performance to measure, identify, address, and resolve any potential issues, vulnerabilities, or outages
Automation: Development and implementation of automated tasks and processes to improve efficiency and reduce manual intervention
Security: Implementation of a secure configuration and measures to protect infrastructure against cyber-attacks, vulnerabilities, and other security threats
Teamwork: Cross-functional collaboration with product managers, architects, and other engineers to define IT Infrastructure requirements, devise solutions, and ensure seamless integration and alignment with business objectives
Learning: Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth

What we offer

Competitive holiday allowance
Life assurance
Private medical care
Pension contribution

Fulltime

New

Staff Engineer, Site Reliability Engineer

OnStar is a cornerstone of General Motors' connected services—bringing safety, s...

Location

Ireland , Dublin

Salary:

Not provided

General Motors

Expiration Date

Until further notice

Requirements

8+ years in SRE, DevOps, or systems engineering, including experience managing or mentoring high-impact teams
Track record of building and maintaining high-scale, cloud-native systems (preferably AWS, GCP, or Azure)
Expertise in container orchestration and deployment strategies using Kubernetes and CI/CD pipelines
Proficiency in Python, Go, or Java, with strong code review and readability standards
Experience leading cross-functional infrastructure projects, configuration strategy, or organizational tooling initiatives
Ability to think and act under pressure
Strong communication skills

Job Responsibility

Lead the design and implementation of scalable, fault-tolerant, and observable infrastructure supporting OnStar mobile and web experiences, in-vehicle services, and the backend platforms and integrations that power them
Champion configuration management, infrastructure refactoring, and testing frameworks to strengthen system resilience
Partner across SRE, development, and product teams to improve service reliability, deployment safety, and incident response practices
Drive internal consultation and strategic planning on reliability standards for new OnStar capabilities, customer-facing releases, and platform initiatives
Define and evolve observability strategy using tools such as Prometheus, Grafana, and Datadog, with automated alerting and actionable SLO dashboards
Own and improve on-call practices, manage blameless postmortems, and guide root cause analysis to eliminate recurring failures
Mentor engineers and help shape a high-performance culture rooted in extreme ownership and operational excellence
Support compliance and privacy-driven engineering initiatives across connected services, with potential crossover into areas like data retention and safety certification tooling

Fulltime

New

Svp Of Infrastructure & Cloud Operations

Our client is a global game monetisation and payments platform, headquartered in...

Location

United States , US Remote

Salary:

300000.00 - 325000.00 USD / Year

Signify Technology

Expiration Date

Until further notice

Requirements

Minimum 10 years of experience leading infrastructure and operations across both private and public cloud platforms — public-only experience will not be considered
Strong GCP experience
familiarity with multi-cloud environments essential
Deep expertise in SLA and SLO management — the team is measured on availability, stability, and performance
Proven leadership of DevOps, SRE, Networking, and Database Management functions with direct cross-disciplinary team responsibility
Demonstrated experience reporting directly to a CTO or CIO
Strong background in AI-powered automation, with hands-on experience implementing intelligent systems for monitoring, alerting, and incident resolution
Experience managing and developing global, internationally distributed teams across different time zones and cultures
Flexibility to work across time zones — their teams span up to 15 hours of difference (Malaysia, China, North America)
Willingness to travel to international office locations, including Kuala Lumpur and/or Baku

Job Responsibility

Define and execute the vision, mission, and strategic roadmap for global infrastructure and operations, aligned with business priorities and technology goals
Build and scale high-performing teams across DevOps, SRE, Networking, and Database disciplines
Oversee global infrastructure operations across multiple time zones and cultural environments
Manage hybrid and multi-cloud environments (GCP preferred), including compute, storage, network, and security
Develop and implement robust automation strategies using AI/ML to reduce toil, accelerate issue resolution, and improve system reliability
Lead initiatives in observability, CI/CD security, and proactive incident prevention
Ensure infrastructure is secure, compliant, and resilient, with robust business continuity and disaster recovery practices
Partner with internal stakeholders across Product, Engineering, and Security to enable product velocity and stability
Own the hiring, mentoring, and development of global infrastructure teams with a focus on continuous improvement
Develop and manage the infrastructure budget, focusing on cost optimisation and resource forecasting

What we offer

100% company-paid medical, dental, and vision plans
Unlimited flexible time off
Personalised career roadmap and professional development investment
High-impact, company-wide scope with genuine leadership visibility
A collaborative, globally diverse team culture with a strong focus on job satisfaction and growth

Fulltime

New

Sr Principal Site Reliability Engineer (Sovereign Cloud)

Palo Alto Networks runs a large infrastructure and is one of the largest GCP cus...

Location

Bulgaria , Sofia

Salary:

Not provided

Palo Alto Networks

Expiration Date

Until further notice

Requirements

10+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering
7+ years building high availability, scalable cloud-native applications on AWS and GCP
BS or MS in Computer Science, a related field, or equivalent professional experience required
Expertise in configuration management with a framework such as Ansible, Terraform, Helm
Passion for infrastructure and monitoring as code
Solid experience in container workloads and Kubernetes
Familiarity with PKI concepts, Networking concepts
In-depth knowledge of different security controls ( app-id, user-id, security profile, url category, content, ssl decryption, firewall MFA etc)
Linux administration, internals, and network troubleshooting
Proficiency with programming languages like Golang or Python along with shell scripting to automate tasks

Job Responsibility

Contribute to the success of SRE and DevOps
Develop expertise in new technologies
Work with developers, researchers, data scientists, and security experts
Design, build and operate reliable, secure Cloud infrastructure
Ensure that applications are production-ready, scalable, and reliable
Develop tools and automation frameworks
Automate robust deployment of robust services
Orchestrate end-to-end monitoring and alerting
Participate in on-call rotations to support critical business and production systems
Lead root cause analysis of critical business and production issues

Fulltime

Select Country

Cloud SRE Intern

Job Description

Job Responsibility

Requirements

Looking for more opportunities?

Cloud SRE Intern

Senior Site Reliability Engineer (SRE) – Cloud & Distributed Systems

Engineering Intern, SRE

Cloud Solution Architecture - SRE

Software Engineer

Public Cloud Network Lead

Staff Engineer, Site Reliability Engineer

Svp Of Infrastructure & Cloud Operations

Sr Principal Site Reliability Engineer (Sovereign Cloud)

Our AI answers in your language