CrawlJobs Logo

Lead Reliability Engineer

avarafoods.co.uk Logo

Avara Foods

Location Icon

Location:
United Kingdom , Hereford

Category Icon
Category:

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

41000.00 - 44000.00 GBP / Year

Job Description:

This is a key leadership role, responsible for driving reliability, asset performance, and long-term engineering excellence across the site. You’ll lead a small team of Reliability Engineers and work closely with the Engineering Reliability manager and the maintenance operations team, to deliver strategic reliability improvements and optimise asset lifecycle management.

Job Responsibility:

  • Lead and manage the Reliability Team, ensuring effective delivery of asset performance, maintenance planning, and reliability projects
  • Act as lead for reliability and asset care, championing continuous improvement across site
  • Develop and sustain proactive maintenance strategies, including predictive and condition-based maintenance, to improve equipment availability and reduce unplanned downtime
  • Analyse performance and downtime data to identify and eliminate root causes of equipment failure
  • Collaborate with the wider Engineer Team to coordinate planned maintenance, improvement activities, and engineering support during production
  • Support the Engineering Reliability Manager in the development and execution of the site’s maintenance and reliability roadmap
  • Maintain the office, reliability- and outside areas to high standard, ensuring regular checks are conducted and satisfactory feedback is received from GMP/WPW audits
  • Lead cross-functional reliability reviews, ensuring effective communication between Engineering, Operations, Planning, and technical teams
  • Manage contractor and OEM support, ensuring all work complies with site safety, technical, and legislative standards
  • Ensure all rectification actions identified on service reports are followed up and completed in a timely manner
  • Identify the training needs of the team in order to coach, train and develop the team to provide the opportunities to develop to their potential
  • Oversee asset care documentation, including PM reviews, calibration schedules, and PUWER assessments
  • Lead or support capital and improvement projects, ensuring reliability principles are built into design and implementation

Requirements:

  • HND or above in Engineering (Mechanical, Electrical, or related discipline)
  • Proven experience in reliability, maintenance, or engineering leadership in an FMCG or manufacturing environment
  • Strong understanding of maintenance systems (CMMS), asset management, and performance metrics (OEE, MTBF, MTTR)
  • Demonstrable leadership, coaching, and influencing skills
  • Excellent analytical, problem-solving, and communication abilities
  • Ability to manage multiple priorities and work effectively across teams

Nice to have:

  • Experience leading engineering teams or projects in a large-scale manufacturing site
  • Reliability tools knowledge (RCM, FMEA, TPM)
  • NEBOSH or IOSH qualification
  • CAD experience and contractor management exposure
What we offer:
  • 6% Pension
  • 31 Days Holiday
  • Life Assurance
  • Private Medical Health Cover
  • Subsidised Canteen
  • Free Staff Parking
  • Wellbeing and lifestyle benefits, including discounts with major retailers and access to health resources

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Lead Reliability Engineer

Lead Site Reliability Engineer

Groupon is a marketplace where customers discover new experiences and services e...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
groupon.com Logo
Groupon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years in systems engineering
  • at least 5+ years in SRE or DevOps roles
  • expertise in cloud platforms (GCP, AWS) and container orchestration (Kubernetes, Docker)
  • proficiency in programming and scripting languages like Python, Go, and Bash
  • advanced knowledge of Infrastructure as Code (IaC) tools such as Terraform and Ansible
  • deep understanding of networking, DNS, load balancing, and security principles
  • proven track record of managing high-availability systems in demanding environments
  • exceptional analytical and problem-solving skills
Job Responsibility
Job Responsibility
  • Architect and maintain fault-tolerant systems, ensuring uptime SLAs of 99.9% or higher
  • drive automation in infrastructure management and deployment using Terraform, Ansible, Kubernetes, and similar tools
  • create and optimize CI/CD pipelines to ensure reliable, secure, and efficient software delivery
  • build and enhance comprehensive observability solutions, including monitoring, logging, and alerting systems using Prometheus, Grafana, and the ELK stack
  • collaborate with stakeholders to define and achieve SLIs, SLOs, and error budgets aligned with business needs
  • lead incident response during on-call rotations, ensuring rapid resolution and root cause analysis for critical issues
  • design and execute performance testing, capacity planning, and scalability strategies for evolving workloads
  • proactively identify and resolve bottlenecks, increasing system performance and developer efficiency
  • mentor junior engineers, fostering a collaborative and growth-oriented team environment
  • guide architectural decisions that drive innovation and enhance system reliability
What we offer
What we offer
  • The opportunity to work with cutting-edge technologies in a transformative environment
  • a collaborative and innovative work values alignment that values your expertise and contributions
  • professional growth and leadership development pathways tailored to your aspirations
  • a chance to leave a lasting impact by shaping the future of reliable and scalable systems
Read More
Arrow Right

Site Reliability Engineering Support Lead

Site Reliability Engineering Support Lead role focused on application support, d...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Solid SRE process experience
  • 5+ years of Leading high-performance, 24x7, DevOps or SysOps team
  • Proficiency in Windows administration, Office 365, Exchange, SharePoint, Active Directory, Backup, Networking and Infrastructure
  • Experience with Microsoft OS Windows & Server
  • Experience in ticket tracking and resolving on time
  • Hands-on experience on ticketing tools (ServiceNow)
  • Excellent verbal, written, presentation and interpersonal communication skills
  • Ability to make complex technical matters easy-to-comprehend for non-technical persons.
Job Responsibility
Job Responsibility
  • Taking end-to-end Ownership of Application Support for Production Systems Issues resolution
  • Implementing, monitoring, and maintaining CI/CD frameworks
  • Developing new capabilities, coordinating implementation across a large number of teams including infrastructure, developer tools and information security
  • Influencing a culture of Site Reliability Engineering. Engaging in training and mentoring to help develop other engineers with SRE mind set
  • Providing the first line of after-deployment technical support at L1 and L2 level for applications and and/or associated production systems diagnostics, and network health monitoring
  • Coordination and/or for deploying hands-on fixes, patches and software updates at the application level, and as appropriate at the network level
  • Managing a team of technical support engineers who provide technical support to users
  • Escalating complex problems to the L3 level of expertise within organization, along with observations from investigative and diagnostic assessments
  • Co-ordinating in the investigation of repeated technical issues affecting user system and seeing through to resolution
  • Escalating, resolving, guiding team, and tracking production incidents to closure
What we offer
What we offer
  • Competitive base salary (which is annually reviewed)
  • Hybrid working model (up to 2 days working at home per week)
  • Additional benefits to support you and your family to be well, live well and save well.
  • Fulltime
Read More
Arrow Right

Lead Site Reliability Engineer

As a Lead Site Reliability Engineer (SRE), you will ensure the stability, perfor...
Location
Location
United States
Salary
Salary:
184000.00 - 229000.00 USD / Year
https://corelight.com/ Logo
Corelight
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience building and operating FedRAMP environments or similarly regulated systems
  • Expertise in AWS services (e.g., EC2, S3, RDS, Lambda, ECS/EKS, Glue, EMR, Redshift, OpenSearch, VPC)
  • Deep understanding of the FedRAMP framework, controls, and compliance requirements
  • Proficiency in programming languages such as Python, Go, or Java
  • Experience with big data technologies (Hadoop, Spark, Kafka)
  • Strong skills in Infrastructure as Code (IaC) tools like Terraform, CloudFormation, or Ansible
  • Knowledge of containerization and orchestration tools like Docker and Kubernetes
  • Experience with CI/CD tools such as Jenkins, GitLab CI, or CircleCI
  • Proven track record in building and scaling platforms with high availability, resilience, and strict SLO objectives
  • Strong experience with Unix/Linux systems and cloud providers, ideally AWS
Job Responsibility
Job Responsibility
  • Collaborate with software engineering teams to ensure the reliability, performance, and security of the Federal region’s infrastructure
  • Design, implement, and manage FedRAMP-compliant infrastructure and systems
  • Establish continuous monitoring, logging, and auditing processes to ensure compliance with FedRAMP controls
  • Partner with security teams to conduct security assessments and implement necessary controls
  • Design and implement scalable infrastructure solutions that support multi-region growth
  • Drive automation efforts, enabling infrastructure and platforms to scale efficiently with a focus on compliance
  • Stay up-to-date on best practices, evolving security threats, and FedRAMP guidelines to maintain a strong security posture
  • Deploy and maintain cloud-native services in AWS that are resilient and elastic
  • Participate in 24x7 incident response and on-call rotations
  • Plan for capacity and work with teams to prepare for platform growth
What we offer
What we offer
  • Equity and additional benefits will also be awarded
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer Application Development Technical Lead Analyst

The Applications Development Technology Lead Analyst is a senior level position ...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of relevant experience in Apps Development or systems analysis role
  • 5+ years extensive experience system analysis and in programming of software applications with Python and RHEL
  • 5+ years with Site reliability & CI/CD pipelines
  • Previous experience with containerization orchestration
  • Experience in managing and implementing successful projects
  • Subject Matter Expert (SME) in at least one area of Applications Development
  • Ability to adjust priorities quickly as circumstances dictate
  • Demonstrated leadership and project management skills
  • Consistently demonstrates clear and concise written and verbal communication
  • Bachelor's degree/University degree or equivalent experience
Job Responsibility
Job Responsibility
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals
  • Identify and define necessary system enhancements to deploy new products and process improvements
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
  • Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
  • Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
  • Develop comprehensive knowledge of how areas of business integrate to accomplish business goals
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
  • Appropriately assess risk when business decisions are made
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

Join our client, a leading financial institution at the forefront of innovation,...
Location
Location
United States , Austin
Salary
Salary:
57.00 - 63.33 USD / Hour
aquent.com Logo
Aquent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience leading engineering teams and delivering projects using Scrum and efficient release practices
  • Strong background in converting high-level designs into low-level designs and providing technical oversight
  • Demonstrated experience in designing, architecting, and deploying cloud-native applications, specifically on GCP
  • Proficiency with various database technologies, including MongoDB, Aerospike, SQL Server, and PostgreSQL
  • Expertise in containerization technologies such as Docker and Kubernetes, and building/managing CI/CD pipelines
  • Experience leveraging AI-Driven software development tools to enhance productivity, code comprehension, and documentation
  • Proven track record of integrating and applying AI/Machine Learning models for data analytics, visualization, automation, and problem-solving
  • Ability to maintain high quality standards while delivering within tight schedules
  • Exceptional collaborative mindset with a bias for action, engaging effectively with product management, architects, and other domains
  • Strong ability to work with internal, external, and offshore stakeholders
Job Responsibility
Job Responsibility
  • Drive Technical Leadership & Project Delivery: Lead engineering teams through the entire project lifecycle, leveraging agile methodologies like Scrum to ensure efficient delivery and robust release practices
  • Architect & Design Cloud-Native Solutions: Translate high-level architectural visions into detailed low-level designs, providing expert technical oversight for the development and deployment of cutting-edge cloud-native applications
  • Champion Reliability & Scalability: Design, architect, and deploy highly available and scalable cloud-native applications on platforms such as GCP, ensuring optimal performance and resilience
  • Optimize Data Management: Leverage your expertise with diverse database technologies, including MongoDB, Aerospike, SQL Server, and PostgreSQL, to build and maintain robust data solutions
  • Advance DevOps & Automation: Implement and optimize containerization strategies using technologies like Docker and Kubernetes, and establish sophisticated CI/CD pipelines to streamline development and deployment
  • Innovate with AI/ML: Integrate and apply AI/Machine Learning models to enhance data analytics, visualization, automation, and creatively solve complex business and technical challenges
  • Foster Collaboration & Mentorship: Work closely with diverse stakeholders across product management, architecture, and other engineering domains, while actively mentoring and coaching multiple teams to elevate technical capabilities
  • Influence & Present Solutions: Effectively engage subject matter experts, present complex architectural solutions to governance boards and stakeholders, and advocate for data-driven proposals
What we offer
What we offer
  • subsidized health, vision, and dental plans
  • paid sick leave
  • retirement plans with a match
Read More
Arrow Right

Reliability Engineer

The Reliability Engineer is responsible for developing and leading asset reliabi...
Location
Location
United States , Bennettsville
Salary
Salary:
Not provided
domtar.com Logo
Domtar
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Mechanical Engineering or related technical field
  • Minimum five (5) years of experience in maintenance, reliability, or engineering within manufacturing or heavy industrial environments (pulp and paper experience preferred)
  • Strong knowledge of RCM, FMEA, RCFA, CMMS systems, and predictive maintenance technologies
  • Demonstrated commitment to safety and continuous improvement
Job Responsibility
Job Responsibility
  • Lead the development and execution of precision, preventive, and predictive maintenance strategies that improve equipment reliability
  • Champion Root Cause Problem Elimination (RCPE) and Failure Mode & Effects Analysis (FMEA) to proactively address equipment failures
  • Manage and optimize condition-based monitoring programs, including vibration, infrared, oil analysis, and ultrasound technologies
  • Establish and maintain robust systems and tools that enable maintenance and operations teams to monitor and interpret equipment and process health data effectively
  • Optimize maintenance strategies using asset criticality and reliability data to focus efforts on high-impact equipment
  • Analyze failure data and trends to identify systemic issues and drive continuous improvement initiatives
  • Collaborate with planning and scheduling teams to ensure timely and efficient execution of maintenance activities aligned with reliability goals
  • Serve as a subject matter expert on reliability tools, CMMS platforms, and emerging technologies
  • Develop and deliver training and communications to enhance reliability awareness and engagement among maintenance and operations personnel
  • Monitor and report on key reliability and maintenance KPIs, such as MTBF, MTTR, and OEE
What we offer
What we offer
  • competitive compensation
  • a supportive working environment
  • rewarding career paths
  • plenty of opportunities for learning and growth
  • Fulltime
Read More
Arrow Right

Site Reliability Engineering Manager

Hewlett Packard Enterprise (HPE) is looking for a Site Reliability Engineering M...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7–10 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
  • Minimum 2 years of experience managing or leading cloud operations teams
  • Deep understanding of cloud platforms (AWS, GCP, or Azure) and cloud-native architectures
  • Hands-on experience with Kubernetes, containers, infrastructure as code (e.g., Terraform), and configuration management tools
  • Strong foundation in observability (monitoring, logging, tracing), automation using Python, and incident response
  • Familiarity with modern CI/CD automation and tools
  • Excellent communication, stakeholder management, and team-building skills
  • Experience scaling SRE practices in high-growth or large-scale environments
  • Ability to balance long-term reliability initiatives with short-term delivery needs.
Job Responsibility
Job Responsibility
  • Lead and mentor a team of Site Reliability Engineers, supporting their growth, performance, and well-being
  • Own the reliability strategy for SASE cloud infrastructure systems, including incident management, SLIs/SLOs, and capacity planning
  • Partner with Engineering, Product, and Security teams to design and deliver highly available, scalable, and resilient cloud-native services
  • Guide the team in building automation, improving observability, and improve operational efficiency of our cloud infrastructure
  • Drive adoption of best practices in monitoring, alerting, on-call operations, and runbook development
  • Build and maintain a strong engineering culture based on ownership, collaboration, and continuous learning
  • Define and track key reliability metrics, and report on team performance and system health to leadership
  • Contribute to hiring, onboarding, and career development for SREs.
What we offer
What we offer
  • Health & Wellbeing benefits for physical, financial, and emotional wellbeing
  • Personal & Professional Development programs
  • Unconditional inclusion in the workplace.
  • Fulltime
Read More
Arrow Right

Senior Reliability Engineer - PCBA, Harness & Connectors

We are looking for a Senior Reliability Engineer in charge of developing and exe...
Location
Location
United States , San Jose
Salary
Salary:
150000.00 - 225000.00 USD / Year
figure.ai Logo
Figure
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in relevant reliability engineering areas
  • Bachelor's degree or higher in relevant science and engineering fields
  • Strong knowledge of environmental reliability test principles, models, and methodologies, such as high temperature high humidity, thermal cycle/shock, mechanical vibration/shock
  • Strong knowledge of industry test standards such as AECQ, JEDEC, IPC standards
  • Strong knowledge of electrical circuits, PCBA design and relevant SW tools (e.g. Altium)
  • Strong knowledge of PCBA, harness and connector failure modes, mechanisms, and FA techniques
  • Hands-on experience on field reliability risk analysis and failure prediction methods
  • Hands-on experience with Weibull++, JMP, or other reliability statistical analysis software
  • Hands-on experience on electronic circuit debug and relevant tools, e.g. source meter, oscilloscope
  • Hands-on experience with 3D CAD tool (e.g. CATIA)
Job Responsibility
Job Responsibility
  • Work with cross-functional teams, own hardware reliability requirements and validation strategy
  • Develop and execute accelerated life tests for PCBAs, electronic components, electrical harness and connectors
  • Lead DFMEA efforts with design engineers to assess design risks, impacts, controls, and corrective actions
  • Design reliability test flows and procedures, communicate with internal and external/CM teams to execute tests and report results
  • Work with test engineers to design setup and fixtures used in reliability testing
  • Guide and support PCBA, harness, connector failure analysis, design of experiments (DOEs) and corrective action processes with cross-functional teams
  • Analyze field data, assess field risks, and design tests that correlate to field usage conditions
  • Fulltime
Read More
Arrow Right