CrawlJobs Logo

Rotating Reliability Engineer

United States, El Dorado Employment contract · Job Posted May 28, 2026
Apply Position
Job Link Share

Job Description

HF Sinclair is seeking a Rotating Reliability Engineer in El Dorado, KS who will be responsible to conduct engineering studies and makes recommendations to improve the reliability of refinery equipment. Conducts investigations of equipment problems and failures (RCFA), makes recommendations to prevent future problems, expedites the planned and unplanned repairs of equipment, trains employees in machine use and lubrication to increase reliability. Provides overall refinery support of Reliability Programs for Condition Monitoring, Preventive Maintenance, Critical Equipment Monitoring, Lubrication, application of API standards, and development of appropriate repair standards. This individual will directly interface and coordinate machinery activities with safety, maintenance, project and process personnel as well as contractors.

Job Responsibility

  • Evaluates equipment performance and provides equipment reliability improvement recommendations by using RCM analysis
  • Assist Maintenance with troubleshooting rotating equipment problems in the field
  • Champion the rotating machinery bad actor program
  • Develop and/or assist in the strategies for the resolution of bad actors
  • Interfaces with equipment suppliers on retrofits or upgrades
  • Setup and audit Equipment Health and Performance Monitoring and Protection programs, both automated and manual
  • Ensure programs meet Industry Standards and Best Practices
  • Modify programs and update IOWs, CODs, SOLs, Alarm and Trip Set Points, with defined action steps, based on non-conformances and gaps found during operation and maintenance activities
  • Participate in Site Risk Register & PHA reviews and provide assistance on identifying and risk ranking Rotating Equipment threats
  • Investigates equipment problems and failures for root causes and provides engineering recommendations for resolution including economic impacts of various alternatives
  • Assist capital projects group as needed with Rotating Equipment design and specifications review (asset selection, materials, installation plans, testing requirements, critical spare parts, etc.)
  • Assist maintenance & turnaround planners to develop repair work scopes for major machinery components for equipment repaired in house, equipment sent out for repair, and equipment repaired during turnarounds
  • Assist maintenance personnel with the development and updating of asset specific rotating repair procedures
  • Uses SAP to update equipment repair histories, provide technical data updates, and input necessary changes to equipment bill of materials
  • Assist area trainers with operating procedures for critical & special purpose rotating equipment and training documentation for operators
  • Must be able to work effectively with others at all levels and functional areas of the refinery

Requirements

  • A minimum of two years of progressive work experience in rotating engineering
  • A minimum of a Bachelor's Degree in an Engineering discipline
  • Technical expert in area of specialty
  • Advanced ability to stay abreast of new technology developments and processes and apply knowledge analytically
  • Strong knowledge of Microsoft products and commonly used engineering concepts and experience with engineering software
  • Familiarity with standards and practices of rotating engineering, such as API and ASME
  • Demonstrate effective organizational ability
  • Effective written and verbal communication skills
  • Ability to learn and apply engineering principles and methods, spatial and form perception and facility with mathematics
  • Ability to prioritize and balance multiple priorities

Nice to have

  • Reliability engineering experience
  • root cause failure analysis
  • RCM analysis
  • cause mapping experience
  • Experience in refinery or petrochemical industry
  • some or all of FCC, HF Alky, Crude & Vacuum, Sulfur unit, Delayed Coking, and hydro-treating experience

What we offer

  • Medical Insurance
  • Vision Insurance
  • Dental Insurance
  • Paid Time-Off
  • 401(k) Retirement Plan with match
  • Educational Reimbursement
  • Parental Bonding Time
  • Employee Discounts

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Rotating Reliability Engineer

8 matching positions

Sr Rotating Reliability Engineer

All activities will be performed in support of the strategy, and vision of the o...
Location
Location
United States , Big Spring
Salary
Salary:
Not provided
delekus.com Logo
Delek US
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4 year / Bachelor's Degree
  • Eight (8) or more years Oil & Gas, or related experience
  • Reliability Management
  • Asset Management
  • Rotating Equipment
  • Process Safety
  • Turnarounds
  • Continuous Improvement
  • Analysis
  • Budgeting
Job Responsibility
Job Responsibility
  • Responsible for sustaining and continuously improving various mechanical components for equipment and tools
  • Ensures the safe, effective operations of the organization's production and supports continuous improvement
  • Manages reliability engineering projects
  • Performs analytical verification
  • Evaluates, tests and tracks results of reliability interventions
  • Initiates reporting for internal or third-party reported incidents
  • Creates, documents and follows up on corrective actions
  • Prepares routine reports and memos and coordinate communications across all necessary functional groups of the organization
What we offer
What we offer
  • Up to a 10% match on 401K on hire start with a vesting timeline of only one year
  • Medical benefits that start on day one with a 30% premium rebate annually
  • Access to the Calm app for FREE
  • Performance management program to earn additional annual incentives
  • Fulltime
Read More
Arrow Right

Senior Site Reliability Engineer - Fleet Reliability

Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serv...
Location
Location
United States , San Francisco
Salary
Salary:
230000.00 - 345000.00 USD / Year
lambda.ai Logo
Lambda
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in Site Reliability Engineering, DevOps, or a similar role
  • Strong understanding of modern AI infrastructure, from GPU architectures to hardware performance optimization
  • Strong understanding of Linux-based systems in a distributed environment
  • Solid understanding of Python and Go, with experience working with SWE teams to improve internal tooling
  • Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, SumoLogic)
  • Proficiency in automation and configuration management tools (e.g., Ansible, Terraform)
  • Familiarity with cloud platforms (e.g., OCI, AWS, GCP, Azure)
  • Excellent problem-solving and troubleshooting skills
  • Strong communication and collaboration skills
  • Passion for continuous improvement and innovation
Job Responsibility
Job Responsibility
  • Define Fleet Health metrics and indicators to objectively measure and improve system availability
  • Collaborate with the observability team on comprehensive monitoring and alerting systems to proactively predict, detect and respond to issues or anomalies
  • Create runbooks and automated remediations for common failure scenarios
  • Build in automation and auditing to ensure compliance and improve efficiency and productivity
  • Participate in on-call rotations and provide support for incident response and resolution
  • Implement and integrate logging and metrics across platforms such as Datadog, Prometheus, OpenTelemetry, Grafana, SumoLogic, etc
What we offer
What we offer
  • Generous cash & equity compensation
  • Health, dental, and vision coverage for you and your dependents
  • Wellness and commuter stipends for select roles
  • 401k Plan with 2% company match (USA employees)
  • Flexible paid time off plan
  • Fulltime
Read More
Arrow Right

Intermediate Site Reliability Engineer SRE – AI Reliability & Automation

At PointClickCare our mission is simple: to help providers deliver exceptional c...
Location
Location
Canada , Mississauga
Salary
Salary:
115000.00 - 128000.00 CAD / Year
pointclickcare.com Logo
PointClickCare
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years' experience in software engineering
  • Experience with SRE principles
  • Experience with AI/ML in production environments
  • A passion for automation, intelligent systems, and operational excellence
  • Strong debugging, problem-solving, and system design skills
  • Languages: Python, Java, Bash, Terraform
  • Platforms: Azure, Kubernetes, Docker
  • Tools: Datadog, Prometheus, AppDynamics, ELK, GitHub Actions
  • ML/AI: MCP framework, AI agents, Vector store, Agent orchestration (LangChain), RAG
  • CI/CD: Jenkins, ArgoCD, Spinnaker
Job Responsibility
Job Responsibility
  • Build ML-based anomaly detection and pattern recognition systems
  • Enhance telemetry with smart tagging and metadata for better AI insights
  • Develop event-driven workflows and self-healing systems using AI triggers
  • Automate incident response with generative AI and custom AI agent orchestration
  • Use time-series forecasting and predictive modelling to anticipate failures
  • Optimise infrastructure with AI-powered autoscaling and cost-aware resource allocation
  • Build scalable, fault-tolerant systems in a cloud-native environment
  • Participate in on-call rotations and lead incident response for critical systems
  • Skilled in API integration for streamlined data exchange and system connectivity
  • Run internal AIOps workshops and help teams adopt AI maturity models
What we offer
What we offer
  • Benefits starting from Day 1!
  • Retirement Plan Matching
  • Flexible Paid Time Off
  • Wellness Support Programs and Resources
  • Parental & Caregiver Leaves
  • Fertility & Adoption Support
  • Continuous Development Support Program
  • Employee Assistance Program
  • Allyship and Inclusion Communities
  • Employee Recognition … and more!
  • Fulltime
Read More
Arrow Right
New

Site Reliability Engineer

We are currently seeking a Site Reliability Engineer to join our team in Westlak...
Location
Location
United States , Westlake
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Site Reliability Engineering, DevOps Engineering, Platform Engineering, or related disciplines (understanding reliability engineering principles, SLIs, SLOs, error budgets, and operational excellence)
  • 5+ years’ hands-on Terraform experience
  • 5+ years’ experience supporting mission-critical enterprise applications in production environments
  • 5+ years’ experience with cloud networking, security, and infrastructure architecture
  • 5+ years of hands-on experience managing hybrid cloud environments
  • 5 + years of automation skills using Python, Ansible, Shell scripting, or similar technologies
  • 5+ years’ experience building reusable infrastructure modules and automated deployment frameworks
Job Responsibility
Job Responsibility
  • Design, implement, and support highly available load balancing solutions using F5 BIG-IP, Broadcom AVI, and cloud-native load balancing services
  • Build and maintain Infrastructure-as-Code (IaC) solutions using Terraform
  • Develop automation solutions for infrastructure provisioning, configuration management, and operational workflows
  • Support and enhance CI/CD pipelines using tools such as Jenkins, Azure DevOps, GitHub Actions, or similar platforms
  • Collaborate with application, cloud, network, and platform teams to improve reliability, performance, and scalability
  • Monitor production systems and proactively identify reliability, performance, and availability risks
  • Implement Site Reliability Engineering best practices including observability, incident management, capacity planning, and resiliency engineering
  • Troubleshoot complex issues across networking, cloud infrastructure, load balancing, and application environments
  • Support hybrid infrastructure environments spanning on-premises datacenters and public cloud platforms
  • Participate in on-call rotation and provide production support for critical business applications
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer II

Location
Location
United States , Exton
Salary
Salary:
Not provided
bentley.com Logo
Bentley Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • U.S. Master of Science degree, or foreign equivalent in Information Quality,Computer and Information Science, or a closely related field, and 3 years of DevOps Engineering experience
  • 3 years’ experience with Site Reliability Engineering and DevOps automation including designing, implementing and maintaining CI/CD pipelines for cloud-based production systems
Job Responsibility
Job Responsibility
  • Responsible for designing, implementing, and maintaining automated cloud infrastructure and CI/CD pipelines to support enterprise software applications
  • Perform DevOps automation, Infrastructure as Code, and containerized deployments to improve system reliability, scalability, and operational efficiency while reducing manual intervention
  • Cloud platforms Azure and Amazon Web Services (AWS), including infrastructure provisioning, networking architecture, identity management and security configuration
  • Developing and maintaining IaC using Terraform, along with automation and scripting using Python or PowerShell, and configuration management using Ansible to support scalable and reliable cloud environments
  • Containerization and orchestration technologies, including Docker, Kubernetes and Helm for deploying, scaling, and managing distributed cloud-native applications
  • Build and maintain monitoring, logging, and alerting systems (e.g., Prometheus, Grafana) and participate in a rotating on-call schedule for production support
Read More
Arrow Right

Principal Site Reliability Engineer

We are looking for a Principal Engineer to join our SDWAN engineering team. You ...
Location
Location
Bulgaria , Sofia
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years as DevOps engineer with a passion for technology, strong motivation and responsibility
  • Proficiency in DevOps and Platform Engineering with expertise in AWS, GCP, Terraform, ArgoCD, Kubernetes, and related tools
  • Experience in developing and maintaining CI/CD pipelines for continuous delivery in agile environments
  • Skilled in managing cloud infrastructure, particularly with AWS and GCP, and adept in infrastructure as code practices using Terraform/Terragrunt
  • Demonstrated capability in supporting high-scale SaaS applications, focusing on scalability, reliability, and performance
  • Excellent written and verbal communication, able to collaborate and rally support
  • Self-disciplined, self-managed, self-motivated, strong sense of ownership, urgency, and drive
  • Passion for infrastructure and monitoring as code
  • Ready to understand and dissect new technology stacks quickly
Job Responsibility
Job Responsibility
  • Implement and optimize CI/CD pipelines and cloud infrastructure using our technology stack, ensuring efficient and reliable deployment to production
  • Participate in the deployment of monitoring and alerting systems to maintain high system performance and reliability
  • Collaborate with software development and other cross-functional teams to streamline and enhance processes, aiming for efficiency and alignment with business goals
  • Contribute to the management of the cloud infrastructure, utilizing Infrastructure as Code principles
  • Participate in on-call rotations to support critical business and production systems
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer

We are looking for a Site Reliability Engineer (SRE) to support reliable, high-p...
Location
Location
United States , Novi
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Information Technology, Computer Science, Computer Engineering, or comparable practical experience
  • At least 5 years of experience supporting production environments in a corporate, startup, or similarly fast-paced technical setting
  • Hands-on expertise with infrastructure as code, including Terraform, along with experience in cloud platforms and related services
  • Working knowledge of container technologies such as Docker and orchestration platforms like Kubernetes
  • Experience supporting live systems, participating in on-call rotations, and contributing to incident reviews and corrective actions
  • Proficiency with automation and scripting using Bash and Python to reduce manual operational effort
  • Strong communication skills with the ability to explain technical decisions and tradeoffs to cross-functional or non-technical stakeholders
  • Willingness and ability to travel to customer or plant locations as business needs require
Job Responsibility
Job Responsibility
  • Maintain dependable and secure production environments across plant-edge and cloud-based systems, with a focus on uptime, responsiveness, and operational stability
  • Design, refine, and support monitoring dashboards, alerting frameworks, and operational runbooks using tools such as Prometheus, Grafana, and modern telemetry solutions
  • Build and manage infrastructure through code using Terraform, applying version control standards, peer reviews, and controlled deployment processes
  • Create automation scripts and lightweight tools in Bash and Python to streamline routine operations, recovery procedures, backup workflows, and environment setup
  • Take part in incident response and on-call coverage, troubleshoot service disruptions, coordinate initial communication, and document follow-up actions through blameless reviews
  • Establish and measure service reliability indicators and objectives, helping stakeholders balance system dependability with release speed and operational risk
  • Support secure connectivity between factory networks and cloud resources by configuring and maintaining VPNs, routing, private networking, and access controls
  • Administer and optimize relational or time-series databases, including backup planning, replication, performance tuning, and long-term storage health
  • Contribute to CI/CD delivery practices by improving deployment pipelines, supporting controlled release strategies, and preparing rollback procedures when needed
  • Partner with controls, software, and data teams to enable reliable data flow from industrial systems and ensure safe deployment to edge infrastructure
What we offer
What we offer
  • medical, vision, dental, and life and disability insurance
  • 401(k) plan
Read More
Arrow Right

Site Reliability Engineer II

Microsoft is a company where passionate innovators come to collaborate, envision...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's Degree in Computer Science, Information Technology, or related field AND 2+ year(s) technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Work with all aspects of a high throughput and multi-tenant service
  • Collaborate effectively within the team and with partner teams across Microsoft
  • Be part of the on-call rotation for maintaining service health
  • Design, implement, and refine chosen solutions in close partnership with Product Management and partner teams
  • Champion operational excellence via established metrics, process governance, and policy controls for regular assessment and improvement
  • Document and define existing data engineering processes, data and technology, while evaluating them for optimization
  • System Reliability & Uptime – Ensuring high availability of services
  • Incident Management – Detecting, responding to, and mitigating system failures
  • Performance Monitoring – Tracking system health and resolving bottlenecks
  • Automation & Tooling – Reducing manual work through scripts and automation
  • Fulltime
Read More
Arrow Right