CrawlJobs Logo

Site Reliability Engineer II

genpt.com Logo

Genuine Parts Company

Location Icon

Location:
United States , Birmingham

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Under general supervision, the Site Reliability Systems Administrator II is responsible for improving system reliability and resilience. This role focuses on building automation to reduce manual effort and prevent service-impacting incidents. The SRE combines software and systems engineering to build and support large-scale, distributed, fault-tolerant systems. This role ensures that critical platforms are available, reliable, and able to support a fast rate of improvement. This role relies on monitoring platforms and is continually taking a holistic view of system health and performance. The SRE will enhance and support cloud-based transformations and is focused on pushing capabilities forward, staying ahead of customer needs, and innovating for continuous improvement. The SRE provides operational support and engineering for multiple large-scale distributed software applications.

Job Responsibility:

  • Defines, designs, and administers network systems used for data communications and recommends improvements to problems of moderate scope
  • Responsible for making sure that the company network works
  • Manages the load configuration of a central data communication processor under limited guidance and makes some recommendations for the purchase or upgrade of data networks
  • Exercises some discretion in proposing and implementing network system enhancements (software and hardware updates)
  • Serves as a point of contact for performance analysis, scalability, and service architecture/database administration issues
  • Coordinates equipment orders including terminals and cable installation, as well as upgrading, monitoring, testing, and servicing the database/systems
  • Helps to negotiate and place orders with common carriers
  • Performs other duties as assigned

Requirements:

  • Bachelor's degree
  • Three (3) to five (5) years of related experience or an equivalent combination
  • Intermediate knowledge of appropriate networks, products, and protocols
  • Knowledge of Unix, Windows NT/2000/98, Internet Security, Oracle ERP, Distributed computing systems
  • Knowledge of job associated database/software/documentation/programming languages/monitoring and version control tools
  • Troubleshooting skills
  • Problem solving skills
  • Demonstrated knowledge and adherence to Change Management processes
  • Ability to interface well with customers, end users, partners, and associates
What we offer:
  • Healthcare coverage
  • 401(k)
  • Tuition reimbursement
  • Vacation
  • Sick pay
  • Holiday pay

Additional Information:

Job Posted:
December 25, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Site Reliability Engineer II

Senior Security Operations Engineer II

As a Senior Security Operations Engineer, you’ll play a key role in ensuring the...
Location
Location
United States , Scottsdale
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in operations, site reliability, or infrastructure engineering roles
  • Strong experience securing and managing cloud environments (e.g., AWS, Azure) and containerized workloads
  • Deep understanding of Linux systems, networking, distributed systems, and their associated security controls
  • Proficiency in automation, scripting, and security tooling integration to streamline operations and enforcement
  • Experience with security monitoring, alerting, SIEM platforms, and observability tools
  • Solid grasp of CI/CD practices with integrated security testing and compliance checks
  • Experience managing Kubernetes clusters and running containerized workloads in production
  • Experience with deploying and administrating any of the following: scalable cloud native secrets solutions such as AWS KMS, Azure KeyVault
  • PKI solutions such as EJBCA, Smallstep, Venafi
  • or vaulting solutions such as Hashicorp Vault
Job Responsibility
Job Responsibility
  • Implementing and improving automated security checks in CI/CD pipelines to prevent vulnerabilities from reaching production
  • Writing, reviewing, and maintaining security-focused infrastructure-as-code for scalable and compliant deployments
  • Investigating security incidents, performing root cause analysis, and implementing long-term mitigation strategies
  • Collaborating with developers to develop new features, services, and infrastructure requirements
  • Enhancing security observability through improved log collection, metrics, and alerting configurations
  • Maintaining and improving security runbooks, incident response playbooks, and internal security tooling for operational efficiency
  • Resolve security/infrastructure incidents by participating in high impact/high visibility incidents as a participant and ideally as an incident commander
  • Maintain and secure critical infrastructure components such as PKI (Public Key Infrastructure) and IAM ( Identity & Access Management) systems, ensuring reliability, scalability, and compliance with organizational and industry security standards
  • Build and maintain secure, reliable, and scalable infrastructure that protects core services and sensitive data
  • Troubleshoot and resolve complex operational and system-level issues across environments
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right

Senior Security Operations Engineer II

As a Senior Security Operations Engineer, you’ll play a key role in ensuring the...
Location
Location
United States , Scottsdale
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in operations, site reliability, or infrastructure engineering roles
  • Strong experience securing and managing cloud environments (e.g., AWS, Azure) and containerized workloads
  • Deep understanding of Linux systems, networking, distributed systems, and their associated security controls
  • Proficiency in automation, scripting, and security tooling integration to streamline operations and enforcement
  • Experience with security monitoring, alerting, SIEM platforms, and observability tools
  • Solid grasp of CI/CD practices with integrated security testing and compliance checks
  • Experience managing Kubernetes clusters and running containerized workloads in production
  • Experience with deploying and administrating any of the following: scalable cloud native secrets solutions such as AWS KMS, Azure KeyVault
  • PKI solutions such as EJBCA, Smallstep, Venafi
  • or vaulting solutions such as Hashicorp Vault
Job Responsibility
Job Responsibility
  • Implementing and improving automated security checks in CI/CD pipelines to prevent vulnerabilities from reaching production
  • Writing, reviewing, and maintaining security-focused infrastructure-as-code for scalable and compliant deployments
  • Investigating security incidents, performing root cause analysis, and implementing long-term mitigation strategies
  • Collaborating with developers to develop new features, services, and infrastructure requirements
  • Enhancing security observability through improved log collection, metrics, and alerting configurations
  • Maintaining and improving security runbooks, incident response playbooks, and internal security tooling for operational efficiency
  • Resolve security/infrastructure incidents by participating in high impact/high visibility incidents as a participant and ideally as an incident commander
  • Maintain and secure critical infrastructure components such as PKI (Public Key Infrastructure) and IAM ( Identity & Access Management) systems, ensuring reliability, scalability, and compliance with organizational and industry security standards
  • Build and maintain secure, reliable, and scalable infrastructure that protects core services and sensitive data
  • Troubleshoot and resolve complex operational and system-level issues across environments
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer II

Site Reliability Engineer II - (Microsoft 365 Enterprise + Cloud). We are lookin...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Mid-level years of software development: automation-related experience is most valued
  • Scripting languages such as bash, python, and PowerShell, or compiled languages such as C, C# are most relevant, but others are acceptable
  • Awareness of, and ability to reason about, modern software & systems architectures, including load-balancing, queueing, caching, distributed systems failure modes, microservices, and so on
  • Associated troubleshooting skills, including the ability to follow RPC (Remote Procedure Call) call-chains across arbitrary network steps
  • Consequent understanding of monitoring in distributed systems
  • Deep understanding of operating system level concepts such as processes, memory allocation, and the network stack
  • understanding of how applications are affected by the above, and ability to debug same
  • Experience with working in a team, including coordinating large projects, communicating well, and exercising initiative when presented with problems
  • Practical experience running large scale online systems is always an advantage
Job Responsibility
Job Responsibility
  • Researches and maintains deep knowledge of industry trends as well as advances in large-scale distributed systems and cloud technologies
  • identifies opportunities to create, implement, and/or optimally utilize new tools, technologies, and/or processes to solve ambiguous problems and improve product availability, reliability, efficiency, observability, and/or performance
  • Drives the adoption of innovative solutions across engineering teams working with related products within an organization
  • Apply advanced statistical and machine learning techniques to analyze large datasets and extract meaningful insights
  • Experience working with all service aspects of high throughput and multi-tenant services, ability to understand and design workflows carefully, properly handle errors, write clean and well-factored code with good tests and good maintainability
  • Engages with product engineering teams by partaking in code/design reviews, participating in on-call rotations and incident responses throughout product development and operations cycles
  • leverages end-to-end technical expertise on underlying systems/platforms and insights from engagements with product engineering teams and telemetry analyses to propose scalable improvements in code and designs with attention to customer/business objectives and incident prevention
  • Develops code, scripts, systems, or platforms that automate moderately complex but repetitive operations processes (e.g., monitoring, alerting, deploying products and updates, debugging) at scale
  • reviews existing automation code and scripts to evaluate reusability, extendibility, and scalability within an organization
  • Analyzes data from telemetry pipelines and monitoring tools that detail operations metrics (e.g., availability, reliability, performance, efficiency) of systems, platforms, or products operating at scale
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer II

We are the Data Center Network Services team within Cisco IT, supporting network...
Location
Location
United States of America , Research Triangle Park, North Carolina
Salary
Salary:
109900.00 - 200100.00 USD / Year
duo.com Logo
Duo Security
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Engineering or Technology, with 0- 3 years of experience in building, testing, or deploying scalable network applications
  • Strong programming skills, with expertise in Python and Ansible scripting
  • Hands-on experience with tools such as JIRA, Git, and Jenkins
  • Proficiency with Continuous Integration/Continuous Deployment (CI/CD) and pipeline setup
  • Solid understanding of software engineering concepts: data structures, algorithms, object-oriented programming, distributed systems, and cloud computing
Job Responsibility
Job Responsibility
  • Design, develop, test, and deploy new software capabilities for Data Center Networks
  • Collaborate with engineers across multiple disciplines and engage with internal clients
  • Deliver innovative, high-quality solutions that enhance the client experience
What we offer
What we offer
  • Medical, dental and vision insurance
  • 401(k) plan with a Cisco matching contribution
  • Paid parental leave
  • Short and long-term disability coverage
  • Basic life insurance
  • 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
  • 1 paid day off for employee’s birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness
  • Non-exempt employees receive 16 days of paid vacation time per full calendar year
  • Exempt employees participate in Cisco’s flexible vacation time off program
  • 80 hours of sick time off provided on hire date and each January 1st thereafter
  • Fulltime
Read More
Arrow Right

Software Engineer II - CoreAI

As an AI Engineer on the CoreAI Platform team, you will apply artificial intelli...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Design, build, and scale AI models to detect anomalies, identify regressions across large-scale AI systems
  • Analyze patterns in telemetry, logs, and real‑time signals to uncover root causes, predict failures, and drive proactive mitigations
  • Apply AI to identify emerging usage trends, performance hotspots, and workload irregularities that impact system health and user experience
  • Build lightweight automation that leverages anomaly detection signals and pattern analysis to improve live‑site reliability and engineering velocity
  • Contribute to hotfixes, performance tuning, and reliability improvements in production AI engines (e.g., GPU savings, SLA reliability, customer satisfaction)
  • Build intuitive, responsive UI components for AI dashboards and telemetry tools using React and modern web technologies
  • Communicate technical concepts with clarity and initiative, proactively seeking feedback and driving continuous improvement
  • Stay current with industry trends in applied AI, observability, and performance engineering
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer II

Fivetran is looking for a high-performance engineer to be a part of a team of Si...
Location
Location
United States , Denver
Salary
Salary:
120507.78 - 144615.12 USD / Year
fivetran.com Logo
Fivetran
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Knowledge of Cloud Platforms and related tooling: AWS, GCP, Azure, Terraform, configuration management
  • Experience in a scripting language
  • A strong foundation in Linux operating system internals and administration
  • Knowledge of Kubernetes
  • Familiarity with a relational database
Job Responsibility
Job Responsibility
  • Responsible for monitoring the availability, capacity, and throughput of Fivetran's production infrastructure to identify and address potential issues
  • Collaborate with engineering teams to integrate reliability best practices into the product roadmap
  • Support the prioritization and resolution of critical bugs identified by support or sales
  • Contribute to maintaining 100% availability of production infrastructure by collaborating with engineering to implement automation for scalable deployments
  • Proactively monitor infrastructure vulnerabilities and collaborate with the security team to address them in a timely manner
What we offer
What we offer
  • 100% employer-paid medical insurance
  • Generous paid time-off policy (PTO), plus paid sick time, inclusive parental leave policy, holidays, and volunteer days off
  • RSU stock grants
  • Professional development and training opportunities
  • Company virtual happy hours, free food, and fun team-building activities
  • Monthly cell phone stipend
  • Access to an innovative mental health support platform that offers personalized care and resources in areas such as: therapy, coaching, and self-guided mindfulness exercises for all covered employees and their covered dependents
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer II

Fivetran is looking for a high-performance engineer to be a part of a team of Si...
Location
Location
United States , Oakland
Salary
Salary:
133897.53 - 160683.46 USD / Year
fivetran.com Logo
Fivetran
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Knowledge of Cloud Platforms and related tooling: AWS, GCP, Azure, Terraform, configuration management
  • Experience in a scripting language
  • A strong foundation in Linux operating system internals and administration
  • Knowledge of Kubernetes
  • Familiarity with a relational database
Job Responsibility
Job Responsibility
  • Responsible for monitoring the availability, capacity, and throughput of Fivetran's production infrastructure to identify and address potential issues
  • Collaborate with engineering teams to integrate reliability best practices into the product roadmap
  • Support the prioritization and resolution of critical bugs identified by support or sales
  • Contribute to maintaining 100% availability of production infrastructure by collaborating with engineering to implement automation for scalable deployments
  • Proactively monitor infrastructure vulnerabilities and collaborate with the security team to address them in a timely manner
What we offer
What we offer
  • 100% employer-paid medical insurance
  • Generous paid time-off policy (PTO), plus paid sick time, inclusive parental leave policy, holidays, and volunteer days off
  • RSU stock grants
  • Professional development and training opportunities
  • Company virtual happy hours, free food, and fun team-building activities
  • Monthly cell phone stipend
  • Access to an innovative mental health support platform that offers personalized care and resources in areas such as: therapy, coaching, and self-guided mindfulness exercises for all covered employees and their covered dependents
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer II

Doctolib’s Engineering environment is rich and we are building innovative produc...
Location
Location
France , Nantes; Paris
Salary
Salary:
Not provided
doctolib.fr Logo
Doctolib
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Solid hands-on experience (3y+) on a large-scale production platform
  • Proven experience with cloud platforms such as AWS, Azure or Google Cloud
  • Solid understanding of containerization and orchestration technologies (Docker and Kubernetes)
  • Strong understanding of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows
  • Proficiency in at least one programming language (Ruby, Python, Go, Java, etc.) and a deep understanding of infrastructure as code principles
  • Experience with monitoring and observability tools
  • Like troubleshooting performance issues in complex environments
  • Speak English
Job Responsibility
Job Responsibility
  • Platform Reliability: Design, build, and maintain the core platform infrastructure to enable scalability and resilience
  • Automation and Efficiency: Develop tools and processes to automate the deployment, scaling, and lifecycle management of services
  • Monitoring and Incident Management: Implement robust monitoring, alerting, and incident response mechanisms
  • Disaster Recovery: Design and execute disaster recovery strategies
  • Collaborate with Feature Teams: Partner with product and engineering teams to embed reliability best practices
  • Continuous Improvement: Research and evaluate emerging technologies and tools
  • On-Call Ownership: Participate in an on-call rotation
What we offer
What we offer
  • Free Health Insurance for you & your family
  • Up to 14 days of RTT
  • Parental care program (1 month off in addition to the legal parental leave and 0,5 days off per child when the school starts)
  • Wellbeing program (free mental health and coaching offer with our partner moka.care)
  • A flexible workplace policy offering both hybrid and office-based mode
  • Flexibility days allowing to work in EU countries and the UK 10 days per year
  • Lunch voucher with Swile card
  • Work Council subsidy to refund part of sport club membership or creative class
  • Bicycle subsidy
  • Fulltime
Read More
Arrow Right