Site Reliability Engineer Job at Corporate Tools

Job Description

Corporate Tools is looking for a Site Reliability Engineer. You will be a traditional company employee. This is a remote position, but if you’re near one of our local offices, you’re welcome to come hangout with us in-office as well. Our main offices are in Post Falls, ID, and Spokane, WA; we also have satellite offices in Austin, TX, and Salt Lake City, UT. You’ll be working 40 hours a week and, of course, enjoy great company benefits. Our Site Reliability Engineer should help keep our systems steady, secure, and running like a well-oiled machine (except without actual oil). You’ll work closely with our DevOps engineers to build out tools and automation that make things faster, easier, and less painful for everyone.

Job Responsibility

Stop problems before they start
Fix issues quickly and learn from them
Help keep systems steady, secure, and running
Work closely with DevOps engineers to build out tools and automation
Take ownership

Requirements

Bachelor's degree in Computer Science, Software Engineering, or equivalent practical experience
5+ years of experience in software engineering
2+ years of experience in site reliability engineering, DevOps, or infrastructure engineering roles
Deep experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code tools such as Terraform, CloudFormation, or Pulumi
Strong proficiency with Kubernetes, Docker, and container orchestration in production environments
Hands-on experience with observability and monitoring tools like Prometheus, Grafana, OpenTelemetry, Sentry, or New Relic
Proven ability to design and implement highly available, fault-tolerant systems and lead proactive incident response efforts
Experience with performance tuning, database optimization, and caching strategies (e.g., PostgreSQL, Redis, Memcached)
Demonstrated ability to drive reliability improvements, reduce operational toil, and foster a culture of resilience and continuous improvement
Experience leading reliability-focused initiatives such as post-incident reviews, capacity planning, and root cause analysis
Experience in site reliability engineering within Ruby on Rails environments
Familiarity with the Grafana observability stack and related tools (e.g., Alloy, Loki, Tempo, Prometheus)
In-depth experience with AWS services, including ECS, EKS, Route 53, and other related tools
Proven ability to collaborate across teams to improve service reliability, reduce incident frequency, and drive operational excellence
Troubleshoot and resolve complex production issues, applying SRE best practices to minimize impact and prevent recurrence

What we offer

100% employer-paid medical, dental and vision for employees
Annual review with raise option
22 days Paid Time Off accrued annually, and 4 holidays
After 3 years, PTO increases to 29 days
Employees transition to flexible time off after 5 years with the company—not accrued, not capped, take time off when you want
Paid Parental Leave
Up to 6% company matching 401(k) with no vesting period
Quarterly allowance
Open concept office with friendly coworkers
Creative environment where you can make a difference
Trail Mix Bar

Corporate Tools - All Job Offers

Select Country

Site Reliability Engineer

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?