CrawlJobs Logo

Senior Software Engineer, Infrastructure Observability

temporal.io Logo

Temporal

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

180000.00 - 225000.00 USD / Year

Job Description:

We have an opening for a Senior Software Engineer on our Infrastructure Team, with specific focus on Observability - both internal and customer-facing. This is an opportunity to join a mission-critical engineering team that is driving the productivity and reliability of Temporal’s developers and core platforms, respectively. We are a passionate team of talented developers who truly care about our mission and enjoy working deeply across the entire Temporal ecosystem to be a force multiplier across the organization.

Job Responsibility:

  • Lead the end-to-end Software Development Lifecycle: goals & requirements solicitation, design & review, implementation, operationalization & deployment, support & maintenance
  • Formulate feature designs, review with stakeholders, iterate to incorporate feedback and drive consensus
  • Clearly document design choices and operational knowledge to successfully deploy and manage the software you develop
  • Provide appropriate test and production readiness coverage for unit, integration, and performance of your feature ownership area
  • Set a high bar for technical excellence and take pride in the software you develop
  • Design and build multi-component, distributed systems that operate at scale
  • Investigate issues with a methodical approach to identify a root cause
  • Understand performance and reliability implications of design options at scale. Make related tradeoffs
  • Able to participate in the team’s on-call rotation
  • Expert-level knowledge of architecture and services of assigned domain. Strong command over all aspects of the Temporal ecosystem
  • Investigate and understand ways to best leverage Temporal’s own software to power our mission
  • Deeply understand the needs of Temporal internal developers and external customers, and leverage that knowledge for product development and feature design
  • Participate in design reviews and contribute to design of other features
  • Share design principles for building reliable systems at scale

Requirements:

  • Demonstrated ability to develop horizontally scalable, resilient, and high performance distributed systems in a production environment
  • Experience designing, implementing, deploying, and supporting large scale, geographically distributed observability and/or high throughput data streaming/processing pipelines, or similar
  • Expert in one or more high-level programming languages, preferably Go
  • Expert-level Kubernetes skills
  • Expert-level query development skills, preferably SQL
  • Hands-on experience with one or more cloud providers, preferably AWS, or GCP
  • Thorough understanding of computer architecture, operating systems, and networking
  • Familiarity with best practices regarding monitoring, instrumenting, and configuring infrastructure
  • User-first mindset
  • Motivated by impact
  • Strong opinions about tools and technology that are equally balanced by a pragmatic drive for impact
  • Ability to work in a self-directed manner in a fast-paced environment
  • Excellent collaboration and communication skills
What we offer:
  • Unlimited PTO, 12 Holidays + 2 Floating Holidays
  • 100% Premiums Coverage for Medical, Dental, and Vision
  • AD&D, LT & ST Disability, and Life Insurance (Standard & Supplemental Available)
  • Empower 401K Plan
  • Additional Perks for Learning & Development, Lifestyle Spending, In-Home Office Setup, Professional Memberships, WFH Meals, Internet Stipend and more
  • $3,600 / Year Work from Home Meals
  • $1,500 / Year Career Development & Learning
  • $1,200 / Year Lifestyle Spending Account
  • $1,000 / Year In-Home Office Setup (In addition to Temporal issued equipment)
  • $500 / Year Professional Memberships
  • $74 / Month Reimbursement for Internet
  • Calm App Subscription for Mental Health & Wellness
  • This role is eligible to participate in Temporal's equity plan

Additional Information:

Job Posted:
December 12, 2025

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Software Engineer, Infrastructure Observability

Senior Software Engineer, Observability

The Observability team at Airtable ensures that engineers have the tools they ne...
Location
Location
United States , San Francisco; New York; Seattle
Salary
Salary:
196000.00 - 270000.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of software engineering experience
  • 3+ years focused on observability or infrastructure at scale
  • Demonstrated success implementing and running production-grade logging, metrics, or tracing systems
  • Proficiency in distributed systems concepts, data streaming pipelines, and container orchestration (Kubernetes)
  • Deep hands-on knowledge of tools such as Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack, Loki, or ClickHouse
  • Comfort with at least one programming language (e.g., Go, Python, Java) to build and maintain observability tooling
  • Experience mentoring engineers and collaborating across multiple teams
  • Strong communication skills
  • Eagerness to own high-impact initiatives
  • Proven ability to balance short-term fixes with long-term strategic vision
Job Responsibility
Job Responsibility
  • Architect and scale core observability systems
  • Lead the design and evolution of logging, metrics, and tracing pipelines
  • Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack)
  • Guide and mentor a growing team of infrastructure engineers
  • Define and uphold coding standards and operational excellence
  • Partner with Deploy Infrastructure, Service Orchestration, and Product teams
  • Align infrastructure decisions with business goals
  • Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
  • Optimize performance and cost of large-scale data pipelines
  • Shape the observability roadmap
What we offer
What we offer
  • Opportunity to receive benefits
  • Restricted stock units
  • May include incentive compensation
  • Comprehensive benefit offerings
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (Infrastructure) - HyperDX

Join us in revolutionizing Observability for Developers! We’re on a mission to r...
Location
Location
Netherlands
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of backend engineering experience
  • Strong TypeScript and Node.js skills (bonus for additional languages)
  • Deep understanding of APIs, event-driven systems, and high-throughput data pipelines
  • Proficiency in SQL and experience working with analytical databases (ClickHouse experience a plus)
  • Experience with Docker and Kubernetes, plus Helm for managing production deployments
  • Experience with infrastructure-as-code (Terraform, Pulumi, or similar)
  • Familiarity with CI/CD pipelines, monitoring systems, and production-grade alerting practices
  • A passion for building reliable, maintainable, cloud-native systems
Job Responsibility
Job Responsibility
  • Build the core platform: Design and implement backend systems and APIs that power HyperDX, enabling engineers to ingest, query, and analyze observability data at massive scale
  • Scale deployments and infrastructure: Architect, deploy, and maintain cloud-native systems that ensure reliability, scalability, and performance. You’ll use Kubernetes, Helm, and infrastructure-as-code to make deployments simple and resilient
  • Ensure maintainability and operational excellence: Define best practices for CI/CD, monitoring, logging, and alerting. Drive automation across testing, scaling, and incident response to keep our platform healthy and developer-friendly
  • Engineer for scale: Design and operate ingestion and data processing pipelines that remain performant, resilient, and observable—even as we grow to petabyte-level workloads
  • Engage with the community: Collaborate with open-source contributors and customers, solve their challenges, and incorporate their feedback into our roadmap
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (Infrastructure) - HyperDX

Join us in revolutionizing Observability for Developers! We’re on a mission to r...
Location
Location
Germany
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of backend engineering experience
  • Strong TypeScript and Node.js skills (bonus for additional languages)
  • Deep understanding of APIs, event-driven systems, and high-throughput data pipelines
  • Proficiency in SQL and experience working with analytical databases (ClickHouse experience a plus)
  • Experience with Docker and Kubernetes, plus Helm for managing production deployments
  • Experience with infrastructure-as-code (Terraform, Pulumi, or similar)
  • Familiarity with CI/CD pipelines, monitoring systems, and production-grade alerting practices
  • A passion for building reliable, maintainable, cloud-native systems
Job Responsibility
Job Responsibility
  • Build the core platform: Design and implement backend systems and APIs that power HyperDX, enabling engineers to ingest, query, and analyze observability data at massive scale
  • Scale deployments and infrastructure: Architect, deploy, and maintain cloud-native systems that ensure reliability, scalability, and performance. You’ll use Kubernetes, Helm, and infrastructure-as-code to make deployments simple and resilient
  • Ensure maintainability and operational excellence: Define best practices for CI/CD, monitoring, logging, and alerting. Drive automation across testing, scaling, and incident response to keep our platform healthy and developer-friendly
  • Engineer for scale: Design and operate ingestion and data processing pipelines that remain performant, resilient, and observable—even as we grow to petabyte-level workloads
  • Engage with the community: Collaborate with open-source contributors and customers, solve their challenges, and incorporate their feedback into our roadmap
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (Infrastructure) - HyperDX

Join us in revolutionizing Observability for Developers! We’re on a mission to r...
Location
Location
United Kingdom
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of backend engineering experience
  • Strong TypeScript and Node.js skills (bonus for additional languages)
  • Deep understanding of APIs, event-driven systems, and high-throughput data pipelines
  • Proficiency in SQL and experience working with analytical databases (ClickHouse experience a plus)
  • Experience with Docker and Kubernetes, plus Helm for managing production deployments
  • Experience with infrastructure-as-code (Terraform, Pulumi, or similar)
  • Familiarity with CI/CD pipelines, monitoring systems, and production-grade alerting practices
  • A passion for building reliable, maintainable, cloud-native systems
Job Responsibility
Job Responsibility
  • Build the core platform: Design and implement backend systems and APIs that power HyperDX, enabling engineers to ingest, query, and analyze observability data at massive scale
  • Scale deployments and infrastructure: Architect, deploy, and maintain cloud-native systems that ensure reliability, scalability, and performance. You’ll use Kubernetes, Helm, and infrastructure-as-code to make deployments simple and resilient
  • Ensure maintainability and operational excellence: Define best practices for CI/CD, monitoring, logging, and alerting. Drive automation across testing, scaling, and incident response to keep our platform healthy and developer-friendly
  • Engineer for scale: Design and operate ingestion and data processing pipelines that remain performant, resilient, and observable—even as we grow to petabyte-level workloads
  • Engage with the community: Collaborate with open-source contributors and customers, solve their challenges, and incorporate their feedback into our roadmap
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Senior Software Engineer, Release Engineering

We’re looking for a Senior Software Engineer to join our Release Engineering tea...
Location
Location
United States
Salary
Salary:
143000.00 - 203000.00 USD / Year
getdbt.com Logo
dbt Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience designing, operating, or improving CI/CD systems for large-scale distributed applications
  • Proficiency with one or more of the following: Helm, ArgoCD, Terraform, GitHub Actions, or Kubernetes
  • Familiarity with infrastructure-as-code practices and the principles of reliable, observable systems
  • Background in Python (or other modern language) development for automation or platform tooling
  • A collaborative mindset and interest in enabling other developers through tooling and platform improvements
  • Worked asynchronously as part of a fully remote, distributed team
Job Responsibility
Job Responsibility
  • Design, build, and maintain components of our CI/CD platform to make deployments safer, faster, and more reliable
  • Lead initiatives that improve automation, observability, and self-service capabilities for engineers
  • Collaborate across teams to identify friction points in our delivery process and build tools to eliminate them
  • Evolve our release architecture to support dbt Cloud’s multi-cloud, cell-based infrastructure at scale
  • Continuously improve developer experience by refining build pipelines, release workflows, and infrastructure-as-code practices
What we offer
What we offer
  • Unlimited vacation
  • 401k w/3% guaranteed contribution
  • Excellent healthcare
  • Paid Parental Leave
  • Wellness stipend
  • Home office stipend
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Platform Observability

Everlaw is looking for a Senior Software Engineer that brings experience in buil...
Location
Location
United States , Oakland
Salary
Salary:
164000.00 - 208000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS or MS in Computer Science, or equivalent coursework
  • At least 3 years of experience building logging, metrics, and tracing infrastructure
  • Proficiency in coding in a language such as C, C++, C#, Java, Python, Javascript, Go or Rust
  • Experience with Infrastructure as Code and container solutions to manage cloud environments (ex: Terraform, Ansible, Docker, etc)
  • At least 1 year of experience leading multi-developer efforts, including planning, technical breakdown, and coordination
  • Excellent communication and collaboration skills
  • Please note that at this time, Everlaw is not sponsoring U.S. employment visas for this role. Due to federal contract requirements, Everlaw may only hire US citizens for this position.
Job Responsibility
Job Responsibility
  • Build observability strategies to support application and infrastructure metrics, logs, traces, dashboards, and alerts
  • Develop and maintain infrastructure as code (IAC) using tools such as Terraform and Ansible
  • Monitor usage trends to identify opportunities to optimize efficiency and performance of our metrics database and logging tools
  • Improve our on-call and incident management processes by encouraging deeper understanding, communication, and trust
  • Support developer projects by influencing design and implementation of infrastructure features as well as providing technical guidance
  • Support compliance efforts by promoting continuous documentation of our processes and involvement in audits
  • Provide Technical Mentorship to other engineers by both sharing your technical knowledge and becoming an expert in an area of our code base.
What we offer
What we offer
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Seventeen paid vacation days plus 11 federal holidays
  • Membership to Modern Health to help employees prioritize mental health and wellness
  • Annual allocation for Learning & Development opportunities and applicable professional membership dues
  • Company-sponsored life and disability insurance
  • Work in Uptown Oakland, just steps from the BART line and dozens of restaurants and walking distance to Lake Merritt
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Observability and Reliability

We are growing the engineering team and looking for engineers who have the chops...
Location
Location
United States , New York City
Salary
Salary:
150000.00 - 220000.00 USD / Year
sigmacomputing.com Logo
Sigma Computing
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong Computer Science fundamentals
  • 5+ years industry experience building and maintaining high-quality software, especially software other engineers use
  • You apply a product mindset to infrastructure systems and feel accomplished enabling others
  • Desire to be a great teammate and have fun at work
  • Strong sense of craftsmanship, and a healthy academic curiosity
Job Responsibility
Job Responsibility
  • Build observability tools and platforms, including: metrics, logging, distributed tracing, dashboarding, alerting, application performance management
  • Build with modern tools and languages like Go, Open Telemetry and Kubernetes
  • Participate in on-call rotation and ensure uptime of services
  • Create runtime tools/processes that optimize cloud triaging and limit downtime
  • Define best practices around making our systems and services measurable
  • Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies. We expect successful candidates to be coding a majority of their time
What we offer
What we offer
  • Equity
  • Generous health benefits
  • Flexible time off policy. Take the time off you need!
  • Paid bonding time for all new parents
  • Traditional and Roth 401k
  • Commuter and FSA benefits
  • Lunch Program
  • Dog friendly office
  • Fulltime
Read More
Arrow Right