CrawlJobs Logo

Senior Software Engineer - Cloud Infrastructure & Observability

roku.com Logo

Roku

Location Icon

Location:
India , Bengaluru

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Responsibility:

  • Architect and lead Roku’s observability platform across metrics, logs, and traces
  • evolve data pipelines and storage layers optimized for high throughput, performance, and cost at Roku scale (TSDBs, Parquet, distributed processing)
  • Extend and harden open‑source observability systems
  • overhaul core components (e.g., storage layers, query paths) to improve performance, reliability, and usability at scale
  • Implement features such as pre‑aggregation, down-sampling, and sampling to reduce load and accelerate queries across the platform
  • Collaborate across platform, SRE, and product teams to migrate hundreds of workloads to our common platform
  • augment and automate CI/CD flows and onboarding
  • Integrate security into infrastructure and platform services
  • ensure robust multi‑tenant, multi‑cluster, and multi‑cloud designs
  • Contribute improvements back to open source and CNCF‑aligned projects
  • shape standards adoption (OpenTelemetry, OpenMetrics) across the company
  • Mentor engineers
  • establish best practices for reliability, efficiency, and cost management across service mesh and observability domains

Requirements:

  • 15+ years in software engineering with a track record of architecting distributed systems or platforms at scale
  • Strong hands‑on experience in Golang and one scripting language (e.g., Python or Shell)
  • Experience operating observability at pb-scale ingestion and hundreds of millions of series
  • Expertise in observability platforms and tooling (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, ClickHouse) and standards (OpenTelemetry, OpenMetrics)
  • Deep experience building systems of scale and operating cloud infrastructure with Kubernetes
  • strong proficiency with service mesh technologies (Istio/Envoy), infrastructure‑as‑code (Terraform) and experience in multi‑cloud (AWS, GCP)
  • Demonstrated ability to evolve storage and query architectures for cost, scale, and latency (e.g., TSDB, Parquet, distributed processing)
  • Proven experience integrating security as part of infrastructure and platform development
  • Exceptional cross‑functional communication
  • effective collaboration with both technical and non‑technical stakeholders
  • Culture fit: independent thinker, pragmatic problem‑solver, low‑ego collaborator who moves fast and focuses on company success
  • Experience integrating AI tools to improve processes and reduce toil
  • Open‑source contributions in CNCF projects is preferred but not mandatory

Nice to have:

Open‑source contributions in CNCF projects

What we offer:
  • Global access to mental health and financial wellness support and resources
  • healthcare (medical, dental, and vision)
  • life, accident, disability, commuter, and retirement options (401(k)/pension)
  • time off in accordance with local leave policies

Additional Information:

Job Posted:
May 05, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Software Engineer - Cloud Infrastructure & Observability

Senior Director of Engineering, Infrastructure

Senior Director of Engineering role leading the Infrastructure group at PagerDut...
Location
Location
United States , San Francisco
Salary
Salary:
233000.00 - 392000.00 USD / Year
https://www.pagerduty.com Logo
PagerDuty
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience in senior engineering leadership roles, managing multiple layers of managers
  • Significant experience as a hands-on technical contributor earlier in your career
  • Deep knowledge of modern infrastructure and software delivery: high availability, distributed systems, public cloud (AWS), microservices, containers, CI/CD pipelines, observability, and automation
  • Track record of building and scaling high-performing, inclusive engineering organizations
Job Responsibility
Job Responsibility
  • Define and drive the multi-year strategy for PagerDuty's infrastructure and platform foundations
  • Strong ownership of PagerDuty's reliability patterns and practices
  • Bar raiser for all engineering functions
  • Lead, mentor, and scale a diverse team of Engineering Managers, Senior Managers, and technical leaders across multiple geographies
  • Ensure the reliability, scalability, and security of PagerDuty's global SaaS platform
  • Partner with peers in Engineering, Product, and Security to deliver large cross-functional initiatives
  • Champion engineering excellence: CI/CD maturity, observability best practices, operational rigor, and incident readiness
  • Manage budgets, headcount, and vendor relationships to optimize infrastructure investments
  • Represent Infrastructure externally with customers and partners, and internally with executives, as a trusted voice on technical and business tradeoffs
  • Foster a culture of inclusion, accountability, collaboration, and growth
What we offer
What we offer
  • Competitive salary
  • Comprehensive benefits package
  • Flexible work arrangements
  • Company equity
  • ESPP (Employee Stock Purchase Program)
  • Retirement or pension plan
  • Generous paid vacation time
  • Paid holidays and sick leave
  • Dutonian Wellness Days & HibernationDuty - companywide paid days off in addition to PTO
  • Paid parental leave: 22 weeks for pregnant parent, 12 weeks for non-pregnant parent
  • Fulltime
Read More
Arrow Right

Senior Director of Engineering, Infrastructure

Senior Director of Engineering to lead the Infrastructure group at PagerDuty, se...
Location
Location
United States , Atlanta
Salary
Salary:
233000.00 - 392000.00 USD / Year
https://www.pagerduty.com Logo
PagerDuty
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience in senior engineering leadership roles, managing multiple layers of managers
  • Significant experience as a hands-on technical contributor earlier in your career
  • Deep knowledge of modern infrastructure and software delivery: high availability, distributed systems, public cloud (AWS), microservices, containers, CI/CD pipelines, observability, and automation
  • Track record of building and scaling high-performing, inclusive engineering organizations
Job Responsibility
Job Responsibility
  • Define and drive the multi-year strategy for PagerDuty's infrastructure and platform foundations
  • Strong ownership of PagerDuty's reliability patterns and practices
  • Lead, mentor, and scale a diverse team of Engineering Managers, Senior Managers, and technical leaders across multiple geographies
  • Ensure the reliability, scalability, and security of PagerDuty's global SaaS platform
  • Partner with peers in Engineering, Product, and Security to deliver large cross-functional initiatives
  • Champion engineering excellence: CI/CD maturity, observability best practices, operational rigor, and incident readiness
  • Manage budgets, headcount, and vendor relationships to optimize infrastructure investments
  • Represent Infrastructure externally with customers and partners, and internally with executives
  • Foster a culture of inclusion, accountability, collaboration, and growth
What we offer
What we offer
  • Comprehensive benefits package
  • Flexible work arrangements
  • Company equity
  • ESPP (Employee Stock Purchase Program)
  • Retirement or pension plan
  • Generous paid vacation time
  • Paid holidays and sick leave
  • Dutonian Wellness Days & HibernationDuty - companywide paid days off in addition to PTO
  • Paid parental leave: 22 weeks for pregnant parent, 12 weeks for non-pregnant parent
  • Paid volunteer time off: 20 hours per year
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (Infrastructure) - HyperDX

Join us in revolutionizing Observability for Developers! We’re on a mission to r...
Location
Location
Netherlands
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of backend engineering experience
  • Strong TypeScript and Node.js skills (bonus for additional languages)
  • Deep understanding of APIs, event-driven systems, and high-throughput data pipelines
  • Proficiency in SQL and experience working with analytical databases (ClickHouse experience a plus)
  • Experience with Docker and Kubernetes, plus Helm for managing production deployments
  • Experience with infrastructure-as-code (Terraform, Pulumi, or similar)
  • Familiarity with CI/CD pipelines, monitoring systems, and production-grade alerting practices
  • A passion for building reliable, maintainable, cloud-native systems
Job Responsibility
Job Responsibility
  • Build the core platform: Design and implement backend systems and APIs that power HyperDX, enabling engineers to ingest, query, and analyze observability data at massive scale
  • Scale deployments and infrastructure: Architect, deploy, and maintain cloud-native systems that ensure reliability, scalability, and performance. You’ll use Kubernetes, Helm, and infrastructure-as-code to make deployments simple and resilient
  • Ensure maintainability and operational excellence: Define best practices for CI/CD, monitoring, logging, and alerting. Drive automation across testing, scaling, and incident response to keep our platform healthy and developer-friendly
  • Engineer for scale: Design and operate ingestion and data processing pipelines that remain performant, resilient, and observable—even as we grow to petabyte-level workloads
  • Engage with the community: Collaborate with open-source contributors and customers, solve their challenges, and incorporate their feedback into our roadmap
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (Infrastructure) - HyperDX

Join us in revolutionizing Observability for Developers! We’re on a mission to r...
Location
Location
Germany
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of backend engineering experience
  • Strong TypeScript and Node.js skills (bonus for additional languages)
  • Deep understanding of APIs, event-driven systems, and high-throughput data pipelines
  • Proficiency in SQL and experience working with analytical databases (ClickHouse experience a plus)
  • Experience with Docker and Kubernetes, plus Helm for managing production deployments
  • Experience with infrastructure-as-code (Terraform, Pulumi, or similar)
  • Familiarity with CI/CD pipelines, monitoring systems, and production-grade alerting practices
  • A passion for building reliable, maintainable, cloud-native systems
Job Responsibility
Job Responsibility
  • Build the core platform: Design and implement backend systems and APIs that power HyperDX, enabling engineers to ingest, query, and analyze observability data at massive scale
  • Scale deployments and infrastructure: Architect, deploy, and maintain cloud-native systems that ensure reliability, scalability, and performance. You’ll use Kubernetes, Helm, and infrastructure-as-code to make deployments simple and resilient
  • Ensure maintainability and operational excellence: Define best practices for CI/CD, monitoring, logging, and alerting. Drive automation across testing, scaling, and incident response to keep our platform healthy and developer-friendly
  • Engineer for scale: Design and operate ingestion and data processing pipelines that remain performant, resilient, and observable—even as we grow to petabyte-level workloads
  • Engage with the community: Collaborate with open-source contributors and customers, solve their challenges, and incorporate their feedback into our roadmap
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Postgres

ClickHouse is launching a strategic Postgres initiative to extend our developer-...
Location
Location
United States
Salary
Salary:
140000.00 - 208000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in software engineering, ideally with experience building or operating database or cloud platform systems
  • Deep understanding of Postgres — configuration, extensions, operations, and performance tuning
  • Strong programming experience in Ruby, Go, or Python (or willingness to work across languages)
  • Familiarity with cloud infrastructure, APIs, and automation tools (Terraform, Kubernetes, CI/CD)
  • Understanding of distributed systems, data replication, and service orchestration patterns
  • Pragmatic, detail-oriented, and comfortable with both greenfield development and operational ownership
  • Happy to contribute where needed — from backend APIs and platform automation to Postgres internals and debugging
  • Strong communicator who works effectively across teams in a fast-paced, cross-functional environment
  • Operate with a founder’s mindset — take initiative, move quickly, and care deeply about outcomes
Job Responsibility
Job Responsibility
  • Design and build backend services that orchestrate and manage database clusters in ClickHouse Cloud
  • Extend our platform control plane — written in Ruby, Go, and TypeScript — to support new Postgres capabilities
  • Contribute to automation and tooling that simplify cluster provisioning, scaling, and lifecycle management
  • Collaborate with infrastructure, SRE, and product teams to ensure operational excellence, performance, and reliability
  • Develop APIs and integrations that expose new Postgres functionality to customers and internal systems
  • Improve observability, deployment safety, and debugging workflows for database services
  • Participate in design discussions, code reviews, and on-call rotations, contributing to the overall reliability and velocity of the team
  • Operate with autonomy — identifying opportunities, driving execution, and delivering meaningful impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (Infrastructure) - HyperDX

Join us in revolutionizing Observability for Developers! We’re on a mission to r...
Location
Location
United Kingdom
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of backend engineering experience
  • Strong TypeScript and Node.js skills (bonus for additional languages)
  • Deep understanding of APIs, event-driven systems, and high-throughput data pipelines
  • Proficiency in SQL and experience working with analytical databases (ClickHouse experience a plus)
  • Experience with Docker and Kubernetes, plus Helm for managing production deployments
  • Experience with infrastructure-as-code (Terraform, Pulumi, or similar)
  • Familiarity with CI/CD pipelines, monitoring systems, and production-grade alerting practices
  • A passion for building reliable, maintainable, cloud-native systems
Job Responsibility
Job Responsibility
  • Build the core platform: Design and implement backend systems and APIs that power HyperDX, enabling engineers to ingest, query, and analyze observability data at massive scale
  • Scale deployments and infrastructure: Architect, deploy, and maintain cloud-native systems that ensure reliability, scalability, and performance. You’ll use Kubernetes, Helm, and infrastructure-as-code to make deployments simple and resilient
  • Ensure maintainability and operational excellence: Define best practices for CI/CD, monitoring, logging, and alerting. Drive automation across testing, scaling, and incident response to keep our platform healthy and developer-friendly
  • Engineer for scale: Design and operate ingestion and data processing pipelines that remain performant, resilient, and observable—even as we grow to petabyte-level workloads
  • Engage with the community: Collaborate with open-source contributors and customers, solve their challenges, and incorporate their feedback into our roadmap
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Senior Software Engineer - Postgres

ClickHouse is launching a strategic Postgres initiative to extend our developer-...
Location
Location
Canada
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in software engineering, ideally with experience building or operating database or cloud platform systems
  • Deep understanding of Postgres — configuration, extensions, operations, and performance tuning
  • Strong programming experience in Ruby, Go, or Python (or willingness to work across languages)
  • Familiarity with cloud infrastructure, APIs, and automation tools (Terraform, Kubernetes, CI/CD)
  • Understanding of distributed systems, data replication, and service orchestration patterns
  • Pragmatic, detail-oriented, and comfortable with both greenfield development and operational ownership
  • Happy to contribute where needed — from backend APIs and platform automation to Postgres internals and debugging
  • Strong communicator who works effectively across teams in a fast-paced, cross-functional environment
  • Operate with a founder’s mindset — take initiative, move quickly, and care deeply about outcomes
Job Responsibility
Job Responsibility
  • Design and build backend services that orchestrate and manage database clusters in ClickHouse Cloud
  • Extend our platform control plane — written in Ruby, Go, and TypeScript — to support new Postgres capabilities
  • Contribute to automation and tooling that simplify cluster provisioning, scaling, and lifecycle management
  • Collaborate with infrastructure, SRE, and product teams to ensure operational excellence, performance, and reliability
  • Develop APIs and integrations that expose new Postgres functionality to customers and internal systems
  • Improve observability, deployment safety, and debugging workflows for database services
  • Participate in design discussions, code reviews, and on-call rotations, contributing to the overall reliability and velocity of the team
  • Operate with autonomy — identifying opportunities, driving execution, and delivering meaningful impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Postgres

ClickHouse is launching a strategic Postgres initiative to extend our developer-...
Location
Location
India
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in software engineering, ideally with experience building or operating database or cloud platform systems
  • Deep understanding of Postgres — configuration, extensions, operations, and performance tuning
  • Strong programming experience in Ruby, Go, or Python (or willingness to work across languages)
  • Familiarity with cloud infrastructure, APIs, and automation tools (Terraform, Kubernetes, CI/CD)
  • Understanding of distributed systems, data replication, and service orchestration patterns
  • Pragmatic, detail-oriented, and comfortable with both greenfield development and operational ownership
  • Happy to contribute where needed — from backend APIs and platform automation to Postgres internals and debugging
  • Strong communicator who works effectively across teams in a fast-paced, cross-functional environment
  • Operate with a founder’s mindset — take initiative, move quickly, and care deeply about outcomes
Job Responsibility
Job Responsibility
  • Design and build backend services that orchestrate and manage database clusters in ClickHouse Cloud
  • Extend our platform control plane — written in Ruby, Go, and TypeScript — to support new Postgres capabilities
  • Contribute to automation and tooling that simplify cluster provisioning, scaling, and lifecycle management
  • Collaborate with infrastructure, SRE, and product teams to ensure operational excellence, performance, and reliability
  • Develop APIs and integrations that expose new Postgres functionality to customers and internal systems
  • Improve observability, deployment safety, and debugging workflows for database services
  • Participate in design discussions, code reviews, and on-call rotations, contributing to the overall reliability and velocity of the team
  • Operate with autonomy — identifying opportunities, driving execution, and delivering meaningful impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right