CrawlJobs Logo

SRE- Clickhouse Team

posthog.com Logo

PostHog

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We run one of the largest self-managed ClickHouse installations on AWS, already at petabyte scale, and we’re actively preparing it for the next 10–50× of growth. This role sits at the centre of that effort. You won’t be in a typical “keep the lights on” SRE role. The work is about turning a fast-growing, stateful system into a predictable, well-automated platform. You’ll work on the kind of problems that only show up at large scale (petabytes of data, thousands of cores, constant ingestion). You’ll have room to design and automate, not just respond to alerts.

Job Responsibility:

  • Turning a fast-growing, stateful system into a predictable, well-automated platform (provisioning, scaling, rebalancing, recovery)
  • Reducing operational stress, designing safe automation for data-heavy workloads, and building tooling and patterns for scale
  • Managing large fleets of EC2-based VMs, disks, and networking for data-intensive workloads
  • Improving operational tooling around deploys, schema changes, backups, restores, and incident response
  • Working closely with ClickHouse engineers to turn database-level needs into infra-level solutions
  • Reducing operational load by identifying repeat pain points and eliminating them through code and self-healing automation
  • Participating in on-call and incident response, with a focus on making incidents rarer over time

Requirements:

  • Strong experience operating production infrastructure on AWS
  • Hands-on experience with VM-based systems (EC2), not just managed PaaS
  • Experience automating infrastructure using tools like Terraform, Ansible, or similar
  • Solid understanding of Linux systems (disk, memory, networking, failure modes)
  • Experience supporting stateful systems (databases, queues, storage systems, etc.)
  • Ability to debug and reason about performance and reliability issues in production
  • Comfortable owning systems end-to-end, including on-call responsibilities

Nice to have:

  • Prior experience with ClickHouse or other analytical databases
  • Experience operating systems at very large data scale
  • Familiarity with Kubernetes (helpful, but not the core of this role)

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for SRE- Clickhouse Team

Database Reliability Engineer - Core Team

We are committed to providing our customers with reliable and secure services at...
Location
Location
United Kingdom
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science or a related field
  • At least 5 years of experience in Reliability Engineering, QA or customer facing engineering
  • Previous experience operating ClickHouse or other SQL databases in production
  • Excellent understanding of distributed database internals and SQL, particularly ClickHouse is a major plus
  • Scripting experience with Shell or Python, and ability to read and understand C++ code
  • Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform
  • You are a strong problem-solver and have solid production debugging skills
  • You thrive in a fast-paced environment as part of a global team, and you see yourself as a partner with the business with the shared goal of moving the business forward
  • You have a high level of responsibility, ownership, and accountability
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Continuously improve the reliability and performance of ClickHouse core
  • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers
  • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements
  • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Database Reliability Engineer

We are committed to providing our customers with reliable and secure services at...
Location
Location
Netherlands
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science or a related field
  • At least 5 years of experience in Reliability Engineering, QA or customer facing engineering
  • Previous experience operating ClickHouse or other SQL databases in production
  • Excellent understanding of distributed database internals and SQL, particularly ClickHouse is a major plus
  • Scripting experience with Shell or Python, and ability to read and understand C++ code
  • Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform
  • You are a strong problem-solver and have solid production debugging skills
  • You thrive in a fast-paced environment as part of a global team, and you see yourself as a partner with the business with the shared goal of moving the business forward
  • You have a high level of responsibility, ownership, and accountability
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Continuously improve the reliability and performance of ClickHouse core
  • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers
  • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements
  • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers
  • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
  • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer, AI

LogicMonitor is advancing observability through AI‑driven data intelligence, con...
Location
Location
India , Pune
Salary
Salary:
Not provided
logicmonitor.com Logo
LogicMonitor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Data Engineering, or a related field
  • 4-5 years of experience in backend or data systems engineering
  • Experience building streaming data pipelines (Kafka / Spark or any similar technology)
  • Strong programming background in Java and Python, including microservice design
  • Experience with ETL, data modeling, and distributed storage systems
  • Familiarity with LLM pipelines, embeddings, and vector retrieval
  • Understanding of Kubernetes, containerization, and CI/CD workflows
  • Awareness of data governance, validation, and lineage best practices
  • Strong communication and collaboration across AI, Data, and Platform teams
Job Responsibility
Job Responsibility
  • Design and build streaming and batch data pipelines that process metrics, logs, and events for AI workflows
  • Develop ETL and feature‑extraction pipelines using Python and Java microservices
  • Integrate data ingestion and enrichment from multiple observability sources into AI‑ready formats
  • Build resilient data orchestration using Kafka, Airflow, and Redis Streams
  • Develop data indexing and semantic search for large‑scale observability and operational data
  • Work with structured and unstructured data lakes and warehouses (Delta Lake, Iceberg, ClickHouse)
  • Collaborate with the AI Platform team to manage embeddings, metadata, and model context storage
  • Optimize latency and throughput for retrieval, query expansion, and AI response generation
  • Build and maintain Java microservices (Spring Boot) that serve AI and analytics data to Edwin and AIOps applications
  • Develop Python APIs (FastAPI / LangGraph) for LLM orchestration, summarization, and correlation reasoning
Read More
Arrow Right

Senior Software Engineer - Postgres

ClickHouse is launching a strategic Postgres initiative to extend our developer-...
Location
Location
United States
Salary
Salary:
140000.00 - 208000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in software engineering, ideally with experience building or operating database or cloud platform systems
  • Deep understanding of Postgres — configuration, extensions, operations, and performance tuning
  • Strong programming experience in Ruby, Go, or Python (or willingness to work across languages)
  • Familiarity with cloud infrastructure, APIs, and automation tools (Terraform, Kubernetes, CI/CD)
  • Understanding of distributed systems, data replication, and service orchestration patterns
  • Pragmatic, detail-oriented, and comfortable with both greenfield development and operational ownership
  • Happy to contribute where needed — from backend APIs and platform automation to Postgres internals and debugging
  • Strong communicator who works effectively across teams in a fast-paced, cross-functional environment
  • Operate with a founder’s mindset — take initiative, move quickly, and care deeply about outcomes
Job Responsibility
Job Responsibility
  • Design and build backend services that orchestrate and manage database clusters in ClickHouse Cloud
  • Extend our platform control plane — written in Ruby, Go, and TypeScript — to support new Postgres capabilities
  • Contribute to automation and tooling that simplify cluster provisioning, scaling, and lifecycle management
  • Collaborate with infrastructure, SRE, and product teams to ensure operational excellence, performance, and reliability
  • Develop APIs and integrations that expose new Postgres functionality to customers and internal systems
  • Improve observability, deployment safety, and debugging workflows for database services
  • Participate in design discussions, code reviews, and on-call rotations, contributing to the overall reliability and velocity of the team
  • Operate with autonomy — identifying opportunities, driving execution, and delivering meaningful impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Postgres

ClickHouse is launching a strategic Postgres initiative to extend our developer-...
Location
Location
Canada
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in software engineering, ideally with experience building or operating database or cloud platform systems
  • Deep understanding of Postgres — configuration, extensions, operations, and performance tuning
  • Strong programming experience in Ruby, Go, or Python (or willingness to work across languages)
  • Familiarity with cloud infrastructure, APIs, and automation tools (Terraform, Kubernetes, CI/CD)
  • Understanding of distributed systems, data replication, and service orchestration patterns
  • Pragmatic, detail-oriented, and comfortable with both greenfield development and operational ownership
  • Happy to contribute where needed — from backend APIs and platform automation to Postgres internals and debugging
  • Strong communicator who works effectively across teams in a fast-paced, cross-functional environment
  • Operate with a founder’s mindset — take initiative, move quickly, and care deeply about outcomes
Job Responsibility
Job Responsibility
  • Design and build backend services that orchestrate and manage database clusters in ClickHouse Cloud
  • Extend our platform control plane — written in Ruby, Go, and TypeScript — to support new Postgres capabilities
  • Contribute to automation and tooling that simplify cluster provisioning, scaling, and lifecycle management
  • Collaborate with infrastructure, SRE, and product teams to ensure operational excellence, performance, and reliability
  • Develop APIs and integrations that expose new Postgres functionality to customers and internal systems
  • Improve observability, deployment safety, and debugging workflows for database services
  • Participate in design discussions, code reviews, and on-call rotations, contributing to the overall reliability and velocity of the team
  • Operate with autonomy — identifying opportunities, driving execution, and delivering meaningful impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Postgres

ClickHouse is launching a strategic Postgres initiative to extend our developer-...
Location
Location
India
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in software engineering, ideally with experience building or operating database or cloud platform systems
  • Deep understanding of Postgres — configuration, extensions, operations, and performance tuning
  • Strong programming experience in Ruby, Go, or Python (or willingness to work across languages)
  • Familiarity with cloud infrastructure, APIs, and automation tools (Terraform, Kubernetes, CI/CD)
  • Understanding of distributed systems, data replication, and service orchestration patterns
  • Pragmatic, detail-oriented, and comfortable with both greenfield development and operational ownership
  • Happy to contribute where needed — from backend APIs and platform automation to Postgres internals and debugging
  • Strong communicator who works effectively across teams in a fast-paced, cross-functional environment
  • Operate with a founder’s mindset — take initiative, move quickly, and care deeply about outcomes
Job Responsibility
Job Responsibility
  • Design and build backend services that orchestrate and manage database clusters in ClickHouse Cloud
  • Extend our platform control plane — written in Ruby, Go, and TypeScript — to support new Postgres capabilities
  • Contribute to automation and tooling that simplify cluster provisioning, scaling, and lifecycle management
  • Collaborate with infrastructure, SRE, and product teams to ensure operational excellence, performance, and reliability
  • Develop APIs and integrations that expose new Postgres functionality to customers and internal systems
  • Improve observability, deployment safety, and debugging workflows for database services
  • Participate in design discussions, code reviews, and on-call rotations, contributing to the overall reliability and velocity of the team
  • Operate with autonomy — identifying opportunities, driving execution, and delivering meaningful impact
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Senior Infrastructure Engineer - Postgres

ClickHouse is expanding its cloud data platform across AWS, GCP, and Azure—addin...
Location
Location
United States
Salary
Salary:
140000.00 - 208000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in SRE, DevOps, or infrastructure engineering, with a track record of running distributed, production-grade systems
  • Solid understanding of Postgres operations, scaling, and performance tuning
  • Deep hands-on experience across AWS, with exposure to GCP and Azure
  • comfortable navigating multi-cloud topologies
  • Proficient with Terraform, Kubernetes, and container-based infrastructure
  • Strong Go development skills (or willingness to write and own production Go code)
  • Familiar with tools like Prometheus, Grafana, Loki, OpenTelemetry, or equivalents
  • Deep understanding of SLOs, incident response, and continuous improvement in service reliability
  • You operate with a founder’s mentality — hands-on, resourceful, and willing to dive deep to get things done. You take pride in hard work, autonomy, and shipping impactful systems.
Job Responsibility
Job Responsibility
  • Lead reliability and operations for ClickHouse’s Postgres integration — upgrades, patching, maintenance, and scaling
  • Design and implement automation for provisioning, deployments, and service lifecycle management across AWS, GCP, and Azure
  • Develop infrastructure-as-code using Terraform and modern CI/CD tooling to ensure consistent, repeatable deployments
  • Contribute Go-based tooling and services that improve automation, observability, and developer experience
  • Own observability and monitoring, ensuring robust alerting, metrics, and tracing across environments
  • Drive incident management and postmortem practices that strengthen reliability and learning loops
  • Collaborate cross-functionally with platform, networking, and product teams to improve service operability
  • Mentor and enable engineers, helping the team scale effectively as customer adoption grows.
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.
  • Fulltime
Read More
Arrow Right

Senior Infrastructure Engineer - Postgres

ClickHouse is expanding its cloud data platform across AWS, GCP, and Azure—addin...
Location
Location
India
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in SRE, DevOps, or infrastructure engineering, with a track record of running distributed, production-grade systems
  • Solid understanding of Postgres operations, scaling, and performance tuning
  • Deep hands-on experience across AWS, with exposure to GCP and Azure
  • comfortable navigating multi-cloud topologies
  • Proficient with Terraform, Kubernetes, and container-based infrastructure
  • Strong Go development skills (or willingness to write and own production Go code)
  • Familiar with tools like Prometheus, Grafana, Loki, OpenTelemetry, or equivalents
  • Deep understanding of SLOs, incident response, and continuous improvement in service reliability
  • You operate with a founder’s mentality — hands-on, resourceful, and willing to dive deep to get things done. You take pride in hard work, autonomy, and shipping impactful systems
Job Responsibility
Job Responsibility
  • Lead reliability and operations for ClickHouse’s Postgres integration — upgrades, patching, maintenance, and scaling
  • Design and implement automation for provisioning, deployments, and service lifecycle management across AWS, GCP, and Azure
  • Develop infrastructure-as-code using Terraform and modern CI/CD tooling to ensure consistent, repeatable deployments
  • Contribute Go-based tooling and services that improve automation, observability, and developer experience
  • Own observability and monitoring, ensuring robust alerting, metrics, and tracing across environments
  • Drive incident management and postmortem practices that strengthen reliability and learning loops
  • Collaborate cross-functionally with platform, networking, and product teams to improve service operability
  • Mentor and enable engineers, helping the team scale effectively as customer adoption grows
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right