CrawlJobs Logo

Cloud and Observability Engineer

coralogix.com Logo

Coralogix

Location Icon

Location:
India , Gurugram

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Cloud and Observability Engineer you will play a critical role in ensuring a smooth transition of customers’ monitoring and observability infrastructure. Your expertise in various other observability tools, coupled with a strong understanding of DevOps, will be essential in successfully migrating alerts and dashboards through creating extension packages and enhancing the customer's monitoring capabilities. You will collaborate with cross-functional teams, understand their requirements, design migration & extension strategies, execute the migration process, and provide training and support throughout the engagement.

Job Responsibility:

  • Extension Delivery: Build & enhance quality extension packages for alerts, dashboards and parsing rules in Coralogix Platform to improve monitoring experience for key services using our platform
  • Migration Delivery: Help migrate customer alerts, dashboards and parsing rules from leading competitive observability and security platforms to Coralogix
  • Knowledge Management: Build, maintain and evolve documentation with respect to all aspects of extensions and migration
  • Conduct training sessions for internal stakeholders and customer on all aspects of the platform functionality (alerts, dashboards, parsing, querying, etc.), migrations process & techniques and extensions content
  • Collaborate closely with internal stakeholders and customers to understand their specific monitoring needs, gather requirements, and ensure alignment during the extension building process

Requirements:

  • Minimum 2+ years of experience as a Systems Engineer, DevOps Engineer, or similar roles, with a focus on monitoring, alerting, and observability solutions
  • 2+ yrs of hands-on experience with and understanding of Cloud and Container technologies (GCP/Azure/AWS + K8/EKS/GKE/AKS)
  • Good knowledge and hands-on experience with 2 or more Observability platforms, including alert creation, dashboard creation, and infrastructure monitoring
  • Good understanding of CI/CD with at least one deployment and version control tool
  • Basic understanding and practical experience with PromQL, Prometheus's query language, for querying metrics and creating custom dashboards
  • Excellent problem-solving and debugging skills
  • Strong English verbal and written communication skills
  • Ability to analyze complex systems, identify inefficiencies or gaps, and propose optimized monitoring solutions
  • Ability to also work across US and European timezones

Nice to have:

Cloud Service Provider DevOps certifications would be a plus

Additional Information:

Job Posted:
April 16, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Cloud and Observability Engineer

Cloud Software Engineer - Observability Platform

ClickHouse is looking for an experienced engineer to join our Observability team...
Location
Location
United States
Salary
Salary:
141000.00 - 208000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years building and running production systems at scale
  • Proficiency in Golang
  • Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
  • Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
  • Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
  • Experience with ClickHouse preferred
Job Responsibility
Job Responsibility
  • Design, build, and operate distributed systems that power observability across ClickHouse Cloud
  • Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
  • Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
  • Build tooling and automation to eliminate repetitive operational work
  • Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
  • Collaborate with other engineering teams to improve their observability posture
  • Contribute to design discussions, architecture reviews, and mentor teammates
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Cloud Software Engineer - Observability Platform

ClickHouse is looking for an experienced engineer to join our Observability team...
Location
Location
Canada
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years building and running production systems at scale
  • Proficiency in Golang
  • Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
  • Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
  • Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
  • Experience with ClickHouse preferred
Job Responsibility
Job Responsibility
  • Design, build, and operate distributed systems that power observability across ClickHouse Cloud
  • Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
  • Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
  • Build tooling and automation to eliminate repetitive operational work
  • Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
  • Collaborate with other engineering teams to improve their observability posture
  • Contribute to design discussions, architecture reviews, and mentor teammates
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Monitoring & Observability Engineer

The Monitoring & Observability Engineer is a senior level position responsible f...
Location
Location
India , Chennai; Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3-7 years of relevant experience in an Engineering & IT role
  • At least 2+ years of hands-on working experience in: Strong understanding of UI/UX principles and best practices
  • Proficient in JavaScript, TypeScript, HTML, CSS, React, and Node.js
  • Experience with backend technologies and databases (e.g., MongoDB)
  • Experience with Python Programming
  • Experience with version control systems (e.g., Git)
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration skills
  • Create modular and reusable React components to streamline development and maintain consistency across the application
  • Continuously improve existing applications, addressing bugs, and implementing new features
Job Responsibility
Job Responsibility
  • Drive the best-in-class monitoring using a range of tools across all regions of Global Consumer bank
  • Drive POCs and incubate new features and capabilities
  • Be forward looking and ensure long term strategic success
  • Work closely with the monitoring operations teams, production support, performance test teams, operations, application owners and application owners to deliver best-in-class monitoring
  • Explain complicated performance bottlenecks to stakeholders
  • Understand complicated application architecture, including Java app servers, Web Servers, Cloud (PCF, AWS, Google), Kubernetes, TIBCO, mainframe
  • Build advanced dashboards and queries
  • Be a subject matter expert for the Global Consumer Bank, including conducting brown bags and office hours
  • Recommend product customization for system integration
  • Identify problem causality, business impact and root causes
  • Fulltime
Read More
Arrow Right

Federal Observability Engineer

You will be part of a larger technical team, working as an Observability Enginee...
Location
Location
United States , HILL AFB
Salary
Salary:
105500.00 - 243000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • US Citizenship Required
  • Secret Clearance Required
  • DD8750 - Security Plus or higher Security Certification (CISSP, CASP, etc)
  • Bachelor's degree preferred or Associate degree holder (technical field) with 6-8 years working experience in related fields
  • Strong understanding of cloud computing platforms (AWS, Azure, GCP)
  • Experience with containerization technologies (Docker, Kubernetes)
  • Proficiency in scripting languages (Python, Go, Bash)
  • Experience with SQL and NoSQL databases
  • Knowledge of networking protocols (TCP/IP, HTTP)
  • Proven experience with the OpsRamp platform is a strong plus
Job Responsibility
Job Responsibility
  • Designing, implementing, and maintaining observability infrastructure in an OpsRamp environment
  • Working as part of a larger technical team supporting HPE's PCE environment and Cloud infrastructure for a Federal Customer
  • Configuring and managing data sources, defining and monitoring key performance indicators (KPIs), and analyzing performance trends
  • Configuring log collection, aggregation, and analysis within the OpsRamp platform
  • Creating and managing alerts, defining escalation paths, and integrating with incident management systems
  • Developing and implementing automated workflows and remediation actions within the OpsRamp platform
  • Designing and building custom dashboards and reports to provide key insights into system health and performance
  • Integrating OpsRamp with other monitoring and observability tools as needed
  • Ensuring data quality and integrity within the OpsRamp platform
  • Troubleshooting and resolving performance issues, application errors, and other operational problems
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Staff Observability Operations Engineer

We are currently seeking several experienced and highly skilled Staff Observabil...
Location
Location
United States , Hartford
Salary
Salary:
130295.00 - 260590.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ Years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications
  • 5+ Years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions
  • 5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana)
  • Experience developing and administering ServiceNow ITOM event management solutions
  • Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty)
  • Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift)
  • Proficiency in Python and other scripting languages such as Ansible, PowerShell, Bash for automation and configuration
  • Hands-on experience deploying, managing, and administering observability platforms
  • Hands-on experience leading, coordinating, and performing migration of application, platform, and infrastructure observability solutions
  • Proven ability to troubleshoot and resolve complex technical issues
Job Responsibility
Job Responsibility
  • Deploy and implement modern observability solutions
  • Manage and administer observability and event management platforms
  • Coordinate and manage release cycles for observability platforms
  • Troubleshoot and resolve incidents related to observability platforms
  • Continuously monitor and enhance platform performance
  • Collaborate with cross-functional stakeholders
  • Provide training and mentoring to junior engineers
  • Ensure compliance and security of observability platforms
  • Maintain documentation of observability platform configurations
  • Generate and analyze reports on platform performance and capacity
What we offer
What we offer
  • Affordable medical plan options
  • a 401(k) plan (including matching company contributions)
  • an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs
  • confidential counseling and financial coaching
  • Paid time off
  • flexible work schedules
  • family leave
  • dependent care resources
  • colleague assistance programs
  • Fulltime
Read More
Arrow Right

Cloud Security Site Reliability Engineer

This role sits within the Cloud Security team responsible for Private and Public...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree or equivalent work experience
  • 6+ years of relevant work experience
  • Highly motivated self-starter with excellent interpersonal and communication skills
  • Certification or formal training in site reliability engineering concepts and practices
  • Prior experience working towards SLIs, SLOs and observability capabilities at a large scale
  • 4+ years experience in Python (preferable) or Java, on large scale systems alongside Linux based scripting languages
  • Experience working on observability, logging and metrics toolsets
  • Experience of k8s and container technologies such as Docker, Openshift and EKS
  • Experience with public cloud technologies such as AWS, GCP or Azure
  • Experience with Secrets products such as HashiCorp Vault or CyberArk
Job Responsibility
Job Responsibility
  • Working across Container products and Secrets products, across Public and Private Cloud, as well as Cloud native specific products
  • Architecting and building tools and platforms that provide capabilities for SRE
  • Collaboration with multiple stakeholders and partners across Engineering and Operations as well as partner teams within the wider Citi organisation
  • Actively owning production level incidents till resolution.
What we offer
What we offer
  • Equal opportunity employer
  • Accessibility support for persons with disabilities.
  • Fulltime
Read More
Arrow Right

Lead Software Engineer - Cloud Infrastructure

As the Lead Software Engineer - Cloud Infrastructure, you will collaborate with ...
Location
Location
United States
Salary
Salary:
180000.00 - 225000.00 USD / Year
https://corelight.com/ Logo
Corelight
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelors or Masters degree in Computer Science or related fields, or equivalent experience
  • 10+ years of professional experience in cloud infrastructure engineering or related roles
  • Strong programming skills in languages such as Bash, Python, Go
  • Experience with infrastructure-as-code (IaC) tools such as Terraform, CloudFormation
  • Proficiency in scripting/programming languages such as Python, Bash, or PowerShell
  • Experience with automation tools like Jenkins, GitLab, and Ansible/Chef
  • Understanding of networking concepts, security best practices, and cloud-native architectures
  • Experience with cloud platforms like AWS, Azure, or Google Cloud
  • Strong communication and collaboration skills
  • Experience with Observability tools such as Prometheus, Grafana, ELK stack, or similar
Job Responsibility
Job Responsibility
  • Design, deploy, and maintain cloud infrastructure solutions on platforms such as AWS, Azure, or Google Cloud Platform (GCP)
  • Develop automation scripts and tools to streamline provisioning, configuration, and management of cloud resources
  • Collaborate with software development teams to integrate cloud services into applications and workflows
  • Implement monitoring and alerting systems to ensure the performance, availability, and security of cloud environments
  • Optimize resource utilization and cost efficiency through continuous monitoring, analysis, and optimization of cloud infrastructure
  • Stay current with emerging technologies and best practices in cloud computing, DevOps, and infrastructure automation
  • Participate in the resolution of production incidents and contribute to post-mortem analysis and improvement efforts.
What we offer
What we offer
  • Equity
  • Additional benefits
  • Fulltime
Read More
Arrow Right

Senior Observability Engineer

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Site Reliability, DevOps, or Platform Engineering with a focus on observability
  • Proven expertise with at least one major observability platform (e.g., Prometheus, Victoria Metrics, OpenSearch)
  • Hands-on experience with Kubernetes, including deep knowledge of controllers, operators, and Helm
  • Experience writing Kubernetes controllers (controller-runtime, KubeBuilder)
  • Strong programming skills in Go or Python (Rust is a plus)
  • Experience designing, scaling, and operating observability systems at enterprise scale
  • Familiarity with at least one major cloud provider (AWS, Azure, or GCP)
  • Strong understanding of distributed systems, telemetry pipelines, and instrumentation standards (e.g., OpenTelemetry)
  • Excellent communication skills with the ability to explain complex topics to diverse stakeholders
Job Responsibility
Job Responsibility
  • Design, implement, and maintain observability features such as Alerting, SLOs, Reporting, and Synthetic Tests
  • Manage and scale OpenTelemetry Collectors and other observability agents across Kubernetes environments
  • Write and maintain Kubernetes Controllers using frameworks like controller-runtime and KubeBuilder
  • Operate and optimize the internal Coralogix account, ensuring proper usage, cost efficiency, and best practices adoption
  • Define and enforce observability guidelines and standards across the organization
  • Partner with engineering teams to embed observability by default into products and services
  • Control observability-related costs while maximizing performance, visibility, and value
  • Contribute to upstream projects such as OpenTelemetry, helping shape industry standards
  • Explore and implement cutting-edge observability technologies, including eBPF-based approaches
  • Fulltime
Read More
Arrow Right