This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Cloud Observability Engineer, you will be a critical part of our Cloud Technology team, responsible for designing, building, and maintaining the foundational observability platform and underlying infrastructure across our multi-cloud environment. You will empower development, operations, and SRE teams by providing the robust capabilities they need to generate and consume key metrics, logs, and traces.
Job Responsibility:
Design, build, and maintain the end-to-end observability platform and infrastructure covering monitoring, logging, tracing, and alerting capabilities for cloud-native applications and infrastructure
Select, configure, and optimize core observability tools and technologies (e.g., Prometheus, OpenTelemetry, cloud-native monitoring services like CloudWatch, Google Cloud Monitoring)
Develop and maintain the frameworks, tooling, and automation that enable engineering teams to create, manage, and consume their own dashboards, alerts, and reports
Architect and implement highly scalable, reliable, and cost-effective data ingestion pipelines and storage solutions for metrics, logs, and traces
Ensure the observability platform itself is highly available, performant, and resilient
Develop and maintain internal applications and tools to provide operational visibility into the observability platform's health and performance
Automate the deployment, configuration, and ongoing lifecycle management of observability tools and infrastructure components using Infrastructure as Code (IaC) principles
Implement and manage the underlying infrastructure and services for synthetic monitoring and real user monitoring (RUM)
Mentor junior engineers and contribute to the overall technical growth of the team
Stay up-to-date with emerging observability trends, tools, and technologies
Requirements:
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
6+ years of experience in a dedicated Observability, Monitoring, SRE, or DevOps role with a strong focus on building and managing cloud environments
Proven expertise with at least one major cloud provider (AWS or GCP preferred)
Deep understanding of monitoring concepts, metrics collection, log aggregation, and distributed tracing
Extensive experience with architecting and implementing observability platforms and tools (e.g., Prometheus, OpenTelemetry, Fluentbit, OpAMP)
Proficiency in scripting and automation (e.g., Python, Go)
Experience with Infrastructure as Code (IaC) tools like Terraform or CloudFormation
Strong understanding of containerization technologies (Docker, Kubernetes) and their observability challenges
Excellent problem-solving skills and the ability to diagnose complex technical issues across distributed systems
Strong communication and collaboration skills
Nice to have:
Experience with Kafka, PubSub, Kinesis or other message queuing systems
Familiarity with serverless architectures (AWS Lambda, Google Cloud Functions)
Knowledge of security best practices in cloud environments
What we offer:
Medical, dental & vision coverage
401(k)
Life, accident, and disability insurance
Wellness programs
Paid time off packages including vacation, sick leave, and paid holidays
Discretionary and formulaic incentive and retention awards
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.