CrawlJobs Logo

Observability Engineer, Grafana & Azure

Romania, Bucharest · Job Posted February 05, 2026
Apply Position
Job Link Share

Job Description

The Mid-Level Grafana & Observability Engineer will be responsible for implementing and maintaining monitoring solutions in Azure environments. Candidates should have 3-5 years of experience with Grafana and related technologies. This remote role requires collaboration with teams to enhance observability coverage and troubleshoot incidents effectively.

Job Responsibility

  • Create and maintain Grafana dashboards for applications and infrastructure
  • Configure and manage Grafana data sources
  • Write and maintain queries using: PromQL, LogQL and KQL
  • Support OpenTelemetry instrumentation and data collection
  • Integrate monitoring with Azure services (AKS, App Services, VMs)
  • Configure alerts and support incident troubleshooting
  • Maintain documentation for dashboards, metrics, and telemetry pipelines
  • Collaborate with application and platform teams to improve observability coverage

Requirements

  • 3–5 years of experience in monitoring, DevOps, or platform engineering
  • Solid hands-on experience with Grafana
  • Working knowledge of Grafana data sources and query languages: PromQL, LogQL and KQL
  • Experience using or supporting OpenTelemetry
  • Experience with Azure monitoring tools (Azure Monitor, Log Analytics)
  • Basic understanding of cloud-native architectures and containers
  • English (Fluent): mandatory

Nice to have

  • Exposure to AKS monitoring
  • Experience with Terraform or ARM/Bicep
  • Basic scripting skills (Python, Bash)
  • Interest in growing toward a senior observability role

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Observability Engineer, Grafana & Azure

8 matching positions

Senior Software Engineer, Observability

You will work on core observability systems (metrics, logs, traces) while also d...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
roku.com Logo
Roku
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in software engineering, building distributed, high-throughput systems or observability platforms
  • 4+ years of Go/Golang experience
  • our observability ecosystem is built on Go, making it the most effective language for this role
  • Experience with, or strong interest in, observability tools (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, Clickhouse) and standards (OpenTelemetry, OpenTracing, OpenMetrics)
  • Deep understanding of distributed systems and data models
  • Hands-on experience with Kubernetes and cloud platforms (AWS, GCP, Azure)
Job Responsibility
Job Responsibility
  • Extend and integrate open-source observability systems, and when necessary, structurally overhaul core components, such as storage layers and query paths, to enhance the performance, reliability, and usability of these tools at scale
  • Build services to improve performance, usability, reliability, and cost efficiency
  • Implement features like pre-aggregation, downsampling, and sampling to reduce load and accelerate queries
  • Create developer-facing capabilities for metrics, logs, and traces usage, data quality, and cost management
  • Automate onboarding, dashboards, alerting, and tracing
  • Collaborate across platform and infrastructure teams to integrate observability into Roku’s cloud-native stack
What we offer
What we offer
  • global access to mental health and financial wellness support and resources
  • healthcare (medical, dental, and vision)
  • life, accident, disability, commuter, and retirement options (401(k)/pension)
  • Fulltime
Read More
Arrow Right

DevOps Engineer (Observability)

Working on a team of professionals, you will manage, configure, and support obse...
Location
Location
Mexico
Salary
Salary:
Not provided
teradata.com Logo
Teradata
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
  • 2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
  • defining group policies
  • authoring monitors, alerts, and dashboards
  • and integration with other enterprise applications such as ServiceNow
  • Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
  • Strong scripting skills with a modern programming language such as Python
  • Experience with a configuration management tool such as Ansible or Puppet
  • Experience with a build/deployment automation tool such as Jenkins or Bamboo
  • Experience with at least one modern source control tool, preferably Git
Job Responsibility
Job Responsibility
  • manage, configure, and support observability tooling for Teradata's product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
  • define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
  • build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
  • monitor all the layers of Teradata's application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
  • constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata's observability tooling
  • Fulltime
Read More
Arrow Right

Cloud Engineer (Azure)

Location
Location
Singapore , Singapore
Salary
Salary:
160000.00 SGD / Year
eamesconsulting.com Logo
Eames Consulting
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 6 years in Platform Engineering, SRE, or Production Operations within a complex, distributed environment
  • Deep hands-on experience with Kubernetes administration (AKS, OpenShift, or both)
  • Strong understanding of Kubernetes internals, including scheduling, RBAC, admission controllers, and CNI plugins
  • Proven experience with Terraform, Crossplane, or similar IaC tools within GitOps/GitLab CI/CD workflows.
Job Responsibility
Job Responsibility
  • Manage the full lifecycle of AKS clusters, including upgrades, multi-tenant configurations, and scaling
  • Build and maintain the self-service stack that allows engineering teams to provision environments and services autonomously
  • Define and track SLIs, SLOs, and error budgets
  • Participate in incident response and drive post-mortem root cause analysis to improve system resilience
  • Automate platform provisioning and service configuration using GitOps workflows and CI/CD pipelines
  • Extend the Grafana observability stack (metrics, logs, and traces) to provide deep visibility into platform health and application performance
  • Contribute to the development of internal tools, Kubernetes operators, and backend services.
  • Fulltime
Read More
Arrow Right

DevOps Engineer (Observability)

Working on a team of professionals, you will manage, configure, and support obse...
Location
Location
India
Salary
Salary:
Not provided
teradata.com Logo
Teradata
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
  • 2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
  • defining group policies
  • authoring monitors, alerts, and dashboards
  • and integration with other enterprise applications such as ServiceNow
  • Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
  • Strong scripting skills with a modern programming language such as Python
  • Experience with a configuration management tool such as Ansible or Puppet
  • Experience with a build/deployment automation tool such as Jenkins or Bamboo
  • Experience with at least one modern source control tool, preferably Git
Job Responsibility
Job Responsibility
  • Manage, configure, and support observability tooling for Teradata’s product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
  • Define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
  • Build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
  • Monitor all the layers of Teradata’s application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
  • Constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata’s observability tooling
  • Work closely with product engineering and cloud operations personnel to help administer all aspects of Teradata’s observability tooling in pre-production and production environments
  • Work with security and compliance teams to help provide evidence necessary to meet Teradata’s compliance obligations
What we offer
What we offer
  • People-first culture
  • Flexible work model
  • Focus on well-being
  • Inclusive environment
  • Fulltime
Read More
Arrow Right

DevOps Engineer - Observability

Location
Location
Poland
Salary
Salary:
Not provided
lingarogroup.com Logo
Lingaro
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Azure - cloud management experience
  • Knowledge of the Azure cloud and core services
  • Ability to design the architecture of cloud environments (nice to have)
  • Grafana, Prometheus - ability to work with metrics monitoring and visualization tools
  • Docker - experience in management applications working in containers
  • Kubernetes - experience working with a container orchestration platform
  • Terraform - knowledge and experience in Infrastructure As Code
  • GitHuB
  • Azure Repos
  • Knowledge of CI/CD processes
Job Responsibility
Job Responsibility
  • Monitoring infrastructure (Creating Grafana Dashboards, Alerts, Collecting Prometheus Metrics)
  • Implement automated management features, such as performance monitoring, diagnostics and failover
  • Configuring implemented solutions in accordance with system security processes
  • Managing CI (continuous integration) systems and pipelines
  • Designing and implementing infrastructure
What we offer
What we offer
  • Stable employment
  • “Office as an option” model
  • Workation
  • Great Place to Work® certified employer
  • Flexibility regarding your preferred form of contract
  • Comprehensive online onboarding program with a “Buddy” from day 1
  • Cooperation with top-tier engineers and experts
  • Unlimited access to the Udemy learning platform from day 1
  • Certificate training programs
  • Upskilling support
  • Fulltime
Read More
Arrow Right

Federal Observability Engineer

You will be part of a larger technical team, working as an Observability Enginee...
Location
Location
United States , HILL AFB
Salary
Salary:
105500.00 - 243000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • US Citizenship Required
  • Secret Clearance Required
  • DD8750 - Security Plus or higher Security Certification (CISSP, CASP, etc)
  • Bachelor's degree preferred or Associate degree holder (technical field) with 6-8 years working experience in related fields
  • Strong understanding of cloud computing platforms (AWS, Azure, GCP)
  • Experience with containerization technologies (Docker, Kubernetes)
  • Proficiency in scripting languages (Python, Go, Bash)
  • Experience with SQL and NoSQL databases
  • Knowledge of networking protocols (TCP/IP, HTTP)
  • Proven experience with the OpsRamp platform is a strong plus
Job Responsibility
Job Responsibility
  • Designing, implementing, and maintaining observability infrastructure in an OpsRamp environment
  • Working as part of a larger technical team supporting HPE's PCE environment and Cloud infrastructure for a Federal Customer
  • Configuring and managing data sources, defining and monitoring key performance indicators (KPIs), and analyzing performance trends
  • Configuring log collection, aggregation, and analysis within the OpsRamp platform
  • Creating and managing alerts, defining escalation paths, and integrating with incident management systems
  • Developing and implementing automated workflows and remediation actions within the OpsRamp platform
  • Designing and building custom dashboards and reports to provide key insights into system health and performance
  • Integrating OpsRamp with other monitoring and observability tools as needed
  • Ensuring data quality and integrity within the OpsRamp platform
  • Troubleshooting and resolving performance issues, application errors, and other operational problems
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Cloud Software Engineer - Observability Platform

ClickHouse is looking for an experienced engineer to join our Observability team...
Location
Location
United States
Salary
Salary:
141000.00 - 208000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years building and running production systems at scale
  • Proficiency in Golang
  • Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
  • Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
  • Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
  • Experience with ClickHouse preferred
Job Responsibility
Job Responsibility
  • Design, build, and operate distributed systems that power observability across ClickHouse Cloud
  • Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
  • Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
  • Build tooling and automation to eliminate repetitive operational work
  • Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
  • Collaborate with other engineering teams to improve their observability posture
  • Contribute to design discussions, architecture reviews, and mentor teammates
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Cloud Software Engineer - Observability Platform

ClickHouse is looking for an experienced engineer to join our Observability team...
Location
Location
Canada
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years building and running production systems at scale
  • Proficiency in Golang
  • Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
  • Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
  • Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
  • Experience with ClickHouse preferred
Job Responsibility
Job Responsibility
  • Design, build, and operate distributed systems that power observability across ClickHouse Cloud
  • Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
  • Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
  • Build tooling and automation to eliminate repetitive operational work
  • Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
  • Collaborate with other engineering teams to improve their observability posture
  • Contribute to design discussions, architecture reviews, and mentor teammates
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right