Observability Engineer, Grafana & Azure Job at NTT DATA (Bucharest)

Senior Software Engineer, Observability

You will work on core observability systems (metrics, logs, traces) while also d...

Location

India , Bengaluru

Salary:

Not provided

Roku

Expiration Date

Until further notice

Requirements

8+ years in software engineering, building distributed, high-throughput systems or observability platforms
4+ years of Go/Golang experience
our observability ecosystem is built on Go, making it the most effective language for this role
Experience with, or strong interest in, observability tools (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, Clickhouse) and standards (OpenTelemetry, OpenTracing, OpenMetrics)
Deep understanding of distributed systems and data models
Hands-on experience with Kubernetes and cloud platforms (AWS, GCP, Azure)

Job Responsibility

Extend and integrate open-source observability systems, and when necessary, structurally overhaul core components, such as storage layers and query paths, to enhance the performance, reliability, and usability of these tools at scale
Build services to improve performance, usability, reliability, and cost efficiency
Implement features like pre-aggregation, downsampling, and sampling to reduce load and accelerate queries
Create developer-facing capabilities for metrics, logs, and traces usage, data quality, and cost management
Automate onboarding, dashboards, alerting, and tracing
Collaborate across platform and infrastructure teams to integrate observability into Roku’s cloud-native stack

What we offer

global access to mental health and financial wellness support and resources
healthcare (medical, dental, and vision)
life, accident, disability, commuter, and retirement options (401(k)/pension)

Fulltime

DevOps Engineer (Observability)

Working on a team of professionals, you will manage, configure, and support obse...

Location

Mexico

Salary:

Not provided

Teradata

Expiration Date

Until further notice

Requirements

Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
defining group policies
authoring monitors, alerts, and dashboards
and integration with other enterprise applications such as ServiceNow
Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
Strong scripting skills with a modern programming language such as Python
Experience with a configuration management tool such as Ansible or Puppet
Experience with a build/deployment automation tool such as Jenkins or Bamboo
Experience with at least one modern source control tool, preferably Git

Job Responsibility

manage, configure, and support observability tooling for Teradata's product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
monitor all the layers of Teradata's application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata's observability tooling

Fulltime

Cloud Engineer (Azure)

Location

Singapore , Singapore

Salary:

160000.00 SGD / Year

Eames Consulting

Expiration Date

Until further notice

Requirements

Minimum 6 years in Platform Engineering, SRE, or Production Operations within a complex, distributed environment
Deep hands-on experience with Kubernetes administration (AKS, OpenShift, or both)
Strong understanding of Kubernetes internals, including scheduling, RBAC, admission controllers, and CNI plugins
Proven experience with Terraform, Crossplane, or similar IaC tools within GitOps/GitLab CI/CD workflows.

Job Responsibility

Manage the full lifecycle of AKS clusters, including upgrades, multi-tenant configurations, and scaling
Build and maintain the self-service stack that allows engineering teams to provision environments and services autonomously
Define and track SLIs, SLOs, and error budgets
Participate in incident response and drive post-mortem root cause analysis to improve system resilience
Automate platform provisioning and service configuration using GitOps workflows and CI/CD pipelines
Extend the Grafana observability stack (metrics, logs, and traces) to provide deep visibility into platform health and application performance
Contribute to the development of internal tools, Kubernetes operators, and backend services.

Fulltime

DevOps Engineer (Observability)

Working on a team of professionals, you will manage, configure, and support obse...

Location

India

Salary:

Not provided

Teradata

Expiration Date

Until further notice

Requirements

Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
defining group policies
authoring monitors, alerts, and dashboards
and integration with other enterprise applications such as ServiceNow
Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
Strong scripting skills with a modern programming language such as Python
Experience with a configuration management tool such as Ansible or Puppet
Experience with a build/deployment automation tool such as Jenkins or Bamboo
Experience with at least one modern source control tool, preferably Git

Job Responsibility

Manage, configure, and support observability tooling for Teradata’s product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
Define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
Build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
Monitor all the layers of Teradata’s application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
Constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata’s observability tooling
Work closely with product engineering and cloud operations personnel to help administer all aspects of Teradata’s observability tooling in pre-production and production environments
Work with security and compliance teams to help provide evidence necessary to meet Teradata’s compliance obligations

What we offer

People-first culture
Flexible work model
Focus on well-being
Inclusive environment

Fulltime

DevOps Engineer - Observability

Location

Poland

Salary:

Not provided

Lingaro

Expiration Date

Until further notice

Requirements

Azure - cloud management experience
Knowledge of the Azure cloud and core services
Ability to design the architecture of cloud environments (nice to have)
Grafana, Prometheus - ability to work with metrics monitoring and visualization tools
Docker - experience in management applications working in containers
Kubernetes - experience working with a container orchestration platform
Terraform - knowledge and experience in Infrastructure As Code
GitHuB
Azure Repos
Knowledge of CI/CD processes

Job Responsibility

Monitoring infrastructure (Creating Grafana Dashboards, Alerts, Collecting Prometheus Metrics)
Implement automated management features, such as performance monitoring, diagnostics and failover
Configuring implemented solutions in accordance with system security processes
Managing CI (continuous integration) systems and pipelines
Designing and implementing infrastructure

What we offer

Stable employment
“Office as an option” model
Workation
Great Place to Work® certified employer
Flexibility regarding your preferred form of contract
Comprehensive online onboarding program with a “Buddy” from day 1
Cooperation with top-tier engineers and experts
Unlimited access to the Udemy learning platform from day 1
Certificate training programs
Upskilling support

Fulltime

Federal Observability Engineer

You will be part of a larger technical team, working as an Observability Enginee...

Location

United States , HILL AFB

Salary:

105500.00 - 243000.00 USD / Year

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

US Citizenship Required
Secret Clearance Required
DD8750 - Security Plus or higher Security Certification (CISSP, CASP, etc)
Bachelor's degree preferred or Associate degree holder (technical field) with 6-8 years working experience in related fields
Strong understanding of cloud computing platforms (AWS, Azure, GCP)
Experience with containerization technologies (Docker, Kubernetes)
Proficiency in scripting languages (Python, Go, Bash)
Experience with SQL and NoSQL databases
Knowledge of networking protocols (TCP/IP, HTTP)
Proven experience with the OpsRamp platform is a strong plus

Job Responsibility

Designing, implementing, and maintaining observability infrastructure in an OpsRamp environment
Working as part of a larger technical team supporting HPE's PCE environment and Cloud infrastructure for a Federal Customer
Configuring and managing data sources, defining and monitoring key performance indicators (KPIs), and analyzing performance trends
Configuring log collection, aggregation, and analysis within the OpsRamp platform
Creating and managing alerts, defining escalation paths, and integrating with incident management systems
Developing and implementing automated workflows and remediation actions within the OpsRamp platform
Designing and building custom dashboards and reports to provide key insights into system health and performance
Integrating OpsRamp with other monitoring and observability tools as needed
Ensuring data quality and integrity within the OpsRamp platform
Troubleshooting and resolving performance issues, application errors, and other operational problems

What we offer

Health & Wellbeing benefits
Personal & Professional Development programs
Unconditional Inclusion environment
Comprehensive suite of benefits supporting physical, financial and emotional wellbeing

Fulltime

Cloud Software Engineer - Observability Platform

ClickHouse is looking for an experienced engineer to join our Observability team...

Location

United States

Salary:

141000.00 - 208000.00 USD / Year

ClickHouse

Expiration Date

Until further notice

Requirements

5+ years building and running production systems at scale
Proficiency in Golang
Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
Experience with ClickHouse preferred

Job Responsibility

Design, build, and operate distributed systems that power observability across ClickHouse Cloud
Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
Build tooling and automation to eliminate repetitive operational work
Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
Collaborate with other engineering teams to improve their observability posture
Contribute to design discussions, architecture reviews, and mentor teammates

What we offer

Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
Healthcare - Employer contributions towards your healthcare
Equity in the company - Every new team member who joins our company receives stock options
Time off - Flexible time off in the US, generous entitlement in other countries
A $500 Home office setup if you’re a remote employee
Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites

Fulltime

Cloud Software Engineer - Observability Platform

ClickHouse is looking for an experienced engineer to join our Observability team...

Location

Canada

Salary:

Not provided

ClickHouse

Expiration Date

Until further notice

Requirements

5+ years building and running production systems at scale
Proficiency in Golang
Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
Experience with ClickHouse preferred

Job Responsibility

Design, build, and operate distributed systems that power observability across ClickHouse Cloud
Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
Build tooling and automation to eliminate repetitive operational work
Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
Collaborate with other engineering teams to improve their observability posture
Contribute to design discussions, architecture reviews, and mentor teammates

What we offer

Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
Healthcare - Employer contributions towards your healthcare
Equity in the company - Every new team member who joins our company receives stock options
Time off - Flexible time off in the US, generous entitlement in other countries
A $500 Home office setup if you’re a remote employee
Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites

Select Country

Observability Engineer, Grafana & Azure

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?