Observability engineer Job at European Bank for Reconstruction and Development (Sofia)

Observability Engineer

My client are seeking an Engineer with strong Linux experience and expertise wit...

Location

Poland , Warsaw

Salary:

350000.00 - 600000.00 PLN / Year

Hunter Bond

Expiration Date

Until further notice

Requirements

Strong Linux experience
Experience with at least 2 of VictoriaMetrics, Prometheus, Grafana, Vector, ELK, AlertManager
Python and Git skills
Understanding of Kubernetes is advantageous

Job Responsibility

Working with monitoring and observability tools to support the estate
Using at least 2 of the following technologies – VictoriaMetrics, Prometheus, Grafana, Vector, ELK, AlertManager
Supporting Linux environments and contributing with Python and Git skills

What we offer

Bonus

Fulltime

Observability Engineer

As an Observability Engineer you will be part of a team that is responsible for ...

Location

Philippines , Makati City

Salary:

Not provided

Avaloq

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Engineering, or a related field or equivalent experience may be considered
5+ years of experience in observability, for on-prem and cloud infrastructure, or related fields, with at least 2 years in a leadership or tech lead role
Proven experience in infrastructure troubleshooting and infrastructure as code (IaC) tools with Terraform, Phyton and similar
Cloud Platforms: expertise in Oracle OCI is a plus, with knowledge of AWS, Azure, and GCP observability features
Observability Tools: skilled in industry standards like Prometheus, Alert Manager, Grafana, Loki, PagerDuty and similar tools
Best Practices: deep understanding of monitoring, logging, and tracing within cloud-native environments
Containerized Platforms: proficient in OpenShift, Kubernetes, and related container platforms
CI/CD & Automation: experienced with CI/CD pipelines and automation tools like Jenkins and GitLab/GitHub
Good problem-solving skills with a focus on delivering high-quality, scalable solutions
Effective communication skills, both written and verbal, with the ability to convey complex technical concepts to technical and non-technical stakeholders

Job Responsibility

Observability Stack Design & Deployment: Design and Implement: build a robust observability stack encompassing logging, metrics collection, monitoring, alerting, and tracing systems tailored for cloud environments
Integration: seamlessly connect observability tools with cloud services and infrastructure to achieve comprehensive monitoring and visibility
IaC Development: use Terraform to automate the provisioning and deployment of observability tools and infrastructure, ensuring consistency and efficiency
Monitoring and Optimization: Monitoring Standards: define and enforce organization-wide monitoring and alerting standards for real-time incident detection
Optimization: continuously refine the observability stack to enhance system performance, minimize downtime, and optimize resource utilization
End-to-End Monitoring Practices: Comprehensive Tracking: implement end-to-end monitoring solutions that provide insights into the performance, availability, and reliability of IT workloads
Standardization: establish best practices for metrics, logs, and traces, ensuring holistic visibility across the technology stack
Automated Alerts: develop automated alerting systems for proactive issue identification and resolution
Technical Collaboration: Cross-Team Integration: work closely with DevOps, SRE, and application development teams to align observability strategies with operational objectives
Stakeholder Engagement: communicate complex technical insights clearly to stakeholders, enabling informed decision-making

What we offer

Annual bonus
Flexible working
Instant recognition scheme
Access to Udemy for professional and personal learning

Fulltime

Observability Engineer

As a Platform Engineer focusing on Observability, you are responsible for the ar...

Location

United States

Salary:

73534.50 - 172346.48 USD / Year

Comcast Advertising

Expiration Date

Until further notice

Requirements

Bachelor’s Degree: Engineering, Computer Science, or a related field (relevant work experience also considered)
3–4 years of experience in SRE, DevOps, or Platform Engineering roles
Deep expertise in OpenTelemetry, Elastic (ELK/ECK), Prometheus, and Grafana
Proficiency in Ansible and Terraform for managing cloud resources and configuration
Strong facility in Python or Go for building internal tools and automation
Experience deploying and operating applications in public cloud environments (AWS, Azure, or GCP)
Experience with Concourse, GitHub, and Artifactory
Strong UNIX/Linux background with a firm grasp of CLI utilities, network protocols (HTTP/TLS), and asynchronous messaging
A commitment to supporting developers as internal customers
Firm understanding of Agile, Scrum, and Kanban methodologies

Job Responsibility

Observability as a Service: Build and manage centralized observability platforms (Elastic, Prometheus, Grafana) that serve as a shared resource for all internal development teams
Telemetry Pipeline Management: Design and optimize telemetry pipelines to ensure high-fidelity data collection, transformation, and routing using OpenTelemetry (OTel)
DevX Advocacy: Partner with software engineering teams to understand their pain points, providing consultation and tooling that makes "monitoring by default" easy and intuitive
Automation & Tooling: Write and edit code (Python, Go) to automate manual processes, reducing the operational burden on feature teams
Dashboarding & Alerting: Build sophisticated Grafana dashboards and alerting logic that provides actionable insights rather than noise
Documentation & Enablement: Create clear, concise technical documentation and "golden paths" to help developers self-serve their observability needs
CI/CD Integration: Integrate monitoring and reporting into deployment pipelines to ensure system health is validated during every release
Triage & Mitigation: Assist teams in triaging complex production issues by providing deep-link visibility and data-driven insights to prevent future regressions

What we offer

Paid Time off
Physical Wellbeing benefits
Financial Wellbeing benefits
Emotional Wellbeing benefits
Life Events + Family Support benefits

Fulltime

Site Reliability Engineer / Observability Engineer

Rackspace is building up its Professional Services Center of Excellence on Appli...

Location

Egypt , Giza

Salary:

Not provided

Rackspace

Expiration Date

Until further notice

Requirements

Bachelor’s degree in engineering/computer science or equivalent
Senior-level experience with Site Reliability Engineering, DevOps, Code level application support and troubleshooting, AWS Infrastructure design, implementation and optimization, Automation for deployment, scaling and reliability
Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc.
Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
Proactive approach to identifying problems and solutions
Experience writing code with one or more interpreted languages such as Python, PHP, Perl, Ruby, Linux Shell
Experience with Terraform or Cloud Formation scripting
Experience with configuration management tools like Ansible, Chef or Puppet
Experience with standard software development best practices and tools such as code repositories (Git preferred)
Experience executing in an agile software development environment

Job Responsibility

Work with customers and implement Observability solutions
Build and maintain scalable systems and robust automation that supports engineering goals
Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance
Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation
Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standards
Collaborate with team members to document and share solutions
Maintain a deep understanding of the customer’s business as well as their technical environment
Identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues

Fulltime

Senior Software Engineer, Observability

You will work on core observability systems (metrics, logs, traces) while also d...

Location

India , Bengaluru

Salary:

Not provided

Roku

Expiration Date

Until further notice

Requirements

8+ years in software engineering, building distributed, high-throughput systems or observability platforms
4+ years of Go/Golang experience
our observability ecosystem is built on Go, making it the most effective language for this role
Experience with, or strong interest in, observability tools (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, Clickhouse) and standards (OpenTelemetry, OpenTracing, OpenMetrics)
Deep understanding of distributed systems and data models
Hands-on experience with Kubernetes and cloud platforms (AWS, GCP, Azure)

Job Responsibility

Extend and integrate open-source observability systems, and when necessary, structurally overhaul core components, such as storage layers and query paths, to enhance the performance, reliability, and usability of these tools at scale
Build services to improve performance, usability, reliability, and cost efficiency
Implement features like pre-aggregation, downsampling, and sampling to reduce load and accelerate queries
Create developer-facing capabilities for metrics, logs, and traces usage, data quality, and cost management
Automate onboarding, dashboards, alerting, and tracing
Collaborate across platform and infrastructure teams to integrate observability into Roku’s cloud-native stack

What we offer

global access to mental health and financial wellness support and resources
healthcare (medical, dental, and vision)
life, accident, disability, commuter, and retirement options (401(k)/pension)

Fulltime

Software Engineer, Observability

As a Software Engineer in Observability, you’ll be responsible for our metrics a...

Location

India , Bengaluru

Salary:

Not provided

Dialpad

Expiration Date

Until further notice

Requirements

Background in both Systems and/or Software Engineering
Experience in designing, automating, maintaining, and optimizing observability platforms (logging, metrics, and tracing)
Experience with configuration management tools such as Ansible, Terraform, etc.
Experience with Public Cloud environments such as GCP, AWS, etc.
Familiarity with languages such as Python, Go, Rust, etc.
Previous direct experience with Grafana, Loki, Prometheus
Experience with Linux
Experience with Kubernetes (including GKE/EKS) and building containerized applications
Undergraduate degree in Computer Science or Engineering

Job Responsibility

Develop and improve instrumentation for monitoring and logging the health and availability of services
Develop and maintain the observability stack within Dialpad engineering
Define best practices and standards around making systems and services measurable, and work with various teams to get those best practices applied
Create tools and libraries for other engineering teams to enable them to build self-monitoring capabilities
Create and own internal documentation used by the other engineering teams
Stay up-to-date with the latest trends in observability, logging, monitoring, and cloud technologies
Collaborate with different engineering teams to integrate observability practices into their workflows
Participate in a rotating on-call within the larger Infrastructure Engineering division

What we offer

Competitive salary
comprehensive benefits
real opportunities for growth
cutting-edge AI tools
robust training program

Fulltime

Sr Data Quality & Observability Engineer (Snowflake)

Lamb Weston is continuing to modernize its enterprise data ecosystem to support ...

Location

United States , Eagle

Salary:

117060.00 - 175600.00 USD / Year

Lamb Weston

Expiration Date

July 27, 2026

Requirements

Bachelor’s degree in Computer Science, Information Systems, Data Analytics, or a related field, or equivalent experience
5+ years of experience in data analysis, data quality, or analytics engineering roles
Strong SQL skills and experience working with large, complex datasets
Hands-on data quality experience, including implementing data quality logic using SQL and data functions (e.g., window functions, conditional logic, string/date functions, aggregations, table functions/CTEs)
Demonstrated experience with data profiling, data validation, and data quality frameworks
Experience with Git-based version control, code review practices, and deploying changes through SDLC/CI-CD processes
Experience working in SAP data environments (ECC, S/4HANA, BW, or HANA)
Business Analyst skills, including requirements gathering, documentation, and stakeholder facilitation
Familiarity with cloud data platforms such as Snowflake and AWS preferred
Understanding of data governance, metadata, and lineage concepts

Job Responsibility

Design, implement, and maintain data quality rules, checks, and controls across enterprise data assets
Perform data profiling, root cause analysis, and anomaly detection across SAP and non-SAP data sources
Partner with business stakeholders to understand data quality issues, business impacts, and remediation priorities
Translate business requirements into measurable data quality rules and thresholds
Develop and maintain data quality frameworks, including reusable SQL patterns, UDFs, stored procedures
Implement automated scheduling and orchestration of data quality checks using Snowflake-native capabilities (e.g., tasks, streams) and/or pipeline orchestration tools (ie: Informatica)
Implement data quality monitoring and observability scorecards, and reporting for key metadata domains
Own and evolve enterprise data quality KPIs/scorecards, including standardized definitions, thresholds, and executive-ready reporting across domains
Analyze data discrepancies and ensure reconciliation back to systems of record
Lead issue management workflows, including defect triage, prioritization, root cause documentation, corrective action validation, and prevention recommendations

What we offer

Health Insurance Benefits - Medical, Dental, Vision
Flexible Spending Accounts for Health and Dependent Care, and Health Reimbursement Accounts
Well-being programs including companywide events and a wellness incentive program
Paid Time Off
Financial Wellness – Industry leading 401(k) plan with generous company contributions, Financial Planning Services, Employee Stock purchase program, and Health Savings Accounts, Life and Accident insurance
Family-Friendly Employee events
Employee Assistance Program services – mental health and other concierge type services

Fulltime

Senior Software Engineer, Observability

We are looking for an experienced Senior Engineer to join our newly formed Obser...

Location

Germany , Berlin

Salary:

Not provided

Aiven Deutschland GmbH

Expiration Date

Until further notice

Requirements

Extensive experience with observability concepts on a big scale
A good grasp of monitoring and observability tools like Prometheus, Grafana, and OpenTelemetry
Understanding of SLAs, SLOs, and SLIs
Strong knowledge of database fundamentals, including OLAP vs. OLTP, persistence, replication, and clustering
Experience with ClickHouse specifically regarding logs, metrics, and OpenTelemetry is highly desirable
Experience in building and designing distributed systems in a cloud environment
Ability to work with SQL to interact with our platform's master database
Deep understanding of release management and testing best practices to own the delivery pipeline
A genuine interest in solving complex technical challenges with customer-focused solutions

Job Responsibility

Ensure our existing observability offering is up and running all the time
Ideate and develop innovative new features that attract our target customer segment, drive product engagement, and ultimately fuel growth
Support our existing external customer base by resolving escalated support issues and collaborating with them to understand and solve their needs
Guide the team in the hands-on implementation of key platform features, ensuring maintainability and performance
Empower your team to act as 'product custodians' by consistently addressing foundational and production issues
Practise effective communication and collaboration both within the team and across the wider organization and act as a role model in transparency for your peers

What we offer

Participate in Aiven’s equity plan
Balance work and life with our hybrid work policy
Choose the equipment you need to set yourself up for success
Use your Professional Development Plan budget for learning opportunities
Receive holistic wellbeing support through our global Employee Assistance Program
Inquire about our Global Time Off Commitment (Parental and Sick Leave, as well as Personal Time)
Enjoy country-specific benefits for our global cast

Fulltime

Select Country

Observability engineer

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?

Observability engineer

Observability Engineer

Observability Engineer

Observability Engineer

Site Reliability Engineer / Observability Engineer

Senior Software Engineer, Observability

Software Engineer, Observability

Sr Data Quality & Observability Engineer (Snowflake)

Senior Software Engineer, Observability

Our AI answers in your language