CrawlJobs Logo

Observability engineer

ebrd.com Logo

European Bank for Reconstruction and Development

Location Icon

Location:
Bulgaria , Sofia

Category Icon
Category:
IT - Administration

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Be the eyes and ears of our platforms with a role that puts you at the heart of performance, reliability, and resilience. We’re on the lookout for an Observability Engineer to design and optimize cutting-edge monitoring solutions across hybrid and cloud-native environments. From SolarWinds and Prometheus to Azure Monitor and Grafana, you’ll build scalable, intelligent observability platforms that deliver deep insights through metrics, logs, and distributed traces. Bring your scripting skills, automation mindset, and deep telemetry knowledge to life. Whether it’s deploying via Terraform or integrating with ITSM workflows. This is your chance to shape the digital nervous system of the enterprise, driving operational excellence through proactive monitoring and rock-solid insights. If you're fluent in observability and passionate about system intelligence, let’s push the boundaries of performance together.

Job Responsibility:

  • Design, automate, and optimize observability platforms for logging, metrics, and tracing
  • Expertise in consolidating and analysing application / system logs at enterprise scale, including familiarity with distributed tracing technologies, integrating with ITSM platforms
  • Proficient in scripting languages (Python, Bash, PowerShell) for task automation
  • Experience with Terraform or Ansible for deploying and configuring monitoring / logging infrastructure
  • Strong understanding of protocols (WMI, SSH, SNMP) and methods (API, Traps) for data gathering

Requirements:

  • Designing, Implementing and Supporting COTS and Open Source monitoring solutions
  • Understanding of software development principles and troubleshooting application issues
  • Understanding of infrastructure management principles and troubleshooting practices
  • Understanding of performance monitoring approaches
  • Knowledge of Azure monitoring services, container monitoring
  • Understanding of telemetry standards for interoperability
  • Intermediate to advanced technology certification in the given specialism
  • Entry level service management certification such a ITIL Foundation
What we offer:
  • Varied, stimulating and engaging work
  • A working culture that embraces inclusion and celebrates diversity
  • An environment that places sustainability, equality and digital transformation at the heart of what we do
  • Flexible working

Additional Information:

Job Posted:
December 25, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Observability engineer

Senior Software Engineer, Observability

The Observability team at Airtable ensures that engineers have the tools they ne...
Location
Location
United States , San Francisco; New York; Seattle
Salary
Salary:
196000.00 - 270000.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of software engineering experience
  • 3+ years focused on observability or infrastructure at scale
  • Demonstrated success implementing and running production-grade logging, metrics, or tracing systems
  • Proficiency in distributed systems concepts, data streaming pipelines, and container orchestration (Kubernetes)
  • Deep hands-on knowledge of tools such as Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack, Loki, or ClickHouse
  • Comfort with at least one programming language (e.g., Go, Python, Java) to build and maintain observability tooling
  • Experience mentoring engineers and collaborating across multiple teams
  • Strong communication skills
  • Eagerness to own high-impact initiatives
  • Proven ability to balance short-term fixes with long-term strategic vision
Job Responsibility
Job Responsibility
  • Architect and scale core observability systems
  • Lead the design and evolution of logging, metrics, and tracing pipelines
  • Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack)
  • Guide and mentor a growing team of infrastructure engineers
  • Define and uphold coding standards and operational excellence
  • Partner with Deploy Infrastructure, Service Orchestration, and Product teams
  • Align infrastructure decisions with business goals
  • Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
  • Optimize performance and cost of large-scale data pipelines
  • Shape the observability roadmap
What we offer
What we offer
  • Opportunity to receive benefits
  • Restricted stock units
  • May include incentive compensation
  • Comprehensive benefit offerings
  • Fulltime
Read More
Arrow Right

Lead Director – Observability Engineering

At CVS Health, we’re building a world of health around every consumer and surrou...
Location
Location
United States
Salary
Salary:
144200.00 - 288400.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
December 31, 2025
Flip Icon
Requirements
Requirements
  • 10+ years of experience Leading Software Development teams developing and managing applications for IT operations, SRE, logging and/or observability, with at least 5 years in a leadership role within a large enterprise (Fortune 100)
  • 10 + years' of experience designing, developing, and implementing observability systems for large-scale, distributed systems, encompassing legacy and modern technologies
  • Experience leading a major logging and observability platform migration. Demonstrable experience building custom monitoring solutions
  • Proven experience building and implementing operational data models. Experience designing and deploying data lakes and data pipelines at massive scale in an enterprise environment. Experience with enterprise demand analysis, capacity planning, and performance engineering
  • Deep knowledge of, and experience with on-premises infrastructure, cloud infrastructure, and application architectures
  • Strong background in cloud-native technologies and architectures (e.g., Kubernetes, Docker, microservices) and an understanding of the unique challenges they pose to observability
  • Proven experience developing automation solutions and workflows for deployment, event correlation, and incident remediation
  • Experience and/or expertise with the following: Core Platforms & Languages: Python, Java, Javascript, XML, JSON - 5 years
  • Application Programming Interfaces (API): REST, SOAP - 5 years
  • Source Control: Github/GitOps - 5 years
Job Responsibility
Job Responsibility
  • Program Development and Modernization –Develop a plan to rationalize and modernize observability platforms, delivering an efficient observability ecosystem that meets the unique needs of CVS Health
  • Spearhead technology enablement for the transition of services from numerous legacy platforms, improving operational visibility and predictability
  • This involves designing and implementing complex solutions to collect, process, and manage structured and unstructured data at massive scale, optimizing built and purchased platforms to ensure efficient and performant operations, and ensuring solutions align with the organization’s goals
  • Team Leadership and Mentoring: Provide guidance and leadership to the Observability Engineering team
  • This involves hiring and developing talent, mentoring and supporting team members, assigning tasks, and ensuring projects are on track
  • Foster collaboration and knowledge sharing within the team
  • Architecture and Design: Help define the overall architecture of the Observability environment, including observability standards, data models, integrations, and security controls
  • ensure our platforms are scalable, reliable, and aligned with best practices
  • Leverage open source and commercial software to deliver and maintain resilient, reliable, cost-effective platforms tailored to the needs of CVS Health
  • Project Management: Engage executives, department heads, and IT teams to plan, execute, and oversee Observability projects
What we offer
What we offer
  • Affordable medical plan options, a 401(k) plan (including matching company contributions), and an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
  • Benefit solutions that address the different needs and preferences of our colleagues including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility
  • Fulltime
!
Read More
Arrow Right

Senior Software Engineer - Observability and Reliability

We are growing the engineering team and looking for engineers who have the chops...
Location
Location
United States , New York City
Salary
Salary:
150000.00 - 220000.00 USD / Year
sigmacomputing.com Logo
Sigma Computing
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong Computer Science fundamentals
  • 5+ years industry experience building and maintaining high-quality software, especially software other engineers use
  • You apply a product mindset to infrastructure systems and feel accomplished enabling others
  • Desire to be a great teammate and have fun at work
  • Strong sense of craftsmanship, and a healthy academic curiosity
Job Responsibility
Job Responsibility
  • Build observability tools and platforms, including: metrics, logging, distributed tracing, dashboarding, alerting, application performance management
  • Build with modern tools and languages like Go, Open Telemetry and Kubernetes
  • Participate in on-call rotation and ensure uptime of services
  • Create runtime tools/processes that optimize cloud triaging and limit downtime
  • Define best practices around making our systems and services measurable
  • Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies. We expect successful candidates to be coding a majority of their time
What we offer
What we offer
  • Equity
  • Generous health benefits
  • Flexible time off policy. Take the time off you need!
  • Paid bonding time for all new parents
  • Traditional and Roth 401k
  • Commuter and FSA benefits
  • Lunch Program
  • Dog friendly office
  • Fulltime
Read More
Arrow Right

Senior Observability Engineer

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Site Reliability, DevOps, or Platform Engineering with a focus on observability
  • Proven expertise with at least one major observability platform (e.g., Prometheus, Victoria Metrics, OpenSearch)
  • Hands-on experience with Kubernetes, including deep knowledge of controllers, operators, and Helm
  • Experience writing Kubernetes controllers (controller-runtime, KubeBuilder)
  • Strong programming skills in Go or Python (Rust is a plus)
  • Experience designing, scaling, and operating observability systems at enterprise scale
  • Familiarity with at least one major cloud provider (AWS, Azure, or GCP)
  • Strong understanding of distributed systems, telemetry pipelines, and instrumentation standards (e.g., OpenTelemetry)
  • Excellent communication skills with the ability to explain complex topics to diverse stakeholders
Job Responsibility
Job Responsibility
  • Design, implement, and maintain observability features such as Alerting, SLOs, Reporting, and Synthetic Tests
  • Manage and scale OpenTelemetry Collectors and other observability agents across Kubernetes environments
  • Write and maintain Kubernetes Controllers using frameworks like controller-runtime and KubeBuilder
  • Operate and optimize the internal Coralogix account, ensuring proper usage, cost efficiency, and best practices adoption
  • Define and enforce observability guidelines and standards across the organization
  • Partner with engineering teams to embed observability by default into products and services
  • Control observability-related costs while maximizing performance, visibility, and value
  • Contribute to upstream projects such as OpenTelemetry, helping shape industry standards
  • Explore and implement cutting-edge observability technologies, including eBPF-based approaches
  • Fulltime
Read More
Arrow Right

Monitoring & Observability Engineer

The Monitoring & Observability Engineer is a senior level position responsible f...
Location
Location
India , Chennai; Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3-7 years of relevant experience in an Engineering & IT role
  • At least 2+ years of hands-on working experience in: Strong understanding of UI/UX principles and best practices
  • Proficient in JavaScript, TypeScript, HTML, CSS, React, and Node.js
  • Experience with backend technologies and databases (e.g., MongoDB)
  • Experience with Python Programming
  • Experience with version control systems (e.g., Git)
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration skills
  • Create modular and reusable React components to streamline development and maintain consistency across the application
  • Continuously improve existing applications, addressing bugs, and implementing new features
Job Responsibility
Job Responsibility
  • Drive the best-in-class monitoring using a range of tools across all regions of Global Consumer bank
  • Drive POCs and incubate new features and capabilities
  • Be forward looking and ensure long term strategic success
  • Work closely with the monitoring operations teams, production support, performance test teams, operations, application owners and application owners to deliver best-in-class monitoring
  • Explain complicated performance bottlenecks to stakeholders
  • Understand complicated application architecture, including Java app servers, Web Servers, Cloud (PCF, AWS, Google), Kubernetes, TIBCO, mainframe
  • Build advanced dashboards and queries
  • Be a subject matter expert for the Global Consumer Bank, including conducting brown bags and office hours
  • Recommend product customization for system integration
  • Identify problem causality, business impact and root causes
  • Fulltime
Read More
Arrow Right

Federal Observability Engineer

You will be part of a larger technical team, working as an Observability Enginee...
Location
Location
United States , HILL AFB
Salary
Salary:
105500.00 - 243000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • US Citizenship Required
  • Secret Clearance Required
  • DD8750 - Security Plus or higher Security Certification (CISSP, CASP, etc)
  • Bachelor's degree preferred or Associate degree holder (technical field) with 6-8 years working experience in related fields
  • Strong understanding of cloud computing platforms (AWS, Azure, GCP)
  • Experience with containerization technologies (Docker, Kubernetes)
  • Proficiency in scripting languages (Python, Go, Bash)
  • Experience with SQL and NoSQL databases
  • Knowledge of networking protocols (TCP/IP, HTTP)
  • Proven experience with the OpsRamp platform is a strong plus
Job Responsibility
Job Responsibility
  • Designing, implementing, and maintaining observability infrastructure in an OpsRamp environment
  • Working as part of a larger technical team supporting HPE's PCE environment and Cloud infrastructure for a Federal Customer
  • Configuring and managing data sources, defining and monitoring key performance indicators (KPIs), and analyzing performance trends
  • Configuring log collection, aggregation, and analysis within the OpsRamp platform
  • Creating and managing alerts, defining escalation paths, and integrating with incident management systems
  • Developing and implementing automated workflows and remediation actions within the OpsRamp platform
  • Designing and building custom dashboards and reports to provide key insights into system health and performance
  • Integrating OpsRamp with other monitoring and observability tools as needed
  • Ensuring data quality and integrity within the OpsRamp platform
  • Troubleshooting and resolving performance issues, application errors, and other operational problems
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Cloud Software Engineer - Observability Platform

ClickHouse is looking for an experienced engineer to join our Observability team...
Location
Location
United States
Salary
Salary:
141000.00 - 208000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years building and running production systems at scale
  • Proficiency in Golang
  • Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
  • Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
  • Experience with OpenTelemetry, Prometheus, Grafana, or similar tools
  • Experience with ClickHouse preferred
Job Responsibility
Job Responsibility
  • Design, build, and operate distributed systems that power observability across ClickHouse Cloud
  • Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
  • Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
  • Build tooling and automation to eliminate repetitive operational work
  • Help shape the roadmap for observability by identifying bottlenecks and scaling challenges
  • Collaborate with other engineering teams to improve their observability posture
  • Contribute to design discussions, architecture reviews, and mentor teammates
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Staff Observability Operations Engineer

We are currently seeking several experienced and highly skilled Staff Observabil...
Location
Location
United States , Hartford
Salary
Salary:
130295.00 - 260590.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ Years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications
  • 5+ Years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions
  • 5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana)
  • Experience developing and administering ServiceNow ITOM event management solutions
  • Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty)
  • Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift)
  • Proficiency in Python and other scripting languages such as Ansible, PowerShell, Bash for automation and configuration
  • Hands-on experience deploying, managing, and administering observability platforms
  • Hands-on experience leading, coordinating, and performing migration of application, platform, and infrastructure observability solutions
  • Proven ability to troubleshoot and resolve complex technical issues
Job Responsibility
Job Responsibility
  • Deploy and implement modern observability solutions
  • Manage and administer observability and event management platforms
  • Coordinate and manage release cycles for observability platforms
  • Troubleshoot and resolve incidents related to observability platforms
  • Continuously monitor and enhance platform performance
  • Collaborate with cross-functional stakeholders
  • Provide training and mentoring to junior engineers
  • Ensure compliance and security of observability platforms
  • Maintain documentation of observability platform configurations
  • Generate and analyze reports on platform performance and capacity
What we offer
What we offer
  • Affordable medical plan options
  • a 401(k) plan (including matching company contributions)
  • an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs
  • confidential counseling and financial coaching
  • Paid time off
  • flexible work schedules
  • family leave
  • dependent care resources
  • colleague assistance programs
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.