CrawlJobs Logo

Software Engineer - Observability Platform

United Kingdom, London · Job Posted February 13, 2026
Apply Position
Job Link Share

Job Description

Enable team to address the problem of being severely behind high-priority issues of keeping up to date on patching schedules and vulnerability management

Job Responsibility

  • Development and maintenance of Observability Platform & tooling at FactSet, covering all Observability pillars (metrics, logs and traces)
  • Participate in an on-call schedule, as part of a global team, to maintain a highly available platform for our end-users
  • Work with and support the stakeholders who use the Observability platform

Requirements

Mid-level minimum experience

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Software Engineer - Observability Platform

8 matching positions

Senior+ Software Engineer - Cloud Availability Platform Engineering (Observability)

We are looking for a highly skilled engineer with deep expertise in building and...
Location
Location
United States , San Francisco
Salary
Salary:
166000.00 - 201000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in infrastructure or platform engineering, with a focus on observability and monitoring systems
  • Deep expertise with metrics systems (Prometheus, Thanos, Mimir, Cortex), logging pipelines (Fluent Bit, Vector, Loki, ELK/Opensearch), and tracing platforms (Jaeger, Tempo, OpenTelemetry)
  • Strong programming skills in Go or Python for automation, operators, and custom integrations
  • Experience running observability platforms on Kubernetes and operating them at scale across multi-datacenter environments
  • Proven ability to design, optimize, and scale telemetry pipelines handling high cardinality and high throughput data
  • Solid understanding of distributed systems, performance engineering, and debugging complex workloads
  • Strong collaboration skills and the ability to influence engineering teams to adopt observability best practices
Job Responsibility
Job Responsibility
  • Designing and operating scalable observability systems (metrics, logging, tracing) across multi-datacenter Kubernetes environments
  • Architecting end-to-end telemetry pipelines, including ingestion, storage, querying, and visualization
  • Extending monitoring and alerting with Prometheus, Alertmanager, Thanos/Cortex, Grafana, and OpenTelemetry
  • Building scalable log collection and processing pipelines with Fluent Bit, Vector, Loki, or ELK/Opensearch stacks
  • Implementing distributed tracing platforms (Tempo, Jaeger, OpenTelemetry) and integrating with service meshes, load balancers, and APIs
  • Defining and driving adoption of SLOs, SLIs, and error budgets across services and teams
  • Automating provisioning and scaling of observability infrastructure with Kubernetes, Terraform, and custom tooling (Go, Python)
  • Ensuring reliability and cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ML, HPC clusters, GPU infrastructure)
  • Embedding security best practices into observability platforms, including RBAC, TLS, secret management, and multi-tenant access controls
  • Partnering with engineering teams to embed observability into applications, services, and infrastructure
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Senior Software Engineer and Software Engineer II

OneDrive and SharePoint are rapidly growing services at the center of Microsoft'...
Location
Location
United States , Redmond
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience in related to cloud scale distributed design and patterns
  • The ability to deliver informed designs and plans ahead of production and execution
  • Knowledge of others' expertise and the ability to involve multiple players (within and outside the organization) in the creation or development of novel products, processes, or research streams
Job Responsibility
Job Responsibility
  • Design and deliver systems that enable partners and ISVs to migrate from other cloud providers, improve core systems performance and efficiencies, and ensure zero customer impact throughout the change management cycle
  • Deliver systems to meet our business continuity planning goals, provide telemetry for optimizing the service and drive our response time for detecting and resolving service issues down
  • Create, implement, optimize, debug, refactor, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
  • Contribue to the identification of dependencies, and the development of design documents for a product area with little oversight
  • Helps to identify other teams and technologies that will be leveraged, how they will interact, and when one's system may provide support to others
  • Contributes to determining back-end dependencies associated with product, application, service, or platform functionality for product features
  • Understands downstream effects of solutions and work provided
  • Helps to identify areas of dependency and overlap with other teams or team members and drives coordination
  • Remain current in skills by investing time and effort into staying abreast of current developments that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale
  • Reviews work items to deepen knowledge of product features in partnership with appropriate stakeholders (e.g., project managers) and executes project plans, release plans, and work items
  • Fulltime
Read More
Arrow Right
New

Software Engineer - Observability

Microsoft is a company where passionate innovators come to collaborate, envision...
Location
Location
Ireland , Dublin
Salary
Salary:
63400.00 - 105700.00 EUR / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in computer science or related discipline, or equivalent experience
  • Software development with demonstrated experience shipping products or services
  • Solid understanding of data structures, algorithms, and system design fundamentals
  • Strong problem-solving and analytical skills, with a structured approach to software design
  • Ability to collaborate effectively in a cross-functional team environment
  • Strong communication skills
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Design, develop, and operate large-scale, multi-tenant telemetry ingestion pipelines and services (real-time and batch) to handle massive data volumes
  • Build and enhance APIs, tools, and subsystems for telemetry collection, routing, storage, and efficient data access
  • Integrate advanced capabilities (e.g., machine learning–based anomaly detection and data validation) to enhance platform intelligence and insights
  • Own core components of the ingestion and observability platform, driving continuous improvements in reliability, scalability, performance, and data quality
  • Implement robust monitoring, alerting, and diagnostics and ensure production services run reliably, including participation in on-call rotations and incident response
  • Collaborate with partner teams to deliver end-to-end observability solutions and contribute to design reviews and best practices that uphold high engineering standards
  • Embody our culture and values
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Platform

Location
Location
India , Pune
Salary
Salary:
Not provided
rapid7.com Logo
Rapid7
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A minimum of 6 years experience in software development using Java, terraform and Jenkins. Experience with Python, Go, ansible, spinnaker, or other equivalent programming languages would be advantageous
  • Experience with vulnerability or code quality frameworks such as Trivvy, Snyk or SonarQube
  • A minimum of 1 year of working with observability tooling such as grafana
  • Experience using Cloud infrastructure, ideally AWS
  • Experience with testing frameworks as Selenium, Cypress, Cucumber, Playwright would be advantageous
  • Excited by technology, curious and eager to learn, with the ability to mentor more junior members of the team
  • Customer focussed mindset, understanding customer needs, providing excellent service and focussed on delivering value
  • The attitude and ability to thrive in a high-growth, evolving environment
  • Collaborative team player who has the ability to partner with others and drive toward solutions
  • Strong creative problem solving skills
Job Responsibility
Job Responsibility
  • Collaborate with your team and other key stakeholders to identify potential risks to availability/reliability
  • Suggest patterns and standards around testing and operations to improve reliability, quality, and time-to-market of our suite of software solutions
  • Be involved in the creation, design and planning of upcoming testing strategies, operational improvements and decisions around tooling and frameworks relating to SRE/Operations
  • Regularly monitor our applications/infrastructure and identify opportunities for improving efficiencies or MTTR
  • Understand our products and make decisions to support our customers
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Observability

You will work on core observability systems (metrics, logs, traces) while also d...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
roku.com Logo
Roku
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in software engineering, building distributed, high-throughput systems or observability platforms
  • 4+ years of Go/Golang experience
  • our observability ecosystem is built on Go, making it the most effective language for this role
  • Experience with, or strong interest in, observability tools (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, Clickhouse) and standards (OpenTelemetry, OpenTracing, OpenMetrics)
  • Deep understanding of distributed systems and data models
  • Hands-on experience with Kubernetes and cloud platforms (AWS, GCP, Azure)
Job Responsibility
Job Responsibility
  • Extend and integrate open-source observability systems, and when necessary, structurally overhaul core components, such as storage layers and query paths, to enhance the performance, reliability, and usability of these tools at scale
  • Build services to improve performance, usability, reliability, and cost efficiency
  • Implement features like pre-aggregation, downsampling, and sampling to reduce load and accelerate queries
  • Create developer-facing capabilities for metrics, logs, and traces usage, data quality, and cost management
  • Automate onboarding, dashboards, alerting, and tracing
  • Collaborate across platform and infrastructure teams to integrate observability into Roku’s cloud-native stack
What we offer
What we offer
  • global access to mental health and financial wellness support and resources
  • healthcare (medical, dental, and vision)
  • life, accident, disability, commuter, and retirement options (401(k)/pension)
  • Fulltime
Read More
Arrow Right

Software Engineer, Observability

As a Software Engineer in Observability, you’ll be responsible for our metrics a...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
dialpad.com Logo
Dialpad
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Background in both Systems and/or Software Engineering
  • Experience in designing, automating, maintaining, and optimizing observability platforms (logging, metrics, and tracing)
  • Experience with configuration management tools such as Ansible, Terraform, etc.
  • Experience with Public Cloud environments such as GCP, AWS, etc.
  • Familiarity with languages such as Python, Go, Rust, etc.
  • Previous direct experience with Grafana, Loki, Prometheus
  • Experience with Linux
  • Experience with Kubernetes (including GKE/EKS) and building containerized applications
  • Undergraduate degree in Computer Science or Engineering
Job Responsibility
Job Responsibility
  • Develop and improve instrumentation for monitoring and logging the health and availability of services
  • Develop and maintain the observability stack within Dialpad engineering
  • Define best practices and standards around making systems and services measurable, and work with various teams to get those best practices applied
  • Create tools and libraries for other engineering teams to enable them to build self-monitoring capabilities
  • Create and own internal documentation used by the other engineering teams
  • Stay up-to-date with the latest trends in observability, logging, monitoring, and cloud technologies
  • Collaborate with different engineering teams to integrate observability practices into their workflows
  • Participate in a rotating on-call within the larger Infrastructure Engineering division
What we offer
What we offer
  • Competitive salary
  • comprehensive benefits
  • real opportunities for growth
  • cutting-edge AI tools
  • robust training program
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Observability

We are looking for an experienced Senior Engineer to join our newly formed Obser...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
aiven.io Logo
Aiven Deutschland GmbH
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Extensive experience with observability concepts on a big scale
  • A good grasp of monitoring and observability tools like Prometheus, Grafana, and OpenTelemetry
  • Understanding of SLAs, SLOs, and SLIs
  • Strong knowledge of database fundamentals, including OLAP vs. OLTP, persistence, replication, and clustering
  • Experience with ClickHouse specifically regarding logs, metrics, and OpenTelemetry is highly desirable
  • Experience in building and designing distributed systems in a cloud environment
  • Ability to work with SQL to interact with our platform's master database
  • Deep understanding of release management and testing best practices to own the delivery pipeline
  • A genuine interest in solving complex technical challenges with customer-focused solutions
Job Responsibility
Job Responsibility
  • Ensure our existing observability offering is up and running all the time
  • Ideate and develop innovative new features that attract our target customer segment, drive product engagement, and ultimately fuel growth
  • Support our existing external customer base by resolving escalated support issues and collaborating with them to understand and solve their needs
  • Guide the team in the hands-on implementation of key platform features, ensuring maintainability and performance
  • Empower your team to act as 'product custodians' by consistently addressing foundational and production issues
  • Practise effective communication and collaboration both within the team and across the wider organization and act as a role model in transparency for your peers
What we offer
What we offer
  • Participate in Aiven’s equity plan
  • Balance work and life with our hybrid work policy
  • Choose the equipment you need to set yourself up for success
  • Use your Professional Development Plan budget for learning opportunities
  • Receive holistic wellbeing support through our global Employee Assistance Program
  • Inquire about our Global Time Off Commitment (Parental and Sick Leave, as well as Personal Time)
  • Enjoy country-specific benefits for our global cast
  • Fulltime
Read More
Arrow Right

Software Engineer, Platform (Developer Experience)

We are seeking a Platform Engineer to enhance and scale Lovable's developer expe...
Location
Location
Sweden , Stockholm
Salary
Salary:
Not provided
lovable.dev Logo
Lovable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills and a track record of improving developer velocity and system reliability
  • 7+ years of experience working in a platform team supporting developer experience (o11y, CI/CD, application frameworks, productivity tooling, etc)
  • Experience writing code and tools to support growing engineering orgs in scale-ups
  • Experience with Docker, Kubernetes and modern infrastructure practices
  • Problem-solver who thrives on challenges and ships high-leverage systems fast
  • Comfortable navigating ambiguity and solving problems as they arise
  • Care about security, stability, and speed and know when to make trade-offs between them
  • Based in Stockholm or ready to relocate
Job Responsibility
Job Responsibility
  • Own and scale the developer experience at Lovable, making our engineering teams the most productive in the world
  • Bring order and structure to our code base
  • Integrate or build application frameworks to support a growing engineering organization and code footprint
  • Own and develop our observability stack, from code instrumentation through ingestion to presentation
  • Integrating tools for AI driven development
  • Identify and drive reliability improvement efforts across all engineering teams
  • Fulltime
Read More
Arrow Right