CrawlJobs Logo

Observability Engineer / Architect

United Kingdom, Shropshire · Job Posted January 07, 2026
Apply Position
Job Link Share

Job Description

You will play a pivotal role in designing and governing observability architecture across enterprise platforms, ensuring alignment with strategic objectives and technical standards. This role combines discovery (assessing current state, defining observability requirements, and shaping architectural blueprints) and delivery (implementing scalable solutions, integrating with enterprise ecosystems, and optimising technical design) to enhance service reliability, performance, and business insights. You will act as a key influencer in architectural decisions, ensuring observability capabilities are embedded into the organisation’s technology roadmap.

Job Responsibility

  • Conduct current-state assessments of observability architecture across applications, infrastructure, and services
  • Define target-state architecture for observability, ensuring alignment with enterprise principles and technology standards
  • Identify monitoring gaps and prioritise remediation based on technical risk and business-critical outcomes
  • Collaborate with Enterprise Architects, Service Owners, and the Observability Centre of Excellence to shape observability strategy and backlog
  • Design and implement Dynatrace-based observability solutions with architectural considerations for scalability, resilience, and integration
  • Develop reference architectures and patterns for observability, embedding them into CI/CD pipelines and strategic tooling
  • Optimise data ingestion and technical architecture for Dynatrace deployments, ensuring compliance with vendor best practices and enterprise governance
  • Provide architectural oversight during incident analysis and troubleshooting to reduce MTTR and improve system reliability
  • Act as a technical authority for Dynatrace SaaS deployments, guiding engineering teams on architectural decisions
  • Drive adoption of observability standards, frameworks, and architectural principles across teams
  • Contribute to the enterprise observability roadmap, ensuring alignment with broader IT strategy and digital transformation goals
  • Mentor and upskill teams on architectural thinking, Dynatrace capabilities, and observability best practices
  • Participate in architecture review boards and provide input on cross-domain integration and interoperability

Requirements

  • Must not have been outside of the UK for more than 6 Months in the last 5 years
  • Proven experience with Dynatrace SaaS and observability architecture, including dashboarding, alerting, and DQL
  • Strong understanding of observability principles (metrics, logs, traces) and their role in enterprise architecture
  • Familiarity with cloud platforms (AWS, Azure), container technologies (Kubernetes), and architectural frameworks
  • Experience designing solutions for enterprise systems (WebLogic, Apache, Oracle, SQL) and infrastructure (Windows, Linux, Unix)
  • Ability to produce architectural artefacts (diagrams, standards, patterns) and communicate complex designs effectively
  • Excellent stakeholder engagement and collaboration skills
  • Dynatrace Associate Certification
  • TOGAF or equivalent enterprise architecture certification (advantageous)

Nice to have

TOGAF or equivalent enterprise architecture certification

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Observability Engineer / Architect

8 matching positions

Lead Observability Engineer

We are seeking a Lead Observability Engineer to join the team, and be able to wo...
Location
Location
Salary
Salary:
Not provided
n-ix.com Logo
N-iX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of engineering experience in cloud observability platforms, infrastructure, and telemetry systems
  • Deep experience in alerting, notifications, and monitoring at scale
  • Advanced expertise with ClickHouse, or similar high-performance analytical databases, for telemetry storage and querying
  • Hands-on experience migrating telemetry/storage solutions (preferably from Cosmos DB to ClickHouse or equivalent)
  • Solid understanding of telemetry pipelines, cloud-native monitoring, and best practices
  • Experience with dashboarding and visualization tools (Grafana, Kibana, or similar)
  • Strong scripting and automation skills (Python, Bash, Terraform or equivalent)
  • Proven collaboration and communication skills across cross-functional teams.
Job Responsibility
Job Responsibility
  • Lead the migration and transformation of telemetry storage from custom Cosmos DB solutions to ClickHouse, building a scalable and reliable end-to-end observability platform
  • Architect, implement, and maintain alerting and notification systems integrated with ClickHouse for critical services and applications
  • Develop, deploy, and operate high-throughput telemetry pipelines, ensuring accurate and actionable monitoring across cloud environments
  • Collaborate with engineering and product teams to define and champion observability best practices
  • Work with DevOps and development teams to automate collection, ingestion, and retention policies for logs, metrics, and traces
  • Drive continuous improvement in system performance, stability, and reliability through effective observability
  • Participate in on-call rotations, incident response, and root cause analysis to enhance monitoring and alerting capabilities.
What we offer
What we offer
  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits
  • Fulltime
Read More
Arrow Right

Aircraft Subsystems Architect - Engineer

Aircraft Subsystems Architect - Engineer position at SOGECLAIR, a provider of in...
Location
Location
United States , Wichita
Salary
Salary:
Not provided
chat3d.ai Logo
Chat3D
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • More than 10 years of relevant experience in Mission Management Systems development, Command & Control Communications and Sensor, integration and certification
  • Strong knowledge of complex systems development process methodologies and associated practices (ex: ARP4754, ARP4761)
  • Familiarity with NATO Standardization Agreement STANAG 4671 and 4586
  • Open minded, curious, rigorous and methodical
  • Wish to develop systems engineering capabilities
  • Strive for harmonious cohesive teamwork and support knowledge sharing amongst team members
  • Excellent analytical and problem-solving skills
  • Approach problem solving with a holistic view
  • Strong communication skills and can easily tailor communication to suit the audience
  • Possess a university degree in Electrical, Aerospace or Mechanical Engineering
Job Responsibility
Job Responsibility
  • Work with customers, sales, program managers and systems engineers to translate high level customer requirements into specific and detailed platform and product requirements
  • Develop functional requirements and interfaces for the mission management system and integration of the sensing and communication sub-systems
  • Own the mission system architecture from designing roadmaps, performing trade studies, developing Concept of Operations, elaborating requirements and interfaces throughout the life-cycle of the program
  • Lead the architecture through design phase milestones including Critical Design Reviews and Critical Integration Reviews
  • Conduct solution tradeoff studies that lead to the definition of the product
  • Communicate architectures and configurations across engineering disciplines
  • Define the Validation & Verification activities from concept to execution including the associated documentation
  • Develop Specifications for and integrate potential suppliers/vendors solutions
  • Engage in new business pursuits including white papers and proposals
  • Develop data-based observations and recommendations that will be presented to senior leadership
  • Fulltime
Read More
Arrow Right

Sr/Staff Software Engineer, Observability

We are looking for a highly skilled engineer with deep expertise in building and...
Location
Location
United States , San Francisco
Salary
Salary:
172000.00 - 253000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of experience with distributed systems, with a focus on observability and monitoring systems
  • Deep expertise with metrics systems (Prometheus, Thanos, Mimir, Cortex), logging pipelines (Fluent Bit, Vector, Loki, ELK/Opensearch), and tracing platforms (Jaeger, Tempo, OpenTelemetry)
  • Strong programming skills in Go or Python for automation, operators, and custom integrations
  • Experience running observability platforms on Kubernetes and operating them at scale across multi-datacenter environments
  • Proven ability to design, optimize, and scale telemetry pipelines handling high cardinality and high throughput data
  • Solid understanding of distributed systems, performance engineering, and debugging complex workloads
  • Familiarity with service meshes, networking, and workload instrumentation (Envoy, Istio, OpenTelemetry SDKs)
  • Strong collaboration skills and the ability to influence engineering teams to adopt observability best practices
Job Responsibility
Job Responsibility
  • Designing and operating scalable observability systems (metrics, logging, tracing) across multi-datacenter Kubernetes environments
  • Architecting end-to-end telemetry pipelines, including ingestion, storage, querying, and visualization
  • Extending monitoring and alerting with Prometheus, Alertmanager, Thanos/Cortex, Grafana, and OpenTelemetry
  • Building scalable log collection and processing pipelines with Fluent Bit, Vector, Loki, or ELK/Opensearch stacks
  • Implementing distributed tracing platforms (Tempo, Jaeger, OpenTelemetry) and integrating with service meshes, load balancers, and APIs
  • Defining and driving adoption of SLOs, SLIs, and error budgets across services and teams
  • Automating provisioning and scaling of observability infrastructure with Kubernetes, Terraform, and custom tooling (Go, Python)
  • Ensuring reliability and cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ML, HPC clusters, GPU infrastructure)
  • Embedding security best practices into observability platforms, including RBAC, TLS, secret management, and multi-tenant access controls
  • Mentoring engineers and shaping Crusoe’s observability strategy and technical roadmap
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Staff Backend Solution Architect Engineer

Heidi builds for the future of healthcare, not just the next quarter, and our go...
Location
Location
Australia , Sydney
Salary
Salary:
Not provided
heidihealth.com Logo
Heidi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Mastery of software-engineering fundamentals
  • Proven experience designing APIs (REST or GraphQL), and event-driven systems (Kafka, Pub/Sub or similar)
  • Deep coding and architect mindset — Hands on experience in test driven development (TDD), AI coding, microservices, eventual consistency etc.
  • Deep database expertise—schema design, query optimisation and scaling for both relational and NoSQL stores
  • Cloud fluency (GCP or AWS): containers, CI/CD, infrastructure-as-code, security and observability
  • Track record of shipping and operating distributed systems at scale with high reliability
  • Rigorous testing, code-review and documentation habits that elevate team standards
  • Hands-on work with AI/LLM services or real-time audio/streaming pipelines
Job Responsibility
Job Responsibility
  • Architect and build backend services that power LLM-based agents and clinical automations
  • Design robust APIs & data models that are secure, observable and easy for other teams to extend
  • Optimise performance and cost—profile hot paths, tune databases and right-size cloud resources
  • Automate quality: write unit / integration tests, craft alerts and own on-call runbooks so clinicians can trust every interaction
  • Partner with product, AI and front-end engineers to ship new capabilities from concept to production in weeks, not quarters
What we offer
What we offer
  • Additional paid day off for your birthday and wellness days
  • A generous personal development budget of $500 per annum
  • Learn from some of the best engineers and creatives, joining a diverse team
  • The rare chance to create a global impact as you immerse yourself in one of Australia’s leading healthtech startups
  • If you have an impact quickly, the opportunity to fast track your startup career
  • Fulltime
Read More
Arrow Right

Lead Observability Engineer

Lead Observability Engineer role focusing on the Elastic Observability Platform,...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
blueyonder.com Logo
Blue Yonder
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, MIS, or equivalent experience
  • 7–10+ years of experience in observability engineering, SRE, monitoring platform ownership, or infrastructure operations
  • Deep, hands-on expertise with Elastic Stack (Elasticsearch, Kibana, Logstash, Beats/Elastic Agent, APM)
  • Strong architectural knowledge of cloud (Azure/AWS) and hybrid observability patterns
  • Experience leading observability for infrastructure, cloud platforms, network systems, Kubernetes, and Microsoft 365
  • Proven experience designing monitoring for SaaS platforms (Workday, Salesforce, ServiceNow)
  • Advanced scripting/automation experience (Python, PowerShell, Bash)
  • Strong knowledge of API integrations, data pipelines, and log-flow engineering
  • Experience leading incident diagnostics and delivering visibility for RCA and operational improvement
  • Strong analytical, architectural, and troubleshooting skills with a platform-owner mindset
Job Responsibility
Job Responsibility
  • Receives work assignments through the ticketing system or from senior leadership
  • Provides Tier-4 engineering expertise, platform ownership, and technical leadership for all observability capabilities across hybrid cloud, on-premises, and SaaS environments
  • Leads the design, architecture, and maturity of the enterprise observability ecosystem with a primary focus on the Elastic Observability Platform
  • Drives the enterprise strategy for logging, metrics, traces, synthetics, and alerting—including governance, standardization, and performance optimization
  • Partners closely with Cloud, Infrastructure, Security, Enterprise Applications, and SRE leadership to define observability frameworks
  • Ensures observability platforms meet enterprise requirements for security, performance, availability, compliance, and scalability
  • Oversees monitoring implementations for key SaaS applications including Workday, Salesforce, ServiceNow, and Microsoft 365
  • Provides guidance, mentorship, and direction to observability engineers, SREs, and operational teams
  • Acts as a strategic advisor during major incidents by providing real-time diagnostics, correlation insights, and driving RCA improvements
  • Required to provide on-call support during off-hours on weekdays, weekends, and holidays on a rotating basis
  • Fulltime
Read More
Arrow Right

Subsystems Architect - Engineer

At Bombardier, we design, build and maintain the world’s peak-performing aircraf...
Location
Location
United States , Wichita
Salary
Salary:
Not provided
bombardier.com Logo
Bombardier
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • More than 10 years of relevant experience in Mission Management Systems development, Command & Control Communications and Sensor, integration and certification
  • Strong knowledge of complex systems development process methodologies and associated practices (ex: ARP4754, ARP4761)
  • Familiarity with NATO Standardization Agreement STANAG 4671 and 4586
  • Open minded, curious, rigorous and methodical
  • Strive for harmonious cohesive teamwork and support knowledge sharing amongst team members
  • Excellent analytical and problem-solving skills with a holistic view
  • Strong communication skills and can easily tailor your communication to suit the audience
  • Possess a university degree in Electrical, Aerospace or Mechanical Engineering
  • Solid understanding of aircraft systems, system development, industry specific certification processes
Job Responsibility
Job Responsibility
  • Work with customers, sales, program managers and systems engineers to translate high level customer requirements into specific and detailed platform and product requirements
  • Develop functional requirements and interfaces for the mission management system and integration of the sensing and communication sub-systems
  • Own the mission system architecture from designing roadmaps, performing trade studies, developing Concept of Operations, elaborating requirements and interfaces throughout the life-cycle of the program
  • Lead the architecture through design phase milestones including Critical Design Reviews and Critical Integration Reviews
  • Conduct solution tradeoff studies that lead to the definition of the product
  • Communicate architectures and configurations across engineering disciplines
  • Define the Validation & Verification activities from concept to execution including the associated documentation
  • Develop Specifications for and integrate potential suppliers/vendors solutions
  • Engage in new business pursuits including white papers and proposals
  • Develop data-based observations and recommendations that will be presented to senior leadership
What we offer
What we offer
  • Insurance plans (Dental, medical, life insurance, disability, and more)
  • Competitive base salary
  • Retirement savings plan
  • Employee Assistance Program
  • Tele Health Program
  • Fulltime
Read More
Arrow Right

Senior Data Engineer Lead / Architect - Senior Vice President

At Citi Services - Global Trade Technology Organization, we are on a mission to ...
Location
Location
India , Pune, Maharashtra, India, Chennai, Tamil Nadu, India
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of professional experience in data engineering, with a proven track record of designing and building large-scale data systems
  • 3+ years in a technical leadership or architect role, with experience mentoring junior and senior engineers
  • Expert-level proficiency in at least one programming language (Python or Scala preferred) and exceptional SQL skills
  • Proven hands-on experience with Python or Scala for data manipulation, scripting, machine learning, and backend development
  • Deep, hands-on experience with a major cloud platform (AWS, GCP, or Azure) and its data ecosystem (e.g., S3/GCS, Redshift/BigQuery, EMR/Dataproc, Kinesis/Dataflow)
  • Extensive hands-on experience with modern big data technologies and Data streaming (like Hadoop, Hive, Impala, Apache Spark, Kafka, or Flink)
  • Proficiency with workflow orchestration tools such as Airflow, Dagster, or Prefect
  • Proficiency in designing and implementing microservices architectures, RESTful APIs, and event-driven systems with 'Data as a Product' Principle
  • Solid understanding of data modeling concepts and database design for both analytical (OLAP) and transactional (OLTP) workloads
  • Deep understanding and hands-on experience with relational databases (e.g., PostgreSQL, Oracle), NoSQL databases (e.g., MongoDB, Cassandra), data warehousing, and big data technologies (e.g., Spark, Kafka)
Job Responsibility
Job Responsibility
  • Architect & Design: Design, architect, and oversee the development of robust, scalable, and reliable data infrastructure, including data lakes, data warehouses, and real-time streaming platforms on the cloud
  • Build & Code: Act as a senior individual contributor and hands-on technical leader. Write clean, maintainable, and high-performance code for data ingestion, transformation, and serving layers (e.g., using Python, Scala, SQL, and Spark)
  • Lead & Mentor: Lead a team of data engineers, providing technical guidance, mentorship, and career development support. Foster a collaborative and inclusive team environment
  • Champion Culture: Define, document, and champion data engineering best practices across the organization, including CI/CD, data quality, testing frameworks, observability, and code review standards
  • Drive Strategy: Partner with leadership, product managers, data scientists, and analysts to understand data needs and develop a long-term data strategy and roadmap
  • Innovate & Evaluate: Stay at the forefront of data engineering technologies. Evaluate, prototype, and recommend new tools and frameworks to continuously improve our data platform
  • Ensure Governance: Implement and enforce robust data governance, security, and privacy policies in partnership with our security and compliance teams
  • Fulltime
Read More
Arrow Right

Principal Engineer (Software Architect)

At Flight Centre Travel Group (FCTG) our purpose is to 'open up the world for th...
Location
Location
Australia , South Bank
Salary
Salary:
Not provided
fctgcareers.com Logo
Flight Centre Brand
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years experience as a Technical Lead or Technical Architect
  • Experience in transactional domains (e.g. bookings, payments, e-commerce) where data integrity and financial accountability are critical
  • Broad experience across diverse technology stacks with the ability to assess trade-offs across languages, paradigms, hosting models and data storage approaches
  • Strong experience designing and delivering cloud-native applications built for global scale, reliability, security and performance
  • Strong knowledge of architecture styles including SOA, micro-services and common software design patterns
  • Holistic understanding of the full software lifecycle including CI/CD, observability, production support, reporting and developer tooling
  • Exposure to Kubernetes, ElasticSearch, Redis and AWS services such as EKS, Lambda, API Gateway, DynamoDB, S3 and CloudFront
  • Demonstrated experience applying AI-assisted development practices and a strong point of view on embedding AI and agentic capabilities into engineering workflows
  • Proven ability to partner with and influence senior business stakeholders
  • Excellent written and oral communication skills
Job Responsibility
Job Responsibility
  • Shape technical strategy through hands-on involvement in product discovery, prototyping and planning
  • Design pragmatic, cloud‑native architectures with a focus on simplicity, reuse, testability, performance and stability
  • Lead data‑oriented architectural design, defining how data is produced, owned and transformed across business processes
  • Validate and evolve architectural decisions through spikes, proofs of concept and close collaboration with engineers and technical leads
  • Champion adoption of AI and agentic capabilities in engineering workflows, leveraging emerging technologies to improve delivery and impact
  • Stay close to delivery and the code, supporting teams with system dependencies, risk identification and production readiness
  • Establish and promote architectural patterns, standards and best practices that scale across teams and domains
  • Mentor engineers and technical leaders, empowering teams to make sound architectural decisions within clear security and stability guardrails
  • Continuously improve engineering quality, developer experience, tooling, pipelines and ways of working through hands‑on contribution
What we offer
What we offer
  • Inclusive company culture
  • Equal Opportunity Employer
  • Individualised ongoing Learning & Development via communities of practice
  • Innovation Days
  • Dedicated Engineering Days
  • Access to LinkedIn Learning for ongoing skills development
  • Women in PM&E group
  • Exclusive staff discounts
  • Travel discounts including family and friends
  • Career opportunities in a network of brands and businesses across the globe
  • Fulltime
Read More
Arrow Right