Observability Engineer / Architect Job at Whitehall Resources Ltd (Shropshire)

Lead Observability Engineer

We are seeking a Lead Observability Engineer to join the team, and be able to wo...

Location

Salary:

Not provided

N-iX

Expiration Date

Until further notice

Requirements

5+ years of engineering experience in cloud observability platforms, infrastructure, and telemetry systems
Deep experience in alerting, notifications, and monitoring at scale
Advanced expertise with ClickHouse, or similar high-performance analytical databases, for telemetry storage and querying
Hands-on experience migrating telemetry/storage solutions (preferably from Cosmos DB to ClickHouse or equivalent)
Solid understanding of telemetry pipelines, cloud-native monitoring, and best practices
Experience with dashboarding and visualization tools (Grafana, Kibana, or similar)
Strong scripting and automation skills (Python, Bash, Terraform or equivalent)
Proven collaboration and communication skills across cross-functional teams.

Job Responsibility

Lead the migration and transformation of telemetry storage from custom Cosmos DB solutions to ClickHouse, building a scalable and reliable end-to-end observability platform
Architect, implement, and maintain alerting and notification systems integrated with ClickHouse for critical services and applications
Develop, deploy, and operate high-throughput telemetry pipelines, ensuring accurate and actionable monitoring across cloud environments
Collaborate with engineering and product teams to define and champion observability best practices
Work with DevOps and development teams to automate collection, ingestion, and retention policies for logs, metrics, and traces
Drive continuous improvement in system performance, stability, and reliability through effective observability
Participate in on-call rotations, incident response, and root cause analysis to enhance monitoring and alerting capabilities.

What we offer

Flexible working format - remote, office-based or flexible
A competitive salary and good compensation package
Personalized career growth
Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
Active tech communities with regular knowledge sharing
Education reimbursement
Memorable anniversary presents
Corporate events and team buildings
Other location-specific benefits

Fulltime

Aircraft Subsystems Architect - Engineer

Aircraft Subsystems Architect - Engineer position at SOGECLAIR, a provider of in...

Location

United States , Wichita

Salary:

Not provided

Chat3D

Expiration Date

Until further notice

Requirements

More than 10 years of relevant experience in Mission Management Systems development, Command & Control Communications and Sensor, integration and certification
Strong knowledge of complex systems development process methodologies and associated practices (ex: ARP4754, ARP4761)
Familiarity with NATO Standardization Agreement STANAG 4671 and 4586
Open minded, curious, rigorous and methodical
Wish to develop systems engineering capabilities
Strive for harmonious cohesive teamwork and support knowledge sharing amongst team members
Excellent analytical and problem-solving skills
Approach problem solving with a holistic view
Strong communication skills and can easily tailor communication to suit the audience
Possess a university degree in Electrical, Aerospace or Mechanical Engineering

Job Responsibility

Work with customers, sales, program managers and systems engineers to translate high level customer requirements into specific and detailed platform and product requirements
Develop functional requirements and interfaces for the mission management system and integration of the sensing and communication sub-systems
Own the mission system architecture from designing roadmaps, performing trade studies, developing Concept of Operations, elaborating requirements and interfaces throughout the life-cycle of the program
Lead the architecture through design phase milestones including Critical Design Reviews and Critical Integration Reviews
Conduct solution tradeoff studies that lead to the definition of the product
Communicate architectures and configurations across engineering disciplines
Define the Validation & Verification activities from concept to execution including the associated documentation
Develop Specifications for and integrate potential suppliers/vendors solutions
Engage in new business pursuits including white papers and proposals
Develop data-based observations and recommendations that will be presented to senior leadership

Fulltime

Sr/Staff Software Engineer, Observability

We are looking for a highly skilled engineer with deep expertise in building and...

Location

United States , San Francisco

Salary:

172000.00 - 253000.00 USD / Year

Crusoe

Expiration Date

Until further notice

Requirements

6+ years of experience with distributed systems, with a focus on observability and monitoring systems
Deep expertise with metrics systems (Prometheus, Thanos, Mimir, Cortex), logging pipelines (Fluent Bit, Vector, Loki, ELK/Opensearch), and tracing platforms (Jaeger, Tempo, OpenTelemetry)
Strong programming skills in Go or Python for automation, operators, and custom integrations
Experience running observability platforms on Kubernetes and operating them at scale across multi-datacenter environments
Proven ability to design, optimize, and scale telemetry pipelines handling high cardinality and high throughput data
Solid understanding of distributed systems, performance engineering, and debugging complex workloads
Familiarity with service meshes, networking, and workload instrumentation (Envoy, Istio, OpenTelemetry SDKs)
Strong collaboration skills and the ability to influence engineering teams to adopt observability best practices

Job Responsibility

Designing and operating scalable observability systems (metrics, logging, tracing) across multi-datacenter Kubernetes environments
Architecting end-to-end telemetry pipelines, including ingestion, storage, querying, and visualization
Extending monitoring and alerting with Prometheus, Alertmanager, Thanos/Cortex, Grafana, and OpenTelemetry
Building scalable log collection and processing pipelines with Fluent Bit, Vector, Loki, or ELK/Opensearch stacks
Implementing distributed tracing platforms (Tempo, Jaeger, OpenTelemetry) and integrating with service meshes, load balancers, and APIs
Defining and driving adoption of SLOs, SLIs, and error budgets across services and teams
Automating provisioning and scaling of observability infrastructure with Kubernetes, Terraform, and custom tooling (Go, Python)
Ensuring reliability and cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ML, HPC clusters, GPU infrastructure)
Embedding security best practices into observability platforms, including RBAC, TLS, secret management, and multi-tenant access controls
Mentoring engineers and shaping Crusoe’s observability strategy and technical roadmap

What we offer

Restricted Stock Units in a fast growing, well-funded technology company
Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
Employer contributions to HSA accounts
Paid Parental Leave
Paid life insurance, short-term and long-term disability
Teladoc
401(k) with a 100% match up to 4% of salary
Generous paid time off and holiday schedule
Cell phone reimbursement
Tuition reimbursement

Fulltime

Staff Backend Solution Architect Engineer

Heidi builds for the future of healthcare, not just the next quarter, and our go...

Location

Australia , Sydney

Salary:

Not provided

Heidi

Expiration Date

Until further notice

Requirements

Mastery of software-engineering fundamentals
Proven experience designing APIs (REST or GraphQL), and event-driven systems (Kafka, Pub/Sub or similar)
Deep coding and architect mindset — Hands on experience in test driven development (TDD), AI coding, microservices, eventual consistency etc.
Deep database expertise—schema design, query optimisation and scaling for both relational and NoSQL stores
Cloud fluency (GCP or AWS): containers, CI/CD, infrastructure-as-code, security and observability
Track record of shipping and operating distributed systems at scale with high reliability
Rigorous testing, code-review and documentation habits that elevate team standards
Hands-on work with AI/LLM services or real-time audio/streaming pipelines

Job Responsibility

Architect and build backend services that power LLM-based agents and clinical automations
Design robust APIs & data models that are secure, observable and easy for other teams to extend
Optimise performance and cost—profile hot paths, tune databases and right-size cloud resources
Automate quality: write unit / integration tests, craft alerts and own on-call runbooks so clinicians can trust every interaction
Partner with product, AI and front-end engineers to ship new capabilities from concept to production in weeks, not quarters

What we offer

Additional paid day off for your birthday and wellness days
A generous personal development budget of $500 per annum
Learn from some of the best engineers and creatives, joining a diverse team
The rare chance to create a global impact as you immerse yourself in one of Australia’s leading healthtech startups
If you have an impact quickly, the opportunity to fast track your startup career

Fulltime

Lead Observability Engineer

Lead Observability Engineer role focusing on the Elastic Observability Platform,...

Location

India , Hyderabad

Salary:

Not provided

Blue Yonder

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Engineering, MIS, or equivalent experience
7–10+ years of experience in observability engineering, SRE, monitoring platform ownership, or infrastructure operations
Deep, hands-on expertise with Elastic Stack (Elasticsearch, Kibana, Logstash, Beats/Elastic Agent, APM)
Strong architectural knowledge of cloud (Azure/AWS) and hybrid observability patterns
Experience leading observability for infrastructure, cloud platforms, network systems, Kubernetes, and Microsoft 365
Proven experience designing monitoring for SaaS platforms (Workday, Salesforce, ServiceNow)
Advanced scripting/automation experience (Python, PowerShell, Bash)
Strong knowledge of API integrations, data pipelines, and log-flow engineering
Experience leading incident diagnostics and delivering visibility for RCA and operational improvement
Strong analytical, architectural, and troubleshooting skills with a platform-owner mindset

Job Responsibility

Receives work assignments through the ticketing system or from senior leadership
Provides Tier-4 engineering expertise, platform ownership, and technical leadership for all observability capabilities across hybrid cloud, on-premises, and SaaS environments
Leads the design, architecture, and maturity of the enterprise observability ecosystem with a primary focus on the Elastic Observability Platform
Drives the enterprise strategy for logging, metrics, traces, synthetics, and alerting—including governance, standardization, and performance optimization
Partners closely with Cloud, Infrastructure, Security, Enterprise Applications, and SRE leadership to define observability frameworks
Ensures observability platforms meet enterprise requirements for security, performance, availability, compliance, and scalability
Oversees monitoring implementations for key SaaS applications including Workday, Salesforce, ServiceNow, and Microsoft 365
Provides guidance, mentorship, and direction to observability engineers, SREs, and operational teams
Acts as a strategic advisor during major incidents by providing real-time diagnostics, correlation insights, and driving RCA improvements
Required to provide on-call support during off-hours on weekdays, weekends, and holidays on a rotating basis

Fulltime

Subsystems Architect - Engineer

At Bombardier, we design, build and maintain the world’s peak-performing aircraf...

Location

United States , Wichita

Salary:

Not provided

Bombardier

Expiration Date

Until further notice

Requirements

More than 10 years of relevant experience in Mission Management Systems development, Command & Control Communications and Sensor, integration and certification
Strong knowledge of complex systems development process methodologies and associated practices (ex: ARP4754, ARP4761)
Familiarity with NATO Standardization Agreement STANAG 4671 and 4586
Open minded, curious, rigorous and methodical
Strive for harmonious cohesive teamwork and support knowledge sharing amongst team members
Excellent analytical and problem-solving skills with a holistic view
Strong communication skills and can easily tailor your communication to suit the audience
Possess a university degree in Electrical, Aerospace or Mechanical Engineering
Solid understanding of aircraft systems, system development, industry specific certification processes

Job Responsibility

Work with customers, sales, program managers and systems engineers to translate high level customer requirements into specific and detailed platform and product requirements
Develop functional requirements and interfaces for the mission management system and integration of the sensing and communication sub-systems
Own the mission system architecture from designing roadmaps, performing trade studies, developing Concept of Operations, elaborating requirements and interfaces throughout the life-cycle of the program
Lead the architecture through design phase milestones including Critical Design Reviews and Critical Integration Reviews
Conduct solution tradeoff studies that lead to the definition of the product
Communicate architectures and configurations across engineering disciplines
Define the Validation & Verification activities from concept to execution including the associated documentation
Develop Specifications for and integrate potential suppliers/vendors solutions
Engage in new business pursuits including white papers and proposals
Develop data-based observations and recommendations that will be presented to senior leadership

What we offer

Insurance plans (Dental, medical, life insurance, disability, and more)
Competitive base salary
Retirement savings plan
Employee Assistance Program
Tele Health Program

Fulltime

Senior Data Engineer Lead / Architect - Senior Vice President

At Citi Services - Global Trade Technology Organization, we are on a mission to ...

Location

India , Pune, Maharashtra, India, Chennai, Tamil Nadu, India

Salary:

Not provided

Citi

Expiration Date

Until further notice

Requirements

10+ years of professional experience in data engineering, with a proven track record of designing and building large-scale data systems
3+ years in a technical leadership or architect role, with experience mentoring junior and senior engineers
Expert-level proficiency in at least one programming language (Python or Scala preferred) and exceptional SQL skills
Proven hands-on experience with Python or Scala for data manipulation, scripting, machine learning, and backend development
Deep, hands-on experience with a major cloud platform (AWS, GCP, or Azure) and its data ecosystem (e.g., S3/GCS, Redshift/BigQuery, EMR/Dataproc, Kinesis/Dataflow)
Extensive hands-on experience with modern big data technologies and Data streaming (like Hadoop, Hive, Impala, Apache Spark, Kafka, or Flink)
Proficiency with workflow orchestration tools such as Airflow, Dagster, or Prefect
Proficiency in designing and implementing microservices architectures, RESTful APIs, and event-driven systems with 'Data as a Product' Principle
Solid understanding of data modeling concepts and database design for both analytical (OLAP) and transactional (OLTP) workloads
Deep understanding and hands-on experience with relational databases (e.g., PostgreSQL, Oracle), NoSQL databases (e.g., MongoDB, Cassandra), data warehousing, and big data technologies (e.g., Spark, Kafka)

Job Responsibility

Architect & Design: Design, architect, and oversee the development of robust, scalable, and reliable data infrastructure, including data lakes, data warehouses, and real-time streaming platforms on the cloud
Build & Code: Act as a senior individual contributor and hands-on technical leader. Write clean, maintainable, and high-performance code for data ingestion, transformation, and serving layers (e.g., using Python, Scala, SQL, and Spark)
Lead & Mentor: Lead a team of data engineers, providing technical guidance, mentorship, and career development support. Foster a collaborative and inclusive team environment
Champion Culture: Define, document, and champion data engineering best practices across the organization, including CI/CD, data quality, testing frameworks, observability, and code review standards
Drive Strategy: Partner with leadership, product managers, data scientists, and analysts to understand data needs and develop a long-term data strategy and roadmap
Innovate & Evaluate: Stay at the forefront of data engineering technologies. Evaluate, prototype, and recommend new tools and frameworks to continuously improve our data platform
Ensure Governance: Implement and enforce robust data governance, security, and privacy policies in partnership with our security and compliance teams

Fulltime

Principal Engineer (Software Architect)

At Flight Centre Travel Group (FCTG) our purpose is to 'open up the world for th...

Location

Australia , South Bank

Salary:

Not provided

Flight Centre Brand

Expiration Date

Until further notice

Requirements

3+ years experience as a Technical Lead or Technical Architect
Experience in transactional domains (e.g. bookings, payments, e-commerce) where data integrity and financial accountability are critical
Broad experience across diverse technology stacks with the ability to assess trade-offs across languages, paradigms, hosting models and data storage approaches
Strong experience designing and delivering cloud-native applications built for global scale, reliability, security and performance
Strong knowledge of architecture styles including SOA, micro-services and common software design patterns
Holistic understanding of the full software lifecycle including CI/CD, observability, production support, reporting and developer tooling
Exposure to Kubernetes, ElasticSearch, Redis and AWS services such as EKS, Lambda, API Gateway, DynamoDB, S3 and CloudFront
Demonstrated experience applying AI-assisted development practices and a strong point of view on embedding AI and agentic capabilities into engineering workflows
Proven ability to partner with and influence senior business stakeholders
Excellent written and oral communication skills

Job Responsibility

Shape technical strategy through hands-on involvement in product discovery, prototyping and planning
Design pragmatic, cloud‑native architectures with a focus on simplicity, reuse, testability, performance and stability
Lead data‑oriented architectural design, defining how data is produced, owned and transformed across business processes
Validate and evolve architectural decisions through spikes, proofs of concept and close collaboration with engineers and technical leads
Champion adoption of AI and agentic capabilities in engineering workflows, leveraging emerging technologies to improve delivery and impact
Stay close to delivery and the code, supporting teams with system dependencies, risk identification and production readiness
Establish and promote architectural patterns, standards and best practices that scale across teams and domains
Mentor engineers and technical leaders, empowering teams to make sound architectural decisions within clear security and stability guardrails
Continuously improve engineering quality, developer experience, tooling, pipelines and ways of working through hands‑on contribution

What we offer

Inclusive company culture
Equal Opportunity Employer
Individualised ongoing Learning & Development via communities of practice
Innovation Days
Dedicated Engineering Days
Access to LinkedIn Learning for ongoing skills development
Women in PM&E group
Exclusive staff discounts
Travel discounts including family and friends
Career opportunities in a network of brands and businesses across the globe

Fulltime

Select Country

Observability Engineer / Architect

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?