Lead Observability Engineer Job at Blue Yonder (Hyderabad)

Job Description

Lead Observability Engineer role focusing on the Elastic Observability Platform, ensuring end-to-end visibility for infrastructure, cloud services, networks, and business-critical applications. The role involves strategic leadership, platform ownership, and technical expertise across hybrid environments.

Job Responsibility

Receives work assignments through the ticketing system or from senior leadership
Provides Tier-4 engineering expertise, platform ownership, and technical leadership for all observability capabilities across hybrid cloud, on-premises, and SaaS environments
Leads the design, architecture, and maturity of the enterprise observability ecosystem with a primary focus on the Elastic Observability Platform
Drives the enterprise strategy for logging, metrics, traces, synthetics, and alerting—including governance, standardization, and performance optimization
Partners closely with Cloud, Infrastructure, Security, Enterprise Applications, and SRE leadership to define observability frameworks
Ensures observability platforms meet enterprise requirements for security, performance, availability, compliance, and scalability
Oversees monitoring implementations for key SaaS applications including Workday, Salesforce, ServiceNow, and Microsoft 365
Provides guidance, mentorship, and direction to observability engineers, SREs, and operational teams
Acts as a strategic advisor during major incidents by providing real-time diagnostics, correlation insights, and driving RCA improvements
Required to provide on-call support during off-hours on weekdays, weekends, and holidays on a rotating basis
Own and lead the architecture and roadmap for the Elastic Observability platform across the enterprise
Define and enforce governance standards for logs, metrics, traces, data retention, and alerting quality
Lead platform scaling initiatives—including cluster sizing, performance tuning, ILM tiering, and cost optimization
Architect, deploy, and maintain advanced Elastic Observability solutions across hybrid environments
Design executive-grade dashboards, correlation views, analytics boards, anomaly detection, and ML-based detections
Optimize ingestion pipelines, index structures, data flow, and search/query performance at scale
Integrate Elastic Observability with Azure, VMware, Kubernetes, network platforms, ServiceNow, and API sources
Define and lead enterprise monitoring standards across logs, metrics, traces, and synthetics
Drive cloud and on-prem monitoring maturity by improving instrumentation, coverage, and telemetry consistency
Establish alert engineering frameworks that reduce noise and improve detection fidelity
Lead design of synthetic transactions, user-experience monitoring, and availability baselines for SaaS apps
Ensure proactive monitoring of Workday, Salesforce, ServiceNow, and Microsoft 365 integrations
Serve as the observability lead during P1/P0 incidents by delivering real-time visibility and correlation insights
Drive MTTR/MTTD improvements through enhanced observability patterns and RCA alignment
Build and maintain operational runbooks, dashboards, and standard operating procedures
Work with engineering, Cloud, Infrastructure, Applications, and Security leadership to improve observability adoption
Act as the senior technical advisor in major IT projects, shaping observability-by-design principles
Mentor and guide observability engineers, analysts, and SRE teams to uplift operational capabilities
Ensure all monitoring pipelines follow enterprise security, compliance, retention, and logging policies
Validate that new systems adhere to observability onboarding requirements and telemetry standards
empowering partner IT teams, such as Infrastructure and Apps, to self-service by creating their own monitors, all within the unified guidance and framework established by Observability

Requirements

Bachelor’s degree in Computer Science, Engineering, MIS, or equivalent experience
7–10+ years of experience in observability engineering, SRE, monitoring platform ownership, or infrastructure operations
Deep, hands-on expertise with Elastic Stack (Elasticsearch, Kibana, Logstash, Beats/Elastic Agent, APM)
Strong architectural knowledge of cloud (Azure/AWS) and hybrid observability patterns
Experience leading observability for infrastructure, cloud platforms, network systems, Kubernetes, and Microsoft 365
Proven experience designing monitoring for SaaS platforms (Workday, Salesforce, ServiceNow)
Advanced scripting/automation experience (Python, PowerShell, Bash)
Strong knowledge of API integrations, data pipelines, and log-flow engineering
Experience leading incident diagnostics and delivering visibility for RCA and operational improvement
Strong analytical, architectural, and troubleshooting skills with a platform-owner mindset
Demonstrated ability to influence cross-functional teams and drive enterprise observability adoption
Knowledge of ITIL processes, SRE principles, and operational governance
Excellent communication, leadership, and stakeholder-management skills

Nice to have

Familiarity with Grafana, Prometheus, Splunk, AppDynamics, Dynatrace
Knowledge of Terraform, Ansible, Kubernetes, and infrastructure-as-code tools

Blue Yonder - All Job Offers

Select Country

Lead Observability Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Lead Observability Engineer

Lead Observability Engineer

Lead Observability Platform Engineer

Lead Engineer

Sr Data Quality & Observability Engineer (Snowflake)

Lead Engineer

Senior Data Engineer Lead / Architect - Senior Vice President

AI Engineer Lead

Senior Infrastructure Engineer / Observability Specialist

Our AI answers in your language