CrawlJobs Logo

Observability Lead

chicagotrading.com Logo

Chicago Trading Company

Location Icon

Location:
United States , Chicago

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

175000.00 - 250000.00 USD / Year

Job Description:

We are seeking an Observability Lead to own the strategy, execution, and technical direction of CTC's observability platform. In this role, you will lead a small, high-impact team responsible for the tools and systems that give our engineers, quants, and traders visibility into the health and performance of critical infrastructure and applications.

Job Responsibility:

  • Define and drive the observability roadmap
  • Lead the design, implementation, and continuous improvement of monitoring, alerting, logging, tracing, and metrics infrastructure at scale
  • Own the end-to-end developer experience of observability tooling
  • Manage and grow a small team of engineers
  • Partner with infrastructure, platform, and application teams
  • Establish and enforce best practices for instrumentation, SLOs, alert quality, and operational readiness
  • Evaluate emerging tools, frameworks, and approaches in the observability space

Requirements:

  • 8+ years of technical engineering experience
  • At least 3 years focused on observability, monitoring, or site reliability engineering
  • Demonstrated expertise designing, building, and operating observability platforms at scale
  • Deep experience with Datadog and OpenTelemetry strongly preferred
  • Proven experience leading or managing a small engineering team
  • Strong understanding of distributed systems and micro-services architectures
  • Hands-on experience with Kubernetes and bare-metal infrastructure
  • Advanced programming proficiency in at least one of Python, Go, or Java
  • Familiarity with C++ or low-latency systems is a strong plus
  • A product-oriented mindset
  • Exceptional communication skills
  • Financial sector experience (trading, prop trading, hedge funds) and familiarity with low-latency, high-reliability systems are strongly preferred
  • Advanced degree (MS, PhD) in Computer Science, Engineering, or related field is a plus

Nice to have:

  • Familiarity with C++ or low-latency systems
  • Financial sector experience (trading, prop trading, hedge funds)
  • Advanced degree (MS, PhD) in Computer Science, Engineering, or related field
What we offer:
  • Generous medical coverage
  • Paid parental leave
  • Free breakfast and lunch
  • Healthy snacks
  • Wellness reimbursement
  • Quarterly recharge days

Additional Information:

Job Posted:
March 13, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Observability Lead

Solutions Engineering Lead

Coralogix is a modern full-stack observability platform that transforms how busi...
Location
Location
United States , Dallas
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in customer-facing technical roles (Sales Engineering, Solutions Architecture or similar)
  • 3+ years leading or managing pre-sales technical teams with a record of coaching success
  • Experience supporting or owning team-level quotas within a sales organization
  • Hands-on expertise with the following: Kibana, Grafana, Datadog, New Relic, Splunk, Honeycomb, Jaeger, OpenSearch
  • Proficiency crafting PromQL, Lucene and SQL queries for troubleshooting and dashboards
  • Deep knowledge of cloud services central to observability: AWS: EKS, Fargate, Lambda, CloudFormation, CloudWatch Logs and Metrics
  • Azure Monitor and equivalents in Google Operations Suite
  • Working knowledge of OpenTelemetry, modern DevOps and container platforms (Kubernetes, Docker)
  • Strong ability to communicate with engineers and C-level audiences alike
  • Familiarity with structured sales methodologies such as MEDDPIC or Command of the Message (plus)
Job Responsibility
Job Responsibility
  • Own regional SE performance in partnership with Account Executives, ensuring quota attainment and deal velocity
  • Hire, onboard and mentor Solutions Engineers, setting clear KPIs and career paths
  • Maintain a strong personal presence with customers, modeling technical excellence and closing strategic opportunities
  • Improve processes for discovery, POC execution, documentation and knowledge sharing
  • Collaborate with Product, Support and Customer Success to shorten feedback loops and accelerate adoption
  • Architect and deploy reference designs for logs, metrics, traces, SIEM and Kubernetes monitoring across AWS, Azure and GCP
  • Lead white-board deep-dive sessions on ingestion pipelines, index-free querying and cost-optimized retention strategies
  • Provide escalation support during POCs: troubleshoot complex issues, analyze logs, traces, craft PromQL, Lucene or Dataprime queries and isolate root causes
  • Track technical success metrics such as POC win rate, onboarding time-to-value and validation scorecards, converting data insights into process improvements
  • Contribute code or scripts (Python, Go or Java) for custom exporters, automation and synthetic monitoring
What we offer
What we offer
  • Comprehensive and inclusive employee benefits for healthcare, dental, and mental health benefits
  • A 401(k) plan and match
  • Paid sick time
  • Paid time off
  • Fulltime
Read More
Arrow Right

Solutions Engineering Lead

We are hiring a Solutions Engineering Team Lead for the East region to scale and...
Location
Location
United States , Boston
Salary
Salary:
220000.00 - 300000.00 USD / Year
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in customer-facing technical roles (Sales Engineering, Solutions Architecture or similar)
  • 3+ years leading or managing pre-sales technical teams with a record of coaching success
  • Experience supporting or owning team-level quotas within a sales organization
  • Hands-on expertise with the following: Kibana, Grafana, Datadog, New Relic, Splunk, Honeycomb, Jaeger, OpenSearch
  • Proficiency crafting PromQL, Lucene and SQL queries for troubleshooting and dashboards
  • Deep knowledge of cloud services central to observability: AWS: EKS, Fargate, Lambda, CloudFormation, CloudWatch Logs and Metrics
  • Azure Monitor and equivalents in Google Operations Suite
  • Working knowledge of OpenTelemetry, modern DevOps and container platforms (Kubernetes, Docker)
  • Strong ability to communicate with engineers and C-level audiences alike
Job Responsibility
Job Responsibility
  • Own regional SE performance in partnership with Account Executives, ensuring quota attainment and deal velocity
  • Hire, onboard and mentor Solutions Engineers, setting clear KPIs and career paths
  • Maintain a strong personal presence with customers, modeling technical excellence and closing strategic opportunities
  • Improve processes for discovery, POC execution, documentation and knowledge sharing
  • Collaborate with Product, Support and Customer Success to shorten feedback loops and accelerate adoption
  • Architect and deploy reference designs for logs, metrics, traces, SIEM and Kubernetes monitoring across AWS, Azure and GCP
  • Lead white-board deep-dive sessions on ingestion pipelines, index-free querying and cost-optimized retention strategies
  • Provide escalation support during POCs: troubleshoot complex issues, analyze logs, traces, craft PromQL, Lucene or Dataprime queries and isolate root causes
  • Track technical success metrics such as POC win rate, onboarding time-to-value and validation scorecards, converting data insights into process improvements
  • Contribute code or scripts (Python, Go or Java) for custom exporters, automation and synthetic monitoring
What we offer
What we offer
  • Comprehensive and inclusive employee benefits for healthcare, dental, and mental health benefits
  • 401(k) plan and match
  • Paid sick time and paid time off
  • Fulltime
Read More
Arrow Right

Team Lead, Technical Account Manager

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Background knowledge and hands-on practice in Cloud DevOps, specifically experience with AWS (EC2, EKS, ECS, Fargate, Lambda, CloudFormation, Load Balancers, CloudWatch) and the equivalent with Azure and GCP
  • Background knowledge and hands-on practice in Observability, specifically experience working with one or more of the following tools - Kibana, Open-Search, Grafana, Datadog, Sumologic, NewRelic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger
  • Proven experience leading technical teams, especially focused on delivering observability solutions, logging infrastructure, and successful customer onboarding
  • Ability to define and track onboarding KPIs, focusing on technical adoption and customer satisfaction
  • Strong analytical skills to interpret customer data and usage trends, ensuring continuous improvements in observability practices
  • Ability to communicate complex technical information to both technical and non-technical stakeholders
  • Excellent communication skills in English
  • Strong presentation skills with the ability to establish credibility with executives
Job Responsibility
Job Responsibility
  • Lead, mentor, and manage a team of TAMs to ensure successful customer onboarding and long-term success
  • Develop KPIs for the team and track performance related to the onboarding experience, ensuring customer satisfaction
  • Provide technical guidance and foster team collaboration on observability tools and log analytics
  • Oversee the implementation of observability tools, guiding customers through Logs, Metric and Traces monitoring, and real-time analysis
  • Ensure that your team delivers expert-level onboarding and ongoing work, for our observability and logging solutions
  • Provide deep technical insights on cloud observability and integration of Coralogix into customer infrastructures
  • Be the primary escalation point for customer technical challenges
  • Proactively work with customers to enhance their logging and observability practices, integrating them seamlessly with Coralogix’s platform
  • Engage with Coralogix stakeholders to provide tailored technical solutions that align with customer business goals
  • Leverage customer feedback and usage data to enhance the onboarding process and overall TAM team performance
  • Fulltime
Read More
Arrow Right

Cyber Security Engineering Lead

Join Citi's Cloud Technology Services team to lead and execute critical cyber se...
Location
Location
Hungary , Budapest
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of relevant cybersecurity and/or IT experience
  • Leadership roles across technology or cybersecurity leading large programs or transformational activities
  • Proven track record of delivering security observability platforms such as telemetry data for performance and/or user experience.
  • Thorough understanding of industry and corporate technology standards for Cyber Security services
  • Demonstrated ability to take ownership and work with cross functional teams to manage multiple projects simultaneously under pressure
  • Advanced analytical and problem-solving skills
  • Consistently demonstrates clear and concise written and oral communication as well as strong presentation skills to both technical and non-technical audiences.
  • Bachelor’s degree in relevant subject or equivalent work experience
Job Responsibility
Job Responsibility
  • Lead a virtual team of Infrastructure Defense professionals.
  • Lead CTB transformational and RTB activities across NDCS and act as focal point managing cyber security platforms
  • Lead, design, own and deliver Security Observability Enablement on a global scale focusing on all related perimeter technologies – such as Firewall Telemetry.
  • Deliver end-to-end dashboards of critical security service based data (such as firewall performance)
  • Working with Transformation Program Directors, Senior Architects, Steering Committees on execution of perimeter security and edge security programs
  • Work with global cyber security industry partners on influencing next generation cyber technology, take part in related R&D efforts.
  • Responsible for inventory, accuracy and engineering excellence activities for assigned services and products.
What we offer
What we offer
  • Cafeteria Program
  • Home Office Allowance (for colleagues working in hybrid work models)
  • Paid Parental Leave Program (maternity and paternity leave)
  • Private Medical Care Program and onsite medical rooms at our offices
  • Pension Plan Contribution to voluntary pension fund
  • Group Life Insurance
  • Employee Assistance Program
  • Access to a wide variety of learning and development programs, online course libraries and upskilling platforms, such as Udemy and Degreed
  • Flexible work arrangements to support you in managing work - life balance
  • Career progression opportunities across geographies and business lines
  • Fulltime
Read More
Arrow Right

Lead Observability Engineer

Lead Observability Engineer role focusing on the Elastic Observability Platform,...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
blueyonder.com Logo
Blue Yonder
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, MIS, or equivalent experience
  • 7–10+ years of experience in observability engineering, SRE, monitoring platform ownership, or infrastructure operations
  • Deep, hands-on expertise with Elastic Stack (Elasticsearch, Kibana, Logstash, Beats/Elastic Agent, APM)
  • Strong architectural knowledge of cloud (Azure/AWS) and hybrid observability patterns
  • Experience leading observability for infrastructure, cloud platforms, network systems, Kubernetes, and Microsoft 365
  • Proven experience designing monitoring for SaaS platforms (Workday, Salesforce, ServiceNow)
  • Advanced scripting/automation experience (Python, PowerShell, Bash)
  • Strong knowledge of API integrations, data pipelines, and log-flow engineering
  • Experience leading incident diagnostics and delivering visibility for RCA and operational improvement
  • Strong analytical, architectural, and troubleshooting skills with a platform-owner mindset
Job Responsibility
Job Responsibility
  • Receives work assignments through the ticketing system or from senior leadership
  • Provides Tier-4 engineering expertise, platform ownership, and technical leadership for all observability capabilities across hybrid cloud, on-premises, and SaaS environments
  • Leads the design, architecture, and maturity of the enterprise observability ecosystem with a primary focus on the Elastic Observability Platform
  • Drives the enterprise strategy for logging, metrics, traces, synthetics, and alerting—including governance, standardization, and performance optimization
  • Partners closely with Cloud, Infrastructure, Security, Enterprise Applications, and SRE leadership to define observability frameworks
  • Ensures observability platforms meet enterprise requirements for security, performance, availability, compliance, and scalability
  • Oversees monitoring implementations for key SaaS applications including Workday, Salesforce, ServiceNow, and Microsoft 365
  • Provides guidance, mentorship, and direction to observability engineers, SREs, and operational teams
  • Acts as a strategic advisor during major incidents by providing real-time diagnostics, correlation insights, and driving RCA improvements
  • Required to provide on-call support during off-hours on weekdays, weekends, and holidays on a rotating basis
  • Fulltime
Read More
Arrow Right

Engineering Manager

We are looking for a skilled Engineering Manager to lead our Billing and Interna...
Location
Location
Finland , Helsinki
Salary
Salary:
Not provided
aiven.io Logo
Aiven Deutschland GmbH
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven track record of leading diverse teams from junior to senior engineers to successfully deliver software products
  • Strong product sense enabling team to design innovative, cloud-native products
  • Excellent communication skills in English
  • Ability to bring order and clarity to a dynamic, ambiguous environment
  • Experience in recruiting engineering talent and building high-performing teams
  • Experience with agile software development methodologies
  • Experience in building and designing distributed systems in a cloud environment
  • Strong knowledge of database fundamentals including OLAP vs OLTP, persistence, replication, and clustering
  • Good grasp of monitoring and observability tools like Prometheus, Grafana, and OpenTelemetry
  • Ability to work with SQL to interact with platform's master database
Job Responsibility
Job Responsibility
  • Strategic Planning: Partner with Product Manager and Domain Head to create team's roadmap
  • Project Management: Oversee team's backlog and projects
  • Leadership & Delivery: Champion culture of urgency and ownership to deliver impactful results
  • Team Enablement: Empower team to act as product custodians
  • Performance & Development: Provide clear goals and feedback
  • Team Facilitation: Lead team meetings such as planning sessions and retrospectives
  • Culture Building: Foster psychologically safe, high-trust environment
  • Collaboration: Ensure effective communication and collaboration within team and across organization
What we offer
What we offer
  • Participate in Aiven's equity plan
  • Hybrid work policy
  • Equipment provided
  • Employer support for career development including learning platforms and annual learning budget
  • Global Employee Assistance Program
  • Professional massage at office
  • Health and fitness benefits through Urban Sport Club membership
  • Monthly team breakfast
  • Referral bonus programme
Read More
Arrow Right

Staff Observability Operations Engineer

We are currently seeking several experienced and highly skilled Staff Observabil...
Location
Location
United States , Hartford
Salary
Salary:
130295.00 - 260590.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ Years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications
  • 5+ Years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions
  • 5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana)
  • Experience developing and administering ServiceNow ITOM event management solutions
  • Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty)
  • Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift)
  • Proficiency in Python and other scripting languages such as Ansible, PowerShell, Bash for automation and configuration
  • Hands-on experience deploying, managing, and administering observability platforms
  • Hands-on experience leading, coordinating, and performing migration of application, platform, and infrastructure observability solutions
  • Proven ability to troubleshoot and resolve complex technical issues
Job Responsibility
Job Responsibility
  • Deploy and implement modern observability solutions
  • Manage and administer observability and event management platforms
  • Coordinate and manage release cycles for observability platforms
  • Troubleshoot and resolve incidents related to observability platforms
  • Continuously monitor and enhance platform performance
  • Collaborate with cross-functional stakeholders
  • Provide training and mentoring to junior engineers
  • Ensure compliance and security of observability platforms
  • Maintain documentation of observability platform configurations
  • Generate and analyze reports on platform performance and capacity
What we offer
What we offer
  • Affordable medical plan options
  • a 401(k) plan (including matching company contributions)
  • an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs
  • confidential counseling and financial coaching
  • Paid time off
  • flexible work schedules
  • family leave
  • dependent care resources
  • colleague assistance programs
  • Fulltime
Read More
Arrow Right

Site Reliability Engineering (SRE) / Observability Technical Lead

Join a dynamic team as a Site Reliability Engineer, leading observability and re...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in SRE, Observability, or DevOps roles, with leadership responsibilities
  • Proven expertise with Application Performance Monitoring (APM) tools such as New Relic, Datadog, AppDynamics, or Dynatrace
  • Hands-on experience with OpenTelemetry (OTel) for distributed tracing and observability instrumentation
  • Strong proficiency in Infrastructure as Code (IaC) using Terraform
  • Solid understanding of cloud platforms including AWS, GCP, or Azure
  • Experience with automation/configuration management tools like Ansible, Chef, or Puppet
  • Deep knowledge of CI/CD pipelines and tools such as GitHub Actions, Jenkins, or Azure DevOps
  • Experience managing Kubernetes and containerized environments (Docker, Helm)
  • Familiarity with log aggregation and analysis platforms like ELK Stack or Splunk
  • Excellent leadership, communication, and collaboration skills
Job Responsibility
Job Responsibility
  • Lead the strategic development and management of observability and reliability frameworks across the organization, ensuring alignment with business goals and technical requirements
  • Design and implementation of monitoring and observability solutions, collaborating with engineering teams to define standards and best practices
  • Manage Infrastructure as Code (IaC) initiatives using Terraform, coordinating with cloud and infrastructure teams to ensure scalable and secure deployments
  • Drive automation strategies for monitoring, alerting, and logging pipelines, focusing on process improvements and operational efficiency
  • Develop and maintain comprehensive observability roadmaps, including distributed tracing, logging, and metrics collection strategies
  • Collaborate with product management, sales, and pre-sales teams to provide technical expertise and support during solution design and customer engagements
  • Lead cross-functional teams to enhance CI/CD pipelines and deployment reliability, ensuring smooth integration of observability tools and practices
  • Engage with vendors and strategic partners to evaluate, select, and integrate observability and monitoring solutions, ensuring alignment with organizational needs and fostering strong collaborative relationships
  • Mentor and develop junior engineers and analysts, fostering a culture of reliability, observability, and operational excellence
What we offer
What we offer
  • Tailored benefits that support your physical, emotional, and financial wellbeing
  • Continuous growth and development opportunities
  • Flexible work options
  • Fulltime
Read More
Arrow Right