CrawlJobs Logo

Principal Engineer I - Cloud Observability

India · Job Posted May 04, 2026
Apply Position
Job Link Share

Job Description

We’re not just building better tech. We’re rewriting how data moves and what the world can do with it. With Confluent, data doesn’t sit still. Our platform puts information in motion, streaming in near real-time so companies can react faster, build smarter, and deliver experiences as dynamic as the world around them. It takes a certain kind of person to join this team. Those who ask hard questions, give honest feedback, and show up for each other. No egos, no solo acts. Just smart, curious humans pushing toward something bigger, together. One Confluent. One Team. One Data Streaming Platform.

Job Responsibility

  • You will work with a team of engineers and architects to help evolve Confluent Observability features
  • Work closely with product management, engineering leadership, and other key stakeholders across various teams in Confluent to build and drive the overall roadmap
  • Need you to be a strong tech voice outside Confluent Observability within Confluent
  • Influence the overall domain health and operational hygiene for Confluent Observability
  • We need a tech champion for the observability capabilities we provide to our customers
  • You are expected to review designs and code and improve our technical standards
  • We are looking at you to lead the technology charter for our observability features in Confluent Cloud and in hybrid scenarios with Confluent Platform
  • Mentor a team of high-performing engineers and leads, helping them to continue in growing their skill set through hands-on experience and mentorship
  • Be a strong technical leader and representative for engineering teams in India
  • Provide timely and productive feedback, encourage a growth mindset, and advise team members in setting and working toward personal development goals
  • Nurture a culture of excellence on the team through a focus on hiring, communication, execution, and work quality
  • Create and manage processes that enable the team to do its best work

Requirements

  • Minimum of 15+ years of hands-on software development experience with the ability to anticipate future technical needs for the product and craft plans to realize them
  • Taking ideas to production is something we look for
  • Ready to roll up your sleeves - code, debug, design - do whatever it takes to ship the product to production
  • Experience building and operating large-scale systems. Solid understanding of basic systems operations (disk, network, operating systems, etc). Experience running production services in the cloud
  • Strong fundamentals in distributed systems design and development. Solid fundamentals in concurrent and multi-threading programming
  • A self starter with the ability to work effectively in teams. Proactively identifying the symptoms of technical issues and reason about their causes is needed. This will be followed by fixing the root causes
  • Timely shipping of deliverables
  • being able to trade-off short term technical decisions with the long term. Move fast, build in increments, and iterate. A sense of urgency, a mindset towards achieving results, and excellent prioritization skills
  • Ability to influence the team, peers and upper management in technology decisions using effective communication and collaborative techniques
  • Degree in Computer Science, Engineering or equivalent experience. Understanding of various technologies, programming paradigms and frameworks is needed. Ability to be pragmatic and trade off their usage in production is essential
  • Ability to take on intense customer production issues on-call
  • debugging and mitigating them will be needed. This requires patient log and metrics analysis with solid reasoning to nail the issue

Nice to have

  • Experience in designing and developing effective solutions for systems observability problems, including effective enablement of metrics, logging, events, or traces capabilities
  • Experience using and operating Apache Kafka, Apache Flink, Apache Druid, and OpenSearch is a big plus
  • Interest in evangelism (giving talks at tech conferences, writing blog posts evangelizing Kafka)
  • Experience working on stream processing technology or query processing systems

What we offer

  • Remote-First Work
  • Robust Insurance Benefits
  • Flexible Time Away
  • The Best Teammates
  • Experience Ambassadors
  • Open and Honest Culture
  • Well-Being and Growth

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal Engineer I - Cloud Observability

8 matching positions

Senior/ Principal/ Sr Principal Engineer (Cortex Cloud)

As a Senior/ Principal/ Senior Principal Software Engineer at Cortex Cloud, you ...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+/ 8+/10+ years of software engineering experience with a proven track record of delivering robust, high-scale distributed systems
  • Deep expertise in systems-level programming and modern backend languages (e.g., Go, Python) with a focus on building scalable server-side infrastructure
  • Extensive experience designing, deploying, and operating large-scale architectures on GCP, AWS, or Azure, including strong knowledge of Kubernetes, Docker and Helm
  • Proven ability to architect systems that handle high-concurrency data ingestion and wide-scale data distribution/broadcasting
  • Demonstrated experience in profiling, debugging, and optimizing complex distributed systems to eliminate performance bottlenecks
  • Exceptional ability to communicate complex technical concepts to both highly technical peers and non-technical stakeholders
Job Responsibility
Job Responsibility
  • Define and drive the multi-year technical roadmap for our server-side communication infrastructure, ensuring the platform remains resilient and performant under extreme load
  • Lead the design and implementation of backend systems optimized for receiving high-scale data from client-side apps and distributing data back to a vast ecosystem of endpoints
  • Act as a force multiplier by providing technical guidance to multiple engineering teams, aligning them on shared protocols, architectural standards, and communication patterns
  • Champion a culture of high engineering rigor, focusing on deep observability, low-latency data distribution, and runtime stability for mission-critical production environments
  • Partner with Product Management, Infrastructure, and Client-Side Engineering teams to evaluate technical trade-offs, mitigate risks, and ensure seamless end-to-end data flow
  • Spearhead the evaluation of emerging technologies and lead 'proof of concept' initiatives for next-generation transport layers and messaging paradigms
  • Invest in the growth of Senior and Staff engineers through deep-dive design reviews, code audits, and hands-on pair programming on the most critical paths
  • Support the business by leading technical deep dives with strategic customers, translating complex architectural concepts into actionable confidence for our partners
  • Fulltime
Read More
Arrow Right

Senior/ Principal/ Sr Principal Engineer (Cortex Cloud)

As a Senior/ Principal/ Senior Principal Software Engineer at Cortex Cloud, you ...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+/8+/10+ years of software engineering experience with a proven track record of delivering robust, high-scale distributed systems
  • Server-Side Mastery: Deep expertise in systems-level programming and modern backend languages (e.g., Go, Python) with a focus on building scalable server-side infrastructure
  • Cloud Native Foundations: Extensive experience designing, deploying, and operating large-scale architectures on GCP, AWS, or Azure, including strong knowledge of Kubernetes, Docker and Helm
  • Bidirectional Data Flow: Proven ability to architect systems that handle high-concurrency data ingestion and wide-scale data distribution/broadcasting
  • Systemic Problem Solving: Demonstrated experience in profiling, debugging, and optimizing complex distributed systems to eliminate performance bottlenecks
  • Influence & Communication: Exceptional ability to communicate complex technical concepts to both highly technical peers and non-technical stakeholders
Job Responsibility
Job Responsibility
  • Architectural Strategy & Vision: Define and drive the multi-year technical roadmap for our server-side communication infrastructure
  • High-Scale Communication Infrastructure: Lead the design and implementation of backend systems optimized for receiving high-scale data from client-side apps and distributing data back to a vast ecosystem of endpoints
  • Technical Leadership & Influence: Act as a force multiplier by providing technical guidance to multiple engineering teams
  • Drive Engineering Excellence: Champion a culture of high engineering rigor, focusing on deep observability, low-latency data distribution, and runtime stability
  • Cross-Functional Collaboration: Partner with Product Management, Infrastructure, and Client-Side Engineering teams
  • Innovation & Prototyping: Spearhead the evaluation of emerging technologies and lead "proof of concept" initiatives
  • Technical Mentorship: Invest in the growth of Senior and Staff engineers
  • Strategic Customer Engagement: Support the business by leading technical deep dives with strategic customers
Read More
Arrow Right

Principal Cloud Infrastructure Engineer

As Highspot continues to scale rapidly, building a robust and efficient platform...
Location
Location
United States , Seattle
Salary
Salary:
188696.00 - 282609.00 USD / Year
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years of experience in software or infrastructure engineering
  • At least 5 years focused on platform engineering or cloud infrastructure at scale
  • Proven success designing and operating internal developer platforms in AWS environments
  • Expert-level experience with Kubernetes, including provisioning, cluster lifecycle management, workload orchestration, and multi-tenant design
  • Strong expertise in Terraform, GitOps tools (e.g., ArgoCD), and CI/CD systems (e.g., GitHub Actions, Spinnaker)
  • Deep understanding of cloud networking, IAM, service meshes, and container orchestration at scale
  • Familiar with the CNCF landscape and how to leverage open-source tools to solve platform problems
  • Passion for developer experience
  • Track record of technical leadership, mentoring, and influencing engineering culture at a large scale
  • Bachelor's or Master’s in Computer Science or related discipline, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Design and build scalable platform capabilities that empower engineering teams to ship features reliably, securely, and quickly
  • Create and maintain developer-facing tools and paved paths (e.g., CI/CD pipelines, Kubernetes platforms, observability stacks, secrets management)
  • Implement Infrastructure-as-Code and GitOps patterns to promote consistency, automation, and compliance across environments
  • Collaborate with product, security, and compliance stakeholders to build platform services that meet SLAs and governance standards
  • Drive efforts to standardize and simplify infrastructure across cloud environments (AWS, Azure), enabling secure multi-cloud operation
  • Lead incident response, reliability engineering, and observability improvements that ensure platform uptime and performance
  • Act as a technical mentor and thought leader, guiding teams on infrastructure architecture, platform adoption, and best practices
  • Define and execute on a strategic roadmap to evolve the internal platform in line with company growth and technology direction
What we offer
What we offer
  • Comprehensive medical, dental, vision, disability, and life benefits
  • Health Savings Account (HSA) with employer contribution
  • 401(k) Matching with immediate vesting on employer match
  • Flexible PTO
  • 8 paid holidays and 5 paid days for Annual Holiday Week
  • Quarterly Recharge Fridays (paid days off for mental health recharge)
  • 18 weeks paid parental leave
  • Access to Coaches and Therapists through Modern Health
  • 2 volunteer days per year
  • Commuting benefits
  • Fulltime
Read More
Arrow Right

Principal Cloud Infrastructure Engineer

As Highspot continues to scale rapidly, building a robust and efficient platform...
Location
Location
Canada , Vancouver
Salary
Salary:
170435.00 - 230435.00 CAD / Year
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years of experience in software or infrastructure engineering
  • At least 5 years focused on platform engineering or cloud infrastructure at scale
  • Proven success designing and operating internal developer platforms in AWS and/or Azure environments
  • Expert-level experience with Kubernetes, including provisioning, cluster lifecycle management, workload orchestration, and multi-tenant design
  • Strong expertise in Terraform, GitOps tools (e.g., ArgoCD), and CI/CD systems (e.g., GitHub Actions, Spinnaker)
  • Deep understanding of cloud networking, IAM, service meshes, and container orchestration at scale
  • Familiar with the CNCF landscape and how to leverage open-source tools to solve platform problems
  • Passion for developer experience
  • Track record of technical leadership, mentoring, and influencing engineering culture at a large scale
  • Bachelor's or Master’s in Computer Science or related discipline, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Design and build scalable platform capabilities that empower engineering teams to ship features reliably, securely, and quickly
  • Create and maintain developer-facing tools and paved paths (e.g., CI/CD pipelines, Kubernetes platforms, observability stacks, secrets management)
  • Implement Infrastructure-as-Code and GitOps patterns to promote consistency, automation, and compliance across environments
  • Collaborate with product, security, and compliance stakeholders to build platform services that meet SLAs and governance standards
  • Drive efforts to standardize and simplify infrastructure across cloud environments (AWS, Azure), enabling secure multi-cloud operation
  • Lead incident response, reliability engineering, and observability improvements that ensure platform uptime and performance
  • Act as a technical mentor and thought leader, guiding teams on infrastructure architecture, platform adoption, and best practices
  • Define and execute on a strategic roadmap to evolve the internal platform in line with company growth and technology direction
What we offer
What we offer
  • Comprehensive medical, dental, vision, disability, and life benefits
  • Group Retirement Savings Plan (RRSP) and matching employer contributions (DPSP) with immediate vesting
  • Flexible PTO
  • Generous Holiday Schedule + 5 Days for Annual Holiday Week
  • Quarterly Recharge Fridays (paid days off for mental health recharge)
  • Flexible work schedules
  • Access to Coaches and Therapists through Modern Health
  • 2 Volunteer days per year
  • Monthly transportation allowance for employees that work in our Vancouver Hub location
  • Eligible for bonuses and stock options
  • Fulltime
Read More
Arrow Right

Principal Architect - Cloud and Observability

We're building a world of health around every individual — shaping a more connec...
Location
Location
United States
Salary
Salary:
144200.00 - 288400.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
June 29, 2026
Flip Icon
Requirements
Requirements
  • 10+ years in infrastructure, cloud architecture, platform engineering, or SRE
  • 8+ years of architecture work in observability, cloud infrastructure, or both at a large enterprise
  • Solid experience with at least two of Azure, AWS, or GCP -- including networking, identity, compute, and storage
  • 5+ years with Kubernetes in production (OpenShift, EKS, AKS, or GKE)
  • 5+ years with OpenTelemetry or similar frameworks (collectors, SDKs, semantic conventions, pipeline design)
  • 5+ years with observability platforms: Grafana/Mimir/Loki/Tempo, Prometheus, Datadog, Splunk, Dynatrace, or comparable tools
  • Experience defining SLOs/SLIs and building alerting strategies at an organizational level
  • Proven track record writing architecture standards that other teams adopted and followed
  • Able to communicate clearly with both engineers and senior leadership
Job Responsibility
Job Responsibility
  • Own the enterprise observability reference architecture covering metrics, logs, traces, and events across all environments (cloud and on-prem)
  • Drive the OpenTelemetry-first instrumentation strategy -- standard libraries, semantic conventions, collector topologies (DaemonSet, gateway, sidecar), and pipeline design
  • Build and operate telemetry pipelines on Grafana Mimir, Loki, and Tempo, including multi-tenant configurations, retention policies, and capacity planning
  • Define how we measure reliability: SLOs, SLIs, error budgets, and alerting frameworks -- consistently across all lines of business
  • Own the integration between observability tooling and incident management (ServiceNow ITOM, xMatters)
  • Drive telemetry schema standards to ensure teams emit data that is useful downstream, not just technically compliant
  • Build and maintain reference architectures for our hybrid footprint: OpenShift on-prem with KVM/libvirt and Dell PowerFlex storage, plus Azure, AWS, and GCP
  • Lead standards work around workload identity and federation using SPIFFE/SPIRE and cloud-native IAM patterns to move away from static secrets
  • Provide guidance on compute runtime selection -- containers vs. VMs vs. bare metal vs. serverless -- with a clear decision framework for teams
  • Help teams connect autoscaling and capacity planning behavior to actual telemetry signals
What we offer
What we offer
  • medical, dental, and vision coverage
  • paid time off
  • retirement savings options
  • wellness programs
  • other resources, based on eligibility
  • bonus, commission or short-term incentive program
  • equity award program
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, AI Cloud

At Docker, we make app development easier so developers can focus on what matter...
Location
Location
United States , Seattle
Salary
Salary:
232000.00 - 319000.00 USD / Year
docker.com Logo
Docker
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of software engineering experience, including 3+ years in technical leadership roles (Staff or Principal level)
  • Proven experience designing and building highly scalable distributed systems in production environments
  • Deep understanding of cloud infrastructure (AWS, Azure, GCP, or OCI), including compute, networking, and storage primitives
  • Proficiency in Go, Rust, or Java
  • Expertise in Kubernetes, microservices, and service mesh architectures
  • Strong foundation in observability, CI/CD, and infrastructure-as-code (Terraform, Pulumi, or CloudFormation)
  • Experience operating high-availability (99.99%+) production systems
  • Exceptional communication skills and ability to influence across technical and business domains
  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Define and drive the long-term technical strategy for Docker AI Cloud’s control and data plane services
  • Architect highly available, multi-region systems capable of operating seamlessly across multiple cloud providers
  • Design APIs and service abstractions that integrate Docker Desktop, Hub, and enterprise cloud services
  • Establish standards for reliability, scalability, and observability across the Docker AI Cloud platform
  • Lead cross-functional technical discussions and influence architectural decisions company-wide
  • Design and implement distributed systems for workload orchestration, service discovery, and lifecycle management
  • Build and operate control plane components that manage multi-tenant workloads and cloud networking
  • Develop infrastructure that delivers predictable performance, intelligent scaling, and automated failover
  • Ensure security, data integrity, and compliance across Docker’s global infrastructure footprint
  • Partner with platform and product teams to deliver developer-friendly APIs and cloud experiences
What we offer
What we offer
  • Freedom & flexibility
  • fit your work around your life
  • Designated quarterly Whaleness Days plus end of year Whaleness break
  • Home office setup
  • we want you comfortable while you work
  • 16 weeks of paid Parental leave
  • Technology stipend equivalent to $100 net/month
  • PTO plan that encourages you to take time to do the things you enjoy
  • Training stipend for conferences, courses and classes
  • Equity
  • Fulltime
Read More
Arrow Right

Sr Principal Engineer Software (Cortex Cloud)

As a Senior or Sr Principal Software Engineer in Cortex Cloud, you will contribu...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Backend Engineering: 8+ years of experience building and maintaining production-grade distributed systems
  • Languages: Proficiency in Go (Golang) is a strong advantage. We are open to engineers with deep expertise in other backend languages (Java, Python, Rust, C#, or Node.js) who are willing to transition to a Go-primary stack and have a focus on clean, well-tested code
  • Fundamentals: Strong grasp of system design, data structures, and algorithms in high-scale cloud environments
  • Standards: Experience with CI/CD, comprehensive testing (unit, integration, E2E), and rigorous code reviews
  • Cloud: Proficiency in AWS, GCP, or Azure, including cloud-native services
  • Reliability: Experience with observability (monitoring, logging, tracing) and system profiling
  • Education: B.Sc. or M.Sc. in Computer Science, Software Engineering, or equivalent technical/military experience
Job Responsibility
Job Responsibility
  • Contribute to the development and scaling of cloud-native security solutions for enterprise organizations
  • Work within an established team to evolve a high-traffic product, with a focus on refining architecture, optimizing the technology stack, and maintaining engineering standards
  • Write reliable code, influence product direction, and design distributed systems
  • Make technical decisions that impact the long-term stability and performance of cloud workload protection services
  • Fulltime
Read More
Arrow Right

Principal Engineer - Edge Delivery & Observability

The FT is looking for a Principal Engineer (Individual Contributor) to lead our ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
ft.com Logo
Financial Times
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience technically leading teams and projects
  • Effective communicator, able to break down tasks as well as giving/receiving constructive feedback
  • Customer focused, with a strong focus on building and running reliable, stable and secure systems
  • Enthusiastic about operability & monitoring of systems and Cloud infrastructure
  • Experience with AWS, Splunk, Grafana, Prometheus, Cloudflare, Route53, Python and Go (or equivalent tools) is beneficial
Job Responsibility
Job Responsibility
  • Provide technical direction and support to teams across the group
  • Lead one or two feature teams to deliver quality tooling and products that reduce developer toil
  • Work closely with the people manager within the teams
  • Engage with other disciplines (e.g.delivery, product management) and teams across FT to make sure we are all working together effectively
  • Model and help set and reinforce our inclusive, respectful, multidisciplinary and open culture
  • Help continuously improve our technology, process and culture, take ownership of problems and see solutions through to completion
  • Manage and maintain strong relationships with vendors
  • Gain a deep understanding of the FT as a business and use that knowledge to communicate clearly with your peers, reports, and senior management
  • Actively collaborate across teams both within and outside of I&O
  • Contribute to company-wide processes, frameworks, and guidelines
What we offer
What we offer
  • A competitive bonus incentive scheme
  • Extensive learning and development opportunities including 10% time, tech talks, internal conferences and opportunities to attend external conferences and training
  • 25 days annual leave, increasing to 30 days after 2 years’ service
  • Generous parental leave
  • Very competitive pension plan, with the company doubling your contribution
Read More
Arrow Right