CrawlJobs Logo

Principal Network Engineer, Operations & Observability

United States, Englewood 60.24 - 89.60 USD / Hour · Job Posted May 10, 2026
Apply Position
Job Link Share

Job Description

The System Engineer job family has responsibility for infrastructure/technical planning, implementation and support activities for systems owned by the CommonSpirit Health Technology Infrastructure team. Specific responsibilities include installing and supporting system hardware and software, performing system upgrades, and evaluating and installing patches and software updates. Responsibilities also include operational support activities such as resolving software and hardware related problems, managing backup and recovery activity, administering technology layers, managing monitoring and alerting functions, performing capacity planning and conducting version management. System Engineers work closely with architects, infrastructure support, database administrators and application support teams to ensure seamless and quality IT support for CommonSpirit Health customers and alignment with CommonSpirit Health's IT standards, controls and governance. To be successful, individuals must possess a combination of technical, business and leadership skills. This requires an understanding of customers' business needs, processes, and functions. They also require a solid knowledge of IT infrastructure, architecture, applications development and support, networks, and computer operations. In addition, individuals working in this job family must possess excellent communication skills and the ability to influence others. The Principal System Engineer is considered a subject matter expert in the enterprise and multiple technology areas, platforms and functions. They are responsible for maintaining a deep awareness and understanding of emerging trends and technologies in IT and Healthcare. Assignments span the enterprise as the principal system engineer will be responsible for standards and technical roadmap development. This senior role focuses on the strategic architecture, administration, and continuous improvement of the organization's network operations tooling ecosystem. The scope includes platforms such as Cisco Catalyst Center, AKIPS, NNMi, OMi, ThousandEyes, and other related network management, monitoring, and observability technologies.

Job Responsibility

  • Platform Lifecycle Management
  • Enterprise Architecture and Strategy
  • Future-State Vision
  • Strategy and Roadmap
  • Architectural Standards
  • Collaboration and Operational Model
  • Develops organizational policies, standards, and guidelines for methods and tools
  • Determines testing policy
  • Sets the release policy for the organization
  • Maintain primary responsibility for strategic planning, technical roadmap development, standards and architecture
  • Oversees efforts with key vendors to understand future application product plans
  • Perform project resourcing, oversight and management
  • Coaches the team on personal development and develops training strategies and schedules
  • Initiates methods and approaches to meet defined business objectives
  • Works on, and may lead, multiple projects that may span the enterprise
  • Identify automation opportunities and implement scripted solutions
  • Serves as an escalation point for complex requests and issues
  • Provide overall ownership of the Change Management process for the System Engineer team
  • Provide exceptional customer service to CommonSpirit end users
  • Act as primary conduit to oversee support efforts between System Engineering and other CommonSpirit Health teams
  • Schedule and manage all performance tuning and troubleshooting efforts
  • Review, recommend and monitor the source code/versioning management function
  • Provide overall technical ownership for all support and project issues and responsibilities within the System Engineering teams
  • Design, implement and maintain a comprehensive monitoring and alerting process across all Technology Infrastructure platforms
  • Utilize standard tools and methodology to develop system and support performance metrics
  • Demonstrate comprehensive knowledge & expertise with CommonSpirit business processes and routines
  • Perform crisis management during high-severity operational incidents
  • Perform meeting facilitation for staff, operational support and project meetings
  • Complete assignments as required by the Director / Manager
  • May require on-call coverage responsibilities

Requirements

  • Bachelors of Arts degree or equivalent experience
  • 10 years of professional IT experience in an IT technical or infrastructure field
  • 5+ years Unix operational experience (Solaris, AIX, Linux)
  • 5+ years Windows Server operational experience

Nice to have

Healthcare industry experience

What we offer

  • medical
  • prescription drug
  • dental
  • vision plans
  • life insurance
  • paid time off
  • tuition reimbursement
  • retirement plan benefit(s) including 401(k), 403(b), and other defined benefits offerings

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal Network Engineer, Operations & Observability

8 matching positions

Principal Engineer I - Cloud Observability

We’re not just building better tech. We’re rewriting how data moves and what the...
Location
Location
India
Salary
Salary:
Not provided
confluent.io Logo
Confluent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of 15+ years of hands-on software development experience with the ability to anticipate future technical needs for the product and craft plans to realize them
  • Taking ideas to production is something we look for
  • Ready to roll up your sleeves - code, debug, design - do whatever it takes to ship the product to production
  • Experience building and operating large-scale systems. Solid understanding of basic systems operations (disk, network, operating systems, etc). Experience running production services in the cloud
  • Strong fundamentals in distributed systems design and development. Solid fundamentals in concurrent and multi-threading programming
  • A self starter with the ability to work effectively in teams. Proactively identifying the symptoms of technical issues and reason about their causes is needed. This will be followed by fixing the root causes
  • Timely shipping of deliverables
  • being able to trade-off short term technical decisions with the long term. Move fast, build in increments, and iterate. A sense of urgency, a mindset towards achieving results, and excellent prioritization skills
  • Ability to influence the team, peers and upper management in technology decisions using effective communication and collaborative techniques
  • Degree in Computer Science, Engineering or equivalent experience. Understanding of various technologies, programming paradigms and frameworks is needed. Ability to be pragmatic and trade off their usage in production is essential
Job Responsibility
Job Responsibility
  • You will work with a team of engineers and architects to help evolve Confluent Observability features
  • Work closely with product management, engineering leadership, and other key stakeholders across various teams in Confluent to build and drive the overall roadmap
  • Need you to be a strong tech voice outside Confluent Observability within Confluent
  • Influence the overall domain health and operational hygiene for Confluent Observability
  • We need a tech champion for the observability capabilities we provide to our customers
  • You are expected to review designs and code and improve our technical standards
  • We are looking at you to lead the technology charter for our observability features in Confluent Cloud and in hybrid scenarios with Confluent Platform
  • Mentor a team of high-performing engineers and leads, helping them to continue in growing their skill set through hands-on experience and mentorship
  • Be a strong technical leader and representative for engineering teams in India
  • Provide timely and productive feedback, encourage a growth mindset, and advise team members in setting and working toward personal development goals
What we offer
What we offer
  • Remote-First Work
  • Robust Insurance Benefits
  • Flexible Time Away
  • The Best Teammates
  • Experience Ambassadors
  • Open and Honest Culture
  • Well-Being and Growth
  • Fulltime
Read More
Arrow Right

Principal Engineer

Wells Fargo is seeking a Principal Engineer to join the Consumer Technology grou...
Location
Location
United States , Irving;Chandler;Charlotte
Salary
Salary:
Not provided
https://www.wellsfargo.com/ Logo
Wells Fargo
Expiration Date
July 02, 2026
Flip Icon
Requirements
Requirements
  • 7+ years of Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 7+ years of alarm scripting or alerting tool(s) experience
  • 3+ years of experience setting up distributed tracing across an internet topology for full health check and with the ability to pinpoint problem source
Job Responsibility
Job Responsibility
  • Act as an advisor to leadership to develop or influence applications, network, information security, database, operating systems, or web technologies for highly complex business and technical needs across multiple groups
  • Lead the strategy and resolution of highly complex and unique challenges requiring in-depth evaluation across multiple areas or the enterprise, delivering solutions that are long-term, large-scale and require vision, creativity, innovation, advanced analytical and inductive thinking
  • Translate advanced technology experience, an in-depth knowledge of the organizations tactical and strategic business objectives, the enterprise technological environment, the organization structure, and strategic technological opportunities and requirements into technical engineering solutions
  • Provide vision, direction and expertise to leadership on implementing innovative and significant business solutions
  • Maintain knowledge of industry best practices and new technologies and recommends innovations that enhance operations or provide a competitive advantage to the organization
  • Strategically engage with all levels of professionals and managers across the enterprise and serve as an expert advisor to leadership
  • Fulltime
!
Read More
Arrow Right

Principal Engineer I – Senior Azure Databricks Administrator

Software Resources has an immediate, direct hire job opportunity for a Principal...
Location
Location
United States , Phoenix
Salary
Salary:
Not provided
softwareresources.com Logo
Software Resources
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of related experience in data analytics administration and development
  • 4+ years of Databricks related experience
  • Bachelor’s degree in related field required
  • Advanced proven experience in Azure Databricks (Workspace management, Clusters, Jobs, Unity Catalog, Delta Lake, User access management, Rest APIs and SDKs)
  • Knowledge of MLFlow & MLOps
  • Deep understanding of Azure infrastructure and data services, including Azure Data Lake, Azure Data Factory, Azure SQL, Azure Synapse Analytics, Azure Key Vault, Azure Monitor, networking
  • Experience with CI/CD pipelines (Azure DevOps preferred)
  • Strong programming skills in SQL, Python, and/or PySpark
  • Advanced proven experience in leading cross-functional teams and managing multiple projects simultaneously
  • Advanced ability to see the big picture and align projects with organizational goals
Job Responsibility
Job Responsibility
  • Responsible for delivery and operations of technologies and platforms required to model, transform, analyze, report, visualize data
  • Provide SME expertise in designing, building, optimizing, streamlining and automating the Azure Databricks platform
  • Partner with ML engineers, data scientists, data analysts, and enterprise architects to provide frameworks, set standards, enforce best practices, train and enable users
  • Develop technical skills of one or more junior team-members
  • Take assignments that can be worked on individually without supervision, and manage work effort from concept to completion
  • Design, build, optimize, automate and maintain the Azure Databricks platform, ensuring scalability, security, governance and performance
  • Design, implement and manage Azure Databricks workspaces, clusters, jobs, access management
  • Design, implement and manage policies, monitoring and observability
  • Implement data analytics principles aimed at business enablement, reliability practices and sound recovery procedures
  • Ensure compliance with IT policies, procedures, and industry standards
What we offer
What we offer
  • Competitive salaries
  • An ownership stake in the company
  • Medical and dental insurance
  • Time off
  • A great 401k matching program
  • Tuition assistance program
  • An employee volunteer program
  • A wellness program
  • Fulltime
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

The Principal Software Engineer is the senior-most hands-on technical leader for...
Location
Location
India , Chennai
Salary
Salary:
Not provided
rxglobal.com Logo
RX Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience as a senior technical leader across multiple teams/services within a bounded domain
  • Strong polyglot background (e.g., C#/.NET, Java, JavaScript/Node) and ability to choose fit-for-purpose technologies
  • Experience modernising systems: migrating from legacy architectures to cloud-native patterns, reducing technical debt, and decommissioning safely
  • Experience in systems analysis, design and a solid understanding of development, quality assurance and integration methodologies
  • Experience developing integrated solutions within a broad technical and business context of significant impact
  • Experience evaluating third-party services and platforms (security, cost, operations, integration complexity)
  • Experience leading cross‑team architectural change, platform adoption, or measurable improvements to reliability/cost/performance (with before/after metrics)
  • Familiarity with responsible AI usage in engineering workflows (policy/guardrails, data privacy, human‑in‑the‑loop review)
  • Bachelor’s/Master’s degree in Computer Science (or related) or equivalent professional experience
  • Expert software design skills: SOLID, DDD, event-driven architecture patterns, modular design, and maintainable codebases
Job Responsibility
Job Responsibility
  • Engineering Leadership & Culture: Create an environment where teams can do their best work by removing blockers, improving engineering practices, and contributing to a culture of psychological safety and high standards
  • Mentor and coach engineers across teams—especially senior engineers and emerging tech leads—in architecture, systems thinking, and operational excellence
  • Promote strong technical ownership ("you build it, you run it"), including operational readiness and post-incident learning
  • Support scalable knowledge-sharing mechanisms (e.g., tech talks, playbooks, templates, reference implementations)
  • Participate in hiring loops and help onboard new engineers into domain patterns and practices
  • Provide hands-on contributions where needed (prototypes, reference implementations, complex refactors, high-risk changes)
  • Guide teams in decomposition and sequencing to reduce delivery risk
  • support estimation/sizing and technical discovery
  • Leads through influence
  • demonstrates integrity, accountability, and constructive challenge
What we offer
What we offer
  • Comprehensive Health Insurance: Covers you, your immediate family, and parents
  • Enhanced Health Insurance Options: Competitive rates negotiated by the company
  • Group Life Insurance: Ensuring financial security for your loved ones
  • Group Accident Insurance: Extra protection for accidental death and permanent disablement
  • Flexible Working Arrangement: Achieve a harmonious work-life balance
  • Employee Assistance Program: Access support for personal and work-related challenges
  • Medical Screening: Your well-being is a top priority
  • Modern Family Benefits: Maternity, paternity, and adoption support
  • Long-Service Awards: Recognizing dedication and commitment
  • New Baby Gift: Celebrating the joy of parenthood
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

Microsoft is a company where passionate innovators come to collaborate, envision...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check
  • 6+ years of hands‑on software engineering experience, with significant time spent on distributed systems and cloud infrastructure
  • Deep Kubernetes expertise including Kubernetes internals (control plane, scheduling, networking, storage), containers and cloud‑native architectures, advanced Kubernetes networking, traffic management, and reliability patterns
  • Strong background in distributed systems design, including consistency, fault tolerance, scalability, and performance tradeoffs
  • Proven experience owning live production systems, including on‑call rotations, incident mitigation, and operational excellence
  • Proficiency in one or more systems languages (e.g., Go, C++, C#, or equivalent).
Job Responsibility
Job Responsibility
  • Lead the design, development, and operation of cloud-native platform components on Kubernetes, with a focus on reliability, networking, security, and observability at scale
  • Drive end-to-end architecture for large-scale, distributed systems supporting globally deployed services
  • Design and deliver highly available, scalable systems, ensuring strong performance, resilience, diagnosability, and cost efficiency
  • Provide technical leadership in Kubernetes-based infrastructure, including service-to-service communication, traffic management, and resiliency patterns
  • Partner across engineering, platform, and infrastructure teams to define and execute on cross-organizational technical strategy and long-term investments
  • Guide engineering excellence through design reviews, code reviews, and implementation of best practices across the team
  • Contribute to the development and operation of systems that incorporate emerging technologies, including AI-enabled capabilities, in a scalable and reliable manner
  • Coach and mentor engineers, fostering technical growth and raising the overall quality bar across the organization
  • Drive continuous improvement in system design, operational practices, and engineering processes
  • Model Microsoft’s culture and values in all aspects of work.
  • Fulltime
Read More
Arrow Right

Azure Principal Platform Engineer

As an Azure Principal Platform Engineer, you will act as the authoritative Subje...
Location
Location
United Kingdom; Spain , Greater London; England; Spain
Salary
Salary:
Not provided
parserdigital.com Logo
Parser Limited
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Recent, hands-on experience operating multi-cluster AKS in production environments, either multi-region or multi-tenant
  • Proven experience building and architecting complex Kubernetes platforms from scratch
  • Experience with Kubernetes platform engineering, including ingress and service mesh, OPA / Gatekeeper policy, secrets management, and progressive delivery
  • Strong security awareness, displaying comfort with CAF / ALZ patterns, least-privilege IAM, network segmentation, and policy-as-code
  • Experience working with regulated or financial services workloads on Azure
  • FinOps instincts for managing AKS and networking costs effectively
Job Responsibility
Job Responsibility
  • Architect and Bootstrap: Design and provision a greenfield, highly-scalable, multi-tenant AKS platform from the ground up (focusing on underlying cluster architecture, not just workload deployment)
  • Act as the Kubernetes Reference (SME): Serve as the authoritative internal expert for the vast Kubernetes ecosystem, filling a critical knowledge gap and guiding long-term architectural direction
  • Establish a Platform Operating Model: Help structure and establish an Internal Developer Platform (IDP), defining how the new platform team will interact with and empower developer teams through self-service capabilities
  • Upskill and Mentor: Transition the in-house engineering team into a high-performing internal platform team throughout the platform build process
  • Observability: Design and implement enterprise-grade logging, metrics, and tracing for Kubernetes at scale
  • IaC Leadership: Implement and manage Infrastructure as Code to a senior standard, taking charge of state strategy, module design, and drift management
What we offer
What we offer
  • The chance to join an organization with triple-digit growth that is changing the paradigm of how software products are built
  • The opportunity to form part of an amazing, multicultural community of tech experts
  • A highly competitive compensation package
  • A flexible working environment
  • Medical insurance
  • Fulltime
Read More
Arrow Right

Principal Product Engineer

We're looking for a Principal Product Engineer who pairs deep engineering craft ...
Location
Location
Portugal , Lisbon
Salary
Salary:
Not provided
tripadvisor.com Logo
Tripadvisor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of software experience, with significant time still spent hands-on in code - track record of shipping product-impacting work end-to-end, not just owning a layer
  • Real depth in the mobile app ecosystem (iOS and/or Android, with strong fluency in Swift and/or Kotlin and the surrounding ecosystem - offline sync, push, auth, persistence, networking, REST/GraphQL) - and credible breadth beyond it
  • Demonstrated breadth: you've worked seriously in at least one of {web frontend, backend services, data/infra, platform tooling} alongside mobile, and can hold your own in code review there
  • Strong product judgment: you've made calls about what not to build, and can defend them with evidence
  • Comfort troubleshooting in production across stacks - crash analysis, latency tracing, release-health debugging
  • Excellent cross-functional collaboration
  • you make the people around you better.
Job Responsibility
Job Responsibility
  • Identify, scope, and ship the changes that move business metrics - across mobile, web, services, and data layers
  • Architect long-lasting systems that hold up under real production conditions: performance, reliability, scalability, offline behavior, consistency
  • Lead technical design reviews across teams, weighing trade-offs not just in code but in product impact, time-to-ship, and operational cost
  • Drive operational maturity wherever it's weakest - release management, observability, incident response, performance monitoring - including in the mobile apps
  • Partner with PMs, designers, and engineering leaders to shape what we build, why, and in what order
  • you're a peer in those conversations, not a downstream implementer
  • Set the technical bar for the org by example: write the prototype, prove the pattern, then teach it
  • Communicate trade-offs clearly to engineers, product partners, and senior stakeholders
What we offer
What we offer
  • Competitive compensation packages (routinely benchmarked against the latest industry data), including base salary and annual bonuses
  • “Work your way” with flexibility to suit your lifestyle. Tripadvisor Group takes a remote-friendly approach to collaboration across a worldwide team, with the option to join on-site as often as you’d like or as required by your team.
  • Flexible schedule. Work-life balance is ingrained in our culture by design. Trust and accountability make it work.
  • Donation matching. Give back? Give more! We match qualifying charitable donations annually.
  • Tuition assistance. Want to level up your career? We love to hear it! Receive annual support for qualified programs.
  • Lifestyle benefit. An annual benefit to spend on yourself. Use it on travel, wellness, or whatever suits you.
  • Travel perks. We believe that travel is employee development, so we provide discounts and more.
  • Employee assistance program. We’re here for you with resources and programs to help you through life’s challenges.
  • Health benefits. We offer great coverage and competitive premiums.
  • Generous referral scheme. Help us grow and be rewarded with generous awards for referring successful candidates.
Read More
Arrow Right