CrawlJobs Logo

Principal Systems Reliability Engineer, Secure Federal Operations

United States, Herndon 114800.00 - 207200.00 USD / Year · Job Posted February 04, 2026
Apply Position
Job Link Share

Job Description

The Principal Systems Reliability Engineer is responsible for designing and implementing secure, scalable, and reliable technology solutions across cybersecurity, system architecture, networking, and platform operations. It combines expertise in security architecture, end-to-end solution design, and DevSecOps/SRE practices to protect digital assets, enable cross-domain integration, and optimize IT services. The position ensures the reliability and performance of software and systems supporting IT services by managing scalability, availability, latency, and security. It involves designing and maintaining continuous integration and continuous delivery (CI/CD) pipelines, supporting cloud-native application development, and driving operational excellence through automation and proactive monitoring. This role differentiates itself by combining strategic system design with hands-on operational improvements and automation expertise. Success is measured by improved security posture, operational efficiency, faster software delivery, and enhanced customer experience—directly impacting organizational service quality and customer satisfaction.

Job Responsibility

  • Develop and implement system designs and architectures to improve software delivery speed and operational efficiency
  • Lead architecture for cross-domain programs, ensuring alignment with enterprise standards
  • Build and operate cloud-native platforms (Kubernetes, service mesh, ingress, policy engines)
  • Implement network segmentation, firewalls, VPNs, and Zero Trust principles
  • Contribute to advancing software delivery processes including cloud enablement and microservices containerization
  • Deliver software solutions that enhance service availability, scalability, latency, and efficiency
  • Manage environment provisioning and pipeline configurations to support automated server deployment
  • Also responsible for other duties/projects as assigned by business management as needed

Requirements

  • 7+ years of progressive experience in systems architecture, platform engineering, or site reliability engineering, with a strong focus on security and operational excellence
  • Experience designing and implementing secure, scalable, and highly available systems across hybrid and cloud environments (Azure, AWS, or GCP)
  • Experience in automation and scripting using Python, Go, PowerShell, or Bash
  • Knowledge of imaging processes and asset lifecycle management, including provisioning, patching, and compliance tracking preferred
  • Strong background in network architecture and security, including segmentation, VPNs, firewalls, and Zero Trust principles preferred
  • Experience with DevOps tools, such as, Ansible, Chef, Puppet, etc. Experience in Docker, Kubernetes, etc. is preferable
  • Experience with Application Performance Monitoring (APM) tools such as AppDynamics, and logging/observability tools like Splunk for troubleshooting and performance analysis
  • Experience working in a cloud environment (public/private)
  • Ability to influence technology direction, lead architecture reviews, and collaborate across multiple teams preferred
  • Experience in incident and problem management, root cause analysis, and disaster recovery planning preferred
  • US citizenship (without dual citizenship)
  • At least 18 years of age and legally authorized to work in the United States
  • Active security clearance or ability to obtain one
  • Bachelor's Degree in areas of study including Computer Science, Engineering, IT plus 7 years of related work experience, OR Advanced degree with 5 years of related experience

What we offer

  • Medical, dental and vision insurance
  • Flexible spending account
  • 401(k)
  • Employee stock grants
  • Employee stock purchase plan
  • Paid time off and up to 12 paid holidays
  • Paid parental and family leave
  • Family building benefits
  • Back-up care
  • Enhanced family support
  • Childcare subsidy
  • Tuition assistance
  • College coaching
  • Short- and long-term disability
  • Voluntary AD&D coverage
  • Voluntary accident coverage
  • Voluntary life insurance
  • Voluntary disability insurance
  • Voluntary long-term care insurance
  • Mobile service & home internet discounts
  • Pet insurance
  • Access to commuter and transit programs

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal Systems Reliability Engineer, Secure Federal Operations

8 matching positions

Federal Principal Product Support Engineer

Hewlett Packard Enterprise is seeking a master-level Principal Product Support E...
Location
Location
United States , Multiple locations
Salary
Salary:
152000.00 - 349000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in a technical field or equivalent experience demonstrating advanced expertise
  • US Citizenship with active Secret clearance
  • Industry-recognized certifications including CompTIA Security+ (or higher, such as CASP or CISSP)
  • Cloud certifications (e.g., AWS Certified Solution Architect, Microsoft Azure Solution Architect, Google Professional Cloud Architect)
  • 10+ years of hands-on experience in IT support, cloud architecture, virtualization, or related areas
  • Proven record of resolving deeply technical issues and leading support for federal customers
Job Responsibility
Job Responsibility
  • Serve as the top-tier escalation point for the most challenging technical issues within HPE Private Cloud and related technologies
  • Lead in-depth troubleshooting across multi-cloud, virtualization, and infrastructure platforms (AWS, Azure, Google Cloud, VMware ESX, Kubernetes)
  • Collaborate directly with BU engineering teams and managed services personnel to drive resolution of systemic, high-impact issues
  • Develop critical patches and product enhancements
  • Analyze, identify, and architect solutions for recurring or complex customer issues
  • Mentor and guide technical support engineers
  • Work closely with federal customer stakeholders to understand business needs and deliver technical solutions
  • Define, refine, and continuously improve operational processes for cloud service delivery
  • Monitor, analyze, and optimize system performance, security, and reliability
  • Author detailed technical documentation, troubleshooting guides, and process improvements
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive benefits suite supporting physical, financial and emotional wellbeing
  • Career development programs
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

As a Principal AI Architect, you will define and drive the end-to-end Cloud + AI...
Location
Location
United States , Redmond
Salary
Salary:
142800.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Job Responsibility
Job Responsibility
  • Architecture Strategy & Roadmap (Cloud + AI + Agentic)
  • Own the reference architecture and technical roadmap for AI/Agentic platform capabilities (e.g., orchestration, skills, tools/plugins, memory, retrieval, evaluation, observability, governance)
  • Translate Customer Success business objectives into platform investments and architectural decisions, balancing speed-to-value with security, compliance, cost, and long-term maintainability
  • Establish clear architectural guardrails and decision frameworks (e.g., “build vs. buy,” “Copilot Studio vs. Foundry,” “RAG vs. fine-tune,” “central vs. federated patterns”)
  • Technical Leadership & Architectural Governance
  • Lead architecture/design reviews for major initiatives
  • drive alignment on system boundaries, contracts, dependency management, and resiliency
  • Define and standardize architecture patterns (multi-tenant SaaS, event-driven architectures, secure data access, model routing, agent safety controls)
  • Create reusable templates, “golden paths,” and reference implementations to accelerate engineering delivery across teams and reduce fragmentation
  • Responsible AI, Security, Privacy, and Compliance by Design
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

Do you have a passion for large-scale services and working with some of Microsof...
Location
Location
United States , Reston
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Security Clearance Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role
  • The successful candidate must have an active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • This position requires successful verification of the stated security clearance to meet federal government customer requirements
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • This position requires verification of U.S. citizenship due to citizenship-based legal restrictions
  • Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 10+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Job Responsibility
Job Responsibility
  • Demonstrates expertise in distributed systems design, interactions between cloud technology layers and components, common dependencies at scale, and the code that defines infrastructures
  • Develops an understanding of the code, features, and operations of specific products at scale as required to contribute to incremental improvements in product availability, reliability, efficiency, observability, and/or performance
  • Researches and maintains an awareness in industry trends, advances in distributed systems and cloud technologies, new tools, and/or processes for maintaining and improving product availability, reliability, efficiency, observability, and/or performance
  • Leverages technical expertise in large scale distributed systems and specific products, as well as objective insights drawn from analyses of production telemetry data to suggest changes or add-ons to product features or code to improve the availability, reliability, efficiency, observability, and performance of product components or features supported by their team
  • Develops and tests basic changes to optimize code and improve the observability, reliability and operability of a defined range of platform, system, or product components or features with direction from other engineers
  • Engages with product engineering teams by participating code/design reviews, regular meetings, on-call rotations and incident responses throughout product development and operations cycles
  • Independently develops code or scripts that automate the performance of repetitive and easily scalable operations processes (e.g., monitoring, alerting, deploying products and updates) across components and features of products operating at scale
  • Leverages technical expertise and telemetry analysis across a range of components and/or features to identify patterns and opportunities to implement configuration and data changes for one or more platforms, systems, or products in production using code, tooling, and automation
  • Identifies opportunities to leverage existing tools and automation to enable product engineering teams to increase the velocity in which they can reliably and safely implement changes in production
  • Designs, develops, and maintains telemetry pipelines and monitoring tools that detail operations metrics (e.g., availability, reliability, performance, efficiency) of product components and features operating at scale
  • Fulltime
Read More
Arrow Right

Senior Principal Software Engineer, Infrastructure

At Docker, we make app development easier so developers can focus on what matter...
Location
Location
United States , Seattle
Salary
Salary:
251000.00 - 352000.00 USD / Year
docker.com Logo
Docker
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of software engineering experience with demonstrated expertise across multiple platform domains (identity, billing, data, infrastructure)
  • Proven track record architecting and delivering large-scale distributed systems serving millions of users and thousands of enterprise customers
  • Deep expertise in at least two of: identity/access management systems, billing/monetization platforms, data platforms, or cloud infrastructure
  • Broad working knowledge across all platform domains with ability to make sound architectural decisions spanning multiple areas
  • Expert-level understanding of API design, service architecture, and system integration patterns at scale
  • Experience with cloud platforms (AWS, GCP, or Azure) and modern infrastructure patterns (Kubernetes, service mesh, infrastructure-as-code)
  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience
  • Track record of establishing strategic technical plans that directly enabled business outcomes (revenue growth, cost reduction, market expansion)
  • Experience translating business strategy into technical architecture and roadmaps
  • Demonstrated ability to identify and prioritize investments that provide maximum platform leverage
Job Responsibility
Job Responsibility
  • Define and own the multi-year technical vision for Docker's foundational platform, encompassing accounts, billing, data, enterprise governance, and infrastructure
  • Establish strategic plans and objectives for major platform initiatives, making architectural decisions that ensure effective achievement of Docker's business objectives
  • Contribute to and drive the strategic vision in collaboration with the VP of Engineering, translating organizational strategy into technical roadmaps that span multiple teams and years
  • Identify and prioritize platform investments that provide maximum leverage—capabilities built once that enable rapid iteration across all Docker products
  • Develop architectural principles and standards that guide technical decisions across the Bridge organization and influence product engineering teams
  • Anticipate future business needs and ensure platform architecture provides the flexibility to support Docker's evolving commercial models
  • Lead large cross-company programs that require coordination across Desktop, Hub, AI, Security, Cloud, and Platform teams
  • Architect the unified platform interfaces ("Control Planes") that enable product teams to answer canonical questions like "Can this user access this feature?" or "How much has this organization consumed?" without understanding underlying complexity
  • Drive convergence of fragmented systems across Docker—replacing product-specific implementations with shared platform capabilities for authentication, authorization, billing, and observability
  • Establish technical contracts between platform and product teams that enable independent velocity while ensuring consistency and reliability
What we offer
What we offer
  • Freedom & flexibility
  • fit your work around your life
  • Designated quarterly Whaleness Days plus end of year Whaleness break
  • Home office setup
  • we want you comfortable while you work
  • 16 weeks of paid Parental leave
  • Technology stipend equivalent to $100 net/month
  • PTO plan that encourages you to take time to do the things you enjoy
  • Training stipend for conferences, courses and classes
  • Equity
  • Fulltime
Read More
Arrow Right

Principal IAM Engineer

The IAM Principal Engineer is responsible for driving the development, maintenan...
Location
Location
United States , Mount Laurel
Salary
Salary:
142361.11 - 213541.67 USD / Year
comcastcorporation.com Logo
Comcast
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Over 10 years of experience implementing SailPoint IdentityIQ
  • More than 5 years of experience designing, architecting, implementing, operating, and maintaining Radiant Logic Virtual Directory Service (VDS), including Federated Identity Management (FIM) and Identity Correlation and Synchronization (ICS)
  • Skilled in integrating data sources and applications into VDS, configuring data access views and permissions, and performing identity correlation and synchronization
  • Strong knowledge of LDAP, Active Directory services, Multi-Factor Authentication (MFA), risk-based authentication, and privileged access management
  • Deep understanding of Identity and Access Management (IAM) across authentication, authorization, endpoint security, network security, and policy engines
  • Technical expertise with Microsoft MFA, SailPoint, CyberArk, ForgeRock, Okta, Ping Identity, Active Directory, Azure Active Directory, AWS, Google Cloud Platform, Microsoft Azure, and cross-domain IDM integrations
  • Solid grasp of cloud identity concepts and hands-on experience with Azure AD and other cloud environments
  • 3–5+ years of experience developing workflows, forms, connector configurations, provisioning policies, and rules within SailPoint IdentityIQ
  • Quick learner with the ability to adopt new technologies and collaborate effectively to capture and implement business system requirements
  • Proficient in source control and development tools such as GitHub and Eclipse
Job Responsibility
Job Responsibility
  • Apply your expertise in SailPoint IdentityIQ and Radiant One FID / Global Sync to enhance and expand the capabilities of the enterprise IAM platform
  • Collaborate with Agile teams to design, build, test, and support scalable IAM solutions that meet foundational enterprise needs, including identity federation, directory virtualization, and multi-source synchronization
  • Contribute innovative and efficient configuration and coding solutions in SailPoint IdentityIQ and Radiant One FID environments that differentiate the IAM platform
  • Engineer cost-effective technical solutions leveraging Radiant One FID and Global Sync to address business challenges and streamline identity and access processes
  • Develop both tactical and strategic IAM solutions aligned with evolving business requirements, including federated identity management and synchronized directory services
  • Partner with key stakeholders to gather and validate requirements, ensuring delivered solutions meet expectations across SailPoint IdentityIQ and Radiant One FID systems
  • Participate in project teams to design new system capabilities, including proof-of-concept (POC) implementations for both Radiant One FID and SailPoint IdentityIQ, and presentations that highlight their functionality
  • Deploy and manage Radiant One FID in Kubernetes environments using Helm charts, ensuring scalable, reproducible, and reliable containerized deployments
  • Support the end-to-end testing lifecycle for system changes, including integrations with Radiant One FID / Global Sync, from design through execution
  • Create proactive capacity forecasts to prevent outages and ensure system reliability for SailPoint IdentityIQ and Radiant One FID services
What we offer
What we offer
  • Paid Time off
  • Physical Wellbeing benefits
  • Financial Wellbeing benefits
  • Emotional Wellbeing benefits
  • Life Events + Family Support benefits
  • Fulltime
Read More
Arrow Right

Principal Product Support Engineer

Hewlett Packard Enterprise is seeking a master-level Principal Product Support E...
Location
Location
United States , Oklahoma City; Dallas; Houston
Salary
Salary:
152000.00 - 349000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in a technical field or equivalent experience demonstrating advanced expertise
  • US Citizenship with active Secret clearance
  • Industry-recognized certifications, including CompTIA Security+ (or higher, such as CASP or CISSP)
  • Cloud certifications (e.g., AWS Certified Solution Architect, Microsoft Azure Solution Architect, Google Professional Cloud Architect)
  • 10+ years of hands-on experience in IT support, cloud architecture, virtualization, or related areas, with a proven record of resolving deeply technical issues and leading support for federal customers
Job Responsibility
Job Responsibility
  • Serve as the top-tier escalation point for the most challenging technical issues within HPE Private Cloud and related technologies
  • Lead in-depth troubleshooting across multi-cloud, virtualization, and infrastructure platforms (AWS, Azure, Google Cloud, VMware ESX, Kubernetes)
  • Collaborate directly with BU engineering teams and managed services personnel to drive resolution of systemic, high-impact issues and develop critical patches and product enhancements
  • Analyze, identify, and architect solutions for recurring or complex customer issues, ensuring permanent resolution and knowledge transfer
  • Demonstrate mastery across all supported platforms, infrastructure, and technologies, acting as the subject matter expert for internal teams and federal customers
  • Develop and review automated solutions leveraging DevOps principles, CI/CD pipelines, and Infrastructure as Code
  • Lead compliance efforts for DISA STIGs and other federal standards, ensuring audit readiness and system hardening
  • Mentor and guide technical support engineers, sharing expertise and ensuring best practices are followed across teams
  • Work closely with federal customer stakeholders to understand business needs, translate requirements, and deliver innovative technical solutions
  • Engage regularly with product management, engineering, and BU teams to influence and prioritize product fixes, enhancements, and updates
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion culture
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Principal Architect - Cloud and Observability

We're building a world of health around every individual — shaping a more connec...
Location
Location
United States
Salary
Salary:
144200.00 - 288400.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
June 29, 2026
Flip Icon
Requirements
Requirements
  • 10+ years in infrastructure, cloud architecture, platform engineering, or SRE
  • 8+ years of architecture work in observability, cloud infrastructure, or both at a large enterprise
  • Solid experience with at least two of Azure, AWS, or GCP -- including networking, identity, compute, and storage
  • 5+ years with Kubernetes in production (OpenShift, EKS, AKS, or GKE)
  • 5+ years with OpenTelemetry or similar frameworks (collectors, SDKs, semantic conventions, pipeline design)
  • 5+ years with observability platforms: Grafana/Mimir/Loki/Tempo, Prometheus, Datadog, Splunk, Dynatrace, or comparable tools
  • Experience defining SLOs/SLIs and building alerting strategies at an organizational level
  • Proven track record writing architecture standards that other teams adopted and followed
  • Able to communicate clearly with both engineers and senior leadership
Job Responsibility
Job Responsibility
  • Own the enterprise observability reference architecture covering metrics, logs, traces, and events across all environments (cloud and on-prem)
  • Drive the OpenTelemetry-first instrumentation strategy -- standard libraries, semantic conventions, collector topologies (DaemonSet, gateway, sidecar), and pipeline design
  • Build and operate telemetry pipelines on Grafana Mimir, Loki, and Tempo, including multi-tenant configurations, retention policies, and capacity planning
  • Define how we measure reliability: SLOs, SLIs, error budgets, and alerting frameworks -- consistently across all lines of business
  • Own the integration between observability tooling and incident management (ServiceNow ITOM, xMatters)
  • Drive telemetry schema standards to ensure teams emit data that is useful downstream, not just technically compliant
  • Build and maintain reference architectures for our hybrid footprint: OpenShift on-prem with KVM/libvirt and Dell PowerFlex storage, plus Azure, AWS, and GCP
  • Lead standards work around workload identity and federation using SPIFFE/SPIRE and cloud-native IAM patterns to move away from static secrets
  • Provide guidance on compute runtime selection -- containers vs. VMs vs. bare metal vs. serverless -- with a clear decision framework for teams
  • Help teams connect autoscaling and capacity planning behavior to actual telemetry signals
What we offer
What we offer
  • medical, dental, and vision coverage
  • paid time off
  • retirement savings options
  • wellness programs
  • other resources, based on eligibility
  • bonus, commission or short-term incentive program
  • equity award program
  • Fulltime
!
Read More
Arrow Right

Principle Product Manager

At Microsoft, we are building trusted, developer‑centric AI platforms that enabl...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in engineering, computer science, or a related technical field
  • Significant experience (typically 8–12+ years) in product management or software engineering with substantial product ownership, including experience working on platform or infrastructure products
  • Demonstrated ability to operate effectively in large, ambiguous, multi‑team environments with shared ownership and complex dependencies
  • Strong technical depth in cloud platforms, distributed systems, or AI/ML infrastructure, with the ability to engage credibly with senior engineers and architects
  • Proven track record of influencing strategy, driving alignment, and delivering outcomes through collaboration rather than direct authority
  • Strong analytical and systems‑thinking skills, with experience making high‑quality decisions in fast‑moving, evolving problem spaces
Job Responsibility
Job Responsibility
  • Act as a senior contributor to platform strategy for Azure AI Foundry and Azure ML, helping shape multi-year investments across model training, customization, deployment, and lifecycle management
  • Drive alignment and progress across federated, cross-organizational initiatives, working with peer Principal PMs and multiple engineering teams on shared platform outcomes
  • Contribute to the definition and evolution of high-leverage platform abstractions (APIs, SDKs, workflows) that enable scalable adoption of GenAI and custom code training workloads
  • Partner closely with senior engineering leaders to influence architectural direction, surface trade-offs, and ensure platform capabilities meet scale, reliability, and security expectations
  • Engage with strategic customers and internal stakeholders to gather insights, validate requirements, and translate learnings into durable, reusable platform capabilities
  • Use data, metrics, and experimentation to evaluate impact and inform product decisions across shared ownership areas
  • Serve as a thought leader and mentor, elevating product craft, platform thinking, and responsible AI practices across the organization
  • Fulltime
Read More
Arrow Right