CrawlJobs Logo

Lead Software Engineer, DevOps (Azure)(Cloud Operations Resilience Engineering)

capitalone.com Logo

Capital One

Location Icon

Location:
United States , McLean

Category Icon

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

179400.00 - 225100.00 USD / Year

Job Description:

Lead Software Engineer, DevOps ( Azure)(Cloud Operations Resilience Engineering) Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, collaborative, inclusive, and iterative delivery environment? At Capital One, you'll be part of a big group of makers, breakers, doers and disruptors, who love to solve real problems and meet real customer needs. We are seeking DevOps Engineers who are passionate about marrying data with emerging technologies to join our team. As a DevOps Engineer, you’ll have the opportunity to be on the forefront of driving a major transformation within Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One’s foundational cloud infrastructure layer, including observability, connectivity, resilience and availability.

Job Responsibility:

  • Lead a portfolio of diverse technology projects and a team of developers with deep experience in machine learning, distributed microservices, and full stack systems to create solutions that help meet regulatory needs for the company
  • Share your passion for staying on top of tech trends, experimenting with and learning new technologies, participating in internal & external technology communities, and mentoring other members of the engineering community
  • Collaborate with digital product managers, and deliver robust cloud-based solutions that drive powerful experiences to help millions of Americans achieve financial empowerment
  • Utilize programming languages like Python and Go, Container Orchestration services including Docker and Kubernetes, CM tools including Terraform, and a variety of AWS and Azure tools and services

Requirements:

  • Bachelor's degree
  • At least 4 years of experience in DevOps Engineering (Internship experience does not apply)
  • At least 3 years of experience in Cloud Native technologies (Amazon Web Services, Microsoft Azure, Google Cloud Platform)
  • At least 4 years of Unix or Linux system administration experience

Nice to have:

  • 7+ years of DevOps Engineering experience
  • 4+ years of experience with coding and scripting (Python, SQL, Java, JavaScript, Golang, Bash, Perl or Ruby)
  • 4+ years of experience with technologies Apache Mesos, Marathon, or Apache Spark
  • 4+ years of experience using build and deployment tools (Jenkins, Docker)
  • 2+ years of experience with Azure
  • 2+ years of experience with distributed database systems (Cassandra, ElasticSearch)
  • 2+ years of experience with deploying clustered web services
  • 2+ years of experience working within Agile Development Practices
What we offer:
  • Performance based incentive compensation
  • Health, financial and other benefits

Additional Information:

Job Posted:
May 05, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Lead Software Engineer, DevOps (Azure)(Cloud Operations Resilience Engineering)

Site Reliability Engineer

Corporate Tools is looking for a Site Reliability Engineer. You will be a tradit...
Location
Location
United States
Salary
Salary:
175000.00 USD / Year
corporatetools.com Logo
Corporate Tools
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Software Engineering, or equivalent practical experience
  • 5+ years of experience in software engineering
  • 2+ years of experience in site reliability engineering, DevOps, or infrastructure engineering roles
  • Deep experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code tools such as Terraform, CloudFormation, or Pulumi
  • Strong proficiency with Kubernetes, Docker, and container orchestration in production environments
  • Hands-on experience with observability and monitoring tools like Prometheus, Grafana, OpenTelemetry, Sentry, or New Relic
  • Proven ability to design and implement highly available, fault-tolerant systems and lead proactive incident response efforts
  • Experience with performance tuning, database optimization, and caching strategies (e.g., PostgreSQL, Redis, Memcached)
  • Demonstrated ability to drive reliability improvements, reduce operational toil, and foster a culture of resilience and continuous improvement
  • Experience leading reliability-focused initiatives such as post-incident reviews, capacity planning, and root cause analysis
Job Responsibility
Job Responsibility
  • Stop problems before they start
  • Fix issues quickly and learn from them
  • Help keep systems steady, secure, and running
  • Work closely with DevOps engineers to build out tools and automation
  • Take ownership
What we offer
What we offer
  • 100% employer-paid medical, dental and vision for employees
  • Annual review with raise option
  • 22 days Paid Time Off accrued annually, and 4 holidays
  • After 3 years, PTO increases to 29 days
  • Employees transition to flexible time off after 5 years with the company—not accrued, not capped, take time off when you want
  • Paid Parental Leave
  • Up to 6% company matching 401(k) with no vesting period
  • Quarterly allowance
  • Open concept office with friendly coworkers
  • Creative environment where you can make a difference
  • Fulltime
Read More
Arrow Right

Senior Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering)

Senior Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering)....
Location
Location
United States , McLean; Richmond
Salary
Salary:
209000.00 - 262400.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree
  • At least 6 years of experience in DevOps Engineering (Internship experience does not apply)
  • At least 4 years of experience with Cloud Native technologies (Amazon Web Services, Microsoft Azure, Google Cloud Platform)
  • At least 6 years of Unix or Linux system administration experience
Job Responsibility
Job Responsibility
  • Work within and across Agile teams to design, develop, test, implement, and support technical solutions across full-stack development tools and technologies
  • Lead the craftsmanship, availability, resilience, and scalability of your solutions
  • Bring a passion to stay on top of tech trends, experiment with and learn new technologies, participate in internal & external technology communities, and mentor other members of the engineering community
  • Encourage innovation, implementation of cutting-edge technologies, inclusion, outside-of-the-box thinking, teamwork, self-organization, and diversity
  • Work across boundaries to improve the velocity of your and other teams
  • Lead efforts to enable and simplify the use of new and existing AWS services
  • Work with product managers to understand desired application and platform capabilities and testing scenarios
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering)

Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering). Do you...
Location
Location
United States , New York, New York; Richmond, Virginia
Salary
Salary:
179400.00 - 245600.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree
  • At least 4 years of experience in DevOps Engineering (Internship experience does not apply)
  • At least 3 years of experience in Cloud Native technologies (Amazon Web Services, Microsoft Azure, Google Cloud Platform)
  • At least 4 years of Unix or Linux system administration experience
Job Responsibility
Job Responsibility
  • Lead a portfolio of diverse technology projects with deep experience in platform engineering, machine learning, distributed microservices, and full stack systems to create solutions that help meet regulatory needs for the company
  • Share your passion for staying on top of tech trends, experimenting with and learning new technologies, participating in internal & external technology communities, and mentoring other members of the engineering community
  • Collaborate with digital product managers, and deliver robust cloud-based solutions that drive powerful experiences to help millions of customers achieve financial empowerment
  • Utilize programming languages like Python, and Golang, along with container orchestration tools including Docker and Kubernetes, configuration management tools including Ansible and Terraform, and a variety of AWS tools and services
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering)

Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering). Do you...
Location
Location
United States , McLean; Plano; Richmond
Salary
Salary:
179400.00 - 225100.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree
  • At least 4 years of experience in DevOps Engineering (Internship experience does not apply)
  • At least 3 years of experience in Cloud Native technologies (Amazon Web Services, Microsoft Azure, Google Cloud Platform)
  • At least 4 years of Unix or Linux system administration experience
Job Responsibility
Job Responsibility
  • Lead a portfolio of diverse technology projects and a team of developers with deep experience in machine learning, distributed microservices, and full stack systems to create solutions that help meet regulatory needs for the company
  • Share your passion for staying on top of tech trends, experimenting with and learning new technologies, participating in internal & external technology communities, and mentoring other members of the engineering community
  • Collaborate with digital product managers, and deliver robust cloud-based solutions that drive powerful experiences to help millions of Americans achieve financial empowerment
  • Utilize programming languages like Java, Python, SQL, Ruby and Go, Container Orchestration services including Docker and Kubernetes, CM tools including Ansible and Terraform, and a variety of AWS tools and services
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • a comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Senior Infrastructure Engineer

Location
Location
India , Putlibowli
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
May 16, 2026
Flip Icon
Requirements
Requirements
  • Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, Ansible, Dynatrace
  • Build and manage CI/CD pipelines
  • Improve infrastructure provisioning and configuration through automation
  • Monitor the health, performance, and reliability of production systems and applications
  • Design, implement, and maintain automated monitoring solutions, using tools such as Datadog
  • Define and monitor service level objectives (SLOs), service level indicators (SLIs), and error budgets
  • Implement effective alerting systems
  • Lead root cause analysis (RCA) and post-mortem investigations
  • Respond to production incidents, diagnose root causes, and implement corrective actions
  • Create and maintain playbooks and documentation for incident response
Job Responsibility
Job Responsibility
  • Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, Ansible, Dynatrace to automate deployment and management of infrastructure
  • Build and manage CI/CD pipelines to ensure efficient and reliable application deployments
  • Improve infrastructure provisioning and configuration through automation, minimizing manual interventions and reducing human error
  • Monitor the health, performance, and reliability of production systems and applications
  • Design, implement, and maintain automated monitoring solutions, using tools such as Datadog
  • Define and monitor service level objectives (SLOs), service level indicators (SLIs), and error budgets to ensure system reliability and availability meet customer expectations
  • Implement effective alerting systems to identify and address potential issues before they impact users
  • Lead root cause analysis (RCA) and post-mortem investigations after incidents to identify improvements and avoid recurrence
  • Respond to production incidents, diagnose root causes, and implement corrective actions
  • Create and maintain playbooks and documentation for incident response, troubleshooting, and recovery processes
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Vulnerability Management

GEICO is seeking an experienced full-stack engineer with a deep technical expert...
Location
Location
United States , Chevy Chase; Palo Alto; Seattle; Renton
Salary
Salary:
115000.00 - 230000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Tech-lead with data engineering and software development experience in a hybrid environment (AWS, Azure, on-prem)
  • Proficiency in at least one modern programming language (Python, Java, Scala, Go) and deep experience building scalable production-grade data services, APIs, or ingestion frameworks
  • Expertise in designing, building, and operating large‑scale, resilient, and high‑performance data pipelines across distributed systems, with strong knowledge of ETL/ELT patterns, data orchestration, and data quality frameworks
  • Advanced proficiency in modern data storage and processing technologies, including SQL/NoSQL databases (e.g., PostgreSQL), query optimization, and data modeling for analytical and operational use cases
  • Hands‑on experience with reporting and analytics tools such as Power BI, Tableau, or equivalent, including developing semantic models, optimizing reporting datasets, and enabling business teams with curated data
  • Strong applied skills in distributed compute ecosystems (e.g., Spark or similar), and the ability to optimize workloads for performance, cost efficiency, and reliability
  • Extensive knowledge and experience of building data intensive large-scale distributed systems on cloud
  • Experience building the architecture and design of new and current systems (architecture, design patterns, reliability, and scaling)
  • Fluency in DevOps concepts and best practices in CI/CD pipelines and infrastructure as a code
  • Experience with application performance monitoring tools and performance assessments
Job Responsibility
Job Responsibility
  • Lead software design, development, and delivery of integrated systems to drive Vulnerability Management initiatives
  • Deliver automation initiatives, conduct advanced research, and develop proofs of concept to enhance our capabilities and improve overall efficiency
  • Achieve business outcomes through force multiplication
  • Develop, integrate, and maintain multilevel cybersecurity designs, architectures, policies, and procedures
  • Provide secure design guidance and recommendations to developers, infrastructure, and product engineers
  • Influence and educate partner teams to bring an engineering first approach to develop sustainable security systems
  • Mentor peers and team members in security technologies, enterprise solution design, deployment, and effective customer interaction
  • Provide motivating demonstrations and communications to show the value of our security measures to the business, highlighting the low impact on systems, improved operability and resiliency
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Senior Staff Software Engineer- GIA Platform

GEICO is seeking an experienced software engineer with a passion for building hi...
Location
Location
United States , Palo Alto
Salary
Salary:
130000.00 - 260000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Fluency in at least one modern language (Go is preferred, .Net is a plus)
  • Proven track record of designing, implementing, and maintaining highly scalable, available and reliable system in production
  • Understanding of security best practices and data encryption technology
  • Understanding of SQL and NoSQL databases, including stateful services management and storage
  • Understanding of networking, caches, key/value stores, load balancing, global load balancing, queues, DNS and CDN
  • Deep knowledge of DevOps practices, methodologies, and principles, along with a solid understanding of on prem and public cloud-based network, compute, and storage technologies
  • In-depth knowledge of hybrid cloud architecture, IaaS and PaaS technologies, container orchestration platforms (e.g., Kubernetes), cloud efficiency and observability etc.
  • Strong background in incident management
  • Ability to create incident response playbooks, runbooks, incident triaging strategies, and post-incident analysis to drive continuous improvement in system reliability and availability
  • Experience with open-source management and monitoring tools
Job Responsibility
Job Responsibility
  • Develop and drive the overall technical roadmap for the GIA Platform organization, aligning it with the organization's business goals and objectives
  • Work closely with executive leadership, tech teams, and other cross-discipline stakeholders to build optimal strategy for delivering platform services
  • Leverage technical and domain expertise to influence partners and leadership to create a force multiplier in achieving milestones in the team’s technical roadmap
  • Provide thought leadership in GIA Platform, staying ahead of industry trends and emerging technologies to create effective strategy that minimizes business disruption while balancing the modernization of legacy platform components
  • Lead the design and architecture of resilient and scalable platform services, considering both on-premises and cloud-based solutions
  • Champion software development best practices and safe deployment processes to enable continuous, incremental delivery of business values
  • Contribute directly to and leading by example in day-to-day engineering activities (writing feature code and automated tests, raising PRs and reviewing peers’ PRs, developing and managing CI/CD pipelines, production support, among others)
  • Develop and maintain comprehensive incident response plans to address various disaster scenarios across multiple partner integration points
  • Spearhead collaboration with various stakeholders in production readiness assessment and operational excellence
  • Hands-on software engineering and SDLC best practices (Technical Review Documents, Architecture, Software Development, Code Reviews, Testing, Production Readiness Reviews, among others)
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Senior Software Engineering Manager

The GridOS Data Fabric Engineering Manager will lead a globally distributed engi...
Location
Location
Norway , Oslo
Salary
Salary:
Not provided
gevernova.com Logo
GE Vernova
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Electrical/Computer Engineering, or related technical field (or equivalent experience)
  • Professional software engineering experience in an engineering leadership or people management role
  • Proven experience leading teams distributed across multiple locations and time zones, with demonstrated ability to maintain alignment, quality, and delivery velocity
  • Strong background in designing and operating large-scale distributed systems or data platforms, including: Time-series or event-driven data storage technologies
  • Real-time data streaming frameworks (e.g., Kafka, Pulsar, Kinesis)
  • High-availability, mission-critical services in production environments
  • Hands-on experience with at least one modern programming language (e.g., Java, Go, C#, Python) and public cloud platforms (e.g., AWS, Azure, GCP)
  • Strong grasp of software engineering best practices: system design, clean code, testing strategies, CI/CD pipelines, observability, and incident management
  • Excellent written and verbal communication skills, with the ability to drive clarity and decision-making in an asynchronous, global environment.
Job Responsibility
Job Responsibility
  • Lead, mentor, and grow a geographically distributed team of software engineers (and potentially data/DevOps engineers) working across multiple time zones
  • Establish clear, asynchronous ways of working (documentation, decision logs, recorded demos) to ensure alignment despite limited overlapping hours
  • Create an inclusive and collaborative team culture that values diverse perspectives, cultural sensitivity, and psychological safety
  • Set clear expectations, provide regular performance feedback, and drive career development tailored to regional contexts and opportunities
  • Collaborate with regional leaders and HR partners to recruit, onboard, and retain talent across multiple geographies
  • Own the end-to-end lifecycle of Timebase and Anybase capabilities within GridOS, including: Time-series data storage and retrieval
  • Real-time streaming and event processing
  • Time alignment and synchronization across diverse data sources
  • Integrations with simulation, forecasting, optimization, and market modules
  • Partner with architects to evolve the Timebase and Anybase technical roadmap and reference architecture, ensuring it meets global scalability, reliability, and compliance needs
What we offer
What we offer
  • Opportunity to lead a core, globally distributed platform team at the heart of GridOS and GE Vernova’s digital strategy
  • The chance to work on complex, high-impact problems that shape the future of the electric grid and global energy transition
  • A diverse, mission-driven environment with colleagues across regions and disciplines
  • Competitive compensation, benefits, and global career development opportunities.
  • Fulltime
Read More
Arrow Right