CrawlJobs Logo

Staff Software Engineer, Platform Infrastructure

United States, Pittsburgh 171000.00 - 273000.00 USD / Year · Job Posted April 05, 2026
Apply Position
Job Link Share

Job Description

We are seeking an experienced and highly motivated Staff Software Engineer to lead a new team dedicated to stabilizing and modernizing our Offline Testing Infrastructure (OTI). OTI is a critical, shared middle-layer infrastructure that underpins our PR testing, test creation, and Verification & Validation (V&V) efforts. This is a high-impact, high-urgency role focused on improving the velocity of our engineering teams and the reliability of our release cycles. The successful candidate will build and lead a small, focused team to transition OTI to a stable, performant, and scalable platform.

Job Responsibility

  • Lead the OTI Team: Serve as the technical lead (TL) for the OTI team within PIE-Compute, driving the strategic vision, execution, and long-term stability of the core infrastructure
  • Help Define and Optimize the Testing Ecosystem: Lead the design of the next-generation offline testing architecture to meet diverse team needs, reducing redundancy and siloing across the organization
  • Partner with Test Creation and Test Drive teams to standardize end-to-end test execution and reporting (Creation -> Execution -> Reporting)
  • Refine the full test lifecycle to ensure performance and scalability, and maintain clear attribution of failures to enhance reliability and efficient debugging
  • Own Critical OTI Components and Migrations: Take ownership of the shared OTI components, including maintenance and on-call support
  • Own various offline test Modalities, including step code, workflow code, and general health
  • Lead the maintenance and development of common OTI tooling, including launching test evaluations, polling APIs, communicating results, and providing recommended pipeline templates
  • Establish Architecture and Best Practices: Define and enforce data management policies for the testing ecosystem (storage, lifecycling, write strategies, data integrity, and lineage)
  • Define use cases and feature design for new test modalities, including single versus cross-modality testing strategies
  • Manage incidents related to offline tests and maintain Standard Operating Procedures (SOPs) for PRs, local workflows, V&V, and releases
  • Act as a Center of Excellence: Serve as a subject matter expert for optimizing the architecture and performance of Aurora's largest compute use case (offline testing), and provide high-value consulting/architecture support to adjacent teams

Requirements

  • Senior or Staff-level experience (P7 equivalent) as a Software Engineer, ideally in infrastructure, developer tooling, or critical shared services
  • Proven experience leading technical projects and mentoring/directing other engineers
  • Familiarity with distributed compute technologies, cloud services (e.g., AWS), and large-scale workflow management systems
  • Demonstrated ability to triage, debug, and perform on-call and incident management for complex, cross-cutting infrastructure issues
  • Strong communication skills to manage stakeholder alignment and drive cross-team standardization efforts

What we offer

  • annual bonus
  • equity compensation
  • benefits

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Staff Software Engineer, Platform Infrastructure

8 matching positions

Staff Software Engineer, Platform Infrastructure

We are seeking an experienced and highly motivated Staff Software Engineer to le...
Location
Location
United States , Mountain View
Salary
Salary:
189000.00 - 303000.00 USD / Year
aurora.tech Logo
Aurora Innovation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Senior or Staff-level experience (P7 equivalent) as a Software Engineer, ideally in infrastructure, developer tooling, or critical shared services
  • Proven experience leading technical projects and mentoring/directing other engineers
  • Familiarity with distributed compute technologies, cloud services (e.g., AWS), and large-scale workflow management systems
  • Demonstrated ability to triage, debug, and perform on-call and incident management for complex, cross-cutting infrastructure issues
  • Strong communication skills to manage stakeholder alignment and drive cross-team standardization efforts
Job Responsibility
Job Responsibility
  • Lead the OTI Team: Serve as the technical lead (TL) for the OTI team within PIE-Compute, driving the strategic vision, execution, and long-term stability of the core infrastructure
  • Help Define and Optimize the Testing Ecosystem: Lead the design of the next-generation offline testing architecture to meet diverse team needs, reducing redundancy and siloing across the organization
  • Partner with Test Creation and Test Drive teams to standardize end-to-end test execution and reporting (Creation -> Execution -> Reporting)
  • Refine the full test lifecycle to ensure performance and scalability, and maintain clear attribution of failures to enhance reliability and efficient debugging
  • Own Critical OTI Components and Migrations: Take ownership of the shared OTI components, including maintenance and on-call support
  • Own various offline test Modalities, including step code, workflow code, and general health
  • Lead the maintenance and development of common OTI tooling, including launching test evaluations, polling APIs, communicating results, and providing recommended pipeline templates
  • Establish Architecture and Best Practices: Define and enforce data management policies for the testing ecosystem (storage, lifecycling, write strategies, data integrity, and lineage)
  • Define use cases and feature design for new test modalities, including single versus cross-modality testing strategies
  • Manage incidents related to offline tests and maintain Standard Operating Procedures (SOPs) for PRs, local workflows, V&V, and releases
What we offer
What we offer
  • annual bonus
  • equity compensation
  • benefits
  • Fulltime
Read More
Arrow Right

Senior-Staff Software Engineer, Platform Infrastructure

As a Senior Software Engineer on this team, you will help architect, design and ...
Location
Location
United States , San Mateo
Salary
Salary:
130000.00 - 280000.00 USD / Year
verkada.com Logo
Verkada
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must have a BS, MS, or PhD in Computer Science, or similar technical field of study
  • Experience and enthusiasm for learning about new infrastructure products, features, and strategies
  • Comfortable with working at the frontier of infrastructure and software development
  • Experience in Python and/or Go
  • Experience with one of the major cloud platforms (preferably AWS)
  • Strong written and verbal communications
Job Responsibility
Job Responsibility
  • Identify and lead critical efforts related to scalability, reliability and efficiency
  • Influence the features and direction of our platform with your own ideas
  • Provide technical support for engineers on team
  • Align with product and org objectives, and coordinate with cross-functional teams on delivering key results
What we offer
What we offer
  • Healthcare programs that can be tailored to meet the personal health and financial well-being needs - Premiums are 100% covered for the employee under at least one plan and 80% for family premiums under all plans
  • Nationwide medical, vision and dental coverage
  • Health Saving Account (HSA) with annual employer contributions and Flexible Spending Account (FSA) with tax saving options
  • Expanded mental health support
  • Paid parental leave policy & fertility benefits
  • Time off to relax and recharge through our paid holidays, firmwide extended holidays, flexible PTO and personal sick time
  • Professional development stipend
  • Fertility stipend
  • Wellness/fitness benefits
  • Healthy lunches provided daily
  • Fulltime
Read More
Arrow Right

Staff Systems Software Engineer, Infrastructure Platform

The Infrastructure Engineering organisation at GM is building a cloud-native pla...
Location
Location
United States , Austin; Mountain View; Warren
Salary
Salary:
Not provided
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science or related field, or equivalent work experience
  • 8+ years of software engineering experience with a strong track record of building and operating production distributed systems
  • Deep platform or infrastructure engineering experience, with hands-on work building APIs, schedulers, orchestrators, or similar systems at scale
  • Strong proficiency in Go, with ability to write clean, maintainable, and performant production code for backend services
  • Solid understanding of distributed systems fundamentals including consistency models, failure handling, idempotency, retry patterns, and circuit breakers
  • Experience with cloud-native technologies such as Kubernetes, Nomad, Consul, or similar orchestration and service discovery platforms
  • Strong API design skills with understanding of RESTful patterns, authentication and authorisation models (OIDC, RBAC), versioning strategies, and error handling
  • Deep experience with relational databases, particularly PostgreSQL, including schema design, indexing strategies, query optimisation, and migration management
  • Architectural thinking with ability to evaluate trade-offs, balance simplicity with flexibility, design for current requirements and future growth, and document decisions effectively
  • Strong communication skills with ability to explain complex technical concepts to both engineering and business stakeholders
Job Responsibility
Job Responsibility
  • Design and implement core platform services including the API gateway, scheduler, lifecycle orchestrator, and synchronisation services using Go and cloud-native patterns
  • Build RESTful APIs with authentication (OIDC, RBAC), authorisation, versioning, and observability, architecting the inventory database system using PostgreSQL for resource metadata, capabilities, and state management
  • Develop intelligent scheduling and orchestration logic that matches workload requirements to resource capabilities with support for automated pooling, reservation modes, and hybrid allocation strategies
  • Build developer CLI tooling and integrate with the control plane, enabling developers to discover, allocate, and manage infrastructure resources through intuitive commands
  • Implement provisioning workflows that coordinate firmware flashing, health checks, power cycling, and resource validation across diverse automotive hardware configurations
  • Collaborate with stakeholders across Infrastructure Engineering, Quality Engineering, and Hardware Infrastructure to understand workflows and integrate with existing systems
  • Lead architectural discussions, conduct code reviews, document technical decisions, and mentor team members on distributed systems patterns and Go development
  • Work with tools and technologies including Go, PostgreSQL, Kubernetes, Nomad, Consul, RESTful APIs with OIDC authentication and RBAC authorisation, Datadog, S3-compatible object storage (MinIO), CI/CD pipelines, and Git/GitHub
What we offer
What we offer
  • From day one, we're looking out for your well-being–at work and at home–so you can focus on realizing your ambitions
  • Fulltime
Read More
Arrow Right

Staff Infrastructure Software Engineer, Enterprise AI

Scale GP is building the next generation of enterprise-grade Generative AI produ...
Location
Location
United States , New York; San Francisco
Salary
Salary:
216200.00 - 270250.00 USD / Year
scale.com Logo
Scale
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience in a senior role
  • 5+ years of full-time software engineering experience
  • Deep understanding of modern infrastructure practices, including CI/CD, IaC (e.g., Terraform, Helm Charts), container orchestration (e.g., Kubernetes) and observability platforms (e.g., Datadog, Prometheus, Grafana)
  • Extensive experience with at least one major cloud provider (AWS, Azure, or GCP)
  • Strong knowledge of security and compliance in enterprise environments, with a focus on access management, data isolation, and customer-specific VPC setups
  • Proficiency in Python or JavaScript/TypeScript, and SQL
Job Responsibility
Job Responsibility
  • Define the architectural patterns for our multi-cloud infrastructure to support secure, reliable, and scalable Agentic workflows for enterprise customers
  • Lead the infrastructure roadmap with a strong focus on compliance, privacy, and security standards, including designing change management and data isolation strategies
  • Own the development and maintenance of our best-in-class Agentic observability platform (logging, metrics, tracing, and analytics) to proactively ensure system health and enable rapid incident response
  • Drive developer efficiency by building automated tooling and championing Infrastructure-as-Code (IaC) paradigms throughout the engineering organization
  • Solve the toughest engineering problems related to multi-tenancy, data isolation, and high-performance inference at a massive scale, taking end-to-end ownership across the full product lifecycle
What we offer
What we offer
  • Comprehensive health, dental and vision coverage
  • retirement benefits
  • a learning and development stipend
  • generous PTO
  • equity based compensation
  • additional benefits such as a commuter stipend
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Infrastructure

We are seeking a Staff Engineer to help lead critical initiatives of our core in...
Location
Location
United States , Los Angeles
Salary
Salary:
230000.00 - 260000.00 USD / Year
geniussports.com Logo
Genius Sports
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience building and operating infrastructure or devops platforms
  • Experience architecting complex infrastructure with strict uptime and latency requirements across multiple regions
  • Ability to navigate significant ambiguity and make sound technical decisions that hold up over time
  • Experience building applications and automations to eliminate toil for engineering teams
  • Strong communication skills that enable you to drive platform adoption forwards for new teams
  • Track record of making pragmatic tradeoff decisions across architecture, implementation, technical debt, and customer requests
  • Passion for mentorship and upleveling the team around you to maximize their full potential
Job Responsibility
Job Responsibility
  • Work with other InfraPlat leads to define and drive technical vision and implementation for a variety of projects
  • Engage with stakeholders from product engineering teams to scope requests, identify shared pain points within the org, and prioritize initiatives
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Platform

At Scale, our products include the Generative AI Data Engine, SGP, Donovan, and ...
Location
Location
United States , San Francisco; Seattle; New York
Salary
Salary:
248400.00 - 310500.00 USD / Year
scale.com Logo
Scale
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of full-time engineering experience, post-graduation with specialities in back-end systems
  • Extensive experience in software development and a deep understanding of distributed systems and public cloud platforms (AWS preferred)
  • Demonstrated a track record of independent ownership and leadership across successful multi-team engineering projects
  • Possess excellent communication and collaboration skills, and the ability to translate complex technical concepts to non-technical stakeholders
  • Experience working fluently with standard containerization & deployment technologies like Kubernetes, Terraform, Docker, etc.
  • Experience with orchestration platforms, such as Temporal and AWS Step Functions
  • Experience with NoSQL document databases (MongoDB) and structured databases (Postgres)
  • Strong knowledge of software engineering best practices and CI/CD tooling (CircleCI, ArgoCD)
Job Responsibility
Job Responsibility
  • Architectural Vision: You will drive the design and implementation of foundational systems, acting as a bridge between high-level business goals and technical goals
  • Cross-Functional Leadership: You will collaborate with cross-functional teams to define and drive adoption of the next generation of features for our AI data infrastructure
  • Technical Ownership: You are responsible for proactively identifying and driving opportunities for organizational growth, driving improvements in programming practices, and upgrading the tools that define our development lifecycle
  • Technical Mentorship: You will serve as a subject matter expert, presenting technical information to stakeholders and providing the guidance to elevate the engineering culture across the company
What we offer
What we offer
  • Comprehensive health, dental and vision coverage
  • retirement benefits
  • a learning and development stipend
  • generous PTO
  • additional benefits such as a commuter stipend
  • equity based compensation
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Platform

You'll own critical platform infrastructure supporting 70M+ users and scaling to...
Location
Location
United States , San Francisco
Salary
Salary:
230000.00 - 310000.00 USD / Year
gamma.app Logo
Gamma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of backend engineering experience with deep expertise in distributed systems and data infrastructure
  • Proven track record architecting systems that scaled through multiple orders of magnitude
  • Expert-level proficiency in scalable APIs, databases (PostgreSQL, Redis), and event-driven architectures
  • Experience with real-time systems including collaborative editing, WebSockets, CRDTs, or similar conflict resolution
  • Strong operational excellence with monitoring, alerting, incident response, and performance debugging at scale
  • Proficiency in backend technologies (Node.js, Python, or similar) with deep understanding of tradeoffs
  • Product-minded approach with ability to make technical decisions that unlock business value and user experience
  • High autonomy, low ego with empathetic, reflective, self-aware, growth mindset that actively promotes psychological safety
Job Responsibility
Job Responsibility
  • Lead major infrastructure initiatives including database migrations, architecture refactors, and performance optimization at scale
  • Define engineering standards and best practices across backend teams
  • Mentor engineers on system design, scalability patterns, and production excellence
  • Make build vs buy decisions for platform components and balance technical debt paydown with feature velocity
  • Collaborate with leadership to shape multi-year platform roadmap
  • Dive deep into gnarly technical problems including race conditions, performance bottlenecks, and data consistency issues
What we offer
What we offer
  • competitive equity
  • Fulltime
Read More
Arrow Right

Staff Software Engineer – Platform

We’re looking for a Staff Software Engineer (Platform) to help strengthen and ev...
Location
Location
United States of America
Salary
Salary:
Not provided
atlashxm.com Logo
ATLAS
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience building and operating production systems
  • Solid backend experience with Node.js and/or .NET (C#)
  • Experience working in a cloud environment (Azure preferred)
  • Familiarity with infrastructure as code and modern deployment practices
  • Understanding of reliability, scalability, and operational tradeoffs
  • Experience designing APIs, data models, and service boundaries
  • Clear communication skills and a collaborative approach
  • Ability to lead technical work through influence and sound judgment
Job Responsibility
Job Responsibility
  • Design, build, and evolve shared platform systems and infrastructure
  • Take ownership of complex or cross-team technical initiatives
  • Identify systemic issues and help address them at the root
  • Contribute to platform standards, patterns, and best practices
  • Work with engineering and product leaders to prioritise platform investments
  • Support teams during complex production issues when needed
  • Mentor and support other engineers through reviews, pairing, and collaboration
  • Stay hands-on in critical code paths, infrastructure, and operational workflows
What we offer
What we offer
  • Country-specific benefits
  • Flexible PTO
  • Your birthday off and a day for you to volunteer and give back to the organization of your choice
  • Generous Parental Leave Program
  • Growth and development opportunities with access to a top learning content provider
  • Fulltime
Read More
Arrow Right