CrawlJobs Logo

Staff Software Engineer (Infra)

United States, New York City 220000.00 - 260000.00 USD / Year · Job Posted January 20, 2026
Apply Position
Job Link Share

Job Description

As a Staff Software Engineer (Infra) at Amigo, you'll own the technical direction of the infrastructure that powers our platform at global scale. You'll architect multi-region systems handling millions of conversations while maintaining the compliance posture required for healthcare. You'll drive architecture decisions for Kubernetes deployments, Databricks platform, voice/telephony systems, and security infrastructure. You'll mentor engineers, shape technical culture, and ensure we maintain elite engineering standards as the team grows. Reliability and security are non-negotiable—the platform must scale without compromising safety.

Job Responsibility

  • Own technical architecture for infrastructure across cloud platforms, Kubernetes, Databricks, and supporting systems
  • Drive engineering standards for reliability, security, observability, and incident response
  • Architect multi-region deployment strategies with zero-downtime updates for critical systems
  • Design the compliance & security infrastructure for healthcare (HIPAA, SOC 2) and support future regulatory requirements
  • Own disaster recovery architecture and backup systems meeting healthcare compliance requirements
  • Make build vs. buy decisions for infrastructure tooling and evaluate technical tradeoffs
  • Design auto-scaling systems that handle traffic spikes while maintaining cost efficiency
  • Own infrastructure as code of our infrastructure, ensuring clearly documented and identical deployments across regions
  • Mentor engineers and establish patterns that raise the bar for the infrastructure team
  • Collaborate with backend, platform, and security teams to ensure system-wide coherence
  • Define reliability targets (SLOs/SLIs) and drive operational excellence across the platform

Requirements

  • 7+ years of production infrastructure experience, with significant time at elite engineering organizations
  • Expert-level experience with Kubernetes and container orchestration at scale
  • Proven track record designing infrastructure that scales across multiple regions
  • Deep experience with cloud platforms (AWS, GCP, or Azure)
  • Strong understanding of infrastructure-level networking and security configurations
  • History of establishing engineering standards and mentoring engineers
  • Extremely high standards for reliability, security, and operational excellence
  • Both execution-oriented and defensive-minded: you ship infrastructure while anticipating failure modes
  • Deep knowledge of infrastructure as code tools (Terraform, Pulumi, or similar)
  • Experience with compliance requirements and data residency controls in regulated industries
  • Excellent written and verbal communication across engineering and executive stakeholders

Nice to have

  • Experience with healthcare infrastructure or HIPAA compliance at scale
  • Background with voice/telephony systems or real-time communication infrastructure
  • Experience with Databricks platform administration and optimization
  • Track record building and scaling infrastructure teams
  • Knowledge of specific regulatory frameworks (HIPAA, SOC 2, GDPR)
  • Experience with high-availability, mission-critical systems

What we offer

  • Comprehensive health, dental, and vision insurance
  • Mental health support and wellness coaching
  • Flexible wellness stipend for fitness, therapy, or personal growth
  • Daily catered lunch and dinner
  • Annual learning budget for courses, books, or conferences
  • Conference attendance budget for professional development
  • Development setup of your choice
  • Academic collaboration opportunities

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Staff Software Engineer (Infra)

8 matching positions

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Palo Alto
Salary
Salary:
90000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Member of Technical Staff - Software Engineer (AI infra)

Microsoft AI is looking for a Member of Technical Staff - Software Engineer to h...
Location
Location
Switzerland , Zürich
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Experience with generative AI
  • Experience with distributed computing
Job Responsibility
Job Responsibility
  • Develop and tune the pretraining scalable software for Nvidia GB200 72NVL CX8 and AMD MIxxx architectures
  • Benchmark GB200 and AMD MIxxx GPU clusters
  • Gather data and insights to develop the pretraining compute roadmap
  • Care deeply about conversational AI and its deployment
  • Actively contribute to the development of AI models that are powering our innovative products
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
  • Enjoy working in a fast-paced, design-driven, product development cycle
  • Embody our Culture and Values
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - CAD Infra Engineering

Dandy is hiring a Staff Software Engineer to join our rapidly scaling technology...
Location
Location
United States
Salary
Salary:
221000.00 - 268000.00 USD / Year
meetdandy.com Logo
Dandy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of software engineering experience, preferably in a high-growth startup environment
  • An expert in Google Cloud Platform and Google Kubernetes Engine
  • Experience with GPU infrastructure and maintaining cloud to client application test parity is strongly preferred
  • Experience in identifying and remediating security vulnerabilities within a cloud environment
  • Experience with building observability platforms (i.e., metrics, logging, and tracing)
  • Experience with infrastructure as code platforms (Terraform, Pulumi)
  • Experience designing the architecture and automation of infrastructure within a cloud environment
  • A collaborative, pragmatic, and growth-oriented mindset
  • The ability to clearly and concisely communicate about complex technical, architectural, and/or organizational problems and propose thorough iterative solutions
  • Experience with performance and optimization problems and a demonstrated ability to both diagnose and prevent these problems
Job Responsibility
Job Responsibility
  • Solve technical problems of the highest scope and complexity for your team
  • Collaborate with stakeholders within the tech org to influence the overall objectives and long-term goals of your team
  • Advocate for improvements to product quality, security, and performance that have a particular impact across your team and others
  • Develop and maintain infrastructure, systems, and tooling to support Dandy’s products in a secure, well-tested, and performant way
  • Reinvent an analog experience and disrupt a legacy industry through novel and scalable system design
  • Collaborate with Product Engineers and other stakeholders within Engineering, Product and Data to maintain a high bar for quality in a fast-paced, iterative environment
  • Advocate for improvements to infrastructure quality, security, and performance
  • Craft code that meets our internal standards for style, maintainability, and best practices
  • Recognize impediments to our efficiency as a team ("technical debt"), propose and implement solutions
What we offer
What we offer
  • Offers Equity
  • Offers Bonus
  • healthcare
  • dental
  • mental health support
  • parental planning resources
  • retirement savings options
  • generous paid time off
  • Fulltime
Read More
Arrow Right

Staff Software Engineer (Frontend), Infra

Staff Software Engineer role in Infrastructure team at Harmonic, a startup disco...
Location
Location
United States , New York
Salary
Salary:
210000.00 - 280000.00 USD / Year
harmonic.ai Logo
Harmonic
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years building frontend applications at scale
  • Deep expertise in React, TypeScript, and modern build tooling
  • Track record of fixing production performance issues at scale
  • Strong opinions on frontend architecture, backed by experience
  • NYC-based, in office 3 days/week
Job Responsibility
Job Responsibility
  • Fix core reliability issues: error boundaries, state management, data consistency
  • Optimize performance: virtualization for large datasets, bundle optimization, render performance
  • Build monitoring and observability to catch issues before users do
  • Establish testing strategies that prevent regressions
  • Create abstractions and patterns that help engineers ship faster without breaking things
  • Drive technical decisions and mentor the team through complex migrations
What we offer
What we offer
  • Top of the line health, dental and vision insurance, with 100% premium covered
  • 401k matching
  • Free lunch in office
  • Monthly team dinner for each office
  • Commuter benefits
  • Fulltime
Read More
Arrow Right

Staff Software Engineer – Developer Experience

Remote or Hybrid: This role can be remote US or hybrid in our Warren MI or Austi...
Location
Location
United States , Warren
Salary
Salary:
Not provided
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in software development
  • Strong proficiency in at least one modern language (e.g., Java, Go, C#, Python, TypeScript) and in designing, building, and operating distributed systems in production
  • Deep experience designing and operating modern CI/CD pipelines and release processes, including build, test, deploy, release strategies, rollback, and environment promotion
  • Hands-on experience with developer platforms and SDLC tooling (e.g., GitHub/GitLab, CI/CD systems, Jira/ADO, artifact repositories, secrets management, feature flags)
  • Solid understanding of at least one major cloud platform (Azure/AWS/GCP) and infrastructure as code (Terraform, Bicep, CloudFormation, etc.)
  • Strong understanding of security, compliance, and governance concerns for developer platforms (RBAC, auditability, data boundaries)
  • Experience with migration and modernization (on-prem → cloud, legacy → modern pipelines, tool consolidation)
Job Responsibility
Job Responsibility
  • Lead technical strategy and delivery for platforms, tools, and workflows that improve developer productivity, quality, and satisfaction across GM
  • Design and implement golden paths (opinionated, paved-road workflows) for shipping software—covering repo structure, CI/CD, testing, security, observability, and deployment
  • Build automation and self-service capabilities that reduce manual toil (e.g., environment provisioning, pipeline setup, guardrail enforcement, standards checks)
  • Embed AI into engineering workflows, including agents and copilots that assist with planning, coding, testing, documentation, and operations
  • Partner with product, security, infra, and application teams to understand pain points and translate them into concrete platform and tooling improvements
  • Act as a multiplier and mentor across the organization, coaching engineers and providing guidance on software engineering, AI-assisted development, and recommended practices
  • Define and track outcome-focused metrics (e.g., lead time for changes, PR cycle time, change failure rate, developer satisfaction) and use data to guide investments
  • Contribute to and influence engineering standards, patterns, and reference architectures used across Core IT and beyond
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Backend (DevEx)

Phantom is the modern money app used by tens of millions around the world. Our p...
Location
Location
Salary
Salary:
200000.00 - 250000.00 USD / Year
phantom.app Logo
Phantom
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven backend engineering experience with distributed systems fundamentals
  • Direct ownership of paved path / service standards or major DevEx initiatives
  • CI/CD pipeline design and optimization for backend deployments
  • Familiarity with Kubernetes, AWS, and Infrastructure-as-Code (Terraform/Pulumi)
  • Strong code quality, testing, and observability practices
  • Hands-on Rust experience
  • Demonstrated interest in open and community-driven platforms
Job Responsibility
Job Responsibility
  • Play a critical role in shaping the technical direction of our backend systems and infra, with a focus on high-volume, high-availability solutions that meet the complex needs of modern financial platforms
What we offer
What we offer
  • Competitive salary and equity
  • Comprehensive insurance (medical/dental/vision) — 100% covered
  • Stipend for your ideal remote set-up
  • Flexible hours and a supportive remote environment
  • Unlimited vacation: Take time when you need it (and we really mean it!)
  • 401(k) retirement plan
  • Monthly wellness benefit
  • Weekly meal benefit
  • Global off-sites
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Voice AI

Nooks is the AI Sales Assistant Platform (ASAP) that automates the busywork so r...
Location
Location
United States , San Francisco
Salary
Salary:
250000.00 - 325000.00 USD / Year
nooks.ai Logo
Nooks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6–10+ years backend/infra engineering with real-time A/V or telephony systems
  • Hands-on experience with WebRTC, SIP, Twilio, or similar stacks
  • Strong foundation in distributed systems and low-latency infra
  • Experience debugging and optimizing QoS (latency, jitter, packet loss)
Job Responsibility
Job Responsibility
  • Architect and improve real-time voice infra (WebRTC/SIP/Twilio)
  • Ensure call quality and latency meet SLA targets globally
  • Build services for advanced features (call transfers, monitoring, recordings)
  • Develop observability and debugging tools for call flows
  • Partner with Product/Support to enable new A/V features and fast issue resolution
  • Own backend infrastructure for real-time audio/video calling (Twilio, Salesfloor, A/V quality, recordings, transcriptions) to deliver a dialer that scales reliably to 10× volume
What we offer
What we offer
  • equity
  • comprehensive health, dental, vision, life and disability insurance coverage
  • hybrid work
  • unlimited paid time off
  • Fulltime
Read More
Arrow Right