This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As an Engineering Manager for the Infrastructure team, you’ll lead the engineers responsible for keeping Apollo’s systems fast, reliable, and scalable as we serve millions of daily users and process billions of data points. You’ll work at the intersection of platform engineering, SRE, observability, and developer productivity, ensuring that our foundation can support Apollo’s AI-native evolution and rapid growth.
Job Responsibility:
Lead, coach, and grow a distributed team of high-impact Infrastructure Engineers
Partner with senior engineering leadership on strategic initiatives such as cloud migration, infrastructure scaling, platform reliability, and cost efficiency
Define and implement modern operational excellence practices, including SLOs, error budgets, incident reviews, and performance monitoring
Guide technical decision-making across key areas like Kubernetes, GCP, observability, networking, CI/CD, and IaC (Terraform, Ansible)
Collaborate with AI, Data, and Product Engineering teams to ensure infrastructure scalability for ML and AI-native workloads
Run effective 1:1s, career development conversations, and quarterly performance reviews
Support recruiting efforts to attract top engineering talent across time zones
Requirements:
5+ years of hands-on software or infrastructure engineering experience
2+ years of experience leading teams of senior and staff-level engineers in platform, SRE, or infrastructure domains
Proven ability to design and operate large-scale distributed systems in cloud environments (preferably GCP or AWS)
Expertise with Kubernetes, Docker, Terraform, Ubuntu, and CI/CD pipelines
Familiarity with observability tools (Grafana, Prometheus, ELK, Datadog, NewRelic) and performance tuning
Strong grounding in networking, security, and reliability principles
Experience managing infrastructure costs, availability SLAs, and high-throughput systems at scale
Nice to have:
Experience with AI/ML infrastructure, data pipelines, MongoDB, Ruby on Rails, Ansible, or ElasticSearch
What we offer:
Equity
Company bonus or sales commissions/bonuses
401(k) plan
At least 10 paid holidays per year
Flex PTO
Parental leave
Employee assistance program and wellbeing benefits
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.