This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Sr. System Administrator on the Data Centers team, you'll own both our hardware and cloud infrastructures, with a particular focus on reliability, scalability, and operational excellence. You'll manage the administration and lifecycle of our global server fleet and lead key initiatives in automation, observability, and performance. You'll also take the lead on capacity planning, including physical data center needs, and will be a technical partner to stakeholders across Engineering as we evolve our infrastructure footprint. In addition, you'll evaluate, propose, and implement new project initiatives such as intrusion detection/prevention, system/security hardening, and automating manual processes. This position reports to our Engineering Manager, Data Centers and Networking, and is based in Vancouver, Canada (with flexibility to collaborate across global time zones).
Job Responsibility:
Scout, evaluate, and compare hardware options and colocation facilities, partnering with Engineering to align decisions with performance and cost objectives
Design and deploy a cloud expansion strategy that balances reliability, performance, and efficiency across providers and regions
Steer capacity planning and our expansion/upgrade strategy, using data to anticipate growth and proactively mitigate bottlenecks
Design and deploy servers at scale into data centers around the globe, ensuring consistent standards and automation from day one
Develop and maintain automation for a large fleet of servers, VMs, and containers, reducing toil and improving consistency across environments
Work with vendors to obtain quotes, make purchases, and schedule services, including coordinating logistics for data center installations and maintenance
Set up and evolve monitoring for server, network, and data center health, including alerting, dashboards, and SLO-oriented metrics
Develop and maintain proper documentation for engineering staff, including runbooks, standards, and architectural diagrams
Participate in a rotating on-call schedule within the larger Infrastructure Engineering division, helping drive rapid incident response and robust post-incident reviews
Lead complex systems and network troubleshooting, fault analysis, and resolution, acting as an escalation point for the broader team
Provide technical mentorship to other System Administrators and engineers, sharing best practices around Linux, automation, and operational excellence
Partner with Security and other stakeholders on initiatives such as system hardening, compliance, and intrusion detection/prevention
Occasionally travel for on-site work when remote hands are not available
an active passport is required
Requirements:
Background in Systems and/or Software Engineering, with a strong focus on infrastructure and operations
Extensive experience with Linux, both on-premise and in the cloud, including performance tuning, troubleshooting, and automation at scale
Familiarity with networking technologies: TCP/IP, DHCP, DNS, routing, firewalls, and load balancing concepts
Data center setup/deployment experience, including racking/stacking, cabling standards, and remote management
Exposure to cloud platforms such as GCP or AWS, and experience working in hybrid environments
Demonstrated ability to keep abreast of industry standards and trends, and to translate them into practical improvements in a production environment
Proven experience in a senior or lead capacity (typically 5+ years in systems administration or similar roles), including driving cross-team initiatives and mentoring others
Strong communication skills and the ability to collaborate effectively with distributed teams