This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Ever since we started in 2007, Sunrun has been at the forefront of connecting people to the cleanest energy on Earth. It’s why we’ve become the #1 home solar and battery company in America. Today, we’re on a mission to change the way the world interacts with energy, and we’re building a company and brand that puts power at the center of life. And we’re doing it by designing a dynamic culture where employee development, well-being, and safety come first. We’re unlike any other solar company. Our vertically integrated model gives us total control over every part of the energy lifecycle – from sale through installation and beyond – so you can find endless opportunities for growth. Come join a career you can grow in and a culture you can run with. This position is primarily remote, with occasional visits to a local office or our corporate headquarters for team-building, training, and collaborative project work. These on-site sessions are designed to strengthen connections, share insights, and ensure a seamless experience for our team and customers. Equipment pick-up from a local branch will be required. We will provide advance notice whenever on-site attendance is required, making these times purposeful and rewarding.
Job Responsibility:
Provide strategic leadership in designing, implementing, and managing the overall infrastructure strategy for our organization
Leverage cloud platforms (e.g., AWS, Azure) to design, deploy, and manage scalable infrastructure solutions
Spearhead the definition of advanced monitoring requirements and elevate SLAs
Collaborate with the engineering team and TPM to implement and enhance monitoring practices
Expertly convey intricate technical information to diverse stakeholders with clarity and precision
Provide leadership in integrating advanced SRE principles into applications and services
Lead the implementation of sophisticated system design measures for heightened security, performance, and resiliency
Develop strategic notification strategies for production outages
Leverage SLOs and SLIs to measure and optimize availability, latency, and response time
Lead and strategize emergency response efforts, conduct retrospectives with RCA, and manage on-call workloads effectively
Oversee the holistic health of the production environment, emphasizing availability, and proactive monitoring
Drive advanced practices in application performance, capacity testing, and auto-scaling
Spearhead innovative support and release strategies in collaboration with cross-functional teams
Lead initiatives to elevate services through advanced testing and release procedures
Champion exemplary documentation practices for actions, findings, and automation procedures
Identify and lead initiatives for advanced automation solutions
Collaborate closely with engineering and product counterparts to strategically influence improved resiliency and reliability
Identify and lead major projects for substantial enhancements in reliability, cost savings, and revenue
Drive strategic efforts in efficiency and capacity planning
Establish and communicate clear requirements while optimizing system resource usage
Requirements:
Bachelor’s in Computer Information Systems, Software Engineering or closely related
5 years of experience as a Software Developer using Microservices hosted in Azure
5 years of experience with Virtualization and cloud computing
5 years of experience with Object Oriented Design (OOD) & and Object-Oriented Programming (OOP)
5 years of experience building software solutions in an engineering environment using Python & Shell scripting
5 years of experience with Network analysis, debugging and troubleshooting with Wireshark & Git