Job Description:
As a Principal Software Engineer, you will be a technical leader and hands-on contributor, designing and optimizing high-scale, distributed storage systems built on AWS storage technologies. You will play a pivotal role in shaping the architecture, performance, and reliability of backend storage solutions that power critical applications at scale. Your primary responsibilities will include designing, implementing, and optimizing backend storage services that support high throughput, low latency, and fault tolerance. You will work closely with senior engineers, architects, and cross-functional teams to drive scalability, availability, and efficiency improvements in large-scale storage solutions. You will also lead technical deep dives, architecture reviews, and root cause analyses to resolve complex production issues related to storage performance, consistency, and durability. As a thought leader, you will drive best practices in distributed system design, security, and cloud cost optimization. You will also mentor senior engineers, contribute to technical roadmaps, and help shape the long-term storage strategy. Your expertise in storage consistency models, data partitioning, indexing, and caching strategies will be instrumental in improving system performance and reliability. Additionally, you will collaborate with Site Reliability Engineers (SREs) to implement observability, monitoring, and disaster recovery strategies, ensuring high availability and compliance with industry standards. You will advocate for automation, Infrastructure-as-Code (IaC), and DevOps best practices, leveraging tools like Terraform, AWS CloudFormation, Kubernetes (EKS), and CI/CD pipelines to enable scalable deployments and operational excellence.