This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
With this role you’ll be at the heart of an exciting journey, crafting tools and patterns that are state-of-the-art and transformative. We are the catalysts, enabling the creation and collaboration of cutting-edge ML and AI technologies. Our work is pivotal in shaping the company’s future, empowering teams across the organisation to explore, innovate, and redefine the landscape of media. Our team is building out new tools and capabilities to accelerate data science activities and the development of ML/GenAI applications. We enable teams across the company to build, collaborate on, manage, and maintain their machine learning platforms at scale. You will play a key role in driving our ambition to build an outstanding software engineering team, environment, and culture. We are looking for a Principal Engineer to join our tech community to drive this transformation, build a modern digital ecosystem using exciting technologies and do the best work of their careers.
Job Responsibility:
Lead the design, development, and evolution of robust tooling and platforms to support scalable Data Science, MLOps, and LLMOps workflows across the organisation
Drive strategy and execution for deploying, serving, and monitoring large language models (LLMs) in real-time and batch environments using Amazon SageMaker, Bedrock, and related services
Guide the use of Infrastructure-as-Code (IaC) practices with AWS CDK and CloudFormation to provision and manage secure and maintainable cloud environments
Design and support CI/CD pipelines using GitHub Actions, AWS CodePipeline, Jenkins, and other tools, with an emphasis on reliability, reusability, and performance
Contribute to the design and integration of monitoring and observability solutions (CloudWatch, Prometheus, Grafana) to ensure infrastructure and model health
Champion software engineering excellence through Test-Driven Development (TDD), rigorous test automation, and continuous quality assurance practices
Support architectural decisions for scalable and maintainable systems, collaborating with engineering and product stakeholders to align with business and technical goals
Partner with architects, product leaders, and stakeholders to shape the long-term technical vision and system architecture
Apply and advocate for security best practices across the software development lifecycle using AWS-native tools and DevSecOps principles
Cultivate a high-performing engineering culture through mentorship, knowledge sharing, and thought leadership via deep dives, brown bags, internal tech talks, and cross-team collaboration
Requirements:
Extensive experience in DevOps/MLOps roles with demonstrated impact in building, scaling, and securing ML/AI infrastructure in cloud-native environments
Strong experience with AWS services such as SageMaker, Bedrock, S3, EC2, Lambda, IAM, VPC, ECS/EKS, with a strong command of cloud solution architecture
Advanced proficiency in Infrastructure-as-Code practices using AWS CDK, CloudFormation, or Terraform in production environments
Proven track record designing and operationalised end-to-end MLOps pipelines with tools such as MLflow, SageMaker Pipelines, or equivalent frameworks
Extensive experience building and operating containerised applications using Docker and Kubernetes, including production-grade orchestration and monitoring
Deep experience with CI/CD best practices with hands-on expertise in GitHub Actions, Jenkins, and GitOps workflows
Strong knowledge of advanced DevOps concepts, including progressive delivery strategies (blue/green, canary), resilience engineering, and performance optimisation
Deep understanding of cloud security, governance, and compliance, with the ability to define and implement scalable security frameworks
Proven ability to drive cross-functional technical initiatives, influence without authority, and deliver results through collaboration and alignment
In-dept understanding of the ML lifecycle, with practical experience deploying and managing LLMs and generative AI models in production