This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Design, develop, and execute Data Pipelines and test cases to ensure data integrity and quality
Develop, implement, and optimize data pipelines that integrate Amazon S3 for scalable data storage, retrieval, and processing within ETL workflows
Leverage Amazon S3 for data storage, retrieval, and management within ETL workflows, including the ability to write scripts for data transfer between S3 and other systems
Utilize Amazon S3's advanced features such as versioning, lifecycle policies, access controls, and server-side encryption to ensure secure and efficient data management
Write, maintain, and troubleshoot scripts or code (using PySpark, Shell, or similar languages) to automate data movement between Amazon S3 and other platforms, ensuring high performance and reliability
Collaborate with cross-functional teams to troubleshoot and resolve data-related issues, utilizing Amazon S3 features such as versioning, lifecycle policies, and access management
Document ETL processes, maintain technical documentation, and ensure best practices are followed for data stored in Amazon S3 environments
Validate HiveQL, HDFS file structures, and data processing within the Hadoop cluster
Knowledge in Metadata dependent ETL process and batch/job framework