This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The ideal candidate will ensure smooth operation, performance, and stability of large-scale distributed data processing jobs and applications deployed in AWS environment.
Job Responsibility:
Monitor data integration (data lake), troubleshoot, and resolve issues in real-time
Investigate and debug data processing failures and performance bottlenecks
Maintain and support ETL/ELT pipelines built on tools such as Spark, Scala, Hive and Glue
Ensure data quality, consistency, and availability across pipelines and storage systems like S3, Redshift, MySQL or Snowflake
Perform root cause analysis, identify and analyze data discrepancies if any
Implement and monitor automated workflows using AWS tools
Analyze and optimize job performance by tuning Spark/Hive configurations and improving query efficiency
Identify and address inefficiencies in data storage and access patterns
Set up and manage monitoring tools (e.g., AWS CloudWatch, Datadog, or Prometheus) to track system health and performance
Develop alerting mechanisms and dashboards for proactive issue identification
Provide daily/weekly monitoring reports on 'job status' and alert on any long running/resource consuming issues
Collaborate with business users and development team(s)
Maintain comprehensive documentation (troubleshooting guides, operational workflows, and best practices)
Requirements:
Hands-on experience with Spark, Scala, Hive
Experience on Kafka, NiFi, various Amazon Web Service (AWS) tools
Familiarity with data loading tools like Talend
Familiarity with cloud database like AWS Redshift, Aurora MySQL and PostgreSQL
Knowledge of workflow/schedulers like Oozie
Strong knowledge of Shell Scripting, python or Java for scripting and automation
Familiarity with SQL and query optimization techniques
Experience in production support & operations management
Ability to analyze logs, diagnose issues, and implement fixes in high-pressure scenarios
5 to 15 years total IT experience
Bachelor’s degree in computer science, Engineering, or a related field
Nice to have:
Knowledge of data governance, security, and compliance in cloud environments
Certifications in AWS (e.g., AWS Certified Big Data Specialty or AWS Certified Solutions Architect)