This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Create and maintain optimal data pipeline architecture
assemble large, complex data sets that meet functional / non-functional requirements.
Design the right schema to support the functional requirement and consumption patter.
Design and build production data pipelines from ingestion to consumption.
Create necessary preprocessing and postprocessing for various forms of data for training/ retraining and inference ingestions as required.
Create data visualization and business intelligence tools for stakeholders and data scientists for necessary business/ solution insights.
Identify, design, and implement internal process improvements: automating manual data processes, optimizing data delivery, etc.
Ensure our data is separated and secure across national boundaries through multiple data centers.
Requirements:
You should have a bachelors or master’s degree in computer science, Information Technology or other quantitative fields
You should have at least 8 years working as a data engineer in supporting large data transformation initiatives related to machine learning, with experience in building and optimizing pipelines and data sets
Strong analytic skills related to working with unstructured datasets.
Experience with Azure cloud services, ADF, ADLS, HDInsight, Data Bricks, App Insights etc
Experience in handling ETL’s using Spark.
Experience with object-oriented/object function scripting languages: Python, Pyspark, etc.
Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
You should be a good team player and committed for the success of team and overall project.