This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer, you will collaborate closely with our Data Scientists to develop and deploy machine learning models. Responsibilities include working with PySpark, AWS EMR, and S3 for data processing, designing machine learning pipelines, optimizing pipelines for performance, and managing ETL workflows using Streamsets.
Job Responsibility:
Work in tandem with Data Scientists to design, develop, and implement machine learning pipelines
Utilize PySpark for data processing, transformation, and preparation for model training
Leverage AWS EMR and S3 for scalable and efficient data storage and processing
Implement and manage ETL workflows using Streamsets for data ingestion and transformation
Design and construct pipelines to deliver high-quality training and inference datasets
Collaborate with cross-functional teams to ensure smooth deployment and real-time/near real-time inferencing capabilities
Optimize and fine-tune pipelines for performance, scalability, and reliability
Ensure IAM policies and permissions are appropriately configured for secure data access and management
Implement Spark architecture and optimize Spark jobs for scalable data processing
Requirements:
Proficiency in Advanced SQL (Window functions), Spark Architecture, Pyspark or Scala with Spark, Hadoop
Proven expertise in designing and deploying data pipelines
Strong problem-solving skills and ability to work effectively in a collaborative team environment
Excellent communication skills and ability to translate technical concepts to non-technical stakeholders
Nice to have:
Hands-on experience with Airflow, S3, and Stream sets or similar ETL tools
Understanding of real-time or near real-time inferencing architectures
Basic knowledge on Kafka, AWS IAM, AWS EMR and Snowflake
What we offer:
All positions are open to people with disabilities
Commitment to fighting against all forms of discrimination
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.