This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a highly motivated expert Data Engineer to design and develop advanced data pipelines and solutions for our Manufacturing Applications Product Team. The ideal candidate will be responsible for designing, developing, and optimizing data pipelines, data integration frameworks, and metadata-driven architectures that enable seamless data access and analytics for Manufacturing and Operations use cases. This role requires deep expertise in big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management.
Job Responsibility
Design, develop, and maintain complex ETL/ELT data pipelines in Databricks using PySpark, Scala, and SQL to process large-scale datasets
Build highly efficient data pipelines to migrate and deploy complex data across systems, with an understanding of biotech/pharma/manufacturing or related domains
Design and implement solutions to enable unified data access, governance, and interoperability across hybrid cloud environments
Ingest and transform structured and unstructured data from databases (PostgreSQL, MySQL, SQL Server, MongoDB, etc.), APIs, logs, event streams, images, PDFs, and third-party platforms
Ensure data integrity, accuracy, and consistency through rigorous quality checks and monitoring
Innovate, explore, and implement new tools and technologies to enhance efficient data processing
Proactively identify and implement opportunities to automate tasks and develop reusable frameworks
Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value
Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories
Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle
Collaborate and communicate effectively with product teams and cross-functional teams to understand business requirements and translate them into technical solutions
Requirements
Hands-on experience in data engineering technologies such as Databricks, PySpark, SparkSQL, Apache Spark, AWS, Python, SQL, and Scaled Agile methodologies
Proficiency in workflow orchestration and performance tuning on big data processing
Strong understanding of AWS services
Ability to quickly learn, adapt, and apply new technologies
Strong problem-solving and analytical skills
Excellent communication and teamwork skills
Experience with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices
Experience with streaming technologies such as Apache Kafka, Debezium, or similar platforms for real-time data processing and integration
Master’s /Bachelor’s degree and 5-8 years of Computer Science, IT, or related field experience
Nice to have
Experience with AI assisted code development using tools like GitHub Copilot, Cursor, Claude Code
Data engineering experience in biotechnology or pharma industry
Experience in writing APIs to make data available to consumers
Experience with SQL/NoSQL databases, vector databases for large language models
Experience with data modeling and performance tuning for both OLAP and OLTP databases
Experience with software engineering best practices, including version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven, etc.), automated unit testing, and DevOps
Experience with manufacturing related data sources like SCADA, Data Historians is a plus