CrawlJobs Logo

Senior ML Data Engineer

Poland, Warsaw · Job Posted December 23, 2025
Apply Position
Job Link Share

Job Description

As a Senior Data Engineer, you will play a pivotal role in our AI/ML workstream, you’ll work closely with business teams and data scientists to design, maintain, and improve machine learning applications. Your main responsibilities will include managing existing ML workloads and building new batch and on-demand pipelines to support advanced AI/ML models. You’ll also contribute to developing Generative AI solutions and applications for the emerging Agentic Era. You’ll collaborate with a global team to create scalable data architectures optimised for AI/ML, source and prepare high-quality data, and implement robust ETL processes. You should be comfortable working independently while driving improvements in engineering standards and best practices. As a senior member of the team, you will act as a mentor and advisor for junior engineers and take ownership as a project lead on strategic AI/ML initiatives.

Job Responsibility

  • Design and maintain scalable data pipelines and storage systems for both agentic and traditional ML workloads
  • Productionise LLM- and agent-based workflows, ensuring reliability, observability, and performance
  • Build and maintain feature stores, vector/embedding stores, and core data assets for ML
  • Develop and manage end-to-end traditional ML pipelines: data prep, training, validation, deployment, and monitoring
  • Implement data quality checks, drift detection, and automated retraining processes
  • Optimise cost, latency, and performance across all AI/ML infrastructure
  • Collaborate with data scientists and engineers to deliver production-ready ML and AI systems
  • Ensure AI/ML systems meet governance, security, and compliance requirements
  • Mentor teams and drive innovation across both agentic and classical ML engineering practices
  • Participate in team meetings and contribute to project planning and strategy discussions

Requirements

  • Bachelor or Master’s degree in data science, data engineering, Computer Science with focus on math and statistics / Master’s degree is preferred
  • At least 5 years experience as AI/ML data engineer undertaking above task and accountabilities
  • Strong foundation in computer science principes and statistical methods
  • Strong experience with cloud technology (AWS or Azure)
  • Strong experience with creation of data ingestion pipeline and ET process
  • Strong knowledge of big data tool such as Spark, Databricks and Python
  • Strong understanding of common machine learning techniques and frameworks (e.g. mlflow)
  • Strong knowledge of Natural language processing (NPL) concepts
  • Strong knowledge of scrum practices and agile mindset
  • Strong Analytical and Problem-Solving Skills with attention to data quality and accuracy
  • Clear Communication of technical concepts and effective collaboration across teams
  • Adaptability to New Technologies and a proactive approach to learning and growth
  • Team-Oriented Mindset, working closely with data scientists, AI engineers, and cross-functional teams
  • Openness to Feedback and collective problem-solving for continuous improvement
  • Team player, willing to improve yourself

What we offer

  • Flexi-Week and Work-Life Balance: We prioritise your mental health and well-being, offering you a flexible four-day Flexi-Week at full pay and with no reduction to your annual holiday allowance. We also offer a variety of different paid special leaves as well as volunteer days
  • Remote Working Allowance: You will receive a monthly allowance to cover part of your running costs. In addition, we will support you in setting up your remote workspace appropriately
  • Pension: Awin offers access to an additional pension insurance to all employees in Germany
  • Flexi-Office: We offer an international culture and flexibility through our Flexi-Office and hybrid/remote work possibilities to work across Awin regions
  • Development: We’ve built our extensive training suite Awin Academy to cover a wide range of skills that nurture you professionally and personally, with trainings conveniently packaged together to support your overall development
  • Appreciation: Thank and reward colleagues by sending them a voucher through our peer-to-peer program

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior ML Data Engineer

8 matching positions

Senior ML Data Engineer

As a Senior Data Engineer, you will play a pivotal role in our AI/ML workstream,...
Location
Location
Salary
Salary:
Not provided
awin.com Logo
Awin Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor or Master’s degree in data science, data engineering, Computer Science with focus on math and statistics / Master’s degree is preferred
  • At least 5 years experience as AI/ML data engineer undertaking above task and accountabilities
  • Strong foundation in computer science principes and statistical methods
  • Strong experience with cloud technology (AWS or Azure)
  • Strong experience with creation of data ingestion pipeline and ET process
  • Strong knowledge of big data tool such as Spark, Databricks and Python
  • Strong understanding of common machine learning techniques and frameworks (e.g. mlflow)
  • Strong knowledge of Natural language processing (NPL) concepts
  • Strong knowledge of scrum practices and agile mindset
Job Responsibility
Job Responsibility
  • Design and maintain scalable data pipelines and storage systems for both agentic and traditional ML workloads
  • Productionise LLM- and agent-based workflows, ensuring reliability, observability, and performance
  • Build and maintain feature stores, vector/embedding stores, and core data assets for ML
  • Develop and manage end-to-end traditional ML pipelines: data prep, training, validation, deployment, and monitoring
  • Implement data quality checks, drift detection, and automated retraining processes
  • Optimise cost, latency, and performance across all AI/ML infrastructure
  • Collaborate with data scientists and engineers to deliver production-ready ML and AI systems
  • Ensure AI/ML systems meet governance, security, and compliance requirements
  • Mentor teams and drive innovation across both agentic and classical ML engineering practices
  • Participate in team meetings and contribute to project planning and strategy discussions
What we offer
What we offer
  • Flexi-Week and Work-Life Balance: We prioritise your mental health and well-being, offering you a flexible four-day Flexi-Week at full pay and with no reduction to your annual holiday allowance. We also offer a variety of different paid special leaves as well as volunteer days
  • Remote Working Allowance: You will receive a monthly allowance to cover part of your running costs. In addition, we will support you in setting up your remote workspace appropriately
  • Pension: Awin offers access to an additional pension insurance to all employees in Germany
  • Flexi-Office: We offer an international culture and flexibility through our Flexi-Office and hybrid/remote work possibilities to work across Awin regions
  • Development: We’ve built our extensive training suite Awin Academy to cover a wide range of skills that nurture you professionally and personally, with trainings conveniently packaged together to support your overall development
  • Appreciation: Thank and reward colleagues by sending them a voucher through our peer-to-peer program
Read More
Arrow Right

Senior Software Engineer, ML Data Platform

DUTIES: Develop fast, robust, and spike-resistant data consumption, data mining...
Location
Location
United States , Detroit
Salary
Salary:
216418.50 USD / Year
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Electronic Engineering, Management Information Systems, or related field of study and Five (5) years of experience as a Software Engineer, Programmer Analyst, or related occupation
  • Five (5) years of experience with: Building Peta Byte (PB) scale data management systems
  • Optimizing those data processing clusters for cost efficiency and performance
  • Building serving systems capable of delivering data at high-throughput, low-latency and high QPS (Queries Per Second) in a cost-efficient and spike-resilient manner
  • Building scalable infrastructure on the cloud with Python, Java, or Scala
  • Writing SQL queries for analytic purposes.
Job Responsibility
Job Responsibility
  • Develop fast, robust, and spike-resistant data consumption, data mining, and processing tools for the entire company
  • Develop orchestration for large-scale post-processing, and computational pipelines
  • Participate in the development, optimization and productionization of the next generation data processing platform using Beam and Spark in the cloud
  • Build self-serve capabilities to help customers to adopt the next generation data processing platform
  • Use the latest cloud technologies to own, design, implement, and test scalable distributed data systems in the cloud
  • Champion engineering excellence by continuously improving systems and processes
  • Own technical projects from start to finish, contribute to the team’s product roadmap, and be responsible for major technical decisions and tradeoffs
  • Effectively participate in team’s planning, code reviews and design discussions
  • Consider the effects of projects across multiple teams and proactively manage conflicts
  • Work with partner teams and orgs to achieve cross-organizational goals and satisfy broad requirements
What we offer
What we offer
  • An incentive pay program offers payouts based on company performance, job level, and individual performance
  • Fulltime
Read More
Arrow Right

Senior Staff Data Engineer- ML & AI Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
adevinta.com Logo
Adevinta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience with a specific focus on the intersection of Data Engineering, MLOps, and AI Infrastructure
  • Deep knowledge of Spark internals, structured streaming, and performance tuning for large-scale data processing
  • Proven experience architecting end-to-end ML platforms for Traditional ML (Classic MLOps) while actively enabling the organization on Generative AI concepts
  • Strong background in building automated pipelines and ensuring system observability
  • Practical experience building infrastructure for Large Language Models, including managing the complexity of chaining models and tools
  • Solid experience serving models at low latency and high concurrency using containerized solutions
  • Ability to speak the language of AI/ML Engineers and effectively bridge the gap between experimental code and production systems
  • Expert level Python
  • Experience with PyTorch, Terraform, Terragrunt, Docker, Kubernetes, GitHub Actions, Datadog
  • Experience with Databricks AI Stack: MLflow, Mosaic AI, Unity Catalog, Feature Store, Databricks Model Serving, Vector Databases
Job Responsibility
Job Responsibility
  • Lead the evolution of our Machine Learning & AI Platform, designing the architecture for AI Agents and establishing patterns for Vector Databases
  • Act as a first mover: validate new Databricks features and integrate them into the platform
  • Write the guidelines for GenAI development, helping teams transition from notebook experiments to production-grade LLM applications
  • Design the Feature Store, manage the Model Registry, and set up the infrastructure for Vector Search and RAG (Retrieval Augmented Generation) workflows
  • Elevate the technical bar of the team, mentoring Staff and Senior engineers on design patterns, code quality, and architectural decisions
  • Translate complex requirements from ML Engineers and Data Scientists into robust engineering tickets and infrastructure roadmaps
What we offer
What we offer
  • An attractive Base Salary
  • Participation in our Short Term Incentive plan (annual bonus)
  • Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
  • A 24/7 Employee Assistance Program for you and your family
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer, ML Data Systems

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet t...
Location
Location
United States , Mountain View
Salary
Salary:
137871.00 - 172339.00 USD / Year
khanacademy.org Logo
Khan Academy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
  • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
  • Familiarity with machine learning workflows — from training data preparation to evaluation
  • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
  • Attention to detail and an obsession with data quality and reproducibility
  • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
  • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.
Job Responsibility
Job Responsibility
  • Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
  • Clean, normalize, and enrich data while preserving semantic meaning and consistency
  • Prepare and format datasets for human labeling, and integrate results into ML datasets
  • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
  • Implement automated tests and validation to detect data drift or labeling inconsistencies
  • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
  • Contribute to shared tools and documentation for dataset management and AI evaluation
  • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.
What we offer
What we offer
  • Competitive salaries
  • Ample paid time off as needed
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.
  • Fulltime
Read More
Arrow Right

Senior Staff Data Engineer- ML & AI Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
adevinta.com Logo
Adevinta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience with a specific focus on the intersection of Data Engineering, MLOps, and AI Infrastructure
  • Deep knowledge of Spark internals, structured streaming, and performance tuning for large-scale data processing
  • Proven experience architecting end-to-end ML platforms for Traditional ML (Classic MLOps) while actively enabling the organization on Generative AI concepts
  • Strong background in building automated pipelines and ensuring system observability
  • Practical experience building infrastructure for Large Language Models, including managing the complexity of chaining models and tools
  • Solid experience serving models at low latency and high concurrency using containerized solutions
  • Ability to speak the language of AI/ML Engineers and effectively bridge the gap between experimental code and production systems
Job Responsibility
Job Responsibility
  • Lead the evolution of our Machine Learning & AI Platform, designing the architecture for AI Agents and establishing patterns for Vector Databases
  • Act as a first mover, validate new Databricks features and integrate them into the platform
  • Write the guidelines for GenAI development, helping teams transition from notebook experiments to production-grade LLM applications
  • Design the Feature Store, manage the Model Registry, and set up the infrastructure for Vector Search and RAG (Retrieval Augmented Generation) workflows
  • Elevate the technical bar of the team, mentoring Staff and Senior engineers on design patterns, code quality, and architectural decisions
  • Translate complex requirements from ML Engineers and Data Scientists into robust engineering tickets and infrastructure roadmaps
What we offer
What we offer
  • An attractive Base Salary
  • Participation in our Short Term Incentive plan (annual bonus)
  • Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
  • A 24/7 Employee Assistance Program for you and your family
  • A collaborative environment with an opportunity to explore your potential and grow
  • Fulltime
Read More
Arrow Right

Senior Data Scientist / ML Engineer

We have a dream: to change industries through the power of digital technology. W...
Location
Location
Bulgaria;Colombia;Croatia;Poland;Portugal;Spain;Ukraine
Salary
Salary:
Not provided
Intellias
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Education: Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, Data Science, or a closely related quantitative field
  • Experience: 5+ years of professional experience in machine learning engineering, AI development, or a closely related role
  • Machine Learning & Statistics: Solid understanding of classical ML algorithms (e.g., tree-based models, SVMs, clustering, ensemble methods), feature engineering, model evaluation metrics, and statistical methods (hypothesis testing, regression analysis, probability distributions)
  • LLM Expertise: Demonstrated project experience with large language models, including prompt engineering and prompt management strategies, LLM application development (end-to-end), fine-tuning of large language models, retrieval-augmented generation (RAG) pipeline design and implementation, practical experience with vector stores such as ChromaDB, pgvector, and PostgreSQL
  • AI Agents: Hands-on experience building AI agents and multi-agent systems using frameworks such as LangChain, LangGraph, CrewAI, or similar orchestration frameworks
  • Programming: Proficiency in Python with a strong emphasis on writing clean, maintainable, production-quality code. Familiarity with software engineering best practices (testing, code review, documentation)
  • Cloud: Practical experience with Google Cloud Platform (GCP) services for ML workloads (e.g., Vertex AI, Cloud Run, GCS, BigQuery, Compute Engine)
  • DevOps & MLOps: Docker: Proficiency in containerization — building, managing, and deploying Docker images and containers. GitLab: Proficient GitLab skills for version control, merge request workflows, and repository management
  • API Development: Experience with FastAPI, including request validation, async handling, and integration with ML model serving
  • Soft Skills: Excellent communication skills, Strong work ethic and high personal accountability, Ownership mentality — takes full responsibility for deliverables and outcomes, Proactive, self-starting approach to identifying problems and driving project success without waiting for direction
Job Responsibility
Job Responsibility
  • Drive/Participate the ideation, development, and execution of POCs and AI related project
  • Develop and implement machine learning models, algorithms, and data-driven solutions to address complex business problems
  • Collaborate cross-functionally with engineering, product management, and other relevant teams to integrate data-driven functionalities into our products
  • Fulltime
Read More
Arrow Right

Senior Speech & Audio Biomarkers ML Engineer / Data Scientist / LLM Researcher

Adalyon is transforming clinical trials with a behavioural-intelligence platform...
Location
Location
Finland
Salary
Salary:
Not provided
life-science-talent-solutions.dk Logo
Life Science Talent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree PhD, postdoctoral experience, or equivalent research depth in speech technology, audio signal processing, acoustics, machine learning, data science, computational linguistics, or a related field
  • Audio and NLP experience – You have built systems that process raw audio and transcripts to derive actionable insights. Familiarity with prosodic and spectral features, and the ability to engineer features like jitter, shimmer and harmonic-to-noise ratio, which have been shown to correlate with cognitive and emotional conditions
  • Speech processing toolkits: Experience with speech processing toolkits (e.g., librosa, Kaldi, Praat) and ML frameworks (PyTorch, TensorFlow, scikit-learn) is essential
  • LLM expertise – Hands-on experience with large language models, including prompting, fine-tuning and integrating them into downstream ML pipelines. Ability to interpret and control LLM outputs to ensure transparency and reproducibility, avoiding the unpredictable behaviour of generic LLMs
  • Startup mindset – Comfortable working in an agile, evolving environment. You take initiative, think creatively and can operate with limited structure. You thrive when delivering an MVP while planning for scalable solutions
  • Practical programming ability, ideally in Python and relevant scientific/data tooling. You do not need to be a software engineer, but you must be able to build the systems and pipelines needed for your research.
Job Responsibility
Job Responsibility
  • Conversational design & data pipeline
  • Signal processing & feature extraction
  • Model development & integration
  • Validation & evidence generation
  • Research & innovation
What we offer
What we offer
  • A competitive salary package that reflects your experience and the value you create
  • The opportunity to work with advanced AI, acoustic analysis, and speech-based biomarker technology at an early stage
  • A central and highly influential role with direct access to research and technology leadership
  • High autonomy, high visibility, and the opportunity to shape the scientific foundation of a growing company
  • A dynamic and flexible startup environment with room for deep technical discussion, scientific exploration, and practical impact
  • Fulltime
Read More
Arrow Right

Senior Engineer - Data Engineer

Position - Data Engineer
Location
Location
India , Indore; Ahmedabad
Salary
Salary:
Not provided
arrow.com Logo
Arrow Electronics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience: 4 to 8 years in software/data engineering
  • Proficiency in SQL, NoSQL databases (e.g., DynamoDB, MongoDB), ETL tools, and data warehousing solutions
  • Proficiency in Python is a must
  • Cloud Platforms: Azure, AWS (e.g., EC2, S3, RDS) or GCP
  • Experience with data visualization tools (e.g., Tableau, Power BI, Looker)
  • Knowledge of data governance and security practices
  • Experience with DevOps practices, including CI/CD pipelines and containerization (Docker, Kubernetes)
  • Excellent verbal and written communication skills in English
  • Experience working in Agile development environments
  • Understanding of AI and ML concepts, frameworks (e.g., TensorFlow, PyTorch), and practical applications
Job Responsibility
Job Responsibility
  • Design and development of real time software and Cloud/Web/mobile based software application
  • Analyze domain specific technical, high level or low level requirement and modification as per end customer or system requirement
  • Perform software testing including unit, functional and system level requirement including manual and automated
  • Perform code review following coding guidelines and static code analysis & troubleshoots software problems of limited difficulty
  • Document technical deliverable like software specifications, design document, code commenting, test cases and test report, Release note etc. throughout the project life cycle
  • Develop software solutions from established programming languages or by learning new language required for specific project
  • Fulltime
Read More
Arrow Right