Senior ML Data Engineer Job at Awin Global (Warsaw)

Senior ML Data Engineer

As a Senior Data Engineer, you will play a pivotal role in our AI/ML workstream,...

Location

Salary:

Not provided

Awin Global

Expiration Date

Until further notice

Requirements

Bachelor or Master’s degree in data science, data engineering, Computer Science with focus on math and statistics / Master’s degree is preferred
At least 5 years experience as AI/ML data engineer undertaking above task and accountabilities
Strong foundation in computer science principes and statistical methods
Strong experience with cloud technology (AWS or Azure)
Strong experience with creation of data ingestion pipeline and ET process
Strong knowledge of big data tool such as Spark, Databricks and Python
Strong understanding of common machine learning techniques and frameworks (e.g. mlflow)
Strong knowledge of Natural language processing (NPL) concepts
Strong knowledge of scrum practices and agile mindset

Job Responsibility

Design and maintain scalable data pipelines and storage systems for both agentic and traditional ML workloads
Productionise LLM- and agent-based workflows, ensuring reliability, observability, and performance
Build and maintain feature stores, vector/embedding stores, and core data assets for ML
Develop and manage end-to-end traditional ML pipelines: data prep, training, validation, deployment, and monitoring
Implement data quality checks, drift detection, and automated retraining processes
Optimise cost, latency, and performance across all AI/ML infrastructure
Collaborate with data scientists and engineers to deliver production-ready ML and AI systems
Ensure AI/ML systems meet governance, security, and compliance requirements
Mentor teams and drive innovation across both agentic and classical ML engineering practices
Participate in team meetings and contribute to project planning and strategy discussions

What we offer

Flexi-Week and Work-Life Balance: We prioritise your mental health and well-being, offering you a flexible four-day Flexi-Week at full pay and with no reduction to your annual holiday allowance. We also offer a variety of different paid special leaves as well as volunteer days
Remote Working Allowance: You will receive a monthly allowance to cover part of your running costs. In addition, we will support you in setting up your remote workspace appropriately
Pension: Awin offers access to an additional pension insurance to all employees in Germany
Flexi-Office: We offer an international culture and flexibility through our Flexi-Office and hybrid/remote work possibilities to work across Awin regions
Development: We’ve built our extensive training suite Awin Academy to cover a wide range of skills that nurture you professionally and personally, with trainings conveniently packaged together to support your overall development
Appreciation: Thank and reward colleagues by sending them a voucher through our peer-to-peer program

Senior Software Engineer, ML Data Platform

DUTIES: Develop fast, robust, and spike-resistant data consumption, data mining...

Location

United States , Detroit

Salary:

216418.50 USD / Year

General Motors

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, Electronic Engineering, Management Information Systems, or related field of study and Five (5) years of experience as a Software Engineer, Programmer Analyst, or related occupation
Five (5) years of experience with: Building Peta Byte (PB) scale data management systems
Optimizing those data processing clusters for cost efficiency and performance
Building serving systems capable of delivering data at high-throughput, low-latency and high QPS (Queries Per Second) in a cost-efficient and spike-resilient manner
Building scalable infrastructure on the cloud with Python, Java, or Scala
Writing SQL queries for analytic purposes.

Job Responsibility

Develop fast, robust, and spike-resistant data consumption, data mining, and processing tools for the entire company
Develop orchestration for large-scale post-processing, and computational pipelines
Participate in the development, optimization and productionization of the next generation data processing platform using Beam and Spark in the cloud
Build self-serve capabilities to help customers to adopt the next generation data processing platform
Use the latest cloud technologies to own, design, implement, and test scalable distributed data systems in the cloud
Champion engineering excellence by continuously improving systems and processes
Own technical projects from start to finish, contribute to the team’s product roadmap, and be responsible for major technical decisions and tradeoffs
Effectively participate in team’s planning, code reviews and design discussions
Consider the effects of projects across multiple teams and proactively manage conflicts
Work with partner teams and orgs to achieve cross-organizational goals and satisfy broad requirements

What we offer

An incentive pay program offers payouts based on company performance, job level, and individual performance

Fulltime

Senior Staff Data Engineer- ML & AI Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...

Location

Netherlands , Amsterdam

Salary:

Not provided

Adevinta

Expiration Date

Until further notice

Requirements

10+ years of experience with a specific focus on the intersection of Data Engineering, MLOps, and AI Infrastructure
Deep knowledge of Spark internals, structured streaming, and performance tuning for large-scale data processing
Proven experience architecting end-to-end ML platforms for Traditional ML (Classic MLOps) while actively enabling the organization on Generative AI concepts
Strong background in building automated pipelines and ensuring system observability
Practical experience building infrastructure for Large Language Models, including managing the complexity of chaining models and tools
Solid experience serving models at low latency and high concurrency using containerized solutions
Ability to speak the language of AI/ML Engineers and effectively bridge the gap between experimental code and production systems
Expert level Python
Experience with PyTorch, Terraform, Terragrunt, Docker, Kubernetes, GitHub Actions, Datadog
Experience with Databricks AI Stack: MLflow, Mosaic AI, Unity Catalog, Feature Store, Databricks Model Serving, Vector Databases

Job Responsibility

Lead the evolution of our Machine Learning & AI Platform, designing the architecture for AI Agents and establishing patterns for Vector Databases
Act as a first mover: validate new Databricks features and integrate them into the platform
Write the guidelines for GenAI development, helping teams transition from notebook experiments to production-grade LLM applications
Design the Feature Store, manage the Model Registry, and set up the infrastructure for Vector Search and RAG (Retrieval Augmented Generation) workflows
Elevate the technical bar of the team, mentoring Staff and Senior engineers on design patterns, code quality, and architectural decisions
Translate complex requirements from ML Engineers and Data Scientists into robust engineering tickets and infrastructure roadmaps

What we offer

An attractive Base Salary
Participation in our Short Term Incentive plan (annual bonus)
Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
A 24/7 Employee Assistance Program for you and your family

Fulltime

Senior Platform Engineer, ML Data Systems

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet t...

Location

United States , Mountain View

Salary:

137871.00 - 172339.00 USD / Year

Khan Academy

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
Familiarity with machine learning workflows — from training data preparation to evaluation
Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
Attention to detail and an obsession with data quality and reproducibility
Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.

Job Responsibility

Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
Clean, normalize, and enrich data while preserving semantic meaning and consistency
Prepare and format datasets for human labeling, and integrate results into ML datasets
Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
Implement automated tests and validation to detect data drift or labeling inconsistencies
Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
Contribute to shared tools and documentation for dataset management and AI evaluation
Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.

What we offer

Competitive salaries
Ample paid time off as needed
8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
Generous parental leave
An exceptional team that trusts you and gives you the freedom to do your best
The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
Opportunities to connect through affinity, ally, and social groups
401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.

Fulltime

Senior Staff Data Engineer- ML & AI Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...

Location

Netherlands , Amsterdam

Salary:

Not provided

Adevinta

Expiration Date

Until further notice

Requirements

10+ years of experience with a specific focus on the intersection of Data Engineering, MLOps, and AI Infrastructure
Deep knowledge of Spark internals, structured streaming, and performance tuning for large-scale data processing
Proven experience architecting end-to-end ML platforms for Traditional ML (Classic MLOps) while actively enabling the organization on Generative AI concepts
Strong background in building automated pipelines and ensuring system observability
Practical experience building infrastructure for Large Language Models, including managing the complexity of chaining models and tools
Solid experience serving models at low latency and high concurrency using containerized solutions
Ability to speak the language of AI/ML Engineers and effectively bridge the gap between experimental code and production systems

Job Responsibility

Lead the evolution of our Machine Learning & AI Platform, designing the architecture for AI Agents and establishing patterns for Vector Databases
Act as a first mover, validate new Databricks features and integrate them into the platform
Write the guidelines for GenAI development, helping teams transition from notebook experiments to production-grade LLM applications
Design the Feature Store, manage the Model Registry, and set up the infrastructure for Vector Search and RAG (Retrieval Augmented Generation) workflows
Elevate the technical bar of the team, mentoring Staff and Senior engineers on design patterns, code quality, and architectural decisions
Translate complex requirements from ML Engineers and Data Scientists into robust engineering tickets and infrastructure roadmaps

What we offer

An attractive Base Salary
Participation in our Short Term Incentive plan (annual bonus)
Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
A 24/7 Employee Assistance Program for you and your family
A collaborative environment with an opportunity to explore your potential and grow

Fulltime

Senior Data Scientist / ML Engineer

We have a dream: to change industries through the power of digital technology. W...

Location

Bulgaria;Colombia;Croatia;Poland;Portugal;Spain;Ukraine

Salary:

Not provided

Intellias

Expiration Date

Until further notice

Requirements

Education: Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, Data Science, or a closely related quantitative field
Experience: 5+ years of professional experience in machine learning engineering, AI development, or a closely related role
Machine Learning & Statistics: Solid understanding of classical ML algorithms (e.g., tree-based models, SVMs, clustering, ensemble methods), feature engineering, model evaluation metrics, and statistical methods (hypothesis testing, regression analysis, probability distributions)
LLM Expertise: Demonstrated project experience with large language models, including prompt engineering and prompt management strategies, LLM application development (end-to-end), fine-tuning of large language models, retrieval-augmented generation (RAG) pipeline design and implementation, practical experience with vector stores such as ChromaDB, pgvector, and PostgreSQL
AI Agents: Hands-on experience building AI agents and multi-agent systems using frameworks such as LangChain, LangGraph, CrewAI, or similar orchestration frameworks
Programming: Proficiency in Python with a strong emphasis on writing clean, maintainable, production-quality code. Familiarity with software engineering best practices (testing, code review, documentation)
Cloud: Practical experience with Google Cloud Platform (GCP) services for ML workloads (e.g., Vertex AI, Cloud Run, GCS, BigQuery, Compute Engine)
DevOps & MLOps: Docker: Proficiency in containerization — building, managing, and deploying Docker images and containers. GitLab: Proficient GitLab skills for version control, merge request workflows, and repository management
API Development: Experience with FastAPI, including request validation, async handling, and integration with ML model serving
Soft Skills: Excellent communication skills, Strong work ethic and high personal accountability, Ownership mentality — takes full responsibility for deliverables and outcomes, Proactive, self-starting approach to identifying problems and driving project success without waiting for direction

Job Responsibility

Drive/Participate the ideation, development, and execution of POCs and AI related project
Develop and implement machine learning models, algorithms, and data-driven solutions to address complex business problems
Collaborate cross-functionally with engineering, product management, and other relevant teams to integrate data-driven functionalities into our products

Fulltime

Senior Speech & Audio Biomarkers ML Engineer / Data Scientist / LLM Researcher

Adalyon is transforming clinical trials with a behavioural-intelligence platform...

Location

Finland

Salary:

Not provided

Life Science Talent

Expiration Date

Until further notice

Requirements

Advanced degree PhD, postdoctoral experience, or equivalent research depth in speech technology, audio signal processing, acoustics, machine learning, data science, computational linguistics, or a related field
Audio and NLP experience – You have built systems that process raw audio and transcripts to derive actionable insights. Familiarity with prosodic and spectral features, and the ability to engineer features like jitter, shimmer and harmonic-to-noise ratio, which have been shown to correlate with cognitive and emotional conditions
Speech processing toolkits: Experience with speech processing toolkits (e.g., librosa, Kaldi, Praat) and ML frameworks (PyTorch, TensorFlow, scikit-learn) is essential
LLM expertise – Hands-on experience with large language models, including prompting, fine-tuning and integrating them into downstream ML pipelines. Ability to interpret and control LLM outputs to ensure transparency and reproducibility, avoiding the unpredictable behaviour of generic LLMs
Startup mindset – Comfortable working in an agile, evolving environment. You take initiative, think creatively and can operate with limited structure. You thrive when delivering an MVP while planning for scalable solutions
Practical programming ability, ideally in Python and relevant scientific/data tooling. You do not need to be a software engineer, but you must be able to build the systems and pipelines needed for your research.

Job Responsibility

Conversational design & data pipeline
Signal processing & feature extraction
Model development & integration
Validation & evidence generation
Research & innovation

What we offer

A competitive salary package that reflects your experience and the value you create
The opportunity to work with advanced AI, acoustic analysis, and speech-based biomarker technology at an early stage
A central and highly influential role with direct access to research and technology leadership
High autonomy, high visibility, and the opportunity to shape the scientific foundation of a growing company
A dynamic and flexible startup environment with room for deep technical discussion, scientific exploration, and practical impact

Fulltime

Senior Engineer - Data Engineer

Position - Data Engineer

Location

India , Indore; Ahmedabad

Salary:

Not provided

Arrow Electronics

Expiration Date

Until further notice

Requirements

Experience: 4 to 8 years in software/data engineering
Proficiency in SQL, NoSQL databases (e.g., DynamoDB, MongoDB), ETL tools, and data warehousing solutions
Proficiency in Python is a must
Cloud Platforms: Azure, AWS (e.g., EC2, S3, RDS) or GCP
Experience with data visualization tools (e.g., Tableau, Power BI, Looker)
Knowledge of data governance and security practices
Experience with DevOps practices, including CI/CD pipelines and containerization (Docker, Kubernetes)
Excellent verbal and written communication skills in English
Experience working in Agile development environments
Understanding of AI and ML concepts, frameworks (e.g., TensorFlow, PyTorch), and practical applications

Job Responsibility

Design and development of real time software and Cloud/Web/mobile based software application
Analyze domain specific technical, high level or low level requirement and modification as per end customer or system requirement
Perform software testing including unit, functional and system level requirement including manual and automated
Perform code review following coding guidelines and static code analysis & troubleshoots software problems of limited difficulty
Document technical deliverable like software specifications, design document, code commenting, test cases and test report, Release note etc. throughout the project life cycle
Develop software solutions from established programming languages or by learning new language required for specific project

Fulltime

Select Country

Senior ML Data Engineer

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?

Senior ML Data Engineer

Senior ML Data Engineer

Senior Software Engineer, ML Data Platform

Senior Staff Data Engineer- ML & AI Platform

Senior Platform Engineer, ML Data Systems

Senior Staff Data Engineer- ML & AI Platform

Senior Data Scientist / ML Engineer

Senior Speech & Audio Biomarkers ML Engineer / Data Scientist / LLM Researcher

Senior Engineer - Data Engineer

Our AI answers in your language