CrawlJobs Logo

Ml / Ai Data Engineer

techholding.co Logo

Tech Holding

Location Icon

Location:
India

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are looking for a highly skilled Senior ML / Data Pipeline Engineer who can translate complex machine learning and multimodal concepts into scalable, production-ready pipelines and workflows. This role focuses on building and optimising large-scale video and multimodal data systems, enabling high-throughput ingestion, processing, and model training across distributed cloud environments.

Job Responsibility:

  • Design, deploy, and scale large-scale ML and data processing pipelines across cloud infrastructure
  • Build systems to ingest, process, and serve 250,000+ hours of multimodal data (video, audio, metadata)
  • Architect and optimize GPU-based compute environments (e.g., NVIDIA Tesla clusters) for distributed training and inference
  • Develop high-throughput backend systems for video ingestion from desktop and mobile platforms
  • Implement distributed processing workflows, including job scheduling, fault tolerance, and resource allocation
  • Design and build human-in-the-loop and automated annotation systems to ensure data quality and scalability
  • Translate ML and multimodal research into scalable, production-grade cloud architectures
  • Optimize pipelines for performance, reliability, and cost efficiency across compute, storage, and networking layers
  • Collaborate with ML, data, and engineering teams to deliver end-to-end data workflows

Requirements:

  • 5+ years of experience in data engineering, ML pipelines, or distributed systems
  • Strong experience building scalable data pipelines for large datasets (video/audio preferred)
  • Hands-on experience with cloud platforms (AWS, Azure, or GCP)
  • Experience working with GPU-based environments and distributed computing
  • Strong programming skills in Python, Scala, or similar languages
  • Experience with data processing frameworks (Spark, Ray, Kafka, Airflow, or similar)
  • Understanding of ML workflows, training pipelines, and inference systems
  • Experience designing fault-tolerant, high-availability systems
  • Strong knowledge of data storage systems (data lakes, object storage, distributed file systems)
  • Ability to handle high-throughput, large-scale data ingestion and processing

Nice to have:

  • Experience with multimodal AI (video, audio, NLP) systems
  • Familiarity with annotation tools and data labeling workflows
  • Experience with containerization and orchestration (Docker, Kubernetes)
  • Knowledge of cost optimization strategies for large-scale cloud workloads

Additional Information:

Job Posted:
May 16, 2026

Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Ml / Ai Data Engineer

Principal Consulting AI / Data Engineer

As a Principal Consulting AI / Data Engineer, you will design, build, and optimi...
Location
Location
Australia , Sydney
Salary
Salary:
Not provided
dyflex.com.au Logo
DyFlex Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven expertise in delivering enterprise-grade data engineering and AI solutions in production environments
  • Strong proficiency in Python and SQL, plus experience with Spark, Airflow, dbt, Kafka, or Flink
  • Experience with cloud platforms (AWS, Azure, or GCP) and Databricks
  • Ability to confidently communicate and present at C-suite level, simplifying technical concepts into business impact
  • Track record of engaging senior executives and influencing strategic decisions
  • Strong consulting and stakeholder management skills with client-facing experience
  • Background in MLOps, ML pipelines, or AI solution delivery highly regarded
  • Degree in Computer Science, Engineering, Data Science, Mathematics, or a related field
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable data and AI solutions using Databricks, cloud platforms, and modern frameworks
  • Lead solution architecture discussions with clients, ensuring alignment of technical delivery with business strategy
  • Present to and influence executive-level stakeholders, including boards, C-suite, and senior directors
  • Translate highly technical solutions into clear business value propositions for non-technical audiences
  • Mentor and guide teams of engineers and consultants to deliver high-quality solutions
  • Champion best practices across data engineering, MLOps, and cloud delivery
  • Build DyFlex’s reputation as a trusted partner in Data & AI through thought leadership and client advocacy
What we offer
What we offer
  • Work with SAP’s latest technologies on cloud as S/4HANA, BTP and Joule, plus Databricks, ML/AI tools and cloud platforms
  • A flexible and supportive work environment including work from home
  • Competitive remuneration and benefits including novated lease, birthday leave, salary packaging, wellbeing programme, additional purchased leave, and company-provided laptop
  • Comprehensive training budget and paid certifications (Databricks, SAP, cloud platforms)
  • Structured career advancement pathways with opportunities to lead large-scale client programs
  • Exposure to diverse industries and client environments, including executive-level engagement
  • Fulltime
Read More
Arrow Right

Consulting AI / Data Engineer

As a Consulting AI / Data Engineer, you will design, build, and optimise enterpr...
Location
Location
Australia , Sydney
Salary
Salary:
Not provided
dyflex.com.au Logo
DyFlex Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands-on data engineering experience in production environments
  • Strong proficiency in Python and SQL
  • Experience with at least one additional language (e.g. Java, Typescript/Javascript)
  • Experience with modern frameworks such as Apache Spark, Airflow, dbt, Kafka, or Flink
  • Background in building ML pipelines, MLOps practices, or feature stores is highly valued
  • Proven expertise in relational databases, data modelling, and query optimisation
  • Demonstrated ability to solve complex technical problems independently
  • Excellent communication skills with ability to engage clients and stakeholders
  • Degree in Computer Science, Engineering, Data Science, Mathematics, or a related field
Job Responsibility
Job Responsibility
  • Build and maintain scalable data pipelines for ingesting, transforming, and delivering data
  • Manage and optimise databases, warehouses, and cloud storage solutions
  • Implement data quality frameworks and testing processes to ensure reliable systems
  • Design and deliver cloud-based solutions (AWS, Azure, or GCP)
  • Take technical ownership of project components and lead small development teams
  • Engage directly with clients, translating business requirements into technical solutions
  • Champion best practices including version control, CI/CD, and infrastructure as code
What we offer
What we offer
  • Work with SAP’s latest technologies on cloud as S/4HANA, BTP and Joule, plus Databricks, ML/AI tools and cloud platforms
  • A flexible and supportive work environment including work from home
  • Competitive remuneration and benefits including novated lease, birthday leave, remote working, additional purchased leave, and company-provided laptop
  • Competitive remuneration and benefits including novated lease, birthday leave, salary packaging, wellbeing programme, additional purchased leave, and company-provided laptop
  • Comprehensive training budget and paid certifications (Databricks, SAP, cloud platforms)
  • Structured career advancement pathways with mentoring from senior engineers
  • Exposure to diverse industries and client environments
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer, ML Data Systems

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet t...
Location
Location
United States , Mountain View
Salary
Salary:
137871.00 - 172339.00 USD / Year
khanacademy.org Logo
Khan Academy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
  • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
  • Familiarity with machine learning workflows — from training data preparation to evaluation
  • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
  • Attention to detail and an obsession with data quality and reproducibility
  • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
  • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.
Job Responsibility
Job Responsibility
  • Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
  • Clean, normalize, and enrich data while preserving semantic meaning and consistency
  • Prepare and format datasets for human labeling, and integrate results into ML datasets
  • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
  • Implement automated tests and validation to detect data drift or labeling inconsistencies
  • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
  • Contribute to shared tools and documentation for dataset management and AI evaluation
  • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.
What we offer
What we offer
  • Competitive salaries
  • Ample paid time off as needed
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.
  • Fulltime
Read More
Arrow Right

Middle/senior AI, ML Engineer

Join us at Provectus to be a part of a team that is dedicated to building cuttin...
Location
Location
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Comfortable with standard ML algorithms and underlying math
  • Strong hands-on experience with LLMs in production, RAG architecture, and agentic systems
  • AWS Bedrock experience strongly preferred
  • Practical experience with solving classification and regression tasks in general, feature engineering
  • Practical experience with ML models in production
  • Practical experience with one or more use cases from the following: NLP, LLMs, and Recommendation engines
  • Solid software engineering skills (i.e., ability to produce well-structured modules, not only notebook scripts)
  • Python expertise, Docker
  • English level - strong Intermediate
  • Excellent communication and problem-solving skills
Job Responsibility
Job Responsibility
  • Create ML models from scratch or improve existing models
  • Collaborate with the engineering team, data scientists, and product managers on production models
  • Develop experimentation roadmap
  • Set up a reproducible experimentation environment and maintain experimentation pipelines
  • Monitor and maintain ML models in production to ensure optimal performance
  • Write clear and comprehensive documentation for ML models, processes, and pipelines
  • Stay updated with the latest developments in ML and AI and propose innovative solutions
Read More
Arrow Right

Senior ML Data Engineer

As a Senior Data Engineer, you will play a pivotal role in our AI/ML workstream,...
Location
Location
Poland , Warsaw
Salary
Salary:
Not provided
awin.com Logo
Awin Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor or Master’s degree in data science, data engineering, Computer Science with focus on math and statistics / Master’s degree is preferred
  • At least 5 years experience as AI/ML data engineer undertaking above task and accountabilities
  • Strong foundation in computer science principes and statistical methods
  • Strong experience with cloud technology (AWS or Azure)
  • Strong experience with creation of data ingestion pipeline and ET process
  • Strong knowledge of big data tool such as Spark, Databricks and Python
  • Strong understanding of common machine learning techniques and frameworks (e.g. mlflow)
  • Strong knowledge of Natural language processing (NPL) concepts
  • Strong knowledge of scrum practices and agile mindset
  • Strong Analytical and Problem-Solving Skills with attention to data quality and accuracy
Job Responsibility
Job Responsibility
  • Design and maintain scalable data pipelines and storage systems for both agentic and traditional ML workloads
  • Productionise LLM- and agent-based workflows, ensuring reliability, observability, and performance
  • Build and maintain feature stores, vector/embedding stores, and core data assets for ML
  • Develop and manage end-to-end traditional ML pipelines: data prep, training, validation, deployment, and monitoring
  • Implement data quality checks, drift detection, and automated retraining processes
  • Optimise cost, latency, and performance across all AI/ML infrastructure
  • Collaborate with data scientists and engineers to deliver production-ready ML and AI systems
  • Ensure AI/ML systems meet governance, security, and compliance requirements
  • Mentor teams and drive innovation across both agentic and classical ML engineering practices
  • Participate in team meetings and contribute to project planning and strategy discussions
What we offer
What we offer
  • Flexi-Week and Work-Life Balance: We prioritise your mental health and well-being, offering you a flexible four-day Flexi-Week at full pay and with no reduction to your annual holiday allowance. We also offer a variety of different paid special leaves as well as volunteer days
  • Remote Working Allowance: You will receive a monthly allowance to cover part of your running costs. In addition, we will support you in setting up your remote workspace appropriately
  • Pension: Awin offers access to an additional pension insurance to all employees in Germany
  • Flexi-Office: We offer an international culture and flexibility through our Flexi-Office and hybrid/remote work possibilities to work across Awin regions
  • Development: We’ve built our extensive training suite Awin Academy to cover a wide range of skills that nurture you professionally and personally, with trainings conveniently packaged together to support your overall development
  • Appreciation: Thank and reward colleagues by sending them a voucher through our peer-to-peer program
Read More
Arrow Right

Senior ML Data Engineer

As a Senior Data Engineer, you will play a pivotal role in our AI/ML workstream,...
Location
Location
Salary
Salary:
Not provided
awin.com Logo
Awin Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor or Master’s degree in data science, data engineering, Computer Science with focus on math and statistics / Master’s degree is preferred
  • At least 5 years experience as AI/ML data engineer undertaking above task and accountabilities
  • Strong foundation in computer science principes and statistical methods
  • Strong experience with cloud technology (AWS or Azure)
  • Strong experience with creation of data ingestion pipeline and ET process
  • Strong knowledge of big data tool such as Spark, Databricks and Python
  • Strong understanding of common machine learning techniques and frameworks (e.g. mlflow)
  • Strong knowledge of Natural language processing (NPL) concepts
  • Strong knowledge of scrum practices and agile mindset
Job Responsibility
Job Responsibility
  • Design and maintain scalable data pipelines and storage systems for both agentic and traditional ML workloads
  • Productionise LLM- and agent-based workflows, ensuring reliability, observability, and performance
  • Build and maintain feature stores, vector/embedding stores, and core data assets for ML
  • Develop and manage end-to-end traditional ML pipelines: data prep, training, validation, deployment, and monitoring
  • Implement data quality checks, drift detection, and automated retraining processes
  • Optimise cost, latency, and performance across all AI/ML infrastructure
  • Collaborate with data scientists and engineers to deliver production-ready ML and AI systems
  • Ensure AI/ML systems meet governance, security, and compliance requirements
  • Mentor teams and drive innovation across both agentic and classical ML engineering practices
  • Participate in team meetings and contribute to project planning and strategy discussions
What we offer
What we offer
  • Flexi-Week and Work-Life Balance: We prioritise your mental health and well-being, offering you a flexible four-day Flexi-Week at full pay and with no reduction to your annual holiday allowance. We also offer a variety of different paid special leaves as well as volunteer days
  • Remote Working Allowance: You will receive a monthly allowance to cover part of your running costs. In addition, we will support you in setting up your remote workspace appropriately
  • Pension: Awin offers access to an additional pension insurance to all employees in Germany
  • Flexi-Office: We offer an international culture and flexibility through our Flexi-Office and hybrid/remote work possibilities to work across Awin regions
  • Development: We’ve built our extensive training suite Awin Academy to cover a wide range of skills that nurture you professionally and personally, with trainings conveniently packaged together to support your overall development
  • Appreciation: Thank and reward colleagues by sending them a voucher through our peer-to-peer program
Read More
Arrow Right

AI ML Engineer

We seeking a talented ML/AI Engineer to join our innovative team and drive the d...
Location
Location
India , hyderabad
Salary
Salary:
Not provided
genzeon.com Logo
Genzeon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience in building ML models with proven track record of successful deployments
  • Extensive experience in Generative AI including LLMs, diffusion models, and related technologies
  • Experience in Agentic AI and understanding of autonomous agent architectures
  • Proficiency with Model Control Protocol (MCP) for agent communication and control
  • Advanced Python programming with expertise in ML libraries (scikit-learn, TensorFlow, PyTorch, etc.)
  • Google Cloud Platform (GCP) experience with ML-focused services
  • Vertex AI hands-on experience for model lifecycle management
  • AutoML experience for automated machine learning workflows
  • Model Armour or similar model security and protection frameworks
  • 3+ years of experience in machine learning engineering or related field
Job Responsibility
Job Responsibility
  • Design, develop, and deploy robust machine learning models for various business applications
  • Build and optimize generative AI solutions using latest frameworks and techniques
  • Implement agentic AI systems that can autonomously perform complex tasks
  • Develop and maintain ML pipelines from data ingestion to model deployment
  • Leverage Google Cloud Platform (GCP) services for scalable ML infrastructure
  • Utilize Vertex AI for model training, deployment, and management
  • Implement AutoML solutions for rapid prototyping and model development
  • Ensure model security and compliance using Model Armour and related tools
  • Write clean, efficient Python code for ML applications and data processing
  • Optimize model performance, accuracy, and computational efficiency
Read More
Arrow Right

AI ML Engineer

We seeking a talented ML/AI Engineer to join our innovative team and drive the d...
Location
Location
India , hyderabad
Salary
Salary:
Not provided
genzeon.com Logo
Genzeon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience in building ML models with proven track record of successful deployments
  • Extensive experience in Generative AI including LLMs, diffusion models, and related technologies
  • Experience in Agentic AI and understanding of autonomous agent architectures
  • Proficiency with Model Control Protocol (MCP) for agent communication and control
  • Advanced Python programming with expertise in ML libraries (scikit-learn, TensorFlow, PyTorch, etc.)
  • Google Cloud Platform (GCP) experience with ML-focused services
  • Vertex AI hands-on experience for model lifecycle management
  • AutoML experience for automated machine learning workflows
  • Model Armour or similar model security and protection frameworks
  • 3+ years of experience in machine learning engineering or related field
Job Responsibility
Job Responsibility
  • Design, develop, and deploy robust machine learning models for various business applications
  • Build and optimize generative AI solutions using latest frameworks and techniques
  • Implement agentic AI systems that can autonomously perform complex tasks
  • Develop and maintain ML pipelines from data ingestion to model deployment
  • Leverage Google Cloud Platform (GCP) services for scalable ML infrastructure
  • Utilize Vertex AI for model training, deployment, and management
  • Implement AutoML solutions for rapid prototyping and model development
  • Ensure model security and compliance using Model Armour and related tools
  • Write clean, efficient Python code for ML applications and data processing
  • Optimize model performance, accuracy, and computational efficiency
Read More
Arrow Right