CrawlJobs Logo

AI Data Engineer

United States 152000.00 - 190000.00 USD / Year · Job Posted January 11, 2026
Apply Position
Job Link Share

Job Description

We’re seeking a Data Engineer to design and manage the data pipelines, platforms, and tools that power intelligent AI applications. You will work closely with data scientists, AI software engineers, and product teams to ensure our ML and LLM workloads are backed by scalable, secure, and high-performance data infrastructure. This is a hands-on, high-impact role where reliability and flexibility of data architecture is paramount.

Job Responsibility

  • Design, build, and maintain data pipelines for structured, unstructured, and semi-structured data sources
  • Develop and optimize data models, ETL processes, and batch/streaming data infrastructure
  • Partner with data scientists to support training, evaluation, and deployment of ML and LLM models
  • Implement scalable architectures for embeddings, vector databases, and retrieval pipelines
  • Enable real-time and offline analytics workflows using best-in-class data engineering practices
  • Ensure data quality, lineage, observability, and governance across all data products
  • Deploy secure, cloud-native data infrastructure (AWS, Azure, GCP) for high-volume AI workloads
  • Contribute to the design of feature stores and MLOps platforms for continuous learning and model updates
  • Collaborate on Responsible AI workflows to ensure compliant data usage and access controls
  • Continuously evaluate new tools and technologies for improving performance, reliability, and agility

Requirements

  • 5+ years of experience as a Data Engineer building large-scale, production-grade data pipelines
  • Strong command of SQL, Python, and distributed data processing frameworks (Spark, Flink, Beam)
  • Hands-on experience with ETL/ELT tools and orchestration systems (Airflow, dbt, Prefect, Dagster)
  • Familiarity with cloud-native data platforms (Snowflake, BigQuery, Redshift, Databricks)
  • Experience supporting ML/AI workloads and collaborating with model development teams
  • Knowledge of vector databases (FAISS, Pinecone, Weaviate) and embeddings management
  • Understanding of data privacy, access control, and compliance in regulated environments
  • Proficiency in modern DevOps tooling for data infrastructure (Docker, Terraform, CI/CD)
  • Ability to work autonomously and thrive in a fast-paced, collaborative environment

Nice to have

  • Cloud: AWS (Redshift, S3, Lambda), Azure (Data Lake, Synapse), GCP (BigQuery, Cloud Functions)
  • Streaming: Kafka, Kinesis, Pub/Sub, Spark Streaming, Apache Flink
  • Workflow Tools: dbt, Airflow, Dagster, Prefect
  • Storage & Processing: Snowflake, Databricks, Parquet, Delta Lake
  • Vector Search: FAISS, Pinecone, Weaviate, txtai

What we offer

  • Impact that matters
  • Flexibility and trust
  • Growth and development
  • Competitive rewards
  • Time for life
  • Belonging and balance

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

AI Data Engineer

8 matching positions

AI Data Engineer

We are looking for a technically sharp and detail-oriented Data Engineer to join...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, Mathematics, or a related discipline
  • 4 – 5 years of hands-on experience in data engineering, ETL development, or analytics engineering roles
  • Demonstrable experience with Databricks and/or Microsoft Fabric in a production environment
  • Proficiency in Power BI report and semantic model development
  • Exposure to Collibra or equivalent data governance / cataloguing platforms is strongly preferred
  • Strong SQL and Python skills
  • PySpark experience is required
  • Familiarity with Azure cloud services and DevOps practices for data pipeline deployment
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ETL/ELT pipelines using Azure Data Factory, Databricks (PySpark / Delta Live Tables), and Microsoft Fabric Data Factory
  • Transform raw, multi-source data into clean, conformed, and analytics-ready datasets following Medallion Architecture principles (Bronze → Silver → Gold)
  • Develop and optimize SQL and PySpark-based transformation logic for structured, semi-structured, and unstructured data
  • Implement incremental load patterns, merge/upsert logic, and slowly changing dimension (SCD) strategies to support historical data tracking
  • Collaborate with the AI Engineers to prepare high-quality feature datasets for ML and LLM use cases
  • Define, implement, and monitor data quality rules including completeness, accuracy, consistency, timeliness, and uniqueness checks
  • Administer and extend the Collibra data governance platform — including business glossary management, data lineage documentation, and stewardship workflows
  • Build automated data quality validation frameworks using tools such as Great Expectations, dbt tests, or Unity Catalog data quality constraints in Databricks
  • Triage and resolve data quality incidents, root-cause data anomalies, and communicate impact to stakeholders proactively
  • Maintain metadata catalogues and ensure all critical datasets have documented ownership, lineage, and classification
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Ai Data Engineer

Join us as an AI Data Engineer at Barclays, where you'll spearhead enterprise AI...
Location
Location
Czechia , Prague
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands-on data engineering experience with a demonstrable focus on AI and machine learning use cases, including data pipeline design and optimization for AI consumption
  • Practical experience with Model Context Protocol (MCP), including implementation in personal or enterprise projects, and understanding of context construction and integration patterns (at least as part of personal projects)
  • Strong understanding of data entitlements, access controls, and governance frameworks, with ability to implement entitlement-aware systems that enforce desk-, book-, client-, and license-level constraints
  • Deep familiarity with AI/LLM concepts and terminology, including understanding of how large language models integrate with data, agentic workflows, and RAG (Retrieval-Augmented Generation) patterns
Job Responsibility
Job Responsibility
  • Build and maintenance of data architectures pipelines that enable the transfer and processing of durable, complete and consistent data
  • Design and implementation of data warehoused and data lakes that manage the appropriate data volumes and velocity and adhere to the required security measures
  • Development of processing and analysis algorithms fit for the intended data complexity and volumes
  • Collaboration with data scientist to build and deploy machine learning models
  • To advise and influence decision making, contribute to policy development and take responsibility for operational effectiveness
  • Collaborate closely with other functions/ business divisions
  • Lead a team performing complex tasks, using well developed professional knowledge and skills to deliver on work that impacts the whole business function
  • Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes
  • Consult on complex issues
  • providing advice to People Leaders to support the resolution of escalated issues
What we offer
What we offer
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution
  • Fulltime
Read More
Arrow Right

Ai Data Engineer

We are looking for an AI Data Engineer to lead the adoption of AI-assisted workf...
Location
Location
Ukraine
Salary
Salary:
Not provided
sigma.software Logo
Sigma Software Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • SQL / strong
  • Python / strong
  • Spark/PySpark / strong
  • Azure Data Platforms / good
  • AI Coding Tools / good
Job Responsibility
Job Responsibility
  • Leading the adoption of AI-assisted workflows within our engineering team
  • Building scalable data pipelines
  • Experimenting with AI tools
  • Identifying opportunities to improve productivity
  • Helping the team transition to AI-augmented workflows
What we offer
What we offer
  • Diversity of Domains & Businesses
  • Variety of technology
  • Health & Legal support
  • Active professional community
  • Continuous education and growing
  • Flexible schedule
  • Remote work
  • Outstanding offices (if you choose it)
  • Sports and community activities
  • Fulltime
Read More
Arrow Right

AI Data Engineer

This role is a fantastic opportunity to start your career in data engineering fo...
Location
Location
Mexico , Mexicali
Salary
Salary:
Not provided
trimble.com Logo
Trimble Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, Information Systems, or a related quantitative field
  • Foundational knowledge of database concepts and proficiency in SQL for data querying and manipulation
  • Proficiency in at least one programming language, preferably Python, for scripting and data processing tasks
  • A conceptual understanding of data warehousing, ETL processes, and data modeling
  • Strong problem-solving skills and an ability to analyze issues of a limited scope with a focus on finding clear answers
  • Eagerness to learn and a strong interest in data engineering, cloud technologies, and AI/ML
Job Responsibility
Job Responsibility
  • Assist in the design, construction, and maintenance of foundational data pipelines (ETL/ELT) to support AI/ML model development
  • Contribute to data cleaning, transformation, and aggregation tasks to prepare datasets for machine learning applications
  • Perform routine data quality checks and write basic tests to ensure data integrity and reliability
  • Support the management of data storage solutions, including data warehouses and data lakes, under the supervision of senior engineers
  • Document data sources, pipeline logic, and data models to ensure clarity and maintainability for the team
  • Collaborate with AI/ML engineers to understand their data requirements and assist in providing them with accessible, analysis-ready datasets
  • Fulltime
Read More
Arrow Right

Ai data engineer

WFH flexibility! Up to 4 days/week! Global Environment! Competitive salary!
Location
Location
Japan , Tokyo
Salary
Salary:
7500000.00 - 12000000.00 JPY / Year
https://www.randstad.com Logo
Randstad
Expiration Date
August 25, 2026
Flip Icon
Requirements
Requirements
  • Azure AI Stack: Hands-on experience with Azure Machine Learning, Azure AI Foundry, and Azure OpenAI Service
  • Data Proficiency: Strong understanding of Azure Data Lake Storage (ADLS) and Synapse Analytics. Experience working with Parquet files and large-scale datasets
  • Programming: Proficiency in Python (for AI/ML) and familiarity with Scala/Spark (to align with our ETL core)
  • Backend & NoSQL: Experience developing with Azure Functions and managing data in CosmosDB
  • DevOps: Knowledge of CI/CD tools (Azure DevOps/GitHub Actions) for automating AI model lifecycles
What we offer
What we offer
  • 健康保険
  • 厚生年金保険
  • 雇用保険
  • 土曜日
  • 日曜日
  • 祝日
  • Fulltime
Read More
Arrow Right

AI Data Engineer

This is a remote 4 month engagement with a likelihood of extending to 12 months....
Location
Location
Salary
Salary:
50.00 - 55.00 USD / Hour
aquent.com Logo
Aquent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A Bachelor’s degree in Computer Science, Engineering, Information Systems, or a closely related field
  • A minimum of 2 years of professional experience in data engineering with experience in Python, SQL, Kubernetes, Airflow and Scala
  • Demonstrated proficiency in data warehouse management, alongside strong experience in building and maintaining robust data pipelines and ETL processes
  • Excellent verbal and written communication skills, with the ability to clearly convey technical information to diverse audiences
  • Proven ability to thrive and contribute effectively within a collaborative, cross-functional team environment
  • Strong analytical capabilities, including the ability to gather complex business requirements and debug intricate issues across various data systems
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable data platforms and pipelines utilizing cutting-edge tools and technologies
  • Collaborate closely with diverse stakeholders to meticulously gather business requirements and translate them into robust technical specifications
  • Develop and implement sophisticated data models that effectively support advanced analytics and comprehensive reporting needs
  • Champion data quality and governance by implementing rigorous validation, consistency checks, and reliability measures
  • Partner with cross-functional teams, including data analysts, data scientists, and business leaders, to deliver high-quality data solutions that meet evolving demands
  • Continuously monitor and optimize data pipelines for peak performance, scalability, and cost-efficiency
  • Establish and implement comprehensive monitoring and observability metrics to proactively ensure data quality and detect anomalies within data pipelines
  • Create clear, comprehensive documentation for data processes and effectively communicate complex technical concepts to both technical and non-technical audiences
What we offer
What we offer
  • subsidized health, vision, and dental plans
  • paid sick leave
  • retirement plans with a match
Read More
Arrow Right

Ai data engineer

The AI Data Engineer role involves developing and implementing machine learning ...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Develop and implement traditional machine learning algorithms
  • Deploy at least one model in a production environment
  • Write and maintain Python code for data science and machine learning projects
  • Knowledge of Deep Learning (DL) techniques
  • Experience working with Generative AI (GenAI) and Large Language Models (LLM)
  • Exposure to Langchain
Job Responsibility
Job Responsibility
  • Develop and implement traditional machine learning algorithms
  • Deploy at least one model in a production environment
  • Write and maintain Python code for data science and machine learning projects
Read More
Arrow Right

AI Data Engineer

The AI Data Engineer role involves designing and implementing cloud platforms fo...
Location
Location
United States , San Juan
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, engineering, information systems, or closely related quantitative discipline
  • 4-7 years’ experience
  • strong programming skills in Python, Java, Golang, or JavaScript
  • good understanding of distributed systems, event-driven programming paradigms, and designing for scale and performance
  • experience with cloud-native applications, developer tools, managed services, and next-generation databases
  • knowledge of DevOps practices like CI/CD, infrastructure as code, containerization, and orchestration using Kubernetes
  • good written and verbal communication skills
  • comfortable with AWS services
  • familiarity with the landscape of big data exploration, visualization, and prototyping platforms
  • familiarity with statistical and machine learning techniques
Job Responsibility
Job Responsibility
  • Research, propose, design, implement, operate and maintain cloud platforms for big data exploration and visualization, in support of a team of data scientists
  • deploy data science solutions into cloud environments
  • work with data scientists to troubleshoot cloud workflows
  • closely collaborate with our datalake team on cloud technologies
  • identify and implement cost-saving strategies to reduce ongoing cloud expenses
  • build CI/CD pipelines
  • deploy and maintain orchestration and monitoring systems for big data processing
  • help build images and containerize applications
What we offer
What we offer
  • Comprehensive suite of benefits that supports physical, financial, and emotional wellbeing
  • specific programs catered to professional development
  • inclusive working environment
  • Fulltime
Read More
Arrow Right