CrawlJobs Logo

Aws Data Engineer (Cloud Data Platform & Pipeline Specialist)

United States, Atlanta Employment contract · Job Posted May 28, 2026
Apply Position
Job Link Share

Job Description

Design, develop, and maintain scalable cloud-based data pipelines using AWS services such as Glue, EMR, S3, RDS, DataSync, and DMS. Build and optimize batch and streaming data orchestration workflows to support enterprise data platforms. Lead large-scale data migration efforts, including legacy-to-cloud transformations and replication strategies. Perform data modeling, transformation, and reconciliation to ensure high-quality, consistent datasets across systems. Implement secure data access patterns following least-privilege principles for pipelines and datasets. Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver solutions. Establish robust data validation, reconciliation, and audit mechanisms to meet regulatory and reporting requirements. Troubleshoot and optimize performance of ETL/ELT pipelines and data workflows in AWS environments. Support governance, compliance, and audit readiness for data platforms in regulated environments (finance/reporting).

Job Responsibility

  • Design, develop, and maintain scalable cloud-based data pipelines using AWS services such as Glue, EMR, S3, RDS, DataSync, and DMS
  • Build and optimize batch and streaming data orchestration workflows to support enterprise data platforms
  • Lead large-scale data migration efforts, including legacy-to-cloud transformations and replication strategies
  • Perform data modeling, transformation, and reconciliation to ensure high-quality, consistent datasets across systems
  • Implement secure data access patterns following least-privilege principles for pipelines and datasets
  • Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver solutions
  • Establish robust data validation, reconciliation, and audit mechanisms to meet regulatory and reporting requirements
  • Troubleshoot and optimize performance of ETL/ELT pipelines and data workflows in AWS environments
  • Support governance, compliance, and audit readiness for data platforms in regulated environments (finance/reporting)

Requirements

  • 5+ years of experience in data engineering, with strong hands-on expertise in AWS data services (Glue, EMR, S3, RDS, DataSync, DMS)
  • 5+ years of Proven experience building and managing data pipelines (batch and streaming) in cloud environments
  • 5+ years of Strong experience in data migration, transformation frameworks, and large-scale data replication
  • 5+ years of Deep understanding of data modeling, data transformation, and reconciliation techniques
  • 5+ years of Experience designing and implementing secure data access and governance (least privilege principles)
  • 5+ years of Hands-on experience with data validation, auditing, and reconciliation processes
  • Familiarity with regulatory or finance data environments and reporting workloads
  • 5+ years of Strong problem-solving skills and ability to work in a collaborative, fast-paced environment
  • AWS data services
  • data pipelines
  • data migration
  • data modeling
  • data validation
  • data governance

Nice to have

  • Experience with real-time/streaming technologies (e.g., Kafka, Kinesis)
  • Exposure to data warehousing platforms (Redshift, Snowflake, etc.)
  • Experience with infrastructure-as-code (Terraform, CloudFormation)
  • Knowledge of DevOps practices and CI/CD for data pipelines
  • Prior experience working in regulated industries (Financial Services, Healthcare, etc.)

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Aws Data Engineer (Cloud Data Platform & Pipeline Specialist)

8 matching positions

Data Scientist Specialist

We are seeking a highly experienced Data Scientist Specialist with deep expertis...
Location
Location
United States , McLean
Salary
Salary:
Not provided
apexsystems.com Logo
Apex Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in AI, Data Science, Computer Science, or related field
  • Extensive experience in AI/ML, including 3+ years in applied GenAI or LLM-based solutions
  • Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases, and multi-modal models
  • Proven experience with AWS cloud-native AI development (SageMaker, Bedrock, MLFlow/Kubeflow on EKS)
  • Strong programming skills in Python and ML/LLM libraries (Transformers, LangChain, etc.)
  • Strong understanding of GenAI system patterns, agentic architectures, evaluation frameworks, and guardrails
  • Demonstrated success working in cross-functional, agile teams
  • GitHub code repository link required for candidate evaluation
Job Responsibility
Job Responsibility
  • Architect and implement GenAI systems: Build scalable AI agents, agentic workflows, and GenAI applications for diverse business use cases
  • Model development & optimization: Fine-tune and optimize lightweight LLMs
  • evaluate and adapt models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives
  • RAG & GraphRAG architectures: Design and deploy Retrieval-Augmented Generation (RAG) and GraphRAG systems using vector databases and enterprise knowledge bases
  • Enterprise data curation: Curate and prepare enterprise data using connectors integrated with AWS Bedrock Knowledge Bases and/or Elasticsearch
  • Agent interoperability: Implement solutions leveraging Model Context Protocol (MCP) and Agent-to-Agent (A2A) communication patterns
  • Experimentation & ML platforms: Build and maintain Jupyter-based notebooks using SageMaker, MLFlow, or Kubeflow on Kubernetes (EKS)
  • Cross-functional collaboration: Work with UI engineers, microservices teams, designers, and data engineers to deliver full-stack GenAI experiences
  • Enterprise integration: Integrate GenAI solutions with enterprise platforms via APIs and standardized GenAI architectural patterns
  • Evaluation & safety: Establish evaluation frameworks, bias mitigation strategies, safety protocols, and guardrails for production deployment
What we offer
What we offer
  • medical
  • dental
  • vision
  • life
  • disability
  • other insurance plans
  • ESPP (employee stock purchase program)
  • 401K program with company match after 12 months
  • HSA (Health Savings Account on the HDHP plan)
  • SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions
Read More
Arrow Right

Technical Architect (AI)

The AI Solutions Engineer is a specialist role, responsible for actively partici...
Location
Location
Indonesia , Jakarta Selatan
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrated understanding of artificial intelligence, natural language processing (NLP), and machine learning principles
  • Expertise in selecting, fine-tuning, and deploying large and small language models (LLMs/SLMs), such as OpenAI's GPT series and open-source alternatives
  • Specialist proficiency in Python programming, essential for rapid prototyping, integration, and model implementation
  • Knowledge of additional programming languages (optional, but valuable): JavaScript / TypeScript, Java / C#
  • Familiarity with full-stack software development, including frontend and backend integration, user experience considerations, and system interoperability
  • Robust knowledge of data pipeline development, data engineering concepts, and handling of structured and unstructured data
  • Proficiency in cloud computing platforms (Azure, AWS, GCP), particularly in deploying, scaling, and managing AI workloads
  • Experience with Microsoft Copilot Studio, Azure AI Foundry, and Semantic Kernel is highly desirable
  • Awareness and application of security, compliance, and risk management practices related to AI solutions
  • Understanding of ethical AI considerations, bias mitigation, and responsible AI deployment
Job Responsibility
Job Responsibility
  • Develop, fine-tune, and deploy AI models, including large language models (LLMs) such as GPT-4 or open-source equivalents
  • Design and implement effective prompt engineering strategies and optimizations to enhance AI accuracy, consistency, and reliability
  • Engage with internal stakeholders and clients to understand business needs, translating them into actionable AI solutions
  • Rapidly prototype, test, and iterate AI applications using advanced Python programming and relevant frameworks
  • Integrate AI solutions securely with existing enterprise systems (CRM, ERP, HRIS, finance platforms, collaboration software) via API development and integration
  • Build, maintain, and optimize end-to-end data pipelines to ensure accurate and timely data delivery for AI models
  • Manage structured and unstructured datasets, leveraging vector databases and semantic search to enhance knowledge management capabilities
  • Deploy, manage, and scale AI solutions within cloud computing environments (Azure, AWS, GCP), ensuring high availability, performance, and cost efficiency
  • Implement DevOps and MLOps practices, including automated deployment, testing, monitoring, and version control, to efficiently manage the AI model lifecycle
  • Ensure AI solutions adhere to industry standards and compliance regulations (GDPR, HIPAA), emphasizing security and privacy best practices
  • Fulltime
Read More
Arrow Right

AI Solutions Engineer

The AI Solutions Engineer at NTT DATA is responsible for developing and deployin...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field
  • Demonstrated experience (typically 4-6 years) developing, deploying, and maintaining AI and machine learning solutions in enterprise environments
  • Hands-on expertise in AI model development, fine-tuning, and optimization using Python and relevant frameworks
  • Demonstrated experience implementing prompt engineering methodologies and optimizing model performance
  • Demonstrated experience in API development and secure integration of AI-driven solutions with enterprise systems and platforms
  • Experience building robust data pipelines, managing structured/unstructured data, and leveraging vector databases
  • Practical experience deploying and scaling AI applications within cloud platforms (Azure, AWS, or GCP)
  • Demonstrated success applying DevOps and MLOps best practices to manage AI model lifecycle and deployments efficiently
  • Proven track record ensuring security, privacy, compliance, and responsible use of AI solutions within regulated environments
  • Experience engaging directly with clients and stakeholders, translating business requirements into effective technical solutions
Job Responsibility
Job Responsibility
  • Develop, fine-tune, and deploy AI models, including large language models (LLMs) such as GPT-4 or open-source equivalents
  • Design and implement effective prompt engineering strategies and optimizations to enhance AI accuracy, consistency, and reliability
  • Engage with internal stakeholders and clients to understand business needs, translating them into actionable AI solutions
  • Rapidly prototype, test, and iterate AI applications using advanced Python programming and relevant frameworks
  • Integrate AI solutions securely with existing enterprise systems (CRM, ERP, HRIS, finance platforms, collaboration software) via API development and integration
  • Build, maintain, and optimize end-to-end data pipelines to ensure accurate and timely data delivery for AI models
  • Manage structured and unstructured datasets, leveraging vector databases and semantic search to enhance knowledge management capabilities
  • Deploy, manage, and scale AI solutions within cloud computing environments (Azure, AWS, GCP), ensuring high availability, performance, and cost efficiency
  • Implement DevOps and MLOps practices, including automated deployment, testing, monitoring, and version control, to efficiently manage the AI model lifecycle
  • Ensure AI solutions adhere to industry standards and compliance regulations (GDPR, HIPAA), emphasizing security and privacy best practices
What we offer
What we offer
  • Workplace embraces diversity and inclusion
  • A place where you can grow, belong and thrive
  • Fulltime
Read More
Arrow Right

Public Cloud - Senior Platform Engineer - SVP

Citi’s Operations & Technology organization (O&T) is driving an innovative Cloud...
Location
Location
United Kingdom , Belfast
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Developer experience across multiple languages – Java, Python etc.
  • Cloud platforms, more than once cloud AWS, GCP or Azure
  • Has built CICD / SDLC pipelines for example Terraform and Harness
  • Performed as tech lead within a specialist domain eg. tech lead for container or data platforms
Job Responsibility
Job Responsibility
  • Implement cloud architecture (on AWS, GCP and Azure) that enables the infrastructure (compute, database network, storage, observability) required for application hosting in public cloud using Citi’s engineering processes and best practices, with particular emphasis on automation and security by design
  • To engineer platforms and services across Google and AWS
  • Working in an agile empowered team to solve complex problems
  • Engineering emergent technologies to create value across Citi
What we offer
What we offer
  • 27 days annual leave (plus bank holidays)
  • A discretional annual performance related bonus
  • Private Medical Care & Life Insurance
  • Employee Assistance Program
  • Pension Plan
  • Paid Parental Leave
  • Special discounts for employees, family, and friends
  • Access to an array of learning and development resources
  • Fulltime
Read More
Arrow Right

Ecom Data Engineer Specialist

PepsiCo operates in an environment undergoing immense and rapid change. Big data...
Location
Location
United States , Purchase, New York
Salary
Salary:
64900.00 - 132550.00 USD / Year
pepsico.com Logo
Pepsico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of overall technology experience that includes at least 3+ years of hands-on software development, data engineering, and systems architecture
  • 3+ years of experience in SQL optimization and performance tuning
  • Experience with data modeling, data warehousing, and building high-volume ETL/ELT pipelines
  • Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
  • Experience with data profiling and data quality tools like Apache Griffin, Deequ, or Great Expectations
  • Current skills in the following technologies: Python
  • Orchestration platforms: Airflow, Luigi, Databricks, or similar
  • Relational databases: Postgres, MySQL, or equivalents
  • MPP data systems: Snowflake, Redshift, Synapse, or similar
  • Cloud platforms: AWS, Azure, or similar
Job Responsibility
Job Responsibility
  • Own data pipeline development end-to-end, spanning data modeling, testing, scalability, operability, and ongoing metrics
  • Ensure that we build high-quality software by reviewing peer code check-ins
  • Define best practices for product development, engineering, and coding as part of a world-class engineering team
  • Collaborate in architecture discussions and architectural decision-making that is part of continually improving and expanding these platforms
  • Lead feature development in collaboration with other engineers
  • validate requirements/stories, assess current system capabilities, and decompose feature requirements into engineering tasks
  • Focus on delivering high-quality data pipelines and tools through careful analysis of system capabilities and feature requests, peer reviews, test automation, and collaboration with other engineers
  • Develop software in short iterations to quickly add business value
  • Introduce new tools/practices to improve data and code quality
  • this includes researching/sourcing 3rd party tools and libraries, as well as developing tools in-house to improve workflow and quality for all data engineers
What we offer
What we offer
  • A business development incentive equity may be awarded based on eligibility and performance
  • Paid time off subject to eligibility, including paid parental leave, vacation, sick, and bereavement
  • Medical, Dental, Vision, Disability, Health, and Dependent Care Reimbursement Accounts, Employee Assistance Program (EAP), Insurance (Accident, Group Legal, Life), Defined Contribution Retirement Plan
  • Fulltime
Read More
Arrow Right

Data Modeling & Engineering Specialist

We are seeking a technically proficient and business-aware Data Modelling & Engi...
Location
Location
Egypt , Giza
Salary
Salary:
Not provided
vodafone.com Logo
Vodafone
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong proficiency in SQL, including analytical functions and performance tuning
  • Experienced with ETL tools such as SSIS, Informatica, Talend, or Airflow
  • Solid understanding of data modelling principles and database design
  • Familiar with cloud platforms (Azure, GCP, AWS) and data warehousing concepts
  • Knowledge of data governance, security, and GDPR compliance is advantageous
  • Comfortable working in agile environments and collaborating across teams
Job Responsibility
Job Responsibility
  • Design and maintain conceptual, logical, and physical data models aligned with business requirements
  • Apply dimensional modelling techniques such as star and snowflake schemas for analytics and reporting
  • Ensure consistency, integrity, and optimisation across data models
  • Develop and maintain ETL pipelines using SQL and modern tools
  • Write efficient SQL scripts for data extraction, transformation, and loading
  • Conduct data validation and quality checks to ensure reliability
  • Build and manage data warehouses and lakes on platforms including Azure, GCP, or AWS
  • Utilise services such as Azure Synapse, BigQuery, or Snowflake for scalable storage and querying
  • Optimise data storage and retrieval for performance and cost-efficiency
  • Partner with data analysts, scientists, and business stakeholders to gather and understand requirements
What we offer
What we offer
  • Opportunity to work on cutting-edge cloud data platforms and tools
  • Exposure to cross-functional collaboration and agile delivery environments
  • A chance to contribute to strategic data initiatives that impact business decisions
  • Be part of a global organisation driving digital transformation
Read More
Arrow Right

Cloud Platform Engineer

Citi’s Operations & Technology organization (O&T) is driving an innovative Cloud...
Location
Location
Poland , Warsaw
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Developer experience across multiple languages – Java, Python etc.
  • Cloud platforms, more than one cloud AWS, GCP or Azure
  • Has built CICD / SDLC pipelines for example Terraform and Harness
  • Performed as tech lead within a specialist domain eg tech lead for container or data platforms
  • 5+ years of application architecture and modernization (cloud native)
  • 5+ years of messaging, data, analytics, api, service and data mesh
  • Experience of working in regulator governed industries, good overall understanding of compliance and security controls
Job Responsibility
Job Responsibility
  • Implement cloud architecture (on AWS, GCP and Azure) that enables the infrastructure (compute, database network, storage, observability) required for application hosting in public cloud using Citi’s engineering processes and best practices, with particular emphasis on automation and security by design.
  • To engineer platforms and services across Google and AWS
  • Working in an agile empowered team to solve complex problems
  • Engineering emergent technologies to create value across Citi
What we offer
What we offer
  • Private Medical Care Program
  • Life Insurance Program
  • Pension Plan contribution (PPE Program)
  • Employee Assistance Program
  • Paid Parental Leave Program (maternity and paternity leave)
  • Sport Card
  • Holidays Allowance
  • Sport and team recreation activities
  • Special offers and discounts for employees
  • Access to an array of learning and development resources
  • Fulltime
Read More
Arrow Right

Data Scientist

The Data Scientist plays a pivotal role in planning, executing, and delivering m...
Location
Location
United States , Camden
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s, or PhD in Computer Science, Data Science, Engineering, Statistics, Applied Mathematics, Operations Research, or a related quantitative field
  • Specialization in ML, AI, cognitive science, or data science is highly preferred
  • 3-5 years of hands-on experience planning and executing end-to-end data science projects with demonstrated impact on clinical or operational outcomes in business environments
  • Advanced programming proficiency in Python or R with strong expertise in machine learning frameworks (scikit-learn, TensorFlow, PyTorch) and statistical analysis tools
  • Expertise in machine learning and statistical techniques including supervised/unsupervised learning, deep learning, NLP, computer vision, regression models, ensemble methods, and experimental design (A/B testing)
  • Strong data engineering capabilities including SQL/NoSQL database programming, distributed computing tools (Hadoop, Spark, Kafka), data pipeline development, and experience with cloud platforms (AWS, Azure, GCP)
  • Production ML and MLOps experience including model deployment, monitoring, containerization (Docker, Kubernetes), version control, and applying DevOps principles to data science workflows
  • Data visualization and communication excellence with ability to create compelling dashboards (Tableau, Power BI), translate complex technical findings into actionable insights, and present to diverse audiences from executives to frontline staff
  • Cross-functional collaboration skills with proven ability to work in agile environments, partner with stakeholders to align technical solutions with business objectives, and mentor junior team members
  • Healthcare domain knowledge preferred, particularly experience with Epic EHR systems, clinical workflows, and healthcare data standards, along with relevant certifications (Clarity /Caboodle, Google Cloud ML Engineer, AWS ML Specialist)
Job Responsibility
Job Responsibility
  • Collect, clean, and analyze datasets from diverse internal and external sources, applying advanced data wrangling techniques
  • Acquire access to various databases and source systems (SQL, NoSQL, graph databases) and create data pipelines
  • Apply statistical analysis and visualization techniques to explore and prepare data
  • Design, develop, and validate machine learning, statistical, and optimization models
  • Select appropriate algorithms and models for AI/ML and test them for accuracy, robustness, and fairness
  • Perform feature selection and engineering
  • Integrate domain knowledge into ML solutions
  • Conduct controlled experiments (A/B and multivariate testing)
  • Collaborate with MLOps, data engineers, and IT to evaluate deployment options
  • Continuously monitor execution and health of production ML models
  • Fulltime
Read More
Arrow Right