CrawlJobs Logo

Associate Data Scientist (Data Engineering)

India, Hyderabad · Job Posted April 12, 2026
Apply Position
Job Link Share

Job Description

The Associate – Data Engineering role is an entry‑level engineering position within the Marketing Science Operations (MSO) Data Engineering team. This role is ideal for candidates who are excited to build foundations in data pipelines, data quality, and scalable data products that power analytics and reporting. The team will provide structured learning and on‑the‑job exposure. What matters most is curiosity, critical thinking, and a strong learning mindset.

Job Responsibility

  • Support building and maintaining data ingestion and transformation workflows (ETL/ELT) for marketing and media datasets
  • Assist with implementing data validations and QA checks (schema checks, null checks, duplication checks, count reconciliation) to ensure datasets are reliable and analytics‑ready
  • Help monitor pipeline runs and refresh schedules, investigate failures using logs, and escalate issues with clear context and evidence
  • Support development of reusable, governed datasets / data products that are consistent and easy for BI and analytics teams to consume
  • Contribute to documentation (data flow notes, runbooks, assumptions, source mapping) to improve operational stability and handoffs
  • Collaborate with cross‑functional partners (MSO BI, MSO Ops, Analytics teams, Vendors) to ensure inputs and outputs are aligned and dependable
  • Participate in continuous learning and improvement to strengthen engineering fundamentals, reliability practices, and delivery discipline
  • Analyze and understand the functional & non-functional requirements for Investment and Media Analytics function, and translate them into prototype, technical specifications

Requirements

  • Bachelor’s (3 to 6 Yrs) or Master’s (2 to 4 Yrs) degree in Computer Science, Engineering, Information Systems, Data Science, or a related quantitative/technical field
  • Experience in data engineering, software engineering, analytics engineering, or data/tech role
  • Familiarity with data concepts (tables, schemas, joins, basic transformations)
  • Proficiency in SQL and data analysis and structured data
  • Strong problem‑solving and analytical reasoning skills
  • Effective written and verbal communication skills
  • Strong learning mindset and willingness to work with new tools, datasets, and business contexts

Nice to have

  • Exposure to Python programming for automation
  • Exposure to cloud/data tools such as Databricks, Spark/PySpark, Azure/AWS, ADF/Airflow
  • Experience with ETL/ELT pipelines, batch processing, or orchestration concepts
  • Familiarity with data quality checks, profiling, data governance, logging, monitoring, or incident triage practices
  • Exposure to DevSecOps practice

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Associate Data Scientist (Data Engineering)

8 matching positions

Data Scientist

At Applied Network Solutions (ANS), we bring together some of the most curious m...
Location
Location
United States , Aurora
Salary
Salary:
100000.00 - 165000.00 USD / Year
go-ans.com Logo
Applied Network Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Active TS/SCI clearance with Polygraph
  • A bachelor’s degree and 10 years of relevant experience. An associate’s degree plus 12 years of relevant experience may be considered for individuals with in-depth experience that is clearly related to the position
  • Degree must be in Mathematics, Applied Mathematics, Statistics, Applied Statistics, Machine Learning, Data Science, Operations Research, or Computer Science
  • A broader range of degrees will be considered if accompanied by a Certificate in Data Science from an accredited college/university
  • Relevant experience must be in designing/implementing machine learning, data science, advanced analytical algorithms programming (skill in at least one high-level language (e.g., Python)), statistical analysis (e.g. variability, sampling error, inference, hypothesis testing, EDA, application of linear models), data management (e.g. data cleaning and transformation), data mining, data modeling and assessment, artificial intelligence, and/or software engineering. Experience in more than one area is strongly preferred.
  • Devise strategies for extracting meaning and value from large datasets.
  • Make and communicate principled conclusions from data using elements of mathematics, statistics, computer science, and application specific knowledge.
  • Translate practical mission needs and analytic questions related to large datasets into technical requirements and, conversely, assist others with drawing appropriate conclusions from the analysis of such data.
  • Employ some combination (2 or more) of the following areas: 1. Foundations (Mathematical, Computational, Statistical) 2. Data Processing (Data management and curation, data description and visualization, workflow and reproducibility) 3. Modeling, Inference, and Prediction (Data modeling and assessment, domain-specific considerations)
Job Responsibility
Job Responsibility
  • plan, analyze, design, develop, test, secure, integrate, implement, operate, and maintain the custom solutions that ANS delivers
What we offer
What we offer
  • Family Medical, Dental (w/ adult orthodontia) and Vision coverage
  • Pet Discount Program
  • PTO (Paid Time Off)
  • Maternity/ Paternity Leave
  • Supplemental Military Leave Pay
  • 11 Paid Holidays
  • 401(k) plan with 6% Company Contribution
  • Generous Professional Development Program
  • 100% Employer paid Short- and Long-Term Disability
  • 100% Employer paid Life Insurance
Read More
Arrow Right

Senior Data Scientist - Value & Access

As the Data Scientist at Amgen, you will be responsible for developing and deplo...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s degree in computer science, statistics or STEM majors with a minimum of 5 years of Information Systems experience
  • Bachelor’s degree in computer science, statistics or STEM majors with a minimum of 7 years of Information Systems experience
  • Foundational understanding of US pharmaceutical ecosystem and Patient support services offerings (Copay) and other standard data sets including claims, prescription
  • Experience with one or more analytic software tools or languages like R and Python
  • Strong foundation in machine learning algorithms and techniques
  • Experience in statistical techniques and hypothesis testing, experience with regression analysis, clustering and classification
Job Responsibility
Job Responsibility
  • Drive full lifecycle of Data Science projects delivery and ability to guide data scientists in shaping the developing the solution and act as a subject matter expert in solving development and commercial questions
  • Assume the role of business owner and manage the Proprietary AI engine built to optimize Copay
  • Support Amgen Gross to Net and other V&A Transformation initiatives
  • Ensure models are trained with the latest data and meet the SLA expectations
  • Manage AI tool’s road map, working with a global cross functional team
  • Work in technical teams in development, deployment, and application of applied analytics, predictive analytics, and prescriptive analytics
  • Utilize technical skills such as hypothesis testing, machine learning and retrieval processes to apply statistical and data mining techniques to identify trends, create figures, and analyze other relevant information
  • Perform exploratory and targeted data analyses using descriptive statistics and other methods
  • Model/analytics experiment and development pipeline leveraging MLOps
  • Oversee efforts of 1-3 associates, including setting performance standards, managing their staffing, and monitoring performance
Read More
Arrow Right

Senior Data Scientist - Value & Access

As the Data Scientist at Amgen, you will be responsible for developing and deplo...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s degree in computer science, statistics or STEM majors with a minimum of 5 years of Information Systems experience
  • Bachelor’s degree in computer science, statistics or STEM majors with a minimum of 7 years of Information Systems experience
  • Foundational understanding of US pharmaceutical ecosystem and Patient support services offerings (Copay) and other standard data sets including claims, prescription
  • Experience with one or more analytic software tools or languages like R and Python
  • Strong foundation in machine learning algorithms and techniques
  • Experience in statistical techniques and hypothesis testing, experience with regression analysis, clustering and classification
Job Responsibility
Job Responsibility
  • Accountable to drive full lifecycle of Data Science projects delivery and ability to guide data scientists in shaping the developing the solution and act as a subject matter expert in solving development and commercial questions
  • Assume the role of business owner and manage the Proprietary AI engine built to optimize Copay
  • Support Amgen Gross to Net and other V&A Transformation initiatives
  • Ensure models are trained with the latest data and meet the SLA expectations
  • Manage AI tool’s road map, working with a global cross functional team
  • Work in technical teams in development, deployment, and application of applied analytics, predictive analytics, and prescriptive analytics
  • Utilize technical skills such as hypothesis testing, machine learning and retrieval processes to apply statistical and data mining techniques to identify trends, create figures, and analyze other relevant information
  • Perform exploratory and targeted data analyses using descriptive statistics and other methods
  • Model/analytics experiment and development pipeline leveraging MLOps
  • Oversee efforts of 1-3 associates, including setting performance standards, managing their staffing, and monitoring performance
Read More
Arrow Right

Senior Manager, Data Engineering

Stanford Health Care is seeking a dynamic and experienced Senior Manager of Data...
Location
Location
United States , Palo Alto
Salary
Salary:
83.98 - 111.27 USD / Hour
stanfordhealthcare.org Logo
Stanford Health Care
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related field. A Master’s degree is preferred
  • 8+ years of hands-on experience in data engineering, data warehousing, or a related field
  • 3+ years of experience in a leadership or management role, directly managing multiple technical teams
  • Experience in a healthcare or academic medical center environment is highly desirable
  • Proven ability to lead, mentor, and grow multiple high-performing engineering teams
  • Expert knowledge of data engineering principles, including ETL/ELT, data modeling, and data architecture
  • Strong strategic planning skills with experience in developing roadmaps for both data engineering and enterprise analytics
  • Deep understanding of cloud data platforms (GCP, AWS, Azure) and modern data warehouse technologies (Databricks, Snowflake, BigQuery, Redshift)
  • Proficiency with big data technologies (e.g., Spark, Kafka) and workflow orchestration tools (e.g., Airflow)
  • Expert-level SQL skills and proficiency in a programming language such as Python or Scala
Job Responsibility
Job Responsibility
  • Lead, mentor, and manage multiple teams of data engineers, providing technical guidance, career development, and performance management
  • Foster a culture of collaboration, innovation, and operational excellence
  • Manage resource allocation, project prioritization, and team workloads to ensure timely delivery of key initiatives
  • Recruit, coach, and motivate team members, developing leadership competencies within the team
  • Oversee the design, development, and maintenance of scalable and robust data pipelines (ETL/ELT) to support analytics and data science
  • Define and drive the technical roadmap for our cloud-based data platforms (e.g., GCP, AWS, Azure) and data warehousing solutions (e.g., Databricks)
  • Define and implement the strategy for enterprise analytics and operational reporting, in collaboration with key business and clinical stakeholders
  • Partner with enterprise architecture, applications, and data science teams to ensure sustainable and scalable data management and solution delivery
  • Partner with key stakeholders across Stanford Medicine (Stanford Health Care, Stanford School of Medicine, and Stanford Medicine Partners), including data scientists, analysts, clinical leaders, and researchers, to translate data needs into technical requirements
  • Manage the full lifecycle of data engineering projects, from conception and planning to execution and delivery
  • Fulltime
Read More
Arrow Right

Associate Data Engineer

Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2-6 years of experience with Bachelor’s degree in Computer Science, Data Engineering, Information Systems, Engineering, Mathematics, or a related field, or equivalent practical experience
  • Hands-on experience with Python for data processing, scripting, and automation
  • Strong working knowledge of PySpark and distributed data processing concepts
  • Proven hands-on experience using Databricks for data engineering, including notebooks, clusters, jobs, workflows, Delta tables, and performance optimization
  • Ability to build, maintain, and troubleshoot scalable ETL/ELT pipelines in Databricks
  • Experience working with Delta Lake and lakehouse architecture concepts
  • Working knowledge of SQL for querying, transforming, and validating data
  • Ability to work with structured and semi-structured data formats such as CSV, JSON, Parquet, and Delta
  • Understanding of data engineering concepts such as ETL/ELT, data pipelines, data lakes, data warehouses, batch processing, and data quality
  • Basic understanding of AI and machine learning concepts, including features, training datasets, model inputs/outputs, and model evaluation basics
Job Responsibility
Job Responsibility
  • Develop, test, and maintain data pipelines using Databricks, PySpark, and Python
  • Ingest, transform, and process structured and semi-structured data from multiple sources
  • Support the development of scalable ETL/ELT workflows for analytics, reporting, and machine learning use cases
  • Work with data engineers, analysts, and data scientists to understand data requirements and deliver reliable datasets
  • Perform data cleansing, validation, and quality checks to ensure accuracy and consistency
  • Optimize Spark jobs and Databricks notebooks for performance, reliability, and cost efficiency
  • Create and maintain documentation for data pipelines, workflows, data definitions, and processes
  • Assist in troubleshooting pipeline failures, data issues, and performance bottlenecks
  • Follow best practices for version control, code quality, testing, and deployment
  • Support basic AI/ML data preparation activities, including feature engineering, dataset creation, and model input preparation
  • Fulltime
Read More
Arrow Right
New

Data Engineer - Data & Technology, SCD

Job ID: R0014920, Date posted: 21/05/2026. Supply Data & Analytics has the assig...
Location
Location
Sweden , Älmhult
Salary
Salary:
Not provided
https://www.ikea.com Logo
IKEA
Expiration Date
June 04, 2026
Flip Icon
Requirements
Requirements
  • At least 5 years' experience as a Data Engineer or equivalent
  • Databricks or Fabric certified
  • deep operational knowledge of Python, SQL and PySpark
  • excellent knowledge of SQL, Python and PySpark
  • excellent working knowledge of Databricks, with Associate Data Engineer or Professional Data Engineer certification
  • good communication and collaboration skills
  • team player who takes ownership, builds cross-functional relationships
  • curious problem solver who wants to keep learning
  • shares IKEA values
Job Responsibility
Job Responsibility
  • Design, develop, and maintain data models (including dimensional modelling, star/snowflake schemas, Data Vault, and Medallion architecture) to support business needs of analytics and self-service reporting across Core Business Supply
  • Design, implement, and maintain robust ETL/ELT pipelines that integrate diverse internal and external data sources (databases, APIs, real-time streams) into IKEAs data platforms
  • Collaborate closely with business stakeholders, architects, analysts, data product leaders, and data scientists to translate complex supply chain and business requirements into scalable, performant, and well-governed data models and solutions
  • Establish, promote, and enforce data engineering best practices, including data modelling standards, data quality, version control, testing, monitoring, documentation, and metadata management
  • Optimise data models and pipelines for performance, reliability, scalability, and cost-efficiency across batch and real-time processing
  • Ensure data is stored, processed, and accessed securely, with full adherence to data governance, privacy, and compliance standards
  • Contribute to the development and adoption of reusable frameworks, components, and semantic layers that accelerate data product delivery and increase consistency across the organisation
  • Stay current with emerging data engineering technologies and practices, applying them where they create clear business value
  • Promote a culture of data sharing, engineering excellence, and continuous improvement while guiding and developing less experienced coworkers through mentoring, code reviews, and knowledge sharing
What we offer
What we offer
  • Opportunity to develop your career globally
  • vibrant culture where ideas are heard
  • opportunity to learn new skills
  • flexible work from home
  • Fulltime
!
Read More
Arrow Right

Sr Data Engineer

The Senior Data Engineer plays a critical role in delivering strategic and opera...
Location
Location
United States
Salary
Salary:
120000.00 - 160000.00 USD / Year
personifyhealth.com Logo
Personify Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least one AWS certification (e.g., AWS Certified Data Analytics – Specialty, Big Data – Specialty, Developer – Associate)
  • 7+ years in data engineering or analytics engineering, with a strong focus on cloud-native architectures. Proven experience designing and operating scalable data platforms in AWS
  • 5+ years in healthcare, insurance, or claims processing, including 5+ years working with EDI (834, 835, 837, 2222, 2223, 999), X12 file standards or HL7 standards and familiarity with HIPAA and CMS compliance
  • Expert-level proficiency in SQL (including pivots, window functions, and complex date calculations) and Python for data processing, transformation, and application development
  • Hands-on experience with orchestration tools like Airflow, containerization with Docker, and CI/CD pipelines. Strong bias for automation and continuous improvement
  • Proficient in consuming and transforming REST APIs and JSON data into relational models. Skilled in building robust data ingestion and transformation pipelines
  • Experience with JIRA, BitBucket Git, BitBucket Pipelines, and collaboration with cross-functional teams including Data Analysts, Data Scientists, Product, and Account Management
  • Proficient in Excel and BI tools such as Tableau, Power BI, and MicroStrategy for data analysis and reporting
  • Detail-oriented with a strong focus on data quality, accuracy, and performance tuning for large-scale data systems. Background in cost optimization and system reliability
  • Ability to mentor engineers, share technical knowledge, and communicate effectively with both technical and non-technical stakeholders. Strong documentation and systems thinking
Job Responsibility
Job Responsibility
  • Build data applications and processes using Python, SQL, and Django
  • manage and query data in PostgreSQL, Oracle, and cloud-native databases
  • Examine, extract, cleanse, and load data while implementing quality assurance rules and tools to ensure consistent and accurate data
  • Work with healthcare-specific data processes such as EDI file transfers, claims adjudication, audits, eligibility verification, and reporting workflows
  • Collaborate with cross-functional teams (Data Analysts, Data Scientists, Product, Reporting, Account Management) to define requirements and deliver data-driven solutions
  • Ensure data quality, integrity, and security through automated validation, auditing, and monitoring, with compliance to HIPAA and CMS regulations
  • Monitor, maintain, and tune pipeline performance
  • proactively troubleshoot and resolve complex data flow and system issues
  • Provide technical mentorship to Data Engineers, sharing expertise in data modeling, pipeline development, and troubleshooting practices
  • Research and propose improvements to the tech stack and data engineering processes
What we offer
What we offer
  • Competitive base salary and benefits effective day one
  • Comprehensive medical and dental through our own health solutions (yes, we use what we build)
  • Unlimited PTO—rest and recharge time is non-negotiable
  • Mental health support, retirement planning, and financial protection
  • Professional development with clear career progression and learning budgets
  • Mission-driven culture where diverse perspectives drive real impact on people's health
  • Fulltime
Read More
Arrow Right

Technical Architect (AI)

The AI Solutions Engineer is a specialist role, responsible for actively partici...
Location
Location
Indonesia , Jakarta Selatan
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrated understanding of artificial intelligence, natural language processing (NLP), and machine learning principles
  • Expertise in selecting, fine-tuning, and deploying large and small language models (LLMs/SLMs), such as OpenAI's GPT series and open-source alternatives
  • Specialist proficiency in Python programming, essential for rapid prototyping, integration, and model implementation
  • Knowledge of additional programming languages (optional, but valuable): JavaScript / TypeScript, Java / C#
  • Familiarity with full-stack software development, including frontend and backend integration, user experience considerations, and system interoperability
  • Robust knowledge of data pipeline development, data engineering concepts, and handling of structured and unstructured data
  • Proficiency in cloud computing platforms (Azure, AWS, GCP), particularly in deploying, scaling, and managing AI workloads
  • Experience with Microsoft Copilot Studio, Azure AI Foundry, and Semantic Kernel is highly desirable
  • Awareness and application of security, compliance, and risk management practices related to AI solutions
  • Understanding of ethical AI considerations, bias mitigation, and responsible AI deployment
Job Responsibility
Job Responsibility
  • Develop, fine-tune, and deploy AI models, including large language models (LLMs) such as GPT-4 or open-source equivalents
  • Design and implement effective prompt engineering strategies and optimizations to enhance AI accuracy, consistency, and reliability
  • Engage with internal stakeholders and clients to understand business needs, translating them into actionable AI solutions
  • Rapidly prototype, test, and iterate AI applications using advanced Python programming and relevant frameworks
  • Integrate AI solutions securely with existing enterprise systems (CRM, ERP, HRIS, finance platforms, collaboration software) via API development and integration
  • Build, maintain, and optimize end-to-end data pipelines to ensure accurate and timely data delivery for AI models
  • Manage structured and unstructured datasets, leveraging vector databases and semantic search to enhance knowledge management capabilities
  • Deploy, manage, and scale AI solutions within cloud computing environments (Azure, AWS, GCP), ensuring high availability, performance, and cost efficiency
  • Implement DevOps and MLOps practices, including automated deployment, testing, monitoring, and version control, to efficiently manage the AI model lifecycle
  • Ensure AI solutions adhere to industry standards and compliance regulations (GDPR, HIPAA), emphasizing security and privacy best practices
  • Fulltime
Read More
Arrow Right