CrawlJobs Logo

Data Scientist – AI, LLMs & Data Pipelines

SRKay Consulting Group

Location Icon

Location:
India , Pune

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

This role is for our US-based Media Analytics client account, a distributed SaaS company using AI to power media monitoring and analysis. Their platform enables clients to understand what’s being said in the world and why it matters—using LLMs, data pipelines, and custom metrics to turn raw data into actionable insights. We are looking for a high impact Data Scientist to help reimagine and transform the way the organization operates and deliver products to its clients. This is a strategic role that blends advanced data science, large language models (LLMs), and product thinking to drive simplicity, usability, quality, and operational efficiency across the company. This role is intended for someone with 5+ years of experience, a deep understanding of LLMs (especially OpenAI’s ecosystem), and the skills to hit the ground running alongside our existing AI/ML team. You will work at the intersection of AI innovation, operational design, and client experience—bringing technical rigor, business intuition, and creative energy to solve complex challenges.

Job Responsibility:

  • LLM-led Innovation: Build and deploy LLM-powered tools and workflows that simplify analyst work, reduce errors, and accelerate delivery
  • Operational Transformation: Understand operational processes and create intelligent, scalable solutions that eliminate complexity and manual effort
  • Product Evolution: Partner with product and operations teams to infuse intelligence and automation into client-facing platforms
  • Client-Centric Design: Translate client pain points and product gaps into practical, data-driven AI solutions that enhance experience and outcomes
  • Rapid Experimentation: Prototype fast, test early, iterate often. Maintain speed without sacrificing accuracy or quality
  • Cross-Functional Collaboration: Work closely with engineering, product, client success, and operations teams to bring ideas to life
  • Own and operate data pipelines: Run, troubleshoot, and improve the scripts and workflows that move and transform our data—upgrading these to use workflow orchestration tools for robustness and automation
  • Work with LLMs: Fine-tune models, conduct reinforcement learning with human feedback (RLHF), and support preference-review workflows
  • Model evaluation & monitoring: Analyze results and report on the effectiveness of AI models and data labeling efforts using both established and novel benchmarks
  • Thematic extraction at scale: Help identify and surface underlying themes, narratives, and sentiment across vast media datasets
  • Metric innovation: Derive new ways to quantify media content and influence using observed patterns and domain insight
  • Collaborate & iterate: Partner closely with the key stakeholders (such as Lead Data Scientist, members of Engineering, Products & Operations team) to evolve our AIand ML-powered systems
  • Secure, scalable coding: Contribute high-quality, secure code in a cloud-based environment

Requirements:

  • 5+ years of hands-on experience as a data scientist or ML engineer, with demonstrated ownership of projects in production
  • Minimum of 2 years’ experience with LLM’s
  • Proven experience applying LLMs and generative AI to real-world business problems
  • Strong Linux skills – comfortable navigating and scripting in a CLI-first environment
  • Expert Python skills – you write clean, maintainable, tested, and efficient code
  • Deep experience working with OpenAI models and APIs, including prompt engineering, finetuning and evaluation
  • Fluent in SQL with ability to work efficiently with PostgreSQL datasets
  • Experience using Docker for local and production development
  • Proficiency with LangChain for building multi-step LLM workflows
  • Effective remote communication and collaboration, especially in a distributed team with meetings on US Eastern Time
  • Strong business acumen—able to connect technical solutions to operational and client value
  • Startup-style drive, agility, and hands-on mindset—you ship, not just ideate
  • Creativity and experimentation—willing to try, fail, and improve
  • Exceptional communication skills—can explain complex ideas simply and persuasively
  • Deep proficiency in Python, NLP, and AI frameworks (e.g., Hugging Face, LangChain), vector databases, and prompt engineering
  • Minimum 3–5 years of experience in AI/ML roles, preferably in B2B or SaaS environments
  • Bachelor's or Master's in Computer Science, Data Science, AI, or related field
  • Experience with AWS cloud services (EC2, S3, etc.)
  • Familiarity with workflow orchestration tools

Nice to have:

  • Knowledge of media analytics, journalism, or influence measurement
  • Prior work with data labeling, theme detection, or benchmarking model output

Additional Information:

Job Posted:
March 05, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Data Scientist – AI, LLMs & Data Pipelines

AI Data Scientist

We are seeking a highly skilled and experienced Senior Data Scientist / AI Engin...
Location
Location
Poland , Warszawa
Salary
Salary:
Not provided
https://www.bosch.pl/ Logo
Robert Bosch Sp. z o.o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Data Science, or a related field. A PhD is preferred
  • Minimum of 5 years of experience in data science and AI engineering
  • Strong proficiency in programming languages like Python
  • In-depth knowledge of machine learning algorithms and techniques
  • Experience with LLMs and RAG or agent frameworks (Langchain, LLamaIndex, etc.)
  • Proficiency in NLP (Natural Language Processing) techniques and libraries
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration abilities
  • Ability to work independently and in a team environment
  • Strong attention to detail and ability to prioritize tasks
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to identify, understand, and solve business problems
  • Design, develop, and implement advanced machine learning algorithms, models, and LLM-based pipelines
  • Analyze large datasets to extract meaningful insights and patterns
  • Write production-quality code in Python, using standard software engineering best practices (git, testing, etc.)
  • Stay up-to-date with the latest advancements in AI, GenAI, and data science
  • Collaborate with software engineers to integrate AI models into production systems
  • Provide guidance and mentorship to junior data scientists and engineers
What we offer
What we offer
  • Competitive salary + annual bonus
  • Hybrid work with flexible working hours
  • Referral Bonus Program
  • Copyright costs for IT employees
  • Complex environment of working, professional support and possibility to share knowledge and best practices
  • Ongoing development opportunities in a multinational environment
  • Broad access to professional trainings (incl. language courses), conferences and webinars
  • Private medical care and life insurance
  • Cafeteria System with multiple benefits (incl. MultiSport, shopping vouchers, cinema tickets, etc.)
  • Prepaid Lunch Card
  • Fulltime
Read More
Arrow Right

AI Data Scientist - Senior

We are seeking a highly skilled and experienced Senior Data Scientist / AI Engin...
Location
Location
Poland , Warsaw
Salary
Salary:
Not provided
https://www.bosch.pl/ Logo
Robert Bosch Sp. z o.o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Data Science, or a related field. A PhD is preferred
  • Minimum of 5 years of experience in data science and AI engineering
  • Strong proficiency in programming languages like Python
  • In-depth knowledge of machine learning algorithms and techniques
  • Experience with LLMs and RAG or agent frameworks (Langchain, LLamaIndex, etc.)
  • Proficiency in NLP (Natural Language Processing) techniques and libraries
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration abilities
  • Ability to work independently and in a team environment
  • Strong attention to detail and ability to prioritize tasks
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to identify, understand, and solve business problems
  • Design, develop, and implement advanced machine learning algorithms, models, and LLM-based pipelines
  • Analyze large datasets to extract meaningful insights and patterns
  • Write production-quality code in Python, using standard software engineering best practices (git, testing, etc.)
  • Stay up-to-date with the latest advancements in AI, GenAI, and data science
  • Collaborate with software engineers to integrate AI models into production systems
  • Provide guidance and mentorship to junior data scientists and engineers
What we offer
What we offer
  • Competitive salary + annual bonus
  • Hybrid work with flexible working hours
  • Referral Bonus Program
  • Copyright costs for IT employees
  • Complex environment of working, professional support and possibility to share knowledge and best practices
  • Ongoing development opportunities in a multinational environment
  • Broad access to professional trainings (incl. language courses), conferences and webinars
  • Private medical care and life insurance
  • Cafeteria System with multiple benefits (incl. MultiSport, shopping vouchers, cinema tickets, etc.)
  • Prepaid Lunch Card
  • Fulltime
Read More
Arrow Right

Senior Data Scientist

We are seeking a Senior Data Scientist with deep expertise in unstructured data ...
Location
Location
Taiwan
Salary
Salary:
Not provided
beyond.ai Logo
Beyond Limits
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on experience in AI, Machine Learning, and Data Science, with a strong focus on production-scale AI
  • Expertise in LLMs, including fine-tuning, distributed training, quantization, and pruning techniques
  • Experience working with OCR, ASR, and TTS applications in real-world deployments
  • Proven experience deploying AI models in production, with real-world examples of scaled AI applications
  • Strong understanding of cloud computing, containerization (Docker, Kubernetes), and ML Ops best practices
  • Proficiency in Python, PyTorch, and ML libraries
  • Hands-on experience with vector databases and retrieval-augmented generation (RAG) architectures
  • Strong awareness of AI system performance benchmarks (latency, speed, throughput) and ability to optimize models accordingly
  • Experience working with AI agents, designing real-world intelligent automation solutions beyond just open-source experimentation
  • Proficiency in transformer-based architectures (BERT, GPT, LLaMA, Whisper, etc.), including pre-training, fine-tuning, and task-specific adaptation
Job Responsibility
Job Responsibility
  • Develop and deploy AI models for unstructured data (text, speech, audio, images) with a focus on enterprise-scale performance
  • Fine-tune, optimize, and deploy LLMs and multimodal models, integrating distributed training, quantization, and pruning techniques for efficiency
  • Design and implement production-ready AI solutions, ensuring scalability, low-latency inference, and high throughput
  • Work with AI agents and automation frameworks to create intelligent, real-world AI applications for enterprise clients
  • Build and maintain end-to-end LLM Ops pipelines, ensuring efficient training, deployment, monitoring, and model updates
  • Implement vector search and retrieval-augmented generation (RAG) systems for large-scale data solutions
  • Monitor AI performance using key metrics such as speed, latency, and throughput, continuously refining models for real-world efficiency
  • Work with cloud-based AI infrastructure (AWS, GCP) and containerized environments (Docker, Kubernetes) to scale AI solutions
  • Collaborate with engineering, DevOps, and product teams to align AI solutions with business needs and client requirements
  • Implement data curation pipelines, including data collection, cleaning, deduplication, decontamination, etc. for training high-quality AI models
Read More
Arrow Right

Senior Data Scientist

We are seeking a Senior Data Scientist with deep expertise in unstructured data ...
Location
Location
Salary
Salary:
Not provided
beyond.ai Logo
Beyond Limits
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on experience in AI, Machine Learning, and Data Science, with a strong focus on production-scale AI
  • Expertise in LLMs, including fine-tuning, distributed training, quantization, and pruning techniques
  • Experience working with OCR, ASR, and TTS applications in real-world deployments
  • Proven experience deploying AI models in production, with real-world examples of scaled AI applications
  • Strong understanding of cloud computing, containerization (Docker, Kubernetes), and ML Ops best practices
  • Proficiency in Python, PyTorch, and ML libraries
  • Hands-on experience with vector databases and retrieval-augmented generation (RAG) architectures
  • Strong awareness of AI system performance benchmarks (latency, speed, throughput) and ability to optimize models accordingly
  • Experience working with AI agents, designing real-world intelligent automation solutions beyond just open-source experimentation
  • Proficiency in transformer-based architectures (BERT, GPT, LLaMA, Whisper, etc.), including pre-training, fine-tuning, and task-specific adaptation
Job Responsibility
Job Responsibility
  • Develop and deploy AI models for unstructured data (text, speech, audio, images) with a focus on enterprise-scale performance
  • Fine-tune, optimize, and deploy LLMs and multimodal models, integrating distributed training, quantization, and pruning techniques for efficiency
  • Design and implement production-ready AI solutions, ensuring scalability, low-latency inference, and high throughput
  • Work with AI agents and automation frameworks to create intelligent, real-world AI applications for enterprise clients
  • Build and maintain end-to-end LLM Ops pipelines, ensuring efficient training, deployment, monitoring, and model updates
  • Implement vector search and retrieval-augmented generation (RAG) systems for large-scale data solutions
  • Monitor AI performance using key metrics such as speed, latency, and throughput, continuously refining models for real-world efficiency
  • Work with cloud-based AI infrastructure (AWS, GCP) and containerized environments (Docker, Kubernetes) to scale AI solutions
  • Collaborate with engineering, DevOps, and product teams to align AI solutions with business needs and client requirements
  • Implement data curation pipelines, including data collection, cleaning, deduplication, decontamination, etc. for training high-quality AI models
Read More
Arrow Right

Data Scientist Specialist

We are seeking a highly experienced Data Scientist Specialist with deep expertis...
Location
Location
United States , McLean
Salary
Salary:
Not provided
apexsystems.com Logo
Apex Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in AI, Data Science, Computer Science, or related field
  • Extensive experience in AI/ML, including 3+ years in applied GenAI or LLM-based solutions
  • Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases, and multi-modal models
  • Proven experience with AWS cloud-native AI development (SageMaker, Bedrock, MLFlow/Kubeflow on EKS)
  • Strong programming skills in Python and ML/LLM libraries (Transformers, LangChain, etc.)
  • Strong understanding of GenAI system patterns, agentic architectures, evaluation frameworks, and guardrails
  • Demonstrated success working in cross-functional, agile teams
  • GitHub code repository link required for candidate evaluation
Job Responsibility
Job Responsibility
  • Architect and implement GenAI systems: Build scalable AI agents, agentic workflows, and GenAI applications for diverse business use cases
  • Model development & optimization: Fine-tune and optimize lightweight LLMs
  • evaluate and adapt models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives
  • RAG & GraphRAG architectures: Design and deploy Retrieval-Augmented Generation (RAG) and GraphRAG systems using vector databases and enterprise knowledge bases
  • Enterprise data curation: Curate and prepare enterprise data using connectors integrated with AWS Bedrock Knowledge Bases and/or Elasticsearch
  • Agent interoperability: Implement solutions leveraging Model Context Protocol (MCP) and Agent-to-Agent (A2A) communication patterns
  • Experimentation & ML platforms: Build and maintain Jupyter-based notebooks using SageMaker, MLFlow, or Kubeflow on Kubernetes (EKS)
  • Cross-functional collaboration: Work with UI engineers, microservices teams, designers, and data engineers to deliver full-stack GenAI experiences
  • Enterprise integration: Integrate GenAI solutions with enterprise platforms via APIs and standardized GenAI architectural patterns
  • Evaluation & safety: Establish evaluation frameworks, bias mitigation strategies, safety protocols, and guardrails for production deployment
What we offer
What we offer
  • medical
  • dental
  • vision
  • life
  • disability
  • other insurance plans
  • ESPP (employee stock purchase program)
  • 401K program with company match after 12 months
  • HSA (Health Savings Account on the HDHP plan)
  • SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions
Read More
Arrow Right

Data Scientist

We are looking for a skilled Data Scientist with 3–5 years of experience to join...
Location
Location
Salary
Salary:
Not provided
bugendaitech.com Logo
Bugendai Tech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3–5 years of hands-on experience in Data Science or Machine Learning roles
  • Strong programming skills in Python with libraries like Pandas, NumPy, Scikit-learn, etc.
  • Proficiency in writing complex SQL queries for large datasets
  • Experience working with Generative AI / LLMs
  • Hands-on experience with AWS Cloud Services (e.g., EC2, S3, Lambda, SageMaker)
  • Solid understanding of data structures, algorithms, and model evaluation metrics
  • Ability to explain technical concepts to non-technical stakeholders
Job Responsibility
Job Responsibility
  • Design and implement machine learning and GenAI models to solve real-world business problems
  • Write efficient, maintainable code in Python for data processing, model training, and evaluation
  • Perform deep data analysis using SQL to extract meaningful insights and trends
  • Build and optimize data pipelines and workflows in AWS Cloud (e.g., S3, Lambda, SageMaker, Redshift, etc.)
  • Collaborate with cross-functional teams including product, engineering, and business stakeholders
  • Monitor model performance and retrain/iterate as needed
  • Stay updated with the latest in Generative AI, LLMs, and related technologies
Read More
Arrow Right

Senior Data Scientist

The Senior Medical Data Scientist will play a vital role in supporting Global Me...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Engineering, Statistics, or a related quantitative field and 2+ year(s) of experience in data analytics or data science
  • OR, Master’s degree in Computer Science, Engineering, Statistics, or a related quantitative field and 5+ years of experience in data analytics or data science
  • OR, Bachelor’s degree in Computer Science, Engineering, Statistics, or a related quantitative field and 7+ years of experience in data analytics or data science
  • Experience in data science, statistics, machine learning, or advanced analytics, with hands-on application of statistical techniques such as hypothesis testing, regression analysis, clustering, and classification
  • Demonstrated hands-on expertise in data science, statistical modeling, machine learning, advanced predictive modeling, NLP, or related fields, with successful delivery of end-to-end solutions
  • Advanced programming in SQL and Python (R a plus)
  • comfort working in Databricks or similar environments
  • Strong storytelling and presentation skills
  • ability to translate data into clear, compelling insights
Job Responsibility
Job Responsibility
  • Design and build high-impact projects that address complex medical questions across therapeutic areas, applying advanced data science methods (predictive modeling, NLP, LLMs, machine learning, causal inference)
  • Conduct end-to-end data science projects across structured and unstructured datasets (e.g., HCP sentiment, scientific engagement data, publications, CRM/field medical data) using Python, R, SQL, Tableau, Power BI, or similar platforms
  • Drive the design, application, and evaluation of test-and-learn approaches — including causal inference methods, advanced A/B testing, and pilots, with accountability for methodological rigor and actionable insights
  • Partner with business leaders and stakeholders to define problem statements, set project scope, and design and conduct medical data science projects such as MSL engagement optimization, MSL note sentiment analysis using LLMs, and digital engagement impact measurement, turning complex data into insights that enable more effective field medical activities
  • Design and optimize scalable data pipelines in collaboration with engineering teams, ensuring reproducibility, robustness, and compliance in deployment workflows
  • Translate complex model outputs into clear, actionable insights and recommendations and deliver compelling presentations, reports, and visualizations to medical and cross-functional leadership teams
  • Serve as a technical SME (subject matter expert) in data science, providing thought influencing designing and methodology choices across the MDnA team
  • Passion to continuously learn, evaluate and introduce emerging technologies (e.g., Generative AI, LLMs, advanced A/B testing frameworks) and act as a motivational peer to strengthen Amgen’s data science toolbox
  • Maintain best practices in reproducibility, documentation, version control, and model governance across teams
Read More
Arrow Right

Data Scientist

Medical Data Scientist will play a vital role in supporting Global Medical Affai...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Engineering, Statistics, or a related quantitative field and 1+ year(s) of experience in data analytics or data science
  • OR, Master’s degree in Computer Science, Engineering, Statistics, or a related quantitative field and 3+ years of experience in data analytics or data science
  • OR, Bachelor’s degree in Computer Science, Engineering, Statistics, or a related quantitative field and 5+ years of experience in data analytics or data science
  • Experience in data science, statistics, machine learning, or advanced analytics, with hands-on application of statistical techniques such as hypothesis testing, regression analysis, clustering, and classification
  • Working knowledge of SQL and Python (R a plus)
  • comfort working in Databricks or similar environments
  • Passion for applying technical skills to solve complex business problems and drive data-driven transformation
Job Responsibility
Job Responsibility
  • Support the application of data science methods — including advanced statistical techniques, predictive modeling, and machine learning — to design, develop, and deploy solutions that address complex medical questions across therapeutic areas
  • Support and execute components of data science projects across structured and unstructured datasets (e.g., HCP sentiment, scientific engagement data, publications, CRM/field medical data) using Python, R, SQL, Tableau, Power BI, or similar platforms
  • Assist in designing and evaluating experimentation and pilots (e.g., digital channel optimization or field medical targeting) to test innovative approaches, with a focus on scaling successful solutions
  • Work in technical teams and contribute to medical data science projects such as Medical Science Liaison (MSL) engagement optimization, MSL note sentiment analysis using LLMs, and digital engagement impact measurement, turning complex data into insights that enable more effective field medical activities
  • Support building and optimizing scalable data pipelines for ingestion, cleansing, transformation, model training, automation, and deployment in partnership with data engineering and platform teams
  • Collaborate with cross-functional stakeholders (Medical, Commercial, and Technology groups across Amgen India) to ensure analyses directly address key medical business questions and strategic priorities
  • Partner with data science, IT, and external vendors to support data access, integrity, and governance
  • Help to translate analytic outputs into clear, actionable insights and recommendations and deliver compelling presentations, reports, and visualizations to medical and cross-functional leadership teams
  • Stay curious — continuously learn and apply emerging methods such as LLMs, Generative AI, A/B testing, and causal inference to strengthen analytical approaches
  • Support documentation, version control, and deployment practices using standardized platforms and tools
What we offer
What we offer
  • Competitive salary and comprehensive benefits package
  • Opportunities for professional growth and career development
  • A collaborative and inclusive work environment
  • Fulltime
Read More
Arrow Right