CrawlJobs Logo

Data Scientist – AI, LLMs & Data Pipelines

India, Pune · Job Posted March 05, 2026
Apply Position
Job Link Share

Job Description

This role is for our US-based Media Analytics client account, a distributed SaaS company using AI to power media monitoring and analysis. Their platform enables clients to understand what’s being said in the world and why it matters—using LLMs, data pipelines, and custom metrics to turn raw data into actionable insights. We are looking for a high impact Data Scientist to help reimagine and transform the way the organization operates and deliver products to its clients. This is a strategic role that blends advanced data science, large language models (LLMs), and product thinking to drive simplicity, usability, quality, and operational efficiency across the company. This role is intended for someone with 5+ years of experience, a deep understanding of LLMs (especially OpenAI’s ecosystem), and the skills to hit the ground running alongside our existing AI/ML team. You will work at the intersection of AI innovation, operational design, and client experience—bringing technical rigor, business intuition, and creative energy to solve complex challenges.

Job Responsibility

  • LLM-led Innovation: Build and deploy LLM-powered tools and workflows that simplify analyst work, reduce errors, and accelerate delivery
  • Operational Transformation: Understand operational processes and create intelligent, scalable solutions that eliminate complexity and manual effort
  • Product Evolution: Partner with product and operations teams to infuse intelligence and automation into client-facing platforms
  • Client-Centric Design: Translate client pain points and product gaps into practical, data-driven AI solutions that enhance experience and outcomes
  • Rapid Experimentation: Prototype fast, test early, iterate often. Maintain speed without sacrificing accuracy or quality
  • Cross-Functional Collaboration: Work closely with engineering, product, client success, and operations teams to bring ideas to life
  • Own and operate data pipelines: Run, troubleshoot, and improve the scripts and workflows that move and transform our data—upgrading these to use workflow orchestration tools for robustness and automation
  • Work with LLMs: Fine-tune models, conduct reinforcement learning with human feedback (RLHF), and support preference-review workflows
  • Model evaluation & monitoring: Analyze results and report on the effectiveness of AI models and data labeling efforts using both established and novel benchmarks
  • Thematic extraction at scale: Help identify and surface underlying themes, narratives, and sentiment across vast media datasets
  • Metric innovation: Derive new ways to quantify media content and influence using observed patterns and domain insight
  • Collaborate & iterate: Partner closely with the key stakeholders (such as Lead Data Scientist, members of Engineering, Products & Operations team) to evolve our AIand ML-powered systems
  • Secure, scalable coding: Contribute high-quality, secure code in a cloud-based environment

Requirements

  • 5+ years of hands-on experience as a data scientist or ML engineer, with demonstrated ownership of projects in production
  • Minimum of 2 years’ experience with LLM’s
  • Proven experience applying LLMs and generative AI to real-world business problems
  • Strong Linux skills – comfortable navigating and scripting in a CLI-first environment
  • Expert Python skills – you write clean, maintainable, tested, and efficient code
  • Deep experience working with OpenAI models and APIs, including prompt engineering, finetuning and evaluation
  • Fluent in SQL with ability to work efficiently with PostgreSQL datasets
  • Experience using Docker for local and production development
  • Proficiency with LangChain for building multi-step LLM workflows
  • Effective remote communication and collaboration, especially in a distributed team with meetings on US Eastern Time
  • Strong business acumen—able to connect technical solutions to operational and client value
  • Startup-style drive, agility, and hands-on mindset—you ship, not just ideate
  • Creativity and experimentation—willing to try, fail, and improve
  • Exceptional communication skills—can explain complex ideas simply and persuasively
  • Deep proficiency in Python, NLP, and AI frameworks (e.g., Hugging Face, LangChain), vector databases, and prompt engineering
  • Minimum 3–5 years of experience in AI/ML roles, preferably in B2B or SaaS environments
  • Bachelor's or Master's in Computer Science, Data Science, AI, or related field
  • Experience with AWS cloud services (EC2, S3, etc.)
  • Familiarity with workflow orchestration tools

Nice to have

  • Knowledge of media analytics, journalism, or influence measurement
  • Prior work with data labeling, theme detection, or benchmarking model output

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Data Scientist – AI, LLMs & Data Pipelines

8 matching positions

Senior Data Scientist / AI Consultant – Security & Risk Analytics

Senior Data Scientist / AI Consultant – Security & Risk Analytics. Hybrid – 3 da...
Location
Location
Ireland , Dublin 18
Salary
Salary:
100000.00 EUR / Year
solasit.ie Logo
Solas IT Recruitment
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience delivering end-to-end ML and DL solutions in production environments
  • 8+ years commercial experience
  • Strong programming skills in Python and solid SQL expertise
  • Experience with Generative AI, including LLMs, RAG, vector databases, and agent-based frameworks
  • Strong analytical thinking with the ability to communicate insights to both technical and non-technical stakeholders
  • Hands-on experience in data preparation, feature engineering, and model evaluation
  • A proactive and solution-oriented mindset, comfortable working in fast-paced environments
Job Responsibility
Job Responsibility
  • Translate complex business challenges into data-driven solutions using ML, DL, and GenAI techniques
  • Manage the full model lifecycle including data exploration, feature engineering, modelling, validation, and deployment
  • Build classification and scoring models, effectively handling incomplete or ambiguous datasets
  • Develop explainable AI outputs to support business decision-making
  • Deploy and monitor models to ensure performance, scalability, and reliability
  • Prototype new concepts and evaluate emerging tools and frameworks
  • Collaborate with engineering and product teams to deliver scalable, production-ready systems
  • Support data pipelines, cloud platforms, and CI/CD processes for analytics workloads
What we offer
What we offer
  • Excellent salary package
  • Great benefits
  • Very good career progression
  • Fulltime
Read More
Arrow Right

Principal Applied Data Scientist - AI for Good Lab

The AI for Good Lab is hiring a Principal Applied Data Scientist to join our tea...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research)
  • OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience (e.g., statistics, predictive analytics, research)
  • OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research) OR equivalent experience
  • Deep foundation in AI, machine learning, statistics, or related quantitative methods applied to real-world problems
  • Experience working end-to-end with data-from sourcing and exploration through modeling, interpretation, and communication
  • Proficiency in at least one scientific programming language (Python, R or equivalent languages) and experience with SQL or similar query languages
  • Excellent written and verbal communication skills, with demonstrated experience communicating complex ideas clearly and persuasively to non-technical audiences
  • Proven ability to influence outcomes and lead work in cross-functional, matrixed environments
Job Responsibility
Job Responsibility
  • Lead and develop applied AI solutions (LLMs, Agents, Computer Vision) and data science solutions by identifying and gathering data, shaping problem formulations, applying AI, machine learning, and statistical methods, and generating insight with real-world impact
  • Use AI creatively as a research and solution-building tool, combining quantitative methods, experimentation, and domain knowledge to surface patterns, test ideas, and inform decisions
  • Rapidly prototype and validate approaches using modeling, statistics and experimentation
  • select methods under real-world constraints (cost/latency, safety, privacy, maintainability)
  • Design and build reliable, maintainable, end-to-end systems spanning data pipelines, model lifecycle, evaluation/telemetry, deployment, and operations
  • Advance the AI for Good Lab research agenda by authoring technical papers and presentation, published both internally and externally
  • Work in close partnership with other researchers and research organizations, as well as policy, industry, and nonprofit stakeholders, to co-create solutions
  • Present findings with clear and compelling narratives, using impactful visualizations and storytelling to articulate insights that drive understanding and action
  • Lead through influence by shaping technical direction and standards (model evaluation, responsible AI, safety/privacy, and monitoring), aligning collaborators, navigating tradeoffs, and sustaining momentum across teams and institutions
  • Fulltime
Read More
Arrow Right

AI Data Scientist

We are seeking a highly skilled and experienced Senior Data Scientist / AI Engin...
Location
Location
Poland , Warszawa
Salary
Salary:
Not provided
https://www.bosch.pl/ Logo
Robert Bosch Sp. z o.o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Data Science, or a related field. A PhD is preferred
  • Minimum of 5 years of experience in data science and AI engineering
  • Strong proficiency in programming languages like Python
  • In-depth knowledge of machine learning algorithms and techniques
  • Experience with LLMs and RAG or agent frameworks (Langchain, LLamaIndex, etc.)
  • Proficiency in NLP (Natural Language Processing) techniques and libraries
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration abilities
  • Ability to work independently and in a team environment
  • Strong attention to detail and ability to prioritize tasks
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to identify, understand, and solve business problems
  • Design, develop, and implement advanced machine learning algorithms, models, and LLM-based pipelines
  • Analyze large datasets to extract meaningful insights and patterns
  • Write production-quality code in Python, using standard software engineering best practices (git, testing, etc.)
  • Stay up-to-date with the latest advancements in AI, GenAI, and data science
  • Collaborate with software engineers to integrate AI models into production systems
  • Provide guidance and mentorship to junior data scientists and engineers
What we offer
What we offer
  • Competitive salary + annual bonus
  • Hybrid work with flexible working hours
  • Referral Bonus Program
  • Copyright costs for IT employees
  • Complex environment of working, professional support and possibility to share knowledge and best practices
  • Ongoing development opportunities in a multinational environment
  • Broad access to professional trainings (incl. language courses), conferences and webinars
  • Private medical care and life insurance
  • Cafeteria System with multiple benefits (incl. MultiSport, shopping vouchers, cinema tickets, etc.)
  • Prepaid Lunch Card
  • Fulltime
Read More
Arrow Right

AI Data Scientist - Senior

We are seeking a highly skilled and experienced Senior Data Scientist / AI Engin...
Location
Location
Poland , Warsaw
Salary
Salary:
Not provided
https://www.bosch.pl/ Logo
Robert Bosch Sp. z o.o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Data Science, or a related field. A PhD is preferred
  • Minimum of 5 years of experience in data science and AI engineering
  • Strong proficiency in programming languages like Python
  • In-depth knowledge of machine learning algorithms and techniques
  • Experience with LLMs and RAG or agent frameworks (Langchain, LLamaIndex, etc.)
  • Proficiency in NLP (Natural Language Processing) techniques and libraries
  • Strong problem-solving and analytical skills
  • Excellent communication and collaboration abilities
  • Ability to work independently and in a team environment
  • Strong attention to detail and ability to prioritize tasks
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to identify, understand, and solve business problems
  • Design, develop, and implement advanced machine learning algorithms, models, and LLM-based pipelines
  • Analyze large datasets to extract meaningful insights and patterns
  • Write production-quality code in Python, using standard software engineering best practices (git, testing, etc.)
  • Stay up-to-date with the latest advancements in AI, GenAI, and data science
  • Collaborate with software engineers to integrate AI models into production systems
  • Provide guidance and mentorship to junior data scientists and engineers
What we offer
What we offer
  • Competitive salary + annual bonus
  • Hybrid work with flexible working hours
  • Referral Bonus Program
  • Copyright costs for IT employees
  • Complex environment of working, professional support and possibility to share knowledge and best practices
  • Ongoing development opportunities in a multinational environment
  • Broad access to professional trainings (incl. language courses), conferences and webinars
  • Private medical care and life insurance
  • Cafeteria System with multiple benefits (incl. MultiSport, shopping vouchers, cinema tickets, etc.)
  • Prepaid Lunch Card
  • Fulltime
Read More
Arrow Right

Senior Data Scientist – Gen AI Engineer

Working at Citi is far more than just a job. A career with us means joining a te...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's degree in Computer Science, Data Science, or a related field. Ph.D. preferred.
  • 8-12 years of experience in a Data Scientist or equivalent role, with at least 4 years of specialized experience in Generative AI, including experience leading technical development and mentoring teams.
  • Working with financial data and applying NLP techniques, refining prompt engineering strategies for LLMs, collaborating with stakeholders to understand their needs, developing and testing Python code for GenAI solutions, integrating with vector databases, monitoring MLOps pipelines, researching emerging GenAI technologies, and participating in team meetings.
  • Troubleshooting and debugging GenAI models in production.
  • Staying up-to-date with the rapidly evolving GenAI landscape.
  • Communicating technical concepts clearly to non-technical stakeholders.
  • Candidates must possess demonstrable experience in the full lifecycle of real-world, production-level GenAI project implementation.
  • Data structures (lists, dictionaries, sets), algorithms, object-oriented programming, file handling, exception handling.
  • Scientific Computing: NumPy, Pandas, SciPy.
  • Machine Learning: Scikit-learn, XGBoost, LightGBM.
Job Responsibility
Job Responsibility
  • Working with financial data and applying NLP techniques, refining prompt engineering strategies for LLMs, collaborating with stakeholders to understand their needs, developing and testing Python code for GenAI solutions, integrating with vector databases, monitoring MLOps pipelines, researching emerging GenAI technologies, and participating in team meetings.
  • Troubleshooting and debugging GenAI models in production.
  • Staying up-to-date with the rapidly evolving GenAI landscape.
  • Communicating technical concepts clearly to non-technical stakeholders.
  • Fulltime
Read More
Arrow Right

Senior Data Scientist – Gen AI Engineer - Assistant Vice President

Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Data Science, Artificial Intelligence, or a related quantitative field.
  • 8–12 years of experience as a Data Scientist or equivalent role, with at least 2 years of specialized, hands-on experience in Generative AI, including leading technical development and mentoring teams.
  • Demonstrable experience across the full lifecycle of production-level GenAI projects — from ideation and prototyping through deployment, monitoring, and ongoing maintenance in live environments.
  • Expert-level Python proficiency
  • Scikit-learn, XGBoost, LightGBM
  • PyTorch
  • Hugging Face Transformers, LangChain, LlamaIndex, Semantic Kernel
  • LangGraph, CrewAI, AutoGen
  • Transformer architectures
  • RAG patterns
Job Responsibility
Job Responsibility
  • Working with financial and enterprise data, applying modern NLP and GenAI techniques to solve business problems.
  • Designing, refining, and systematizing prompt engineering strategies for large language models (LLMs), including structured prompting, chain-of-thought, and few-shot/zero-shot approaches.
  • Collaborating with business stakeholders to translate requirements into GenAI-powered solutions.
  • Developing, testing, and maintaining production-grade Python code for GenAI applications.
  • Integrating with vector databases (e.g., Pinecone, Weaviate, Milvus, pgvector, Qdrant) for retrieval-augmented generation (RAG) pipelines.
  • Building, monitoring, and optimizing MLOps/LLMOps pipelines for continuous model deployment and observability.
  • Researching and evaluating emerging GenAI technologies, frameworks, and best practices to maintain competitive advantage.
  • Troubleshooting and debugging GenAI models and agentic systems in production, including rapid identification and resolution of issues in real-world deployments.
  • Communicating complex AI/ML concepts clearly to non-technical stakeholders, translating technical jargon into actionable business terms.
  • Participating in and leading team meetings, design reviews, and architecture discussions.
  • Fulltime
Read More
Arrow Right

Data Scientist – Agentic AI

At Satalia, a WPP company, we push the boundaries of data science, optimisation ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
satalia.com Logo
Satalia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Educational background in Computer Science, Data Science, or similar
  • Professional experience in Data Science, AI/ML Engineering, Software Engineering, or related technical roles
  • High proficiency in Python and working knowledge of SQL
  • Experience building API services using FastAPI, Flask or similar
  • Experience with containerisation, cloud platforms and CI/CD
  • Experience with AgenticOps best practices, including evaluation, observability, safety guardrails, security, performance optimisation
  • Experience designing and building complex multi-agent systems using at least one of the major agent frameworks (e.g. ADK, AutoGen, LangGraph)
  • Understanding of agent memory and state management patterns
  • Familiarity with MCP servers
  • Quality-first, test-driven mindset with focus on automated testing
Job Responsibility
Job Responsibility
  • Design and deliver production-grade agentic systems, from multi-agent orchestration to the AI services and models that power them
  • Assess fit of agentic solutions per use case, being intentional in system design
  • Design the most suitable orchestration pattern per use case
  • Evaluate and recommend the best suited agent development framework
  • Engineer multi-step agent workflows that are reliable, auditable and modular, supporting both linear and non-linear control flows
  • Implement agent memory and state management systems
  • Optimise context assembly for each agent interaction
  • Evaluate and select the most appropriate foundation models, embeddings and context-specific model variants
  • apply fine-tuning or adaptation where needed
  • Implement guardrails to ensure operation within ethical, legal, brand boundaries
What we offer
What we offer
  • enhanced pension
  • life assurance
  • income protection
  • private healthcare
  • Remote working
  • Truly flexible working hours
  • 27 days holiday plus bank holidays and enhanced family leave
  • Annual bonus
  • Impactful projects
  • People oriented culture
  • Fulltime
Read More
Arrow Right

Senior Data Scientist (NLP & LLMs)

Gong harnesses the power of AI to transform how revenue teams win. The Gong Reve...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
gong.io Logo
Gong
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • M.Sc. or Ph.D. in an exact science field or equivalent industry experience
  • 5+ years of industry experience in data science/machine learning, including hands-on work with deep learning or large language models in real products
  • Practical LLM/NLP experience in building agentic workflows and NLP features end-to-end, from problem framing and data preparation to prompt/model design and production deployment
  • Deep familiarity with modern techniques, including RAG, fine-tuning, LLMs, as well as the evaluation frameworks required to benchmark them
  • Expert-level Python skills, including working with common data and ML libraries (e.g., pandas, NumPy, scikit-learn, PyTorch, Transformers, LangChain, or similar)
  • Proven track record designing and running large-scale experiments (e.g., offline benchmarks, prompt/model pipelines), and turning results into concrete decisions
  • Deep expertise in statistics and data analysis
  • Demonstrated skill in parsing unstructured, real-world data to uncover patterns and drive research strategy
  • Product mindset - Experience working with product managers and other stakeholders, with a strong sense for customer needs and value
  • Ability to break down ambiguous problems, propose alternatives, and help the team focus on what will move the needle for users and the business
Job Responsibility
Job Responsibility
  • The AI Product Lifecycle: Driving the roadmap from ideation to deployment, including defining success metrics and assessing technical feasibility for complex ML problems
  • Agentic Systems: Architecting the dynamic reasoning chains, tool-use logic, and multi-step workflows that power the Gong AI Assistant, enabling it to navigate complex tasks with intelligence and autonomy
  • Conversational Intelligence Strategy: Deep-diving into millions of calls and emails to extract key insights, ensuring our models capture the true intent and nuance of unstructured human interactions at scale
  • Model & Pipeline Engineering: Building robust Python environments to collect proprietary datasets, training models from scratch or fine-tuning them, and integrating LLMs into sophisticated pipelines that serve our global customer base
  • Temporal Context & Deal Dynamics: Developing methods to track intent and extract key insights across continuous, multi-month sales cycles, connecting the dots between multiple conversations and interactions across an entire deal journey
  • Applied AI Research & Implementation: Applying a versatile toolkit that includes advanced RAG, unsupervised clustering, fine-tuning custom models, and designing agentic workflows to build high-impact features
  • Intelligent Sales Assistance: Architecting the Gong AI Assistant to serve as a deep-research partner
  • you'll build models that answer complex user queries, perform deal-level research, and surface non-obvious insights that are traditionally hard to extract from raw dialogue
  • Technical Product Translation: Partnering closely with Product Managers to transform high-level product requirements into scalable ML solutions and production-ready model architectures
  • Global Sales Transformation: The features you build will directly empower thousands of companies and their sales teams, fundamentally changing how they interact with customers and close deals
Read More
Arrow Right