Data Scientist – AI, LLMs & Data Pipelines Job at SRKay Consulting Group (Pune)

Job Description

This role is for our US-based Media Analytics client account, a distributed SaaS company using AI to power media monitoring and analysis. Their platform enables clients to understand what’s being said in the world and why it matters—using LLMs, data pipelines, and custom metrics to turn raw data into actionable insights. We are looking for a high impact Data Scientist to help reimagine and transform the way the organization operates and deliver products to its clients. This is a strategic role that blends advanced data science, large language models (LLMs), and product thinking to drive simplicity, usability, quality, and operational efficiency across the company. This role is intended for someone with 5+ years of experience, a deep understanding of LLMs (especially OpenAI’s ecosystem), and the skills to hit the ground running alongside our existing AI/ML team. You will work at the intersection of AI innovation, operational design, and client experience—bringing technical rigor, business intuition, and creative energy to solve complex challenges.

Job Responsibility

LLM-led Innovation: Build and deploy LLM-powered tools and workflows that simplify analyst work, reduce errors, and accelerate delivery
Operational Transformation: Understand operational processes and create intelligent, scalable solutions that eliminate complexity and manual effort
Product Evolution: Partner with product and operations teams to infuse intelligence and automation into client-facing platforms
Client-Centric Design: Translate client pain points and product gaps into practical, data-driven AI solutions that enhance experience and outcomes
Rapid Experimentation: Prototype fast, test early, iterate often. Maintain speed without sacrificing accuracy or quality
Cross-Functional Collaboration: Work closely with engineering, product, client success, and operations teams to bring ideas to life
Own and operate data pipelines: Run, troubleshoot, and improve the scripts and workflows that move and transform our data—upgrading these to use workflow orchestration tools for robustness and automation
Work with LLMs: Fine-tune models, conduct reinforcement learning with human feedback (RLHF), and support preference-review workflows
Model evaluation & monitoring: Analyze results and report on the effectiveness of AI models and data labeling efforts using both established and novel benchmarks
Thematic extraction at scale: Help identify and surface underlying themes, narratives, and sentiment across vast media datasets
Metric innovation: Derive new ways to quantify media content and influence using observed patterns and domain insight
Collaborate & iterate: Partner closely with the key stakeholders (such as Lead Data Scientist, members of Engineering, Products & Operations team) to evolve our AIand ML-powered systems
Secure, scalable coding: Contribute high-quality, secure code in a cloud-based environment

Requirements

5+ years of hands-on experience as a data scientist or ML engineer, with demonstrated ownership of projects in production
Minimum of 2 years’ experience with LLM’s
Proven experience applying LLMs and generative AI to real-world business problems
Strong Linux skills – comfortable navigating and scripting in a CLI-first environment
Expert Python skills – you write clean, maintainable, tested, and efficient code
Deep experience working with OpenAI models and APIs, including prompt engineering, finetuning and evaluation
Fluent in SQL with ability to work efficiently with PostgreSQL datasets
Experience using Docker for local and production development
Proficiency with LangChain for building multi-step LLM workflows
Effective remote communication and collaboration, especially in a distributed team with meetings on US Eastern Time
Strong business acumen—able to connect technical solutions to operational and client value
Startup-style drive, agility, and hands-on mindset—you ship, not just ideate
Creativity and experimentation—willing to try, fail, and improve
Exceptional communication skills—can explain complex ideas simply and persuasively
Deep proficiency in Python, NLP, and AI frameworks (e.g., Hugging Face, LangChain), vector databases, and prompt engineering
Minimum 3–5 years of experience in AI/ML roles, preferably in B2B or SaaS environments
Bachelor's or Master's in Computer Science, Data Science, AI, or related field
Experience with AWS cloud services (EC2, S3, etc.)
Familiarity with workflow orchestration tools

Nice to have

Knowledge of media analytics, journalism, or influence measurement
Prior work with data labeling, theme detection, or benchmarking model output

SRKay Consulting Group - All Job Offers

Select Country

Data Scientist – AI, LLMs & Data Pipelines

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Data Scientist – AI, LLMs & Data Pipelines

Senior Data Scientist / AI Consultant – Security & Risk Analytics

Principal Applied Data Scientist - AI for Good Lab

AI Data Scientist

AI Data Scientist - Senior

Senior Data Scientist – Gen AI Engineer

Senior Data Scientist – Gen AI Engineer - Assistant Vice President

Data Scientist – Agentic AI

Senior Data Scientist (NLP & LLMs)

Our AI answers in your language