CrawlJobs Logo

LLM Platform Engineer

United States, San Francisco 245000.00 - 345000.00 USD / Year · Job Posted February 18, 2026
Apply Position
Job Link Share

Job Description

Join the Future of Commerce with Whatnot. Whatnot is the largest live shopping platform in North America and Europe to buy, sell, and discover the things you love. We’re re-defining e-commerce by blending community, shopping, and entertainment into a community just for you. We’re looking for builders–intellectually curious, highly entrepreneurial engineers eager to shape the future of AI and ML at Whatnot. You’ll design and scale the core infrastructure that powers large language model applications across the company, working side by side with machine learning scientists to bring cutting-edge models into production and unlock entirely new product experiences. This means building systems that make AI dependable and fast at scale–from building retrieval systems to more effectively ground LLM responses in Whatnot’s business context to developing scalable LLM evaluation frameworks and human-in-the-loop feedback mechanisms.

Job Responsibility

  • Own the infrastructure powering LLMs across critical business surfaces– supporting growth, recommendations, trust and safety, fraud, seller tooling, and more
  • Create robust and scalable LLM evaluation frameworks to measure model performance, guide iteration, and prevent regression via CI/CD
  • Deploy RAG systems and MCP servers to more effectively ground LLM responses in Whatnot’s business context while enforcing rigorous PII controls
  • Design efficient human-in-the-loop feedback pipelines that can be used to inform scalable LLM evaluation
  • Bridge the gap between research and production, helping to transform experimental ideas into scalable solutions
  • Stretch beyond your comfort zone to take on new technical challenges as we scale AI across Whatnot’s ecosystem

Requirements

  • 4+ years of professional experience developing machine learning systems and algorithms
  • Bachelor’s degree in Computer Science, Statistics, Applied Mathematics or a related technical field, or equivalent work experience
  • 3+ years of software engineering experience building and maintaining production systems for consumer-scale loads
  • 1+ years of professional experience developing software in Python
  • Ability to work autonomously and drive initiatives across multiple product areas and communicate findings with leadership and product teams
  • Experience with operational, search, and key-value databases such as PostgreSQL, DynamoDB, Elasticsearch, Redis
  • Firm grasp of visualization tools for monitoring and logging e.g. DataDog, Grafana
  • Familiarity with cloud computing platforms and managed services such as AWS Sagemaker, Lambda, Kinesis, S3, EC2, EKS/ECS, Apache Kafka, Flink
  • Professionalism around collaborating in a remote working environment and well tested, reproducible work
  • Exceptional documentation and communication skills

What we offer

  • Flexible Time off Policy and Company-wide Holidays (including a spring and winter break)
  • Health Insurance options including Medical, Dental, Vision
  • Work From Home Support
  • Home office setup allowance
  • Monthly allowance for cell phone and internet
  • Care benefits
  • Monthly allowance for wellness
  • Annual allowance towards Childcare
  • Lifetime benefit for family planning, such as adoption or fertility expenses
  • Retirement
  • 401k offering for Traditional and Roth accounts in the US (employer match up to 4% of base salary) and Pension plans internationally
  • Monthly allowance to dogfood the app
  • Parental Leave
  • 16 weeks of paid parental leave + one month gradual return to work

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

LLM Platform Engineer

8 matching positions

New

Azure Data Science Platform Engineer (AI)

WFH flexibility! Up to 4 days/week! Global Environment! Competitive salary! We a...
Location
Location
Japan , Tokyo 23 wards
Salary
Salary:
7000000.00 - 12000000.00 JPY / Year
https://www.randstad.com Logo
Randstad
Expiration Date
February 29, 2028
Flip Icon
Requirements
Requirements
  • Bilingual proficiency in Japanese and English is preferred (English is a MUST)
  • Day-to-day communication will primarily be in English, with occasional interaction with Japanese-speaking stakeholders
  • 6+ years of experience in data science, machine learning, advanced analytics, or applied AI, with demonstrated business results
  • Strong experience taking solutions from development into production and supporting them in live environments
  • Strong Python programming skills and solid engineering discipline
  • Experience with GenAI / LLM use cases or solution delivery
  • Hands-on experience with Azure for deploying, supporting, or operating production workloads
  • Strong experience with Terraform and Infrastructure as Code in enterprise cloud environments
  • Experience with CI/CD, GitHub Actions, deployment automation, and DevOps practices
  • Experience with SRE, MLOps, or LLMOps, particularly in monitoring, incident handling, reliability, and operational support
Job Responsibility
Job Responsibility
  • We are looking for a candidate who combines strong data science delivery capability with practical production and operational ownership
  • This is not a pure research role and not a pure platform role. It is intended for someone who can build business-facing AI/ML solutions and also help ensure those solutions are deployable, stable, and maintainable
  • The ideal candidate is proactive, technically hands-on, comfortable working independently, and effective in cross-functional enterprise environments.
What we offer
What we offer
  • WFH flexibility! Up to 4 days/week!
  • Global Environment!
  • Competitive salary!
  • 健康保険
  • 厚生年金保険
  • 雇用保険
  • 土曜日
  • 日曜日
  • 祝日
  • Fulltime
Read More
Arrow Right
New

Data Platform Engineer (Ai-Enabled)

Global Controllers Technology is looking for a hands-on data engineer who is pas...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years overall experience in Big Data or Enterprise large-scale application development using scalable tools like Databricks, Scala, Java, and the Python ecosystem.
  • Deep understanding of data modeling, data warehousing concepts, methodologies, and best practices.
  • Strong experience with data processing (e.g., Spark, Starburst, Snowflake, Redshift), storage (e.g., Hadoop, MongoDB, Oracle), and ETL (e.g., Airflow, Ab Initio, Talend) solutions.
  • Experience with data engineering techniques such as building data lakes and data warehouses, data mesh architectures, data pipelines, and ETL vs. ELT patterns.
  • Strong expertise in Starburst/Trino with deep understanding of data federation and data virtualization architectures.
  • Experience with real-time data processing and data virtualization technologies such as Starburst, Snowflake, and Flink (preferred).
  • Experience with designing and developing distributed systems, handling structured and unstructured data to store, analyze, and report.
  • Experience with developing and maintaining large and complex database systems using relational databases like Oracle, and other big data technologies like Hadoop, Spark, etc.
  • Proficient in PL/SQL and Unix Shell scripts. Strong SQL, PL/SQL development, and database tuning experience.
  • Experience in designing Online Transaction Processing (OLTP), Operational Data Store (ODS), and Data Warehouse applications.
Job Responsibility
Job Responsibility
  • Drive the creation of a scalable and high-quality data intensive platform leveraging some of the latest methodologies such as Data Virtualization and Data Domains
  • Fulltime
Read More
Arrow Right
New

Platform Engineer with AI

We are looking for a hands-on AI Engineer to help improve and scale an internal ...
Location
Location
Romania , Iasi
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BSc/MSc in Computer Science or related field
  • Minimum 6+ years as a Platform Engineer
  • Strong hands-on experience with Python for scripting, automation, evaluation tooling, or integrations
  • Practical experience building with LLMs, prompts, AI agents, or agentic workflows in real engineering or production contexts
  • Daily use of AI coding assistants or coding agents such as Copilot, Cursor, Kiro, Windsurf, Claude Code, or similar
  • Experience designing, testing, and iterating prompts for reliable task execution
  • Experience building evaluation or testing approaches for non-deterministic AI outputs
  • Strong experience with GitHub Actions or equivalent CI/CD tooling, including reusable workflows and pipeline-as-code patterns
  • Experience working with YAML, Markdown, JSON, and configuration-driven systems
  • Good understanding of authentication, authorisation, secrets handling, and least-privilege access patterns
Job Responsibility
Job Responsibility
  • Build and improve AI workflows across repositories
  • Design agent workflows for docs, testing, code quality, reviews, and security
  • Create effective prompts and multi-step interactions
  • Improve reliability and usability of agents
  • Use AI coding tools (e.g., Copilot, Cursor) to accelerate delivery
  • Design evaluation pipelines for prompts and agents
  • Build datasets with edge and adversarial cases
  • Add automated regression and behavior checks
  • Use code-based and LLM-based evaluators
  • Define quality gates before production
What we offer
What we offer
  • Smooth integration and a supportive mentor
  • Pick your working style: choose from Remote, Hybrid or Office work opportunities
  • Early bird or night owl? Our projects have different working hours to suit your needs
  • Sharpen your tech skills with our sponsored certifications, trainings and top e-learning platforms
  • Enjoy our Private Health Insurance – it’s custom-made for you
  • Attend individual coaching sessions or go one step further by joining our accredited Coaching School
  • Make the most of our epic parties or themed events – they’re lovingly designed for our people and their families
Read More
Arrow Right

Vp Platform Engineer

Our client’s technology team is responsible for creating and continuously improv...
Location
Location
United States , New York
Salary
Salary:
175000.00 - 215000.00 USD / Year
rennerbrown.com Logo
Renner Brown
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in computer science, Computer Engineering, or related field
  • 8+ years in infrastructure engineering, cloud platform engineering, or data engineering
  • Demonstrated experience building shared platforms or developer services in an enterprise environment
  • Azure expertise: Azure AI Foundry, Azure Data Factory, Azure Databricks, AKS, Azure API Management, Azure Key Vault, Azure Entra ID
  • Strong Python skills: backend services, REST APIs (FastAPI or Flask), and automation scripting
  • PowerShell for infrastructure tasks
  • Infrastructure-as-Code: Terraform and/or Bicep
  • container orchestration with Docker and Kubernetes
  • Experience integrating LLM APIs (Anthropic Claude, Azure OpenAI) in production including token cost management and observability
  • RAG pipeline experience: vector search (Azure AI Search or pgvector), document processing, and retrieval patterns
Job Responsibility
Job Responsibility
  • Design, build, and operate the firm’s AI platform, enabling developers to build and deploy Python-based AI applications
  • Implement and manage Azure AI Foundry environments: model deployments, AI hubs, project workspaces, and access controls
  • Integrate and operationalize third-party AI APIs (Anthropic Claude API, Azure OpenAI) with secure access patterns, API gateway controls, rate limiting, and cost monitoring
  • Build internal developer tooling and SDK scaffolding to accelerate AI application development across the firm
  • Build and maintain data pipelines using Azure Data Factory and Azure Databricks to serve AI application data needs
  • Implement vector search and document retrieval infrastructure (Azure AI Search) to support RAG-based applications
  • Manage structured and unstructured data stores including Azure Data Lake, Azure SQL, and Cosmos DB
  • Provision and maintain secure, scalable infrastructure on Azure (primary) and AWS using Infrastructure-as-Code (Terraform or Bicep)
  • Build and maintain CI/CD pipelines for AI application deployment via Azure DevOps or GitHub Actions
  • Manage containerized workloads using Docker and Kubernetes (AKS) for AI application hosting and API services
  • Fulltime
Read More
Arrow Right

Ai Platform Engineer

Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
supermetrics.com Logo
Supermetrics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of relevant experience building and shipping production automation systems
  • REST API and webhook experience
  • Hands-on experience with LLMs and prompt engineering
  • SQL and data warehouse experience, BigQuery preferred
  • Python scripting for automation and data tasks
  • Experience building and maintaining automation pipelines in tools like n8n, Zapier or Make in production environments
  • The ability to think in systems
  • Comfort working in a fast-moving environment where tooling and patterns are still being established
Job Responsibility
Job Responsibility
  • Build and maintain the shared infrastructure that all AI agents across GTM, CS and Support run on
  • Own and maintain the Agent Registry: the master catalogue of every agent, what it does, who owns it and when it was last reviewed
  • Ensure all agents connect to tools like Salesforce, Gainsight and Zendesk in a consistent and reliable way
  • Set up and maintain observability, dashboards and logs so the team can see what agents are doing and catch issues fast
  • Maintain data quality and write authority on the Golden Record in BigQuery
  • Work with the GTM and CS AI Leads to onboard new agents onto the shared platform correctly
  • Building and maintaining the event routing system that lets agents react to triggers
  • Creating and maintaining shared functions used by multiple agents
  • Managing API changes from vendors and updating shared functions accordingly
  • Building and maintaining the observability layer so every agent run is logged and queryable
Read More
Arrow Right

Ai Platform Engineer

We are looking for an AI Platform Engineer to own the critical platform layer th...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
supermetrics.com Logo
Supermetrics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of relevant experience building and shipping production automation systems
  • REST API and webhook experience
  • Hands-on experience with LLMs and prompt engineering
  • SQL and data warehouse experience, BigQuery preferred
  • Python scripting for automation and data tasks
  • Experience building and maintaining automation pipelines in tools like n8n, Zapier or Make in production environments
  • Ability to think in systems: you design for maintainability, not just functionality
  • Comfort working in a fast-moving environment where tooling and patterns are still being established
Job Responsibility
Job Responsibility
  • Build and maintain the shared infrastructure that all AI agents across GTM, CS and Support run on
  • Own and maintain the Agent Registry: the master catalogue of every agent, what it does, who owns it and when it was last reviewed
  • Ensure all agents connect to tools like Salesforce, Gainsight and Zendesk in a consistent and reliable way
  • Set up and maintain observability, dashboards and logs so the team can see what agents are doing and catch issues fast
  • Maintain data quality and write authority on the Golden Record in BigQuery
  • Work with the GTM and CS AI Leads to onboard new agents onto the shared platform correctly
  • Building and maintaining the event routing system that lets agents react to triggers
  • Creating and maintaining shared functions used by multiple agents so updates are made once and fixed once
  • Managing API changes from vendors and updating shared functions accordingly
  • Building and maintaining the observability layer so every agent run is logged and queryable
  • Fulltime
Read More
Arrow Right

Principal Data Genai Platform Engineer - Senior Vice President

Location
Location
India , Chennai
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of relevant experience in enterprise application development, data engineering, or AI platform engineering, with a strong track record of leadership in regulated environments
  • 8+ years of experience leading multi-team Agile organizations (20+ engineers), including managing distributed and hybrid AI-assisted teams
  • Advanced expertise in Python, PySpark, and Databricks ecosystem for large-scale data processing and ELT/ETL pipelines
  • Proven experience architecting and implementing enterprise AI/GenAI platforms, including agentic AI frameworks, LLM integrations, and prompt engineering
  • Hands-on experience with AI-assisted development tools such as Devin.AI and GitHub Copilot and integrating them into engineering workflows
  • Strong experience with microservices architecture, APIs, and cloud-native deployment (Kubernetes/OpenShift)
  • Strong experience with event-driven architectures and streaming platforms (Kafka)
  • Deep understanding of data architecture, data mesh, data federation, and regulatory data requirements
  • Exceptional leadership, communication, stakeholder management, and decision-making capabilities
  • Experience with cloud platforms (AWS, Azure, GCP, Databricks) and modern data ecosystems
Job Responsibility
Job Responsibility
  • Lead multiple agile scrum teams comprising ~15+ engineers, including hybrid teams of human engineers and AI-assisted development (Devin.AI, Copilot), ensuring delivery excellence and alignment with business priorities
  • Define and execute the enterprise strategy for Python engineering, AI agent platforms, and full-stack data applications, aligned with Retail and Wealth Risk objectives
  • Serve as the senior architect and technical authority for enterprise-scale AI agents, data engineering pipelines, and microservices-based applications, ensuring scalability, resilience, and security
  • Drive the adoption and operationalization of AI Product Development Lifecycle (AI PDLC), including model governance, evaluation, deployment, monitoring, and compliance with Model Risk Management (MRM)
  • Lead development of high-volume data pipelines and data federation layers using PySpark, Databricks, Kafka, and Data Mesh architecture to support regulatory reporting (CCAR, FDIC) and risk analytics
  • Architect and oversee GenAI agent ecosystems using LLMs (Google ADK, Gemini/Flash), implementing Human-in-the-Loop (HITL) frameworks to ensure explainability, auditability, and compliance
  • Drive AI-augmented software development lifecycle, integrating tools such as Devin.AI, GitHub Copilot, and MCP platforms through advanced prompt engineering and governance guardrails
  • Lead microservices and cloud-native architecture using FastAPI/Spring Boot, Kubernetes/OpenShift, and CI/CD pipelines, ensuring high availability and performance
  • Drive engineering efficiency and standardization by reusing and repurposing enterprise-level frameworks, platforms, and tools, reducing duplication and accelerating delivery across teams
  • Ensure all engineering solutions incorporate data governance and non-functional requirements, including Data Quality (DQ), data lineage, data tracing, and auditability, aligned with enterprise governance processes and regulatory expectations
  • Fulltime
Read More
Arrow Right

Platform Engineer (ai/llm Infrastructure)

We are currently seeking a Platform Engineer (AI/LLM Infrastructure) to join our...
Location
Location
United States , Santa Clara
Salary
Salary:
130000.00 - 170000.00 USD / Year
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in Platform Engineering, SRE, or Infrastructure Engineering
  • 3+ years of experience delivering and leading infrastructure for AI/LLM-based production systems
  • 3+ years of experience with Terraform and GitOps (ArgoCD/Flux)
  • 3+ years of experience with Azure (Key Vault, Monitor, DevOps Pipelines)
  • 3+ years of Experience with CI/CD and container registry management
Job Responsibility
Job Responsibility
  • Lead the design, implementation, and operation of scalable infrastructure platforms supporting AI/LLM-based solutions for enterprise clients
  • Act as a hands-on technical lead (player-coach), contributing to development while guiding a team of engineers
  • Own end-to-end infrastructure architecture below the application layer, including compute, container orchestration, CI/CD, observability, and security
  • Partner directly with clients and stakeholders to design, present, and deliver robust AI infrastructure solutions
  • Architect and manage production-grade Kubernetes environments (AKS/EKS), including cluster operations and RBAC
  • Design and operationalize RAG pipelines, including ingestion, chunking, embedding workflows, and vector database management
  • Lead GPU infrastructure provisioning and optimization (NVIDIA A100/H100 or similar)
  • Drive Infrastructure-as-Code adoption using Terraform and GitOps practices (ArgoCD/Flux)
  • Build and maintain CI/CD pipelines using GitHub Actions and Azure DevOps
  • Establish observability standards using Datadog, OpenTelemetry, and ELK/OpenSearch
  • Fulltime
Read More
Arrow Right