CrawlJobs Logo

AI Model Serving Specialist

rackspace.com Logo

Rackspace

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

82300.00 - 140580.00 USD / Year

Job Description:

Enable enterprise customers to operationalize AI workloads by deploying and optimizing model-serving platforms (e.g., NVIDIA Triton, vLLM, KServe) within Rackspace’s Private Cloud and Hybrid environments. This role bridges AI engineering and platform operations, ensuring secure, scalable, and cost-efficient inference services.

Job Responsibility:

  • Package and deploy ML/LLM models on Triton, vLLM, or KServe within Kubernetes clusters
  • Tune performance (batching, KV-cache, TensorRT optimizations) for latency and throughput SLAs
  • Work with VMware VCF9, NSX-T, and vSAN ESA to ensure GPU resource allocation and multi-tenancy
  • Implement RBAC, encryption, and compliance controls for sovereign/private cloud customers
  • Integrate models with Rackspace’s Unified Inference API and API Gateway for multi-tenant routing
  • Support RAG and agentic workflows by connecting to vector databases and context stores
  • Configure telemetry for GPU utilization, request tracing, and error monitoring
  • Collaborate with FinOps to enable usage metering and chargeback reporting
  • Assist solution architects in onboarding customers, creating reference patterns for BFSI, Healthcare, and other verticals
  • Provide troubleshooting and performance benchmarking guidance
  • Stay current with emerging model-serving frameworks and GPU acceleration techniques
  • Contribute to reusable Helm charts, operators, and automation scripts

Requirements:

  • Hands-on experience with NVIDIA Triton, vLLM, or similar serving stacks
  • Strong knowledge of Kubernetes, GPU scheduling, and CUDA/MIG
  • Familiarity with VMware VCF9, NSX-T networking, and vSAN storage classes
  • Proficiency in Python and containerization (Docker)
  • Understanding of observability stacks (Prometheus, Grafana) and FinOps principles
  • Exposure to RAG architectures, vector DBs, and secure multi-tenant environments
  • Excellent problem-solving and customer-facing communication skills

Nice to have:

  • NVIDIA Certified Professional (AI/ML)
  • Kubernetes Administrator (CKA)
  • VMware VCF Specialist
  • Rackspace AI Foundations (internal)
What we offer:
  • Incentive compensation opportunities in the form of annual bonus or incentives
  • Equity awards
  • Employee Stock Purchase Plan (ESPP)

Additional Information:

Job Posted:
January 04, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 31698 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for AI Model Serving Specialist

AI Conversation Design Specialist

This role is focused on creating a Conversational AI-driven self-service experie...
Location
Location
Munich; Berlin; Dublin
Salary
Salary:
Not provided
personio.com Logo
Personio SE & Co. KG
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4–6 years of experience delivering AI Agents or chatbots (voice, chat, email, and messaging) preferably in SaaS, technology, or customer experience
  • German fluency highly preferred
  • Hands-on experience with chatbots / AI copilots such as Intercom Fin, Decagon, MavenAGI, Ada, Forethought or equivalent
  • Familiarity with APIs, comfortable occasionally writing snippets of code to bring in data from other systems into Intercom Fin
  • Strong understanding of AI/ML concepts, natural language processing, and customer support technologies (e.g., chatbots, virtual assistants)
  • Experience working with data analytics tools and interpreting model performance metrics
  • Demonstrated ability to translate business needs and customer pain points into technical requirements for AI solutions
  • Experience collaborating with Product, Engineering, Data, Systems, Customer Support, and Professional Services teams
  • Knowledge of customer experience metrics (CSAT, NPS, etc.) and best practices in B2B SaaS
  • Strong project management skills, including Agile or similar methodologies
Job Responsibility
Job Responsibility
  • Self-Service and Productivity Outcomes: Drive contact volume reduction and self-service by implementing solutions for top contact drivers, as well as driving Support Agent productivity and handling time through improvements to co-pilots
  • AI Conversation Automation: Develop AI automated workflows / agent operating procedures, run batch tests, configure personalized answers, and reverse engineer unresolved conversations
  • Conversation Design: Architect natural, useful interactions between customers and AI chatbots. Design the flow and logic of conversations for the chatbots, partnering with subject-matter experts. You tune and design bot conversations —flows, intents/tagging, prompts/responses, fallback logic— to cut hallucinations, misroutes, and unnecessary handoffs
  • AI Service Journey Design: Define and govern AI↔human and AI↔AI handoffs (confidence thresholds, triggers, routing rules), ensuring brand voice, privacy, and responsible-AI standards, building a deep knowledge of our customer journeys and user stories to anticipate and design for different scenarios
  • Data Integration: bring data from 3rd party systems and our own product into Intercom Fin to improve AI conversation effectiveness
  • Performance Monitoring and Improvement: Track key metrics (e.g., self-serve resolutions, monthly active users, deflection rates, resolution time, customer experience score, NPS) to evaluate AI impact, identify gaps, and implement improvements
  • AI Solution Delivery: Lead the implementation of AI-powered tools (e.g., chatbots, copilots, virtual assistants) that address common customer pain points and streamline support and professional services processes, and partner with Engineering and Systems on integrations and fixes
  • Cross-Functional Collaboration: Work closely with Product, Engineering, Customer Experience, Data and Systems teams to ensure AI solutions are aligned with customer needs and business objectives, and that we integrate seamlessly with the Personio Assistant and internal support tooling
  • Continuous Quality Improvement: Oversee the ongoing quality fine-tuning of AI tools using real customer data and feedback to improve routing, accuracy, relevance, and customer satisfaction. You approach with a product mindset, anticipating edge cases and interaction effects, and regression test as new changes are introduced. You monitor bot/service health, triage incidents and misroutes, and coordinate rapid fixes and postmortems as needed
  • Change Management: Champion the adoption of AI tools within customer-facing teams, supporting the development of training materials and enablement sessions dedicated to developing AI proficiency within CX
What we offer
What we offer
  • Receive a competitive reward package – reevaluated each year – that includes salary, benefits, and pre-IPO equity
  • Enjoy 28 days of paid vacation, plus an additional day after 2 and 4 years
  • Make an impact on the environment and society with 1 (fully paid) Impact Day
  • Receive generous family leave, child support, mental health support, and sabbatical opportunities
  • We enjoy gathering for meals, cultural initiatives, and events like local Summer Sessions and year-end celebrations. There's also healthy snacks, drinks, and a weekly catered lunch
  • 20 Flex Days per year to work remotely from other locations
  • Fulltime
Read More
Arrow Right

Senior AI Presales Consultant

We are seeking a high-impact, strategic AI Presales Consultant to join our elite...
Location
Location
India , Mumbai
Salary
Salary:
Not provided
eviden.com Logo
Eviden
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in a customer-facing technical role (e.g., Presales, Solutions Architecture, AI Specialist, or Technical Consulting), with a proven track record of designing large-scale AI, ML, or HPC solutions
  • Deep, hands-on understanding of LLM architectures. Must be able to architect, explain, and build PoCs for RAG pipelines, including vector databases (e.g., Milvus, Pinecone, Chroma), embedding models, and data ingestion strategies
  • Direct experience in sizing AI infrastructure. Must be able to perform "napkin math" and detailed calculations for GPU, CPU, memory, and network requirements
  • Must be able to fluently discuss performance metrics (tokens/second, latency, throughput, TFLOPS) and their relationship to hardware choice (e.g., NVIDIA H100 vs. A100, memory bandwidth, interconnects like NVLink/InfiniBand)
  • Expertise in the AI software stack. Strong understanding of MLOps principles (Kubeflow, MLflow), Kubernetes (K8s) for AI workloads, and model serving platforms (NVIDIA Triton, KServe, or similar)
  • Strong, current knowledge of the AI model landscape (e.g., Llama family, Mistral, GPT-family, foundation models). Ability to discuss fine-tuning techniques, quantization, and pruning
  • Exceptional communication, whiteboarding, and presentation skills. Ability to translate executive-level business needs into detailed technical architecture and build a compelling C-level value proposition
  • Bachelor's or Master's degree in Computer Science, AI, Data Science, or a related engineering field
Job Responsibility
Job Responsibility
  • Strategic Client Advisory: Lead executive-level "Art of the Possible" workshops and technical discovery sessions to understand a client's business goals, data readiness, and AI maturity
  • Full-Stack Solution Architecture: Design holistic, end-to-end AI solutions that synergize our supercomputing hardware, AI software platform, and MLOps capabilities to meet specific client needs
  • Generative AI & LLM Expertise: Act as the subject matter expert on Generative AI. Architect and evangelize scalable data ingestion and preparation pipelines, specializing in Retrieval-Augmented Generation (RAG) frameworks
  • Infrastructure Sizing & Performance Modelling: Analyse customer workloads (data volume, model complexity, training frequency, inference throughput) to accurately size the required platform infrastructure, including Kubernetes clusters, data storage, and software licenses. This includes calculating compute, storage, and network requirements based on key performance metrics like model parameters, token performance (tokens/sec), desired latency, and concurrent user load
  • Model & Software Consultation: Advise clients on AI model selection, comparing the trade-offs of open-source vs. proprietary LLMs, fine-tuning vs. foundation models, and model quantization
  • Position and demonstrate our proprietary AI software platform, MLOps tools, and libraries, integrating them into the client's ecosystem
  • Inference Optimization: Design and architect robust, low-latency, and high-throughput inference solutions for complex AI models, including large-scale LLM serving
  • User Experience (UX) Advocacy: Collaborate with client teams to define the end-user experience, ensuring the solution delivers tangible business value and a seamless interface for data scientists, analysts, and application users
  • Sales Cycle Enablement: Own the technical narrative throughout the sales cycle. Build and deliver compelling presentations, custom demonstrations, and Proofs of Concept (PoCs). Lead the technical response to complex RFIs/RFPs
  • Fulltime
Read More
Arrow Right

People Systems Engineer, Airtable Specialist

The Airtable People team is seeking a People Systems Engineer (Airtable Speciali...
Location
Location
United States , San Francisco
Salary
Salary:
179000.00 - 232300.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in systems engineering, internal tools development, or technical operations
  • Experience leveraging AI to enhance systems and workflows
  • Proven experience building advanced Airtable solutions using interfaces, automations, and scripting
  • Skilled in scripting (JavaScript or Python) and integrating SaaS tools via APIs
  • Pragmatic problem solver who evaluates the right system or tool for the job
  • Familiarity with HR or People systems (e.g., Workday, Greenhouse, Slack, Google Workspace) preferred
  • Strong data modeling and systems design abilities
  • Excellent communication and documentation skills
  • A proactive, collaborative self-starter who thrives in fast-paced environments
Job Responsibility
Job Responsibility
  • Build, enhance, and maintain core People Airtable apps that support all aspects of the employee lifecycle
  • Design and architect new Airtable applications with AI and automation at the core
  • Use Airtable Interfaces, Automations, and Scripting to create clean, efficient, and data-rich user experiences
  • Develop and maintain automations, integrations, and data flows between Airtable and tools like Greenhouse, Workday, Slack, and Google Workspace
  • Evaluate and recommend alternative systems or tools when they better serve People team goals
  • Use scripting (JavaScript or Python) and APIs to connect systems and optimize recurring processes
  • Continuously monitor system performance, usability, and data accuracy
  • Partner with IT and cross-functional teams to align on architecture, data governance, and long-term tooling roadmap
  • Document technical logic and train stakeholders to use and maintain Airtable applications effectively
What we offer
What we offer
  • Benefits
  • Restricted stock units
  • Incentive compensation
  • Fulltime
Read More
Arrow Right

Systems Integration Specialist Advisor

The Systems Integration Specialist Advisor will lead the design, implementation,...
Location
Location
India , Noida
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7-12+ years of experience in enterprise architecture, cloud engineering, or platform leadership
  • 5+ years designing and deploying AI/ML or GenAI solutions in production
  • Strong expertise in Python, AI frameworks (TensorFlow, PyTorch, LangChain, etc.)
  • Deep understanding of LLM architecture, RAG systems, and agentic frameworks
  • Hands-on experience with Kubernetes, Docker, CI/CD pipelines
  • Strong cloud architecture experience (AWS/Azure/GCP certifications preferred)
  • Experience implementing DevSecOps practices
  • Strong knowledge of enterprise security frameworks and cloud security controls
  • Experience designing high-availability distributed systems
Job Responsibility
Job Responsibility
  • Design and architect enterprise AI/ML and GenAI platforms (LLMs, RAG, agentic AI, AIOps, automation)
  • Define scalable AI reference architectures aligned with business objectives
  • Lead end-to-end AI lifecycle management: model development, validation, deployment, monitoring, and governance
  • Establish standards for AI explainability, observability, and ethical AI practices
  • Architect AI solutions across AWS, Azure, or GCP environments
  • Design scalable data pipelines, model serving infrastructure, and distributed systems
  • Implement containerization (Docker) and orchestration (Kubernetes)
  • Build high-availability, resilient AI platforms with performance optimization
  • Establish CI/CD pipelines for AI/ML workloads
  • Implement Infrastructure as Code (Terraform, ARM, CloudFormation)
Read More
Arrow Right

File Clerk - AI Trainer

Join our team as a File Clerk - AI trainer and play a pivotal role in shaping th...
Location
Location
India , Noida
Salary
Salary:
Not provided
aqusag.com Logo
AquSag Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience as a file clerk, records specialist, or in a related role involving document organization and information management
  • Strong understanding of alphabetical, numerical, and custom filing systems
  • Exceptional written and verbal communication skills, with meticulous attention to detail
  • Demonstrated ability to structure, categorize, and streamline large datasets and records
  • Comfortable working in remote, cross-functional environments and collaborating via digital platforms
  • Motivated to help train and improve AI systems with your subject matter expertise
  • Proficient in using office software and file management tools
Job Responsibility
Job Responsibility
  • Serve as a subject matter expert in file management and archival best practices for AI model development
  • Annotate, review, and structure data sets to train AI systems in categorizing and organizing documents based on alphabetical, numerical, or custom filing systems
  • Provide detailed feedback on AI model performance and suggest improvements for enhanced accuracy
  • Simulate real-world filing scenarios and guide AI models in handling edge cases and exceptions
  • Collaborate with AI engineers and data scientists to refine data labeling protocols and taxonomy
  • Document processes and maintain clear, comprehensive communication throughout project cycles
  • Stay current on trends in digital file management and contribute insights to advance AI capabilities
  • Fulltime
Read More
Arrow Right

Senior AI Engineer

Elsewhen, a London-based consultancy, designs and builds technology solutions fo...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
elsewhen.com Logo
Elsewhen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Professional AI engineering experience
  • Background in Software Engineering with Python
  • Solid understanding of the Python standard library and modern Python coding, testing, debugging and automation techniques
  • Hands-on experience building solutions using LLMs and Agentic architectures with ADK, LlamaIndex, or LangGraph
  • Working with vector databases for embedding and indexing
  • Strong experience with cloud platforms
  • Strong experience with API design and frameworks like FastAPI or Flask
  • Solid experience with relational databases and SQL
  • Interest in expanding your knowledge into GenAI and machine learning
  • Excellent communication skills and the ability to work well in a collaborative team environment
Job Responsibility
Job Responsibility
  • Experiment with POCs to find solutions for real-world problems using Large Language Models
  • Collaborate on AI-driven projects, working alongside engineers, product managers and AI specialists while maintaining clear documentation
  • Build and deploy Agentic LLM-based solutions with LangGraph
  • Familiar with different multi agent system patterns
  • Build and deploy LLM-based solutions using RAG
  • Familiar with different types of databases: Relational, Graph etc
  • Design and optimise APIs using Python and FastAPI to serve AI solutions
  • Familiar with GCP ecosystem and Cloudrun
  • Build and optimise data pipelines for vector search and knowledge retrieval using Vector databases and embedding models
What we offer
What we offer
  • Private Health Insurance: Comprehensive coverage for both physical and mental health
  • Flexible and Remote-First Work Environment: Choose how and where you work, with the option for weekly team meet-ups in central London
  • Generous Leave Policy: 27 days of holiday plus bank holidays
  • Family-friendly policies, including enhanced maternity, paternity and shared
  • Learning and Development: Individual annual budget of £2,000 for learning and development, with dedicated learning days
  • Feel Better Fund: £500 to help set up your remote office
  • Social Events: Monthly and quarterly team events, an annual team trip, and half-yearly social events
  • Gym Membership Contribution: Support for maintaining your physical health
  • Pension Contribution: Enhanced employer pension contribution of 6%
  • Bonus Opportunities: Potential to receive a discretionary (non-contractual) bonus based on business and personal achievements
  • Fulltime
Read More
Arrow Right

Cloud Solution Architect - Copilot

Leverage your architectural and engineering expertise to help customers unlock t...
Location
Location
India , Gurgaon
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5-15 years of Experience
  • Deep understanding of Copilot architecture, service boundaries, and supported workloads
  • Scenario design across Teams, Outlook, Word, Excel, PowerPoint, OneDrive, SharePoint
  • Knowledge of Copilot readiness requirements (licensing, identity, data, tenant configuration)
  • Microsoft Teams (meetings, chat, channels, apps, extensibility concepts)
  • SharePoint Online & OneDrive (content lifecycle, permissions, information architecture)
  • Exchange Online (mailbox architecture, calendar, compliance considerations)
  • Viva (especially Viva Topics, Insights, Engage – awareness level)
  • Microsoft Entra ID (Azure AD) fundamentals
  • Authentication models, Conditional Access, role‑based access control (RBAC)
Job Responsibility
Job Responsibility
  • Lead end‑to‑end solution architecture for Microsoft 365 Copilot scenarios across collaboration, content, meetings, and workflow intelligence
  • Translate business priorities into scalable Copilot architectures aligned with Microsoft 365 services (Teams, Outlook, SharePoint, OneDrive, Viva)
  • Drive technical design reviews, reference architectures, and deployment readiness for Copilot and related AI services
  • Partner with account teams to unblock complex architectural challenges and accelerate time‑to‑value
  • Ensure architectural decisions align with customer environments, identity models, and information architecture
  • Enable customers to reimagine collaboration and productivity using Copilot across everyday work scenarios
  • Guide adoption of Copilot‑powered experiences for knowledge workers, leaders, and frontline roles
  • Align Copilot usage to role‑based productivity patterns and modern work transformation goals
  • Support change management motions by mapping Copilot value to user journeys and work rhythms
  • Act as a productivity transformation advisor—not just a tool expert
  • Fulltime
Read More
Arrow Right

Vice President, AI Enablement

The Vice President, AI Enablement leads Parexel’s enterprise‑wide Artificial Int...
Location
Location
United States , Remote
Salary
Salary:
Not provided
parexel.com Logo
Parexel
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years in data science, AI/ML, or technology leadership roles
  • at least 7 years in executive or senior leadership
  • Deep knowledge of clinical research processes, clinical data lifecycle, and pharmaceutical/life sciences regulatory expectations
  • Experience implementing enterprise AI platforms (GenAI, MLOps, data pipelines, automation)
  • Proven success operationalizing large-scale AI programs in GxP or other highly regulated industries
  • Familiarity with emerging AI regulations (EU AI Act, FDA, EMA expectations)
  • Experience with cloud-based AI ecosystems (Azure preferred)
  • Master’s degree or PhD in Computer Science, Data Science, Engineering, Bioinformatics, or related field preferred
  • Advanced business qualifications (MBA or equivalent) desirable
  • Excellent written and verbal communication skills
Job Responsibility
Job Responsibility
  • Define and evolve Parexel’s enterprise AI strategy, roadmap, and investment portfolio
  • Identify priority use cases aligned to business value, feasibility, and regulatory constraints
  • Conduct AI maturity assessments and establish long‑term capability-building plans
  • Co-chair AI governance
  • Define standards for model lifecycle management, validation, auditability, and transparency
  • Ensure compliance with GxP, global regulatory guidance, and responsible AI principles
  • Partner with Quality, Legal, and Compliance to operationalize risk management for AI systems
  • Oversee enterprise AI platforms, tooling, and data pipelines
  • Drive scalable architecture for GenAI, ML operations, retrieval‑augmented generation, and secure model hosting
  • Evaluate and integrate vendor platforms while ensuring data privacy and clinical-grade security
What we offer
What we offer
  • Health, Vision & Dental Insurance
  • Tuition Reimbursement
  • Vacation/Holiday/Sick Time
  • Flexible Spending & Health Savings Accounts
  • Work/Life Balance
  • 401(k) with Company match
  • Pet Insurance
  • Opportunity to work on innovative projects at the forefront of the industry
  • Collaborative and inclusive work environment that values your expertise
  • Professional advancement and development opportunities
  • Fulltime
Read More
Arrow Right