This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Enable enterprise customers to operationalize AI workloads by deploying and optimizing model-serving platforms (e.g., NVIDIA Triton, vLLM, KServe) within Rackspace’s Private Cloud and Hybrid environments. This role bridges AI engineering and platform operations, ensuring secure, scalable, and cost-efficient inference services.
Job Responsibility:
Package and deploy ML/LLM models on Triton, vLLM, or KServe within Kubernetes clusters
Tune performance (batching, KV-cache, TensorRT optimizations) for latency and throughput SLAs
Work with VMware VCF9, NSX-T, and vSAN ESA to ensure GPU resource allocation and multi-tenancy
Implement RBAC, encryption, and compliance controls for sovereign/private cloud customers
Integrate models with Rackspace’s Unified Inference API and API Gateway for multi-tenant routing
Support RAG and agentic workflows by connecting to vector databases and context stores
Configure telemetry for GPU utilization, request tracing, and error monitoring
Collaborate with FinOps to enable usage metering and chargeback reporting
Assist solution architects in onboarding customers, creating reference patterns for BFSI, Healthcare, and other verticals
Provide troubleshooting and performance benchmarking guidance
Stay current with emerging model-serving frameworks and GPU acceleration techniques
Contribute to reusable Helm charts, operators, and automation scripts
Requirements:
Hands-on experience with NVIDIA Triton, vLLM, or similar serving stacks
Strong knowledge of Kubernetes, GPU scheduling, and CUDA/MIG
Familiarity with VMware VCF9, NSX-T networking, and vSAN storage classes
Proficiency in Python and containerization (Docker)
Understanding of observability stacks (Prometheus, Grafana) and FinOps principles
Exposure to RAG architectures, vector DBs, and secure multi-tenant environments
Excellent problem-solving and customer-facing communication skills
Nice to have:
NVIDIA Certified Professional (AI/ML)
Kubernetes Administrator (CKA)
VMware VCF Specialist
Rackspace AI Foundations (internal)
What we offer:
Incentive compensation opportunities in the form of annual bonus or incentives