This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The AI/ML Engineer – Agentic is a senior individual contributor responsible for designing, building, and operating a production-grade agentic orchestration platform, including multi-agent workflows and MCP server–based tool infrastructure. The role focuses on enterprise-scale LLM integration, shared retrieval and memory services, and high‑performance backend systems that power agent execution. This position owns reliability, observability, and cloud-native operations for non-deterministic agentic systems in production
Job Responsibility
Design, build, and own a production-grade agentic orchestration platform, implementing scalable multi-agent workflows using frameworks such as LangGraph or equivalent
Architect, develop, and operate the MCP server infrastructure, including inter-agent communication, tool/server registries, domain isolation, versioning, and lifecycle management
Integrate and operate LLM services at enterprise scale, supporting streaming, structured outputs, tool/function calling, and robust error handling across agent workflows
Build and maintain retrieval and memory services for agentic systems, including RAG pipelines, OpenSearch-backed vector stores, hybrid search, and relevance optimization
Develop and operate high-performance backend services (FastAPI, gRPC, async systems, messaging) that power orchestration, tool execution, and agent runtime behavior
Own observability and reliability for non-deterministic systems, delivering end-to-end tracing, monitoring, and cost/performance visibility for agent executions
Manage cloud-native infrastructure and deployment, including Kubernetes workloads, containerized services, CI/CD pipelines, and resource optimization (CPU/memory, autoscaling)
Requirements
Bachelor's degree in computer science, engineering, information systems, or closely related quantitative discipline
Typically, 4-7 years’ experience
Production experience with agentic frameworks: LangGraph (preferred), Claude Agent SDK, or equivalent (not just prototypes)
Deep understanding of multi-agent architectures: supervisor/worker patterns, hierarchical agent graphs, ReAct loops, ReWoo
Hands-on with inter-agent communication protocols: MCP (Model Context Protocol), A2A, tool registry / server registry
LLM API integration at scale: structured outputs, streaming, function/tool calling, error handling
RAG pipeline design and optimization: chunking strategies, re-ranking, hybrid search
Vector store experience: OpenSearch or equivalent
Applied ML intuition: fine-tuning concepts, prompt engineering, evaluations, Qlora, PEFT
Backend development: FastAPI, gRPC, Kafka, Redis, message queues, Async System design: Python, API Design GraphQL and/or REST at enterprise scale
Observability and monitoring for non-deterministic systems: LangFuse, Prometheus, or equivalent
Kubernetes: deploying, scaling, and managing workloads (Deployments, Services, ConfigMaps, Secrets)
Container image management: building, tagging, versioning, and pushing images via Docker
CI/CD pipelines for automated build and deploy (GitHub Actions, Jenkins, ArgoCD, or similar)
Resource management: CPU/memory limits, autoscaling (HPA/VPA), health probes