This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We’re looking for a GenAI senior Developer to design, build, and deploy intelligent LLM-powered systems—from single-agent chatbots, copilots to complex multi-agent applications - at scale. We are particularly interested in candidates who have hands-on experience in taking GenAI applications from concept to production, especially within high-volume B2C environments. This role prioritizes individuals who understand the nuances of deploying, maintaining, and optimizing GenAI solutions for real-world users,beyond the scope of Proof-of-Concept (PoC) development. You will work across the full stack, integrating LLMs, microservices,vector databases, backend APIs, and modern cloud infrastructure.
Requirements:
Develop scalable, asynchronous microservices using Python (FastAPI) for chatbots, copilots, and agentic workflows
Design event-driven architectures to support high concurrency, rate limiting, and real-time responsiveness
Implement secure, versioned REST/gRPC APIs
Use Pydantic, dependency injection, and modular coding practices for maintainability
Proficient in working with databases using ORMs like SQLAlchemy
Ensure observability using logging, metrics, tracing, and health checks
Create responsive React.js frontends integrated via REST APIs or WebSockets
Deploy applications on Cloud Run, GKE, using Docker, Artifact registry, CI/CD pipelines
Design and build LLM-powered chatbots, voicebots, copilots and other applications using LangChain or custom orchestration frameworks
Implement user session management and context/state tracking for personalized and continuous conversations
Build RAG pipelines with vector databases, knowledge graphs to ground responses with external knowledge and documents
Apply advanced prompt engineering (ReAct, Chain-of-Thought with tool calling) for precise and goal-oriented outputs
Ensure performance in low-latency, streaming environments using WebSockets, gRPC, and SIP media gateways
Perform fine-tuning of open-source LLMs (LLaMA variants) using techniques like SFT, LoRA, for cost-effective domain adaptation
Optimize high-speed inference pipelines leveraging multi-GPU clusters (up to 8x H100s) to reduce latency and improve throughput
Create multi-agent systems & Implement orchestration patterns like supervisor-agent, hierarchical, and networked agents using frameworks like ADK, Pydantic AI and LangGraph
Use LangGraph for stateful workflows with memory, conditional branching, retries, and async execution
Enable persistent context and long-term memory
Monitor behavior, drift, and performance using observability tools
Skilled in developing agents with ADK and A2A protocols & experienced in configuring custom and remote MCP servers
Nice to have:
DevOps: Docker, GitHub Actions, Jenkins, GKE, Cloud Run
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.