This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The AI Ops Engineer role involves designing, deploying, and optimizing AI-powered applications on Azure. Candidates should have 4+ years of experience in software engineering or cloud platforms, with strong skills in operationalizing AI/ML applications. Responsibilities include managing deployment pipelines, monitoring performance, and ensuring system reliability. A degree in Computer Science or related field is required.
Job Responsibility:
Design, deploy, operate, and optimize enterprise-grade AI-powered applications and intelligent agents on Azure that support business workflows and customer interactions at scale
Operationalize AI/ML models and LLM-powered applications by managing deployment pipelines, monitoring performance, ensuring reliability, and maintaining scalability in production environments
Work closely with engineering, product, and CX teams to ensure AI systems run efficiently in production
Leverage Azure services such as Azure OpenAI, Azure Machine Learning, Cognitive Services, Kubernetes, and DevOps pipelines to operationalize AI workloads, continuously monitor model performance, improve latency and accuracy, and ensure governance, security, and system stability
Deploy AI agents and AI-powered applications to production environments
Maintain CI/CD pipelines for AI models and applications
Monitor AI system performance, reliability, and usage metrics
Troubleshoot operational issues including latency, hallucinations, or integration failures
Implement logging, observability, and evaluation frameworks for AI systems
Manage Azure infrastructure supporting AI workloads
Ensure security, compliance, and governance for AI deployments
Continuously improve system scalability, stability, and operational efficiency
Collaborate with AI engineers and product teams to operationalize new AI features
Requirements:
4+ years of hands-on software engineering, cloud, or platform engineering experience
Strong experience operationalizing AI/ML or GenAI applications in production environments
Proven expertise with Microsoft Azure cloud platform, especially AI/ML services
Experience with CI/CD pipelines, infrastructure automation, and cloud deployments
Strong troubleshooting, monitoring, and production reliability experience
Ability to independently manage AI deployments end-to-end
Degree in Computer Science, Engineering, Data Science, or equivalent practical experience
Experience deploying and managing AI/ML and LLM-based applications in production
Hands-on experience with Azure OpenAI, Azure Machine Learning, Azure AI Studio, and Cognitive Services
Knowledge of containerization and orchestration (Docker, Kubernetes, AKS)
Experience with CI/CD pipelines such as Azure DevOps or GitHub Actions
Familiarity with agentic AI frameworks such as LangChain, LlamaIndex, Semantic Kernel, AutoGen, or CrewAI from an operational perspective
Understanding of RAG architectures, vector databases, and AI observability tools
Strong Python scripting and automation experience
Experience monitoring AI models including logging, evaluation, performance metrics, and alerting
Knowledge of MLOps/LLMOps practices including model versioning, governance, and lifecycle management
Familiarity with Git, infrastructure-as-code, and standard DevOps workflows
Strong debugging, production support, and performance optimization skills