This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking an experienced AI Engineer – Intelligent Operations (Infrastructure) to design and implement AI-driven solutions that enhance infrastructure monitoring, automation, and operational efficiency. The ideal candidate will work at the intersection of AI/ML, cloud infrastructure, and DevOps to build intelligent operational systems.
Job Responsibility:
Develop and deploy AI/ML models for infrastructure monitoring and predictive maintenance
Automate incident detection, root cause analysis, and remediation workflows
Integrate AI solutions with cloud and on-prem infrastructure platforms
Build data pipelines for infrastructure logs and telemetry analysis
Collaborate with DevOps, SRE, and Cloud teams
Optimize system performance, scalability, and reliability
Implement MLOps practices for model deployment and lifecycle management
Provide technical leadership and documentation
Requirements:
Strong experience in Python and AI/ML frameworks (TensorFlow, PyTorch, Scikit-learn)
Experience working with infrastructure monitoring data (logs, metrics, traces)
Knowledge of cloud platforms (AWS, Azure, or GCP)
Experience with Docker and Kubernetes
Understanding of DevOps and CI/CD practices
Strong analytical and problem-solving skills
Nice to have:
Experience in AIOps or Intelligent Automation
Knowledge of monitoring tools (Splunk, Datadog, Prometheus, etc.)
Experience with MLOps tools (MLflow, SageMaker, Vertex AI)
Strong communication and stakeholder collaboration skills