CrawlJobs Logo

Ai Application Operations & Maintenance Engineer (Azure)

Albania, Tirana · Job Posted May 14, 2026
Apply Position
Job Link Share

Job Description

The organization is seeking a professional specialized in Application Maintenance & Operations for Generative AI–based applications. The candidate will operate on complex, mission-critical, highly integrated enterprise solutions, built on Microsoft Azure infrastructure, with extensive use of AI services, microservices architectures, and containerized platforms. The objective of the role is to ensure operational continuity, application stability, security, and controlled evolution of AI solutions in production. This includes supporting runtime operations, monitoring, troubleshooting, tuning, and continuous improvement of distributed applications across multiple environments (e.g., Dev, Test, Prod). The role sits at the crossroads between the application layer and the infrastructure layer, with a strong focus on observability, system integration, data management, responsible use of AI resources, and strict adherence to security principles and access segregation.

Job Responsibility

  • Manage corrective and adaptive maintenance activities for AI applications in production
  • Analyze and resolve application incidents and anomalies across front-end, back-end, and service layers
  • Support application release activities and configuration management across different environments (Dev/Test/Prod)
  • Collaborate with development teams to analyze application issues and improve overall software quality
  • Provide operational support for solutions based on Azure Kubernetes Service (AKS), including management of containerized workloads
  • Continuously monitor application and infrastructure services using Azure Monitor, Log Analytics, and Application Insights
  • Analyze application logs, metrics, and alerts to ensure appropriate levels of reliability and performance
  • Perform advanced troubleshooting on data ingestion pipelines, AI services, search services, and databases
  • Provide operational support for data persistence services, including Azure SQL Database for structured data, Azure Cosmos DB for unstructured data and conversational history, Azure Storage Accounts (Blob Storage) for document repositories
  • Verify and support correct content indexing and retrieval through Azure AI Search, including vector search and similarity search
  • Operate and monitor data preprocessing, transformation, and enrichment workflows (Transformation Layer)
  • Operate and manage application Managed Identities, ensuring adherence to the least privilege principle
  • Support secure handling of secrets and sensitive configurations using Azure Key Vault
  • Verify correct usage of AI services (e.g., Azure OpenAI, LLMs, embeddings, cognitive services) according to architectural governance and policies
  • Collaborate with security teams to ensure compliance with network and security requirements (no public exposure, access via internal network/VPN)
  • Contribute to the evolution of AI solutions with a focus on scalability, reliability, and cost optimization
  • Propose improvements to logging, monitoring, and application feedback mechanisms
  • Support the go-live and stabilization of new Generative AI use cases, aligned with reference architectures

Requirements

  • Experience in Application Maintenance and Operations for enterprise applications
  • Solid knowledge of Python in an application context focused on AI functionalities
  • Operational knowledge of Microsoft Azure and its main PaaS services
  • Experience with Azure Kubernetes Service (AKS) and containerized workloads
  • Strong troubleshooting skills based on logs, metrics, and alerts
  • Knowledge of monitoring, logging, and observability principles
  • Familiarity with microservices architectures and multi-layer environments
  • Understanding of IAM concepts, Managed Identities, and secret management
  • Experience operating AI / Generative AI solutions in production
  • Knowledge of Azure OpenAI, embedding services, and vector search
  • Experience with Azure AI Search, Cosmos DB, and Document Intelligence
  • Familiarity with ITIL operational models (Incident, Problem, Change Management)
  • Experience in highly critical and security-sensitive enterprise environments
  • One or more Microsoft AI certifications (e.g., Microsoft Certified: Azure AI Engineer Associate – AI-102)
  • Strong analytical and problem-solving mindset
  • Ability to work in cross-functional teams (development, architecture, security)
  • Structured, quality-driven approach to service delivery
  • High level of autonomy and responsibility in managing production environments
  • Strong technical communication and documentation skills

Nice to have

  • Experience operating AI / Generative AI solutions in production
  • Knowledge of Azure OpenAI, embedding services, and vector search
  • Experience with Azure AI Search, Cosmos DB, and Document Intelligence
  • Familiarity with ITIL operational models (Incident, Problem, Change Management)
  • Experience in highly critical and security-sensitive enterprise environments
  • One or more Microsoft AI certifications (e.g., Microsoft Certified: Azure AI Engineer Associate – AI-102)

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Ai Application Operations & Maintenance Engineer (Azure)

8 matching positions

Generative Ai Application Operations Engineer

The organization is looking for a professional profile specialized in Applicatio...
Location
Location
Italy , Milano
Salary
Salary:
Not provided
bip-group.com Logo
BIP
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in Application Maintenance and Operations on enterprise applications
  • Good knowledge of Python in an application context focused on AI functionalities
  • Operational knowledge of Microsoft Azure and its main PaaS services
  • Experience with Azure Kubernetes Service (AKS) and containerized workloads
  • Strong analysis and troubleshooting skills based on logs, metrics, and alerts
  • Knowledge of monitoring, logging, and observability principles
  • Familiarity with microservices architectures and multi-layer environments
  • Knowledge of IAM concepts, Managed Identity, and secrets management
Job Responsibility
Job Responsibility
  • Management of corrective and evolutionary maintenance activities for AI applications in production
  • Analysis and resolution of application incidents and anomalies across front-end, back-end, and service-layer components
  • Support for application release activities and configuration management across different environments (Dev/Test/Prod)
  • Collaboration with development teams to analyze application issues and improve overall software quality
  • Operational ownership of solutions based on Azure Kubernetes Service (AKS), including management of containerized workloads
  • Continuous monitoring of application and infrastructure services using Azure Monitor, Log Analytics, and Application Insights
  • Analysis of application logs, metrics, and alerts to ensure adequate reliability and performance levels
  • Execution of advanced troubleshooting activities on data ingestion pipelines, AI services, search services, and databases
  • Operational support for the use of data persistence services, including Azure SQL DB, Azure Cosmos DB, Azure Storage Account
  • Verification and support of correct content indexing and querying through Azure AI Search
  • Fulltime
Read More
Arrow Right

DevOps Engineer (Azure | Terraform | Ansible | Agentic AI for Infra/Monitoring/FinOps)

Job Summary: We are looking for a DevOps Engineer with strong hands-on experienc...
Location
Location
India , Bangalore South
Salary
Salary:
Not provided
votredircom.fr Logo
Wissen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3–5 years of experience in MLOps / ML Engineering / Cloud Engineering
  • Proficient in designing and deploying end-to-end ML pipelines
  • Terraform for Azure infrastructure automation
  • Python for ML, automation, and GenAI workflows
  • Azure Compute, Storage, Networking, and Identity
  • Running ML & GenAI workloads at scale on Azure
  • Supporting data pipelines for ML and LLM workloads
  • Experience with LangGraph for LLM workflow and agent orchestration
  • Hands-on exposure to Claude models, including skills/plugins integration
  • Understanding of prompt management, agent execution, and orchestration patterns
Job Responsibility
Job Responsibility
  • Build, deploy, and manage comprehensive MLOps and LLMOps pipelines on Azure
  • Design and oversee CI/CD pipelines for machine learning models and large language model workflows utilizing Harness or Azure DevOps
  • Streamline the promotion of models, prompts, and agent workflows between environments through automation
  • Establish approval gates, implement rollback mechanisms, and facilitate controlled release processes
  • Oversee the lifecycle of ML models and LLM-driven workflows, including their training, assessment, deployment, monitoring, and retraining
  • Administer Azure Machine Learning workspaces, computing resources, environments, model registries, and endpoints
  • Integrate LLM workflows and agent-centric architectures using LangGraph
  • Support the incorporation of Claude-based models, skills, and plugins into enterprise-level applications
  • Operationalize prompt versioning, orchestration strategies, and agent workflows in live production settings
  • Set up and govern Azure ML and Generative AI infrastructure via Terraform as Infrastructure as Code (IaC)
  • Fulltime
Read More
Arrow Right

Software Engineer - AI Search

Join us at Seismic, a cutting-edge technology company leading the way in the Saa...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
seismic.com Logo
Seismic
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4 to 11 years of experience in software engineering, with experience contributing to frontend or UI-focused web applications
  • Experience with HTML, CSS, and modern JavaScript (ES6+)
  • Experience building user interfaces using React, including functional components, hooks, and state management patterns
  • Full-stack experience (C#, Node.js, Python) a plus
  • Experience with TypeScript, including writing strongly typed components and APIs
  • Familiarity with modern CSS techniques such as CSS Modules, styled-components, Tailwind, or similar approaches
  • Experience integrating frontend applications with REST or GraphQL APIs
  • Working knowledge of automated frontend testing practices (e.g., Jest, React Testing Library, Cypress, Playwright)
  • Experience using Git for source control and collaborating through pull requests
  • Familiarity with CI/CD concepts and modern frontend pipelines, including GitHub Actions
Job Responsibility
Job Responsibility
  • Contribute to the design, development, and maintenance of backend systems and services supporting search functionality, ensuring performance, scalability, and reliability
  • Assist in implementing search and/or AI-related features, including indexing, retrieval, and ranking logic, to improve search accuracy and efficiency
  • Collaborate with engineers, AI partners, and product teams to integrate search and AI-driven capabilities across the Seismic platform
  • Participate in monitoring and performance tuning efforts, identifying routine bottlenecks and applying guided improvements to ensure acceptable query latency
  • Work closely with cross-functional and geographically distributed teams, including product managers, frontend engineers, and UX designers, to support seamless search experiences
  • Learn and apply new tools, technologies, and best practices related to search, backend development, and AI systems
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, AI Agent Platform

The Geico AI Agent Platform team is seeking an exceptional Staff Software Engine...
Location
Location
United States , Chevy Chase; New York City
Salary
Salary:
115000.00 - 260000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, Mathematics, or a related field
  • an advanced degree (master’s or Ph.D.) is highly desirable
  • 6+ years of hands-on experience in designing, implementing, and maintaining multi-tenant AIML systems and platforms in production environments
  • 6+ years of experience working with cloud platforms such as Azure and AWS
  • Extensive expertise in designing and deploying large-scale data pipelines and real-time inference systems and managing the end-to-end AI Agent and/or AIML system development lifecycles, including configuration, evaluation, monitoring, observability and AuthN/AuthR considerations
  • 6+ years of experience working with common backend systems & tools (e.g, Kubernetes, Temporal, OpenSearch, PostgreSQL, Redis, Neo4J, etc.)
  • Deep understanding of Docker, container optimization, and multi-stage builds
  • Experience with Prometheus, Grafana, Open Telemetry and distributed tracing
  • 3+ years of experience building front-end web applications using frameworks such as React and/or Next.JS
  • Deep proficiency in programming languages such as Python, Java, Go, etc., with a strong emphasis on coding excellence
Job Responsibility
Job Responsibility
  • Architect and implement scalable multi-tenant backend systems for building AI agent workflows, including agent configuration, offline evaluation, synthetic data generation, workflow simulation, agent marketplace, etc. using Azure Kubernetes Service (AKS), FastAPI, etc., ensuring economy of scale and control cost of maintenance
  • Collaborate with Design team to architect and implement frontend experiences and workflows for onboarding both technical and non-technical stakeholders, maximizing user adoption and successful AI agent development
  • Develop observability frameworks to ensure 99.9%+ uptime for AI agent platforms through robust monitoring, alerting, and incident response procedures
  • Evaluate and (if desirable) integrate cutting-edge GenAI frameworks, libraries and vendors to maintain a state-of-the-art technology stack, including hybrid cloud solutions with AWS/GCP as backup or specialized use cases
  • Architect and implement scalable, high-performance machine learning platforms and systems capable of processing large data volumes and supporting real-time decision making and workflows
  • Oversee the end-to-end lifecycle of AI agent applications, ensuring robust testing, deployment, and ongoing monitoring
  • Ensure adherence to company production readiness standards, security protocols, and regulatory compliance throughout the development lifecycle
  • Continuously optimize platform performance, reducing latency and improving throughput for AI agent workloads
  • Design and implement backup, recovery, and business continuity plans for hosted platform applications & services
  • Design and maintain robust CI/CD pipelines for ML model deployment using Azure DevOps, GitHub Actions, and MLOps tools
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Ai Software Engineer

We are looking for an AI Software Engineer to help design, build, and deploy AI-...
Location
Location
United States , Laconia
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with Python and modern ML frameworks (TensorFlow, PyTorch, scikit-learn)
  • Strong software engineering fundamentals and experience deploying models to production
  • Interest or background in manufacturing, industrial systems, or operational data
  • Proven experience building and delivering production-grade solutions in software engineering, machine learning engineering, or AI engineering role
  • Strong knowledge of machine learning methods and applied artificial intelligence techniques
  • Proficiency in Python and experience with frameworks such as TensorFlow, PyTorch, or scikit-learn
  • Hands-on background developing data pipelines, APIs, and application services that support AI solutions
  • Experience working with cloud-based platforms such as AWS, Azure, or similar environments
  • Ability to work with incomplete, noisy, or highly variable operational datasets from real-world environments
  • Strong communication skills with ability to explain technical ideas clearly to business and operational teams
Job Responsibility
Job Responsibility
  • Develop and deploy AI/ML models for manufacturing use cases (predictive maintenance, quality inspection, process optimization)
  • Work with sensor, production, and ERP data to build actionable insights
  • Collaborate with cross‑functional teams to move ideas from concept to production
What we offer
What we offer
  • Medical, vision, dental, and life and disability insurance
  • 401(k) plan
Read More
Arrow Right

Azure Devops + Agentic Ai

Wissen Technology is hiring for Azure DevOps + Agentic AI; About Wissen Technolo...
Location
Location
India , Bangalore South
Salary
Salary:
Not provided
votredircom.fr Logo
Wissen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3–5 years of experience in MLOps / ML Engineering / Cloud Engineering
  • Proficient in designing and deploying end-to-end ML pipelines
  • Terraform for Azure infrastructure automation
  • Python for ML, automation, and GenAI workflows
  • Azure Compute, Storage, Networking, and Identity
  • Running ML & GenAI workloads at scale on Azure
  • Supporting data pipelines for ML and LLM workloads
  • Experience with LangGraph for LLM workflow and agent orchestration
  • Hands-on exposure to Claude models, including skills/plugins integration
  • Understanding of prompt management, agent execution, and orchestration patterns
Job Responsibility
Job Responsibility
  • Build, deploy, and manage comprehensive MLOps and LLMOps pipelines on Azure
  • Design and oversee CI/CD pipelines for machine learning models and large language model workflows utilizing Harness or Azure DevOps
  • Streamline the promotion of models, prompts, and agent workflows between environments through automation
  • Establish approval gates, implement rollback mechanisms, and facilitate controlled release processes
  • Oversee the lifecycle of ML models and LLM-driven workflows, including their training, assessment, deployment, monitoring, and retraining
  • Administer Azure Machine Learning workspaces, computing resources, environments, model registries, and endpoints
  • Integrate LLM workflows and agent-centric architectures using LangGraph
  • Support the incorporation of Claude-based models, skills, and plugins into enterprise-level applications
  • Operationalize prompt versioning, orchestration strategies, and agent workflows in live production settings
  • Set up and govern Azure ML and Generative AI infrastructure via Terraform as Infrastructure as Code (IaC)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer II - Frontend - AI Search

Join us at Seismic, a cutting-edge technology company leading the way in the Saa...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
seismic.com Logo
Seismic
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in software engineering, with experience contributing to frontend or UI-focused web applications
  • Experience with HTML, CSS, and modern JavaScript (ES6+)
  • Experience building user interfaces using React, including functional components, hooks, and state management patterns
  • Experience with TypeScript, including writing strongly typed components and APIs
  • Familiarity with modern CSS techniques such as CSS Modules, styled-components, Tailwind, or similar approaches
  • Experience integrating frontend applications with REST or GraphQL APIs
  • Working knowledge of automated frontend testing practices (e.g., Jest, React Testing Library, Cypress, Playwright)
  • Experience using Git for source control and collaborating through pull requests
  • Familiarity with CI/CD concepts and modern frontend pipelines, including GitHub Actions
  • Exposure to frontend performance optimization techniques (code splitting, lazy loading, memoization)
Job Responsibility
Job Responsibility
  • Contribute to the development and maintenance of backend systems that power our web application, including search, content discovery, and AI capabilities
  • Contribute to the design, development, and maintenance of backend systems and services supporting search functionality, ensuring performance, scalability, and reliability
  • Assist in implementing search and/or AI-related features, including indexing, retrieval, and ranking logic, to improve search accuracy and efficiency
  • Collaborate with engineers, AI partners, and product teams to integrate search and AI-driven capabilities across the Seismic platform
  • Participate in monitoring and performance tuning efforts, identifying routine bottlenecks and applying guided improvements to ensure acceptable query latency
  • Work closely with cross-functional and geographically distributed teams, including product managers, frontend engineers, and UX designers, to support seamless search experiences
  • Learn and apply new tools, technologies, and best practices related to search, backend development, and AI systems
  • Fulltime
Read More
Arrow Right

Senior Software Engineer II - Frontend - AI Search

Join us at Seismic, a cutting-edge technology company leading the way in the Saa...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
seismic.com Logo
Seismic
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in software engineering, with experience contributing to frontend or UI-focused web applications
  • Experience with HTML, CSS, and modern JavaScript (ES6+)
  • Experience building user interfaces using React, including functional components, hooks, and state management patterns
  • Full-stack experience (C#, Node.js, Python) a plus
  • Experience with TypeScript, including writing strongly typed components and APIs
  • Familiarity with modern CSS techniques such as CSS Modules, styled-components, Tailwind, or similar approaches
  • Experience integrating frontend applications with REST or GraphQL APIs
  • Working knowledge of automated frontend testing practices (e.g., Jest, React Testing Library, Cypress, Playwright)
  • Experience using Git for source control and collaborating through pull requests
  • Familiarity with CI/CD concepts and modern frontend pipelines, including GitHub Actions
Job Responsibility
Job Responsibility
  • Contribute to the development and maintenance of backend systems that power our web application, including search, content discovery, and AI capabilities
  • Contribute to the design, development, and maintenance of backend systems and services supporting search functionality, ensuring performance, scalability, and reliability
  • Assist in implementing search and/or AI-related features, including indexing, retrieval, and ranking logic, to improve search accuracy and efficiency
  • Collaborate with engineers, AI partners, and product teams to integrate search and AI-driven capabilities across the Seismic platform
  • Participate in monitoring and performance tuning efforts, identifying routine bottlenecks and applying guided improvements to ensure acceptable query latency
  • Work closely with cross-functional and geographically distributed teams, including product managers, frontend engineers, and UX designers, to support seamless search experiences
  • Learn and apply new tools, technologies, and best practices related to search, backend development, and AI systems
  • Fulltime
Read More
Arrow Right