CrawlJobs Logo

Ai Application Operations & Maintenance Engineer (Azure)

businessintegrationpartners.com Logo

Business Integration Partners

Location Icon

Location:
Albania , Tirana

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

The organization is seeking a professional specialized in Application Maintenance & Operations for Generative AI–based applications. The candidate will operate on complex, mission-critical, highly integrated enterprise solutions, built on Microsoft Azure infrastructure, with extensive use of AI services, microservices architectures, and containerized platforms. The objective of the role is to ensure operational continuity, application stability, security, and controlled evolution of AI solutions in production. This includes supporting runtime operations, monitoring, troubleshooting, tuning, and continuous improvement of distributed applications across multiple environments (e.g., Dev, Test, Prod). The role sits at the crossroads between the application layer and the infrastructure layer, with a strong focus on observability, system integration, data management, responsible use of AI resources, and strict adherence to security principles and access segregation.

Job Responsibility:

  • Manage corrective and adaptive maintenance activities for AI applications in production
  • Analyze and resolve application incidents and anomalies across front-end, back-end, and service layers
  • Support application release activities and configuration management across different environments (Dev/Test/Prod)
  • Collaborate with development teams to analyze application issues and improve overall software quality
  • Provide operational support for solutions based on Azure Kubernetes Service (AKS), including management of containerized workloads
  • Continuously monitor application and infrastructure services using Azure Monitor, Log Analytics, and Application Insights
  • Analyze application logs, metrics, and alerts to ensure appropriate levels of reliability and performance
  • Perform advanced troubleshooting on data ingestion pipelines, AI services, search services, and databases
  • Provide operational support for data persistence services, including Azure SQL Database for structured data, Azure Cosmos DB for unstructured data and conversational history, Azure Storage Accounts (Blob Storage) for document repositories
  • Verify and support correct content indexing and retrieval through Azure AI Search, including vector search and similarity search
  • Operate and monitor data preprocessing, transformation, and enrichment workflows (Transformation Layer)
  • Operate and manage application Managed Identities, ensuring adherence to the least privilege principle
  • Support secure handling of secrets and sensitive configurations using Azure Key Vault
  • Verify correct usage of AI services (e.g., Azure OpenAI, LLMs, embeddings, cognitive services) according to architectural governance and policies
  • Collaborate with security teams to ensure compliance with network and security requirements (no public exposure, access via internal network/VPN)
  • Contribute to the evolution of AI solutions with a focus on scalability, reliability, and cost optimization
  • Propose improvements to logging, monitoring, and application feedback mechanisms
  • Support the go-live and stabilization of new Generative AI use cases, aligned with reference architectures

Requirements:

  • Experience in Application Maintenance and Operations for enterprise applications
  • Solid knowledge of Python in an application context focused on AI functionalities
  • Operational knowledge of Microsoft Azure and its main PaaS services
  • Experience with Azure Kubernetes Service (AKS) and containerized workloads
  • Strong troubleshooting skills based on logs, metrics, and alerts
  • Knowledge of monitoring, logging, and observability principles
  • Familiarity with microservices architectures and multi-layer environments
  • Understanding of IAM concepts, Managed Identities, and secret management
  • Experience operating AI / Generative AI solutions in production
  • Knowledge of Azure OpenAI, embedding services, and vector search
  • Experience with Azure AI Search, Cosmos DB, and Document Intelligence
  • Familiarity with ITIL operational models (Incident, Problem, Change Management)
  • Experience in highly critical and security-sensitive enterprise environments
  • One or more Microsoft AI certifications (e.g., Microsoft Certified: Azure AI Engineer Associate – AI-102)
  • Strong analytical and problem-solving mindset
  • Ability to work in cross-functional teams (development, architecture, security)
  • Structured, quality-driven approach to service delivery
  • High level of autonomy and responsibility in managing production environments
  • Strong technical communication and documentation skills

Nice to have:

  • Experience operating AI / Generative AI solutions in production
  • Knowledge of Azure OpenAI, embedding services, and vector search
  • Experience with Azure AI Search, Cosmos DB, and Document Intelligence
  • Familiarity with ITIL operational models (Incident, Problem, Change Management)
  • Experience in highly critical and security-sensitive enterprise environments
  • One or more Microsoft AI certifications (e.g., Microsoft Certified: Azure AI Engineer Associate – AI-102)

Additional Information:

Job Posted:
May 14, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Ai Application Operations & Maintenance Engineer (Azure)

Generative Ai Application Operations Engineer

The organization is looking for a professional profile specialized in Applicatio...
Location
Location
Italy , Milano
Salary
Salary:
Not provided
bip-group.com Logo
BIP
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in Application Maintenance and Operations on enterprise applications
  • Good knowledge of Python in an application context focused on AI functionalities
  • Operational knowledge of Microsoft Azure and its main PaaS services
  • Experience with Azure Kubernetes Service (AKS) and containerized workloads
  • Strong analysis and troubleshooting skills based on logs, metrics, and alerts
  • Knowledge of monitoring, logging, and observability principles
  • Familiarity with microservices architectures and multi-layer environments
  • Knowledge of IAM concepts, Managed Identity, and secrets management
Job Responsibility
Job Responsibility
  • Management of corrective and evolutionary maintenance activities for AI applications in production
  • Analysis and resolution of application incidents and anomalies across front-end, back-end, and service-layer components
  • Support for application release activities and configuration management across different environments (Dev/Test/Prod)
  • Collaboration with development teams to analyze application issues and improve overall software quality
  • Operational ownership of solutions based on Azure Kubernetes Service (AKS), including management of containerized workloads
  • Continuous monitoring of application and infrastructure services using Azure Monitor, Log Analytics, and Application Insights
  • Analysis of application logs, metrics, and alerts to ensure adequate reliability and performance levels
  • Execution of advanced troubleshooting activities on data ingestion pipelines, AI services, search services, and databases
  • Operational support for the use of data persistence services, including Azure SQL DB, Azure Cosmos DB, Azure Storage Account
  • Verification and support of correct content indexing and querying through Azure AI Search
  • Fulltime
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right

Ai/ml Engineer

We are looking for an AI/ML Engineer with deep technical expertise and proven le...
Location
Location
Bahrain
Salary
Salary:
Not provided
lobelia.earth Logo
Lobellia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s/master’s in chemical engineering, AI, Machine Learning, or related field
  • 10+ years of hands-on experience in AI/ML, with at least 4 years in a senior or lead role
  • Proven project delivery experience in industrial or energy sectors, with a preference for oil & gas
  • Demonstrated knowledge of oil & gas processes (upstream, midstream, downstream), instrumentation, and control systems
  • Proven Expertise to develop process dynamic simulations using PFDs and P&IDs and trouble shooting
  • Proficiency in handling large-scale data, time-series data, and sensor/IoT data within industrial contexts
  • Familiarity with real-time data challenges and solutions specific to high-stakes industrial environments
  • Strong foundation in machine learning algorithms (supervised, unsupervised, reinforcement learning), statistical modelling, and optimization techniques
  • Strong experience with classical machine learning, deep learning and reinforcement learning projects
  • Identify relevant metrics for A.I. model evaluation and Present technical outcomes to both technical and non-technical audiences, highlighting business value and ROI
Job Responsibility
Job Responsibility
  • Develop dynamic process simulation model to simulate various plant scenarios
  • Exploratory Data Analysis to analyze trends and patterns, data pre-processing and make intelligent recommendations
  • Implement classical machine learning techniques to prepare soft sensors, reinforcement learning models for process plant autonomous control operations
  • Design and develop AI models that troubleshoot the plant upsets, support asset performance management across various maintenance strategies
  • Leverage Generative AI (Large Language Models, Deep Reinforcement Learning) to enable multi-agent systems for collaborative decision-making and autonomous goal-seeking behavior
  • Ensure AI models are scalable and deployable within industrial platforms, integrating with PLC, DCS, SCADA, Historians, EAM, MES/MOM, SCM, and ERP systems
  • Ensure compliance with ethical AI principles, particularly in terms of fairness, transparency, and bias mitigation
  • Lead project implementation from data ingestion and feature engineering to model deployment and monitoring
  • Lead and mentor a team of process engineers and machine learning engineers
  • Review the process model and guide the team to develop plant scenarios in the dynamic simulation model accurately
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

Microsoft Cloud Operations + Innovation (CO+I) is the engine that powers Microso...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years’ experience in business analytics, data science, data modeling, or data engineering work
  • OR master’s degree in computer science, Math, Software Engineering, Computer Engineering, or related field and 3+ years’ experience in business analytics, data science, data modeling, or data engineering work
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • 8+ years of experience in data engineering with coding and debugging skills in C#, Python, and/or SQL
  • Deploying solutions in Azure Services & Managing Azure Subscriptions
  • Understanding and knowledge about big data and writing queries with Kusto/KQL
  • Understanding and knowledge about extracting data via REST APIs
  • Strong analytical skills with a systematic and structured approach to software design
  • 5+ years of experience in data science, analytics, or machine learning
  • 4+ years of experience in developing solutions with Microsoft Power Platform, including Power BI, Fabric, Power Automate & M365 Dataverse
Job Responsibility
Job Responsibility
  • Apply modification techniques to transform raw data into compatible formats for downstream systems
  • Utilize software and computing tools to ensure data quality and completeness
  • Implement code to extract and validate raw data from upstream sources, ensuring accuracy and reliability
  • Writes efficient, readable, extensible code from scratch that spans multiple features/solutions
  • Develops technical expertise in proper modeling, coding, and/or debugging techniques such as locating, isolating, and resolving errors and/or defects
  • Leverages technical proficiency of big-data software engineering concepts, such as Hadoop Ecosystem, Apache Spark, continuous integration and continuous delivery (CI/CD), Docker, Delta Lake, MLflow, AML, and representational state transfer (REST) application programming interface (API) consumption/development
  • Acquires data necessary for successful completion of the project plan
  • Proactively detects changes and communicates to senior leaders
  • Develops usable data sets for modeling purposes
  • Contributes to ethics and privacy policies related to collecting and preparing data by providing updates and suggestions around internal best practices
  • Fulltime
Read More
Arrow Right

Senior Software Engineer II - Frontend - AI Search

Join us at Seismic, a cutting-edge technology company leading the way in the Saa...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
seismic.com Logo
Seismic
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in software engineering, with experience contributing to frontend or UI-focused web applications
  • Experience with HTML, CSS, and modern JavaScript (ES6+)
  • Experience building user interfaces using React, including functional components, hooks, and state management patterns
  • Full-stack experience (C#, Node.js, Python) a plus
  • Experience with TypeScript, including writing strongly typed components and APIs
  • Familiarity with modern CSS techniques such as CSS Modules, styled-components, Tailwind, or similar approaches
  • Experience integrating frontend applications with REST or GraphQL APIs
  • Working knowledge of automated frontend testing practices (e.g., Jest, React Testing Library, Cypress, Playwright)
  • Experience using Git for source control and collaborating through pull requests
  • Familiarity with CI/CD concepts and modern frontend pipelines, including GitHub Actions
Job Responsibility
Job Responsibility
  • Contribute to the development and maintenance of backend systems that power our web application, including search, content discovery, and AI capabilities
  • Contribute to the design, development, and maintenance of backend systems and services supporting search functionality, ensuring performance, scalability, and reliability
  • Assist in implementing search and/or AI-related features, including indexing, retrieval, and ranking logic, to improve search accuracy and efficiency
  • Collaborate with engineers, AI partners, and product teams to integrate search and AI-driven capabilities across the Seismic platform
  • Participate in monitoring and performance tuning efforts, identifying routine bottlenecks and applying guided improvements to ensure acceptable query latency
  • Work closely with cross-functional and geographically distributed teams, including product managers, frontend engineers, and UX designers, to support seamless search experiences
  • Learn and apply new tools, technologies, and best practices related to search, backend development, and AI systems
  • Fulltime
Read More
Arrow Right

Senior DevOps Engineer

The Cloud Engineering team at Seismic is responsible for infrastructure deployed...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
seismic.com Logo
Seismic
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of hands‑on experience in DevOps, Platform Engineering, Cloud Infrastructure, or a related engineering role in production environments
  • Comfortable asking questions and respectfully voicing your opinion in group settings
  • Comfortable providing and discussing a recommendation after evaluating multiple solutions
  • Detailed understanding of DevOps Capabilities and their importance in enabling high performing teams
  • Experienced working in a cloud environment in a role of devops engineer, infrastructure engineer, or software development engineer with infrastructure exposure
  • High level of competency with at least one cloud platform, ideally multiple, such as Azure, AWS, Google Cloud, IBM Cloud, or Oracle Cloud Infrastructure
  • Production level use of Terraform, Shell/Bash, and Python (or Go, Java, Ruby–another non-Shell programming language)
  • Have a strong knowledge of Linux, Kubernetes, networking, and infrastructure fundamentals in a multi-region microservice architecture environment
  • Comfortable collaborating within a global team
  • Comfortable using AI-assisted development tools such as GitHub CoPilot
Job Responsibility
Job Responsibility
  • Drive development and maintenance of our .NET tenant management tooling and automation
  • Elimination of manual work related to tenant creation, deletion, renaming, migration, and refresh processes
  • Debug these processes when they fail
  • Contribute to best practices and build out tools and frameworks to increase productivity of our Engineering group
  • Work closely with application development teams and incorporate their feedback to improve developer experience and reduce toil
  • Present plans and proposals to Engineering Leadership
  • Lead projects as we execute objectives shared across our Production Engineering team
  • Provide guidance to less senior engineers
  • Participate in a 12 hours on, 12 hours off on-call rotation within the Production Engineering team
  • Leverage AI-assisted tools to accelerate infrastructure development, troubleshooting, and documentation while ensuring reliability, security, and compliance standards are met
  • Fulltime
Read More
Arrow Right

AI Solutions Architect

We are seeking an experienced AI Solutions Architect to join our data team and d...
Location
Location
United States , San Diego
Salary
Salary:
160000.00 - 180000.00 USD / Year
clearwayenergy.com Logo
Clearway Energy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or related field
  • 5+ years of experience in AI/ML solution architecture and implementation
  • Proven track record of taking AI projects from proof-of-concept to production
  • Strong experience with Snowflake Cortex AI or similar systems for enterprise AI workflows
  • Hands-on experience with AI agentic frameworks (LangChain, LangGraph, Semantic Kernel, CrewAI, or AutoGen) for building autonomous AI workflows
  • Proficiency in integrating LLM APIs and SDKs (OpenAI, Anthropic, Azure OpenAI, Hugging Face) into production applications
  • Experience designing and implementing RAG systems, including vector databases (Pinecone, Weaviate, Chroma, pgvector, or similar), embedding models, and retrieval optimization
  • Experience building interactive applications with Gradio and Streamlit
  • Advanced GitHub/GitLab workflows including CI/CD, branching strategies, and code review processes
  • Expertise in Python, SQL, and modern ML frameworks (PyTorch, TensorFlow, scikit-learn)
Job Responsibility
Job Responsibility
  • Design and architect end-to-end AI solutions using Snowflake Cortex AI, Modal, and cloud-native technologies
  • Develop rapid proof-of-concepts for Retrieval augmented generation (RAG) applications, operational optimization, process optimization, energy forecasting, predictive maintenance, and computer vision (drone footage) use cases
  • Design and implement AI agentic workflows using frameworks such as LangChain, LangGraph, CrewAI, or similar orchestration tools
  • Create interactive demos and prototypes using Gradio and Streamlit for stakeholder validation
  • Translate business requirements into technical specifications and system architectures
  • Build and deploy scalable AI/ML models for business process optimization, renewable energy applications, including wind/solar forecasting, and asset performance optimization
  • Integrate AI capabilities via APIs and SDKs from providers including OpenAI, Anthropic, Azure OpenAI, and Hugging Face
  • Implement MLOps pipelines using Dagster, Snowflake, Sentry, and Modal for model training, deployment, and monitoring
  • Design and optimize RAG architectures, including vector database implementation, embedding strategies, and retrieval pipelines
  • Ensure robust version control and collaboration practices using GitLab
What we offer
What we offer
  • generous PTO
  • medical, dental & vision care
  • HSAs with company contributions
  • health FSAs
  • dependent daycare FSAs
  • commuter benefits
  • relocation
  • 401(k) plan with employer match
  • a variety of life & accident insurances
  • fertility programs
  • Fulltime
Read More
Arrow Right

AI Solutions Architect

We are seeking an experienced AI Solutions Architect to join our data team and d...
Location
Location
United States , Scottsdale
Salary
Salary:
160000.00 - 180000.00 USD / Year
clearwayenergy.com Logo
Clearway Energy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or related field
  • 5+ years of experience in AI/ML solution architecture and implementation
  • Proven track record of taking AI projects from proof-of-concept to production
  • Strong experience with Snowflake Cortex AI or similar systems for enterprise AI workflows
  • Hands-on experience with AI agentic frameworks (LangChain, LangGraph, Semantic Kernel, CrewAI, or AutoGen) for building autonomous AI workflows
  • Proficiency in integrating LLM APIs and SDKs (OpenAI, Anthropic, Azure OpenAI, Hugging Face) into production applications
  • Experience designing and implementing RAG systems, including vector databases (Pinecone, Weaviate, Chroma, pgvector, or similar), embedding models, and retrieval optimization
  • Experience building interactive applications with Gradio and Streamlit
  • Advanced GitHub/GitLab workflows including CI/CD, branching strategies, and code review processes
  • Expertise in Python, SQL, and modern ML frameworks (PyTorch, TensorFlow, scikit-learn)
Job Responsibility
Job Responsibility
  • Design and architect end-to-end AI solutions using Snowflake Cortex AI, Modal, and cloud-native technologies
  • Develop rapid proof-of-concepts for Retrieval augmented generation (RAG) applications, operational optimization, process optimization, energy forecasting, predictive maintenance, and computer vision (drone footage) use cases
  • Design and implement AI agentic workflows using frameworks such as LangChain, LangGraph, CrewAI, or similar orchestration tools
  • Create interactive demos and prototypes using Gradio and Streamlit for stakeholder validation
  • Translate business requirements into technical specifications and system architectures
  • Build and deploy scalable AI/ML models for business process optimization, renewable energy applications, including wind/solar forecasting, and asset performance optimization
  • Integrate AI capabilities via APIs and SDKs from providers including OpenAI, Anthropic, Azure OpenAI, and Hugging Face
  • Implement MLOps pipelines using Dagster, Snowflake, Sentry, and Modal for model training, deployment, and monitoring
  • Design and optimize RAG architectures, including vector database implementation, embedding strategies, and retrieval pipelines
  • Ensure robust version control and collaboration practices using GitLab
What we offer
What we offer
  • generous PTO
  • medical, dental & vision care
  • HSAs with company contributions
  • health FSAs
  • dependent daycare FSAs
  • commuter benefits
  • relocation
  • 401(k) plan with employer match
  • a variety of life & accident insurances
  • fertility programs
  • Fulltime
Read More
Arrow Right