CrawlJobs Logo

AI Ops Engineer

India, Noida · Job Posted March 21, 2026
Apply Position
Job Link Share

Job Description

The AI Ops Engineer role involves designing, deploying, and optimizing AI-powered applications on Azure. Candidates should have 4+ years of experience in software engineering or cloud platforms, with strong skills in operationalizing AI/ML applications. Responsibilities include managing deployment pipelines, monitoring performance, and ensuring system reliability. A degree in Computer Science or related field is required.

Job Responsibility

  • Design, deploy, operate, and optimize enterprise-grade AI-powered applications and intelligent agents on Azure that support business workflows and customer interactions at scale
  • Operationalize AI/ML models and LLM-powered applications by managing deployment pipelines, monitoring performance, ensuring reliability, and maintaining scalability in production environments
  • Work closely with engineering, product, and CX teams to ensure AI systems run efficiently in production
  • Leverage Azure services such as Azure OpenAI, Azure Machine Learning, Cognitive Services, Kubernetes, and DevOps pipelines to operationalize AI workloads, continuously monitor model performance, improve latency and accuracy, and ensure governance, security, and system stability
  • Deploy AI agents and AI-powered applications to production environments
  • Maintain CI/CD pipelines for AI models and applications
  • Monitor AI system performance, reliability, and usage metrics
  • Troubleshoot operational issues including latency, hallucinations, or integration failures
  • Implement logging, observability, and evaluation frameworks for AI systems
  • Manage Azure infrastructure supporting AI workloads
  • Ensure security, compliance, and governance for AI deployments
  • Continuously improve system scalability, stability, and operational efficiency
  • Collaborate with AI engineers and product teams to operationalize new AI features

Requirements

  • 4+ years of hands-on software engineering, cloud, or platform engineering experience
  • Strong experience operationalizing AI/ML or GenAI applications in production environments
  • Proven expertise with Microsoft Azure cloud platform, especially AI/ML services
  • Experience with CI/CD pipelines, infrastructure automation, and cloud deployments
  • Strong troubleshooting, monitoring, and production reliability experience
  • Ability to independently manage AI deployments end-to-end
  • Degree in Computer Science, Engineering, Data Science, or equivalent practical experience
  • Experience deploying and managing AI/ML and LLM-based applications in production
  • Hands-on experience with Azure OpenAI, Azure Machine Learning, Azure AI Studio, and Cognitive Services
  • Knowledge of containerization and orchestration (Docker, Kubernetes, AKS)
  • Experience with CI/CD pipelines such as Azure DevOps or GitHub Actions
  • Familiarity with agentic AI frameworks such as LangChain, LlamaIndex, Semantic Kernel, AutoGen, or CrewAI from an operational perspective
  • Understanding of RAG architectures, vector databases, and AI observability tools
  • Strong Python scripting and automation experience
  • Experience monitoring AI models including logging, evaluation, performance metrics, and alerting
  • Knowledge of MLOps/LLMOps practices including model versioning, governance, and lifecycle management
  • Familiarity with Git, infrastructure-as-code, and standard DevOps workflows
  • Strong debugging, production support, and performance optimization skills

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

AI Ops Engineer

8 matching positions

AI SRE / AI Ops engineer

Location
Location
Canada , Montreal
Salary
Salary:
140000.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Production experience in SRE / Infrastructure / ops for large-scale systems
  • Strong programming/scripting skills (Python, Go, Java, or equivalent)
  • Deep experience with containerization (Docker), orchestration (Kubernetes, etc.)
  • Infrastructure-as-code (Terraform, Helm, CloudFormation, Ansible, etc.)
  • Familiarity with GPU / AI compute clusters, high-performance data storage, and distributed architectures
  • Experience with monitoring / observability / logging / alerting tools (Prometheus, Grafana, ELK / EFK, Datadog, etc.)
  • Networking & systems engineering knowledge (TCP/IP, DNS, routing, load balancing, distributed storage)
  • Solid experience in capacity planning, performance tuning, scaling, and incident response
  • Demonstrated ability to lead RCAs, deploy fixes, and drive reliability improvements
  • Excellent communication, documentation, and cross-team collaboration skills
  • Fulltime
Read More
Arrow Right

Ai Ops Ml Ops Engineer

Whitehall Resources are currently looking for a AI Ops ML Ops Engineer. Key Requ...
Location
Location
Salary
Salary:
Not provided
whitehallresources.com Logo
Whitehall Resources Ltd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Min 7+ years of Experience in ML Ops, DevOps, AI/ML deployment, monitoring, cloud platforms and production support
  • Build strong cross-functional ways of working across Data & AI, IT, Digital and business teams so delivery is aligned, practical and business-led
  • Keep the internal and external customer experience at the center of data, analytics and AI delivery, with focus on reliable outcomes and decision support
  • Continuously build capability in modern data, analytics and AI practices and actively share knowledge with peers and business users
  • Apply structured problem solving to simplify complex data, process and technology issues and remove barriers to execution
  • Identify practical opportunities to improve business performance using modern data platforms, analytics, automation, GenAI and embedded AI capabilities
  • Adjust priorities and delivery approach in a dynamic business environment while maintaining governance, quality and business continuity
  • Deploy, monitor and manage ML models and AI services across the lifecycle
  • Apply release, versioning, automation and controlled deployment practices
  • Monitor uptime, drift, performance, data quality and operational metrics
Job Responsibility
Job Responsibility
  • Operationalizes, monitors and supports AI/ML solutions in production
  • Ensures models, pipelines and AI services are deployed, monitored, governed and maintained with reliable operational practices
  • Implement ML Ops and deployment practices
  • Monitor model and solution health
  • Support production AI/ML systems
  • Ensure auditability and governance of AI operations
  • Improve automation and reliability
Read More
Arrow Right

AI Ops Engineer - Finance

You'll make sure no one at Lovable spends time on anything AI can do. We're defi...
Location
Location
Sweden , Stockholm
Salary
Salary:
Not provided
lovable.dev Logo
Lovable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Shipped production automations at a company: You've built workflow systems at scale that ran without you. Actual business-critical automations used by teams
  • Engineering brain, modern toolkit: You understand how systems actually work (APIs, databases, architecture) and can review code confidently. You move fast by choosing the right tool for the job: AI agents (Cursor, Claude Code etc) when they're faster, no-code (n8n, Lovable itself, Clay, etc.) when it fits, raw code when it's needed. Python, TypeScript, SQL etc
  • AI tool obsessed: You're always testing the latest models, apps, and workflows before anyone else. You've deployed internal AI tools (Claude/ChatGPT with custom contexts, AI agents, MCPs, etc.) at a company and you're constantly experimenting with new ones
  • Can't unsee inefficiency: You see the whole company as one machine. Once you spot a broken process, you can't leave it alone
  • High agency: You don't wait for permission. You find the bottleneck, build the fix, and ship it
Job Responsibility
Job Responsibility
  • Own internal AI systems that make Lovable run faster than it should
  • Define how AI-native companies operate: Help build the playbook. What we learn here scales to every company using Lovable
  • Build internal tools and workflows: Using Lovable, AI agents, no-code, raw code—whatever gets the job done fastest. Lovable on Lovable first, always
  • Deploy AI agent armies: Claude/ChatGPT with MCPs configured so every teammate has AI working for them, not the other way around
  • Squeeze value from our stack: Make the AI features in Slack, Linear, Notion, Ashby, etc. actually deliver
  • Level up the team: Turn everyone at Lovable into AI power users. Document what works. Scale it
  • Own it end-to-end: You build it, you maintain it, you improve it
  • Fulltime
Read More
Arrow Right

Ai Ops Platform Engineer

Join us as an AI Ops Engineer, to build and run an enterprise AI Factory within ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • LLMOps / MLOps at production scale, operating the full Generative AI lifecycle including models, prompts and agents, CI/CD pipelines, structured evaluation, drift and hallucination monitoring, and controlled, auditable release processes suitable for banking environments
  • Cloud‑native AI platform engineering on AWS, with hands‑on delivery using services such as Amazon Bedrock for foundation models, agent orchestration patterns, Lambda and Step Functions, alongside demonstrated Python engineering capability and secure microservices and API design
  • AI governance, observability and cost optimisation, embedding governance by design through policy as code, alignment to model risk framework expectations, lifecycle traceability and audit‑ready evidence, supported by SRE‑grade monitoring and ongoing optimisation of token usage and compute cost across AI workloads
Job Responsibility
Job Responsibility
  • Build and run an enterprise AI Factory within our Card Merchant Services organisation, enabling AI‑driven change across the merchant payments lifecycle
  • Accountable for the end‑to‑end operationalisation of AI, spanning model, prompt, and agent lifecycles
  • deployment and monitoring
  • guardrails
  • and cost optimisation, ensuring AI solutions are production‑ready, auditable, compliant, and scalable across merchant payment use cases
  • Accountable for the end‑to‑end engineering of GenAI and ML platforms, embedding governance, observability and operational resilience by design, while enabling teams to deploy and run AI solutions with clarity, assurance and accountability at scale
  • Lead and manage engineering teams, providing technical guidance, mentorship, and support to ensure the delivery of high-quality software solutions
  • Oversee timelines, team allocation, risk management and task prioritization
  • Mentor and support team members' professional growth, conduct performance reviews, provide actionable feedback, and identify opportunities for improvement
  • Evaluation and enhancement of engineering processes, tools, and methodologies
What we offer
What we offer
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution
  • Fulltime
Read More
Arrow Right

Cs - Ops - Applied Ai Engineer - Hsinchu

As an Applied AI Engineer, you design and deliver practical AI solutions that im...
Location
Location
Taiwan , Hsinchu
Salary
Salary:
Not provided
asml.com Logo
ASML
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Software Engineering, Data Science, or a related field
  • 5+ years of experience in software development or automation-related roles
  • 3+ years of experience applying AI or data-driven solutions in an enterprise or industrial environment
Job Responsibility
Job Responsibility
  • Design and build AI-driven solutions that support process improvement, intelligent diagnostics, and decision support
  • Apply Generative AI and agent-based AI concepts to automate and simplify complex workflows
  • Translate operational challenges into practical, usable AI solutions that deliver measurable business value
  • Build AI solutions in a clear and structured way, so they are easy to understand, update, and support over time
  • Ensure AI solutions run on ASML-approved systems and data platforms, not personal tools or one-off scripts
  • Work together with IT and platform teams so solutions are stable, secure, and suitable for daily operations
  • Make sure solutions are reliable, protect sensitive data, and can be maintained or improved by others in the future
  • Collaborate with IT teams, software engineers, and business stakeholders to align AI solutions with business needs and company standards
  • Explain AI concepts and solution choices in a clear, practical, and non-technical way when needed
  • Act as a bridge between business problems and technical solutions
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, AI & ML Ops

Hyundai AutoEver America seeks a seasoned Senior AI/ML Engineer to architect, de...
Location
Location
United States , Irvine
Salary
Salary:
103170.00 - 158873.00 USD / Year
haeaus.com Logo
Hyundai AutoEver America
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Engineering, AI, or related field
  • advanced degrees/certifications are a plus
  • 8+ years of software engineering experience, including 3+ years in AI/ML solution development
  • Proven experience designing and deploying LLM-based solutions, traditional ML models, RAG systems, and agent workflows
  • Strong expertise in Python, TensorFlow/PyTorch, Hugging Face, prompt engineering, vector databases, and AI orchestration
  • Hands-on experience with AWS SageMaker/Bedrock, Azure OpenAI, or Azure ML Studio, plus MLOps best practices (CI/CD, testing, model monitoring)
  • Proficiency in frontend frameworks (React), cloud-native deployment (Docker/Kubernetes), microservice APIs, and relational/NoSQL databases
Job Responsibility
Job Responsibility
  • Architect and develop scalable AI/ML and LLM-based systems, including RAG pipelines, agentic workflows, predictive models, and generative AI solutions
  • Build full‑stack AI applications, including React-based dashboards and front‑end interfaces integrated with backend services and cloud infrastructure
  • Develop data pipelines and ML Ops workflows using Python, SQL, AWS/Azure platforms, and monitoring tools to train, deploy, and optimize models
  • Lead cross-functional AI initiatives, deliver PoCs/MVPs, ensure compliance with AI governance, and integrate AI features into enterprise and user-facing systems
  • Provide technical leadership and mentorship, guiding standards, code reviews, model documentation, and best practices in AI/ML development
  • Continuously improve AI performance and reliability through prompt engineering, architecture enhancements, and data optimization
What we offer
What we offer
  • comprehensive medical/dental coverage
  • generous PTO
  • education assistance
  • annual merit increase eligibility
  • Fulltime
Read More
Arrow Right

Cloud & AI Solution Engineer - AI Apps

Are you insatiably curious, deeply passionate about the realm of AI & applicatio...
Location
Location
United Kingdom , Multiple Locations
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, Information Technology, or a related field or equivalent work experience
  • Experience in technical pre‑sales, consulting, and customer‑facing architecture engagements
  • Proven ability to influence both technical and business decision‑makers
  • Strong track record leading complex customer architecture and technical pre‑sales engagements
  • Trusted technical advisor across Modern Database, AI Applications, and Data Platforms
  • AI Application Architecture
  • Full‑Stack & Cloud‑Native Development
  • Databases & Data Architecture
  • Hands‑on delivery of Proof of Concepts, Minimum Viable Products, and rapid prototypes, with direct contribution in architecture design sessions, hackathons, and customer validations—consistently converting concepts into scalable, production‑ready solutions.
Job Responsibility
Job Responsibility
  • Drive technical sales with decision makers using demos and PoCs to influence solution design and enable production deployments
  • Lead hands-on engagements—hackathons, code-with sessions, and architecture workshops—to accelerate adoption of Microsoft’s developer tools and cloud platforms
  • Build trusted relationships with developers and platform leads, co-designing secure, scalable architectures and solutions
  • Resolve technical blockers and objections, collaborating with engineering to share insights and improve products
  • Maintain deep expertise in AI Foundry & App architecture (Agentic AI framework, Semantic Kernel, Foundry SDK, Responsible AI) and App architecture/cloud native dev (APIs, containerization, microservices, event-driven, Python, Java or .NET)
  • Maintain and grow expertise in AI Management & Security (Gen AI Ops, Sentinel, orchestrator, monitoring)
  • Represent Microsoft through thought leadership in developer communities and customer forums
  • Fulltime
Read More
Arrow Right
New

Agentic AI Engineer Co-Op

An Agentic AI Engineer designs, builds, and deploys autonomous AI systems that c...
Location
Location
Salary
Salary:
Not provided
campbellsoupcompany.com Logo
Campbell Soup Company
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years in AI/ML system design, deployment, or autonomous agent development
  • Programming: Proficiency in Python (and sometimes Java, C#) for AI/ML solution development
  • Agent & Workflow Expertise: Experience with agent orchestration frameworks and multi-agent communication protocols
  • RAG & LLM Integration: Hands-on with RAG architectures, evaluation methodologies, and LLM integration
  • Cloud & DevOps: Experience with cloud platforms (e.g., Azure, AWS) and CI/CD pipelines
  • Governance & Compliance: Understanding of responsible AI, security, and compliance in regulated domains (e.g., retail)
Job Responsibility
Job Responsibility
  • Design & Develop Agentic Systems: Build intelligent agents capable of autonomous planning, reasoning, and task execution, often using LLMs (e.g., GPT-class, LLaMA), multi-modal models, and autonomous workflows
  • Orchestration & Frameworks: Implement agent orchestration using frameworks like LangChain, AutoGen, CrewAI, Semantic Kernel, or custom solutions
  • Retrieval-Augmented Generation (RAG): Design and optimize RAG pipelines for enhanced reasoning with external knowledge, including document ingestion, chunking, embeddings, vector stores, and retrieval ranking
  • Tool & Memory Integration: Develop agents that call APIs, databases, and other tools, maintain memory, and adapt based on outcomes
  • Evaluation & Monitoring: Create evaluation frameworks for accuracy, grounding, latency, and cost
  • build observability for agent behavior and failure modes
  • Model Adaptation: Fine-tune or adapt foundation models (e.g., via LoRA, adapters) for domain-specific use cases
  • Production Deployment: Deploy GenAI/agentic systems in cloud-native environments with CI/CD, versioning, and runtime safeguards
  • Cross-Functional Collaboration: Work with data scientists, ML engineers, product teams, and governance/compliance stakeholders
What we offer
What we offer
  • medical
  • dental
  • short and long-term disability
  • AD&D
  • life insurance
  • matching 401(k) plan
  • unlimited sick time
  • paid time off
  • holiday pay
  • free access to the fitness center (if in WHQ)
Read More
Arrow Right