CrawlJobs Logo

Senior Devops & AI Engineer

India, Hyderabad · Job Posted December 08, 2025
Apply Position
Job Link Share

Job Description

This role presents a unique opportunity to contribute to the future of impactful business solutions while advancing your career in a collaborative and innovative environment.

Job Responsibility

  • Configure and optimize Linux-based servers for performance, security, and resource utilization, including kernel tuning, file system management, and network configuration
  • Architect cloud solutions leveraging best practices and services offered by AWS and Azure, optimizing for scalability, reliability, and cost-effectiveness
  • Implement and manage hybrid cloud environments, facilitating seamless integration and interoperability between AWS and Azure services
  • Establish version control practices for IAC templates, ensuring traceability, auditability, and reproducibility of infrastructure changes

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related field
  • 6+ years of experience in Infrastructure Mgmt. roles, with a focus on cloud platforms (Azure and AWS Preferred)
  • Hands-on experience with operations (DevSecOps) principles and best practices
  • Proficiency in scripting languages such as Python, PowerShell, or Bash
  • Excellent communication and collaboration skills
  • In-depth knowledge of Linux operating systems, including CentOS, Ubuntu, and Red Hat, with expertise in shell scripting, package management, and system administration
  • Hands-on experience with a wide range of AWS and Azure services
  • Develop and maintain Infrastructure as Code (IAC) templates using tools such as Terraform or AWS CloudFormation
  • Experience setting up cloud infrastructure stack, databases, service endpoints, GPU as well as CPU resource scaling, optimization etc.
  • Should have worked AIOps/MLOP
  • Should have worked on deploying AI/ML Apps using Docker and Kubernetes
  • Should have worked on scaling, high availability and reliability tasks for AI application
  • Should have worked on deploying and maintaining GPU clusters for AI/ML training and inference

Nice to have

Certifications such as AWS Solution Architect Associate, AWS Cloud Practitioner, Azure DevOps Engineer Expert, Azure Administrator Certified Kubernetes Administrator or relevant industry certifications are a plus

What we offer

  • Opportunity to work on impactful technical challenges with global reach
  • Vast opportunities for self-development, including online university access and knowledge sharing opportunities
  • Sponsored Tech Talks & Hackathons to foster innovation and learning
  • Generous benefits packages including health insurance, retirement benefits, flexible work hours, and more
  • Supportive work environment with forums to explore passions beyond work

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Devops & AI Engineer

8 matching positions

Senior DevOps AI Engineer

We are seeking a highly experienced and technically proficient Senior DevOps Eng...
Location
Location
United States , Columbia
Salary
Salary:
150000.00 - 250000.00 USD / Year
synergyecp.com Logo
Synergy ECP
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • B.S. in a relevant technical field with 12 years of experience, or M.S. in a relevant technical field with 10 years of experience
  • Advanced proficiency in DevOps principles and practices
  • Demonstrated expertise in containerization using Docker and Kubernetes
  • Proven experience in architecting and managing CI/CD pipelines
  • Extensive experience with AI model lifecycle management and maintenance
  • Familiarity with cloud platforms (AWS, Microsoft Azure) for infrastructure deployment and management
  • Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack)
  • Excellent communication and interpersonal skills, with the ability to effectively collaborate with cross-functional teams
  • Ability to translate complex technical concepts into actionable engineering solutions
  • TS/SCI with CI Poly
Job Responsibility
Job Responsibility
  • Design, implement, and maintain robust infrastructure for enterprise AI applications in cloud environments (AWS, Microsoft Azure)
  • Develop and optimize engineering workflows and processes to support AI model development, deployment, and maintenance
  • Architect and manage CI/CD pipelines for continuous integration and continuous delivery of AI models and applications
  • Implement and manage containerization solutions using technologies like Docker and Kubernetes
  • Ensure efficient AI model lifecycle management, including versioning, monitoring, and scaling
  • Collaborate with AI/ML engineers and data scientists to streamline deployment processes and optimize resource utilization
  • Oversee system performance, security, and scalability of AI infrastructure
  • Continuously research and implement new DevOps tools and practices to enhance efficiency
What we offer
What we offer
  • Highly competitive compensation
  • Comprehensive Health Benefits package
  • 401K Retirement plan
  • People Partners to help navigate personal and professional worlds
  • Wellness resources
  • Company-sponsored continuing education program
  • Generous Paid Time Off
  • 11 paid holidays a year
  • Flexible work options
  • Philanthropy program participation
  • Fulltime
Read More
Arrow Right

Senior DevOps Engineer, AI

LogicMonitor® is the AI-first hybrid observability platform powering the next ge...
Location
Location
India , Pune
Salary
Salary:
Not provided
logicmonitor.com Logo
LogicMonitor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience in DevOps or similar roles
  • Proven experience with AWS (preferred), and GCP in production environments
  • Strong expertise in Infrastructure as Code practices
  • Solid knowledge of Kubernetes (EKS), container orchestration, and cluster security
  • Hands-on experience with Grafana, Prometheus, and alerting/monitoring systems
  • Understanding of network connectivity over the private link endpoint, VPC, cross-account vpc connectivity, how to make things accessible internally, externally, etc.
  • Experience in deploying automated Canary and Integration testing pipelines, CI/CD pipeline etc.
  • Exposing internal self-hosted services like LangFuse via WebUI for internal users using Traefik or Ingress controller or any other tool
  • Experience in deployment of LLM related solutions that require MCP, LangFuse, Airflow, GraphDB, VectorDB, Redis etc.
  • Experience working with developers on on-demand JIT access to Prod clusters to troubleshoot/debug issues with tools like Teleport or some other
Job Responsibility
Job Responsibility
  • Multi-Cloud Enablement: Expand and manage application hosting across AWS and Google Cloud, ensuring performance, flexibility, and resilience
  • Infrastructure as Code (IaC): Develop and maintain Terraform or similar installers for Azure and GCP to fully automate infrastructure deployments
  • Cost Optimization: Design and implement AWS cost optimization strategies, including reserved instances, right-sizing, and resource efficiency initiatives
  • Cloud Security: Strengthen infrastructure security with robust access controls, encryption, monitoring, and alerting frameworks
  • Observability: Build and enhance monitoring platforms with Grafana dashboards and Prometheus alerts for real-time performance insights and proactive issue resolution
  • Kubernetes Management: Implement Role-Based Access Control (RBAC) and optimize Ingress controllers (Traefik or similar) for enhanced security and delivery resilience
  • Automation & Scripting: Create Python and Bash scripts to automate repetitive tasks, streamline workflows, and improve operational efficiency
Read More
Arrow Right

Senior DevOps Engineer (AI & Cloud Infrastructure)

We are seeking a Senior DevOps Engineer to design, deploy, and operate the next ...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 250000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on experience in DevOps, Site Reliability Engineering, or ML Infrastructure supporting high-scale, production systems
  • Deep expertise in Azure and AWS, including storage, compute, networking, databases, and cloud-native monitoring services
  • Strong Kubernetes administration experience, including GPU scheduling, operator deployment, and management of core infrastructure components
  • experience with Slurm is highly desirable
  • Proven experience deploying, scaling, and operating Large Language Models (LLMs) and inference engines such as vLLM, TGI, or Triton
  • Strong experience with modern DevOps tooling: Terraform, Helm, Kustomize, ArgoCD, GitHub Actions or GitLab CI, Prometheus, Grafana, and Clickhouse
  • Advanced scripting and automation skills in Python and Bash, with the ability to debug complex distributed systems and optimize performance at scale
  • Demonstrated ability to troubleshoot LLM servers, Kubernetes workloads, GPU utilization, and cloud infrastructure bottlenecks
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements.
Job Responsibility
Job Responsibility
  • Architect, deploy, and operate large-scale LLM inference servers and AI applications with a focus on low latency, high availability, and production reliability
  • Design, provision, and maintain complex cloud architectures across Azure and AWS, including storage, compute, networking, databases, and native LLM services
  • Manage GPU-enabled Kubernetes clusters and Slurm-based HPC environments, optimizing resource allocation for AI training and inference workloads
  • Deploy and operate core Kubernetes infrastructure components and operators (GPU operators, ingress controllers, service meshes, CNIs, CSIs, and storage drivers)
  • Build scalable infrastructure-as-code and deployment workflows using Terraform, Helm, Kustomize, ArgoCD, and GitOps best practices
  • Design and maintain centralized observability systems using Prometheus, Grafana, Clickhouse, and cloud-native monitoring tools
  • Participate in on-call rotations, lead incident response, perform post-mortems, and continuously improve system reliability and SLAs.
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Meaningful equity component.
  • Fulltime
Read More
Arrow Right

Senior Java/Kotlin Engineer (AI-Driven DevOps & Automation)

We are looking for a Senior Java/Kotlin Engineer who goes beyond traditional dev...
Location
Location
Colombia
Salary
Salary:
Not provided
parserdigital.com Logo
Parser Limited
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience in Java and/or Kotlin backend development
  • Solid understanding of software design, APIs, and distributed systems
  • Experience with CI/CD pipelines and DevOps practices
  • Hands-on experience with: Static code analysis tools
  • Dependency management and security remediation
  • Familiarity with AI-assisted coding tools (e.g., Claude, GitHub Copilot, etc.)
  • Experience working with Git-based workflows and multi-repo environments
Job Responsibility
Job Responsibility
  • Backend Development: Design, build, and maintain scalable backend services using Java/Kotlin
  • Deliver production-ready features with high quality and performance standards
  • Collaborate with product and engineering teams to translate requirements into technical solutions
  • AI-Driven DevOps & Automation: Use Claude (or similar agentic AI tools) to identify and fix vulnerabilities
  • Automate code improvements across repositories
  • Generate and maintain unit and integration tests using AI from code context and diffs
  • Continuously improve CI/CD workflows using AI-assisted processes
  • AI Readiness & Engineering Enablement: Improve AI readiness of repositories: clean architecture, modular structure, clear interfaces and contracts, type safety and documentation for LLM consumption
  • Build guardrails for AI usage: prompt design and versioning, output validation and consistency checks, safe code generation practices
What we offer
What we offer
  • The chance to work in innovative projects with leading brands that use the latest technologies that fuel transformation
  • The opportunity to be part of an amazing, multicultural community of tech experts
  • A competitive compensation package and medical insurance
  • A flexible working environment
  • Fulltime
Read More
Arrow Right

Senior Java Engineer – Agentic AI Driven Development - Senior Vice President

The Applications Development Technology Senior Lead Analyst is a senior-level po...
Location
Location
Canada , Mississauga
Salary
Salary:
145100.00 - 217700.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Core Java - Strong understanding of Java (JDK 8+, preferably Java 11/17), including multithreading, collections, garbage collection, and JVM internals
  • Frameworks - Extensive experience with Spring Framework (Spring Boot, Spring MVC, Spring Data JPA, Spring Security)
  • Middleware - Proven experience in designing and developing RESTful APIs and microservices
  • Relational Databases - Strong proficiency in SQL and experience with Oracle databases, including schema design, query optimization, and stored procedures
  • NoSQL Databases - Experience with MongoDB, including data modeling, querying, and performance tuning
  • CI/CD & DevOps - Hands-on experience with CI/CD tools and practices (e.g., Jenkins, GitLab CI, GitHub Actions, Maven/Gradle, Docker, Kubernetes)
  • Version Control - Proficiency with Git and standard branching strategies (e.g., Gitflow)
  • Testing - Experience with unit testing frameworks (JUnit, Mockito) and integration testing
  • Web Technologies (Beneficial) - Familiarity with web services (SOAP/REST), XML, JSON
  • AI Tools & Methodologies - Demonstrable exposure and practical experience with AI development tools such as Devin, GitHub Copilot, Claude, Anti Gravity, and Codex
Job Responsibility
Job Responsibility
  • Lead the design, development, and implementation of complex middleware applications using Java and Spring Boot
  • Architect and optimize database interactions with Oracle, SQL, and MongoDB, ensuring high performance and data integrity
  • Drive the adoption and continuous improvement of CI/CD pipelines to facilitate rapid and reliable software delivery
  • Collaborate with cross-functional teams, including product management, QA, and operations, to define requirements, design solutions, and deliver high-quality software
  • Mentor and provide technical guidance to junior and mid-level software engineers, fostering a culture of technical excellence and continuous learning
  • Actively research and experiment with AI technologies to identify opportunities for enhancing developer productivity, automating tasks, and improving software quality
  • Participate in code reviews, ensuring adherence to coding standards, best practices, and architectural guidelines
  • Troubleshoot and resolve complex technical issues, ensuring the stability and performance of production systems
  • Contribute to the strategic planning and technical roadmap for our middleware platforms
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, and model development
What we offer
What we offer
  • Discover the top benefits offered to our global workforce, designed to support your well-being, growth and work-life balance. Explore a few of the highlights that make working with us rewarding.
  • Fulltime
Read More
Arrow Right

Senior AI Engineer – Microsoft Fabric & Azure AI Foundry

We are looking for an experienced AI Engineer to lead the implementation of Azur...
Location
Location
United States , New York City
Salary
Salary:
160000.00 - 220000.00 USD / Year
valtech.com Logo
Valtech
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in cloud engineering, AI engineering, or data platform architecture
  • Strong hands-on experience with: Microsoft Fabric, Azure AI Foundry, Azure OpenAI, Azure Machine Learning, Azure Data Services
  • Experience integrating AI workloads into enterprise analytics platforms
  • Proficiency in Python and/or C#
  • Experience with REST APIs, SDKs, and AI orchestration frameworks
  • Knowledge of: Vector databases, Retrieval-Augmented Generation (RAG), Prompt engineering, Model evaluation and monitoring
  • Familiarity with DevOps practices including GitHub Actions or Azure DevOps
  • Strong understanding of enterprise security and governance
Job Responsibility
Job Responsibility
  • Design and implement AI solutions using Microsoft Azure AI Foundry within an existing Microsoft Fabric architecture
  • Integrate AI services with Fabric components including: Data Factory, OneLake, Power BI, Lakehouse and Warehouse environments, Real-Time Analytics
  • Build and operationalize generative AI and machine learning workflows
  • Configure and manage: Azure AI Services, Azure OpenAI, Model deployment pipelines, Prompt orchestration and evaluation
  • Establish secure connectivity between Azure AI Foundry and enterprise data sources
  • Implement governance, RBAC, security, compliance, and cost management controls
  • Develop reusable AI pipelines, APIs, and automation frameworks
  • Collaborate with platform teams to ensure scalability, observability, and production readiness
  • Support CI/CD and Infrastructure-as-Code deployment patterns
  • Provide technical leadership and documentation for AI platform adoption
What we offer
What we offer
  • Flexibility, with remote and hybrid work options (country-dependent)
  • Career advancement, with international mobility and professional development programs
  • Learning and development, with access to cutting-edge tools, training and industry experts
  • Medical, dental, and vision insurance for you and your family, plus employer contributions to Health Savings Accounts
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer - CI/CD & AI Automation (AI-first)

Groupon is undergoing a critical platform transformation, modernizing its core d...
Location
Location
Czechia , Prague
Salary
Salary:
Not provided
groupon.com Logo
Groupon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of dedicated experience in Platform Engineering, DevOps, or Infrastructure roles
  • Deep expertise building, scaling, and migrating CI/CD systems, with strong practical experience in Jenkins and/or GitHub Actions
  • Expertise in scripting and automation (Python, Go, or Bash)
  • Solid understanding of container technologies, Kubernetes, and cloud build systems
  • Proven experience leveraging AI tooling (e.g., Claude Code, code analysis) to meaningfully increase developer output and optimize platform work
  • Excellent communication and ability to drive technical decisions across multiple platform and product teams
Job Responsibility
Job Responsibility
  • Platform Transformation: Lead the design, planning, and execution of the Jenkins-to-GitHub Actions migration across a large portfolio of microservices
  • Pipeline Engineering: Design and optimize high-performance, secure, and observable CI/CD workflows across GitHub Actions, Jenkins, and Kubernetes environments
  • AI-First Automation: Drive an AI-First workflow by leveraging tools (e.g., Copilot, code generation) to eliminate infrastructure toil, accelerate development, and analyze pipeline failures
  • Core Automation: Develop robust platform automation (e.g., Python, Go, Bash) to improve build efficiency, artifact caching, reliability, and repository hygiene
  • Security & Compliance: Harden CI/CD infrastructure with robust controls for secrets management, RBAC, audit logging, and secure runner design
  • Observability: Implement and enhance CI/CD observability using tools like Prometheus, Grafana, and OpenTelemetry to provide deep insights into performance and reliability
  • Technical Leadership: Mentor engineers and partner across Cloud, Security, and Developer Experience teams to define and evolve our end-to-end delivery platform architecture
Read More
Arrow Right

Senior AI Engineer (Agents)

A Senior AI Engineer (Agents) is a key technical contributor responsible for de...
Location
Location
Ireland , Cork
Salary
Salary:
Not provided
https://www.marriott.com Logo
Marriott Bonvoy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years' commercial experience building and delivering complex, high-throughput web applications (serving 1M+ users)
  • Demonstrable track record building production-grade AI/agentic products
  • Proven record of delivering full-stack software at scale (1m+ users).
  • Experience building AI/ML applications, LLM integrations, or AI agent systems.
  • Experience integrating LLMs and AI capabilities into production applications.
  • Experience bringing at least one application from design to production. (Full SDLC experience)
  • Strong proficiency in Python, TypeScript, or JavaScript.
  • Strong understanding of software architecture, design patterns, and best practices.
  • Experience with REST APIs, GraphQL, or other API technologies for system integration.
  • Familiarity with cloud platforms (AWS, Azure, or GCP) and cloud services.
Job Responsibility
Job Responsibility
  • Design and implement AI agents and agentic workflows that solve business problems and integrate with existing systems.
  • Write clean, maintainable, well‑tested code following engineering best practices and coding standards.
  • Collaborate with product managers, business stakeholders, and other engineers to understand requirements and deliver solutions.
  • Participate in architecture discussions and contribute to technical design decisions for agent implementations.
  • Own agent features end‑to‑end, from design through implementation, testing, deployment, and monitoring.
  • Integrate AI agents with business systems, databases, APIs, and other enterprise services.
  • Debug and resolve production issues, ensuring agent reliability and performance.
  • Conduct code reviews and provide constructive feedback to peers.
  • Contribute to technical documentation, design docs, and runbooks for agent implementations.
  • Mentor junior engineers and contribute to team knowledge sharing.
  • Fulltime
Read More
Arrow Right