CrawlJobs Logo

Senior AIOps Engineer (Platform & Infrastructure)

groupon.com Logo

Groupon

Location Icon

Location:

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Groupon is moving beyond "experimenting" with AI to running it at massive scale. As we transition to an AI-First organization, we are building a centralized AIOps team to solve a critical challenge: moving AI features from fragmented prototypes to high-performing, cost-efficient production reality. As a Senior AIOps Engineer, you won't just be managing servers; you will be the architect of the "Golden Paths"—the reusable, automated infrastructure that enables our product teams to ship LLMs, Vector Search, and AI Agents faster than ever before.

Job Responsibility:

  • Architect the AI Stack: Design and operate core infrastructure on Kubernetes, including Vector Databases, LLM Gateways (LiteLLM), and workflow automation tools (n8n)
  • Enable at Scale: Drive AI adoption by creating self-service "Golden Paths" using Terraform and Helm, allowing engineering teams to deploy RAG pipelines with one click
  • Operational Excellence: Implement centralized observability, tracing (Langfuse), and governance to ensure our AI systems are reliable, auditable, and secure
  • Fiscal Discipline: Own the "AI Bill"—monitoring token usage and latency to optimize spend while maintaining high performance

Requirements:

  • 5+ years in Platform Engineering, SRE, or DevOps within a cloud-native environment
  • Deep experience managing stateful and stateless workloads (Helm, Istio, Docker)
  • Hands-on experience deploying and operating AI/ML tools or data-intensive systems in production
  • Strong skills in Python or Go to build custom API wrappers and automate operational tasks
  • Expertise in Prometheus, Grafana, and ELK stack to ensure end-to-end observability of complex AI requests
What we offer:
  • End-to-end Ownership: Real authority to standardize how a global company builds with AI
  • Career Growth: This is a high-visibility role within a new, strategic team with potential for leadership progression

Additional Information:

Job Posted:
January 31, 2026

Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior AIOps Engineer (Platform & Infrastructure)

Technology Outbound Product Manager

Join the innovators of OpsRamp as its technology product management leader, resp...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in marketing, engineering, computer science, or a related field
  • MBA or advanced technical degree preferred
  • 4+ years of experience in technical marketing, product marketing, or product management, or pre-sales in observability, ITOM, log management, SaaS and enterprise software, or IT infrastructure industries
  • Knowledge/experience with SaaS software preferred
  • Public cloud experience is a plus
  • Knowledge of application modernization (e.g., Kubernetes), automation (python, pipelines, PowerShell, etc.) is a plus
  • Proven track record of developing and executing successful GTM strategies and campaigns that drive awareness, demand generation, and market leadership
  • Excellent written and verbal communication skills, with the ability to distill complex technical concepts into clear, concise, and compelling messaging and content
  • Strong analytical skills and experience conducting market and competitive analysis to identify key trends, insights, and opportunities
  • Ability to work effectively in a fast-paced, dynamic environment with cross-functional teams and multiple stakeholders
Job Responsibility
Job Responsibility
  • Develop and execute technical evangelizing strategies to drive awareness, demand generation, and market leadership for OpsRamp solutions
  • Collaborate with product management and engineering teams to deeply understand product features, capabilities, and roadmaps, and translate them into compelling value propositions, messaging, and content
  • Create and maintain a wide range of technical collateral, including whitepapers, solution briefs, presentations, videos, demos, and blog posts
  • Drive the creation and delivery of technical enablement materials to support technical sales, partners, and customers, including training presentations, FAQs, and technical guides
  • Conduct market and competitive analysis to identify key trends, insights, and opportunities to differentiate OpsRamp in the ITOM market
  • Serve as a technical evangelist and spokesperson for OpsRamp at industry events, conferences, webinars, and customer meetings
  • Collaborate with product marketing and corporate marketing teams to develop technical content that drives engagement, leads, and pipeline
  • Gather key customer and target audience insights to inform product positioning and messaging as well as the product roadmap
  • Contribute to GTM strategy and messaging, and help maintain technical accuracy of marketing messages.
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Lead / Principal Software Engineer

We’re hiring Lead and Principal Software Engineers to build the next generation ...
Location
Location
Australia , Sydney
Salary
Salary:
Not provided
blumeglobal.com Logo
Blume Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years building scalable, fault-tolerant systems and enterprise software
  • Strong experience with backend architecture, platform modernization, and CI/CD
  • Proficiency in C#, Java, Python, SQL, and JavaScript
  • Experience with cloud infrastructure (AWS, Kinesis, Lambda) and DevOps tools (Docker, Kubernetes, Jenkins)
  • Proven ability to lead technical decisions, mentor engineers, and improve team productivity
  • Strong experience integrating and evaluating AI tools like GitHub Copilot and AIOps in real-world engineering workflows
  • Strong communication across product, compliance, and engineering teams
  • Track record of aligning technical work with business outcomes and customer value
Job Responsibility
Job Responsibility
  • Build the next generation of our platforms
  • Work on high-scale systems that process billions of transactions
  • Modernize core infrastructure
  • Drive AI initiatives to improve performance and reliability
  • Set technical direction
  • Mentor senior engineers
  • Shape architecture across multiple domains
What we offer
What we offer
  • Competitive Package + Equity
  • Find the team/project that fits you best
  • Hybrid and Flexible Work
  • Continuous Learning and Growth
  • Access learning platforms (Coursera, Pluralsight, LinkedIn Learning, WiseTech Academy), mentorship, and development opportunities
  • Top-Tier Hardware
  • Onsite Meals and Snacks
Read More
Arrow Right

Software Engineering Director

We are seeking an experienced Software Engineering Director to lead the company’...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
awtg.co.uk Logo
AWTG
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience (10+ years) in software engineering, technical leadership, or similar roles, with at least 3 years in a senior management capacity
  • Strong background in software development, architecture, and systems design
  • Extensive experience in implementing AI-first software
  • Proven experience in AI development and AIOps implementation
  • Experience with various cloud platforms (GCP, AWS, Azure, Etc), DevOps tools
  • Demonstrated ability to scale technical teams and deliver complex software projects on time and on budget
  • Experience in creating solutions that has cloud, web, mobile app components
  • In-depth knowledge of cybersecurity, data privacy regulations, and compliance standards
  • In-depth knowledge of various AI methodologies and learning algorithms
  • Proven experience in various programming languages like Python, Java, React, C#, domain specific languages, native and cross platform development, etc
Job Responsibility
Job Responsibility
  • Define and oversee the company’s technical vision, strategy, software development, and product roadmap
  • Align technology initiatives with the company’s vision, business objectives and growth strategies
  • Evaluate and implement emerging technologies to maintain a competitive edge
  • Implement an AI-first software vision on products, platforms and solutions
  • Secure internal and external funding for development of new technologies and innovations
  • Manage P&L for the entire Software Division
  • Develop products and platforms that is ready for accelerate and sustain growth
  • Lead revenue generation activities including ensuring that bids and proposals are in top quality
  • Build, lead, and mentor a high-performing team of developers, engineers, and IT professionals
  • Foster a culture of innovation, collaboration, and continuous improvement within software engineering and product teams
  • Fulltime
Read More
Arrow Right

Principal Customer Success Manager

The Customer Success Architect position is a technical champion within the Custo...
Location
Location
United States , New York
Salary
Salary:
115500.00 - 266000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10-15 years experience, preferably in the IT management (ITOM)/APM fields
  • At least 5+ years experience in senior customer-facing positions as an Implementation Architect, Service Delivery Architect, or Lead Solution Architect
  • In-depth knowledge and hands-on experience in one or more of the following: Observability, Process Automation, Patching, AIOps
  • An in-depth understanding of infrastructure management and intelligent automation is preferred
  • Familiarity with cloud-native design patterns, microservices, and modern web-scale architectures
  • Excellent written and oral communication skills, analytical, self-motivated, and quick on-the-job learning skills
  • Effectively multitask between initiatives with minimal oversight and provide a positive customer service attitude.
Job Responsibility
Job Responsibility
  • Being the trusted partner for the customer on use-case and product functionality
  • Lead customers in the application of OpsRamp products and services offerings to meet their Business Outcomes
  • Develop a deep understanding of OpsRamp IT Operations Platform, architecture, and its capabilities through training and hands-on experience
  • Build on the technical design and architecture developed during the implementation phase to maintain a point-in-time architecture for each customer
  • Serve as an important source for information regarding the customer’s technical needs and provide customer feedback
  • Perform and own the health checks during the customer success engagement lifecycle in a client environment
  • Understand and document client use cases and build best practice enablement and content packs for the various use cases
  • Track support and feature requirements and interface with the Product and Engineering team where required
  • Establish technical authority quickly with executive technical customer stakeholders
  • Invest time in documenting best practices, capturing and disseminating knowledge, and other initiatives.
What we offer
What we offer
  • Flexibility to manage work and personal needs
  • Health and emotional wellbeing support
  • Personal and professional development programs
  • Unconditional inclusion
  • Career growth and skill application programs.
  • Fulltime
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right
New

Pipefitters/pipefitter helpers

PC seeks Pipefitters/Pipefitter Helpers with at least two years of experience. R...
Location
Location
United States , New Smyrna Beach
Salary
Salary:
Not provided
pcconstruction.com Logo
PC Construction
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least two years of experience
  • Local applicants interested in water/wastewater construction are preferred
Job Responsibility
Job Responsibility
  • Assembly and installation of piping and mechanical equipment for air, chemical, and water systems
What we offer
What we offer
  • Profit sharing bonus
  • 401(k) with a generous company match
  • Employee stock ownership plan (ESOP)
  • Health, dental, vision, company paid disability, life insurance and leaves
  • Paid time off and holidays upon hire
  • Annual reviews and training and development opportunities with career growth
Read More
Arrow Right
New

Manager, Quality

This is where you make a difference in our patients’ safety. As a member of the ...
Location
Location
United States , Skaneateles Falls
Salary
Salary:
112000.00 - 154000.00 USD / Year
https://www.baxter.com/ Logo
Baxter
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in a relevant scientific discipline
  • Minimum of 5 years of experience in quality assurance, compliance, or quality systems management within the pharmaceutical industry
  • Minimum 2-3 years’ experience managing people
  • In-depth knowledge of relevant regulatory requirements, including cGMP, ICH guidelines, and pharmacopeial standards
  • Proven track record of successfully managing and implementing quality systems initiatives
  • Experience with electronic document management systems and other quality management software tools is desirable
  • Excellent written and verbal communication, presentation, and facilitation skills
  • Strong negotiation skills and significant experience in interacting with regulatory authorities
  • Risk identification and problem-solving skills
  • Demonstrated ability to lead, mentor, and develop others
Job Responsibility
Job Responsibility
  • Responsible for ensuring that the company’s operations and activities comply with all relevant laws, regulations, and industry standards within the segment
  • Ensure the implementation and maintenance of robust quality systems in accordance with regulatory requirements and industry standards
  • Work closely with various departments, including legal, regulatory affairs, quality assurance, and operations, to develop, implement, and maintain effective compliance programs and processes
  • Develop, implement, and maintain quality systems policies, procedures, and processes to ensure compliance with regulatory requirements, including FDA, and other relevant authorities
  • Provide strategic leadership and oversight for the management of quality systems, including but not limited to document control, change control, deviation management, CAPA, training, and audits
  • Design and deliver comprehensive compliance training programs for employees at all levels
  • Understands and deploys processes to assure conformance to regulations in a mid to large size plant of a large program or department
  • Manages regulatory inspections
  • Conduct regular assessments to identify potential compliance vulnerabilities and develop strategies to mitigate risks effectively
  • Monitor changes in laws, regulations, and industry trends to anticipate and address emerging compliance issues proactively
What we offer
What we offer
  • Support for Parents
  • Continuing Education/ Professional Development
  • Employee Heath & Well-Being Benefits
  • Paid Time Off
  • 2 Days a Year to Volunteer
  • Medical and dental coverage that start on day one
  • Insurance coverage for basic life, accident, short-term and long-term disability, and business travel accident insurance
  • Employee Stock Purchase Plan (ESPP), with the ability to purchase company stock at a discount
  • 401(k) Retirement Savings Plan (RSP), with options for employee contributions and company matching
  • Flexible Spending Accounts
  • Fulltime
Read More
Arrow Right
New

Data Center Technician

As a Microsoft Data Center Technician (DCT), you will stage, set up and perform ...
Location
Location
South Korea , Seoul
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • College degree or equivalent experience and basic knowledge of computer hardware and components
  • 1+ year(s) experience supporting IT equipment or related technology
  • Fluent in Korean reading, writing and speaking
  • Ability to meet Microsoft, customer and/or government security screening requirements
Job Responsibility
Job Responsibility
  • Performs diagnostics and troubleshooting following standard procedures, quickly identifies the cause(s) of issues, and replaces faulty components with minimal customer and business disruption
  • Performs post-execution quality checks and verifies that grounding, staging, labeling, and cabling are set up properly according to safety protocols, deployment standards, and planned Network Design Tasks (NDTs)
  • Decommissions hardware for simple changes and refreshes (e.g., memory upgrades, rebuilds) following standard procedures with minimal guidance
  • Follows procedures to communicate, report, and escalate incidents to appropriate Microsoft data center operations management units, Technician Leads, and engineering specialists
  • Assists and provides guidance to other technicians to complete challenging or complex tasks
  • Completes required training aligned to the role and workload
  • observes more experienced technicians to gain hands-on experience and relevant on-the-job training
  • Contributes to a positive and effective team environment by sharing information with others, contributing to regular team meetings, asking questions, and staying apprised of the status of others' work
  • Has pride and a sense of accountability for the service quality, completeness, and resulting user experience
  • displays accountability and ownership of the data center facilities
What we offer
What we offer
  • Training and growth opportunities including Career Rotation Programs, Diversity & Inclusion training and events, and professional certifications
  • Fulltime
Read More
Arrow Right