CrawlJobs Logo

AIOps Support Engineer

United States, San Francisco Employment contract 160000.00 - 190000.00 USD / Year · Job Posted April 20, 2026
Apply Position
Job Link Share

Job Description

AIOps Support Engineer for permanent employment. The AIOps Support Engineer will work 4 days a week onsite in San Francisco and be responsible for the intersection of enterprise AI tooling, cloud operations, and end-user support, ensuring AI platforms run reliably, securely, and at scale.

Job Responsibility

Responsible for the intersection of enterprise AI tooling, cloud operations, and end-user support, ensuring AI platforms run reliably, securely, and at scale.

Requirements

  • Minimum of 3 years of experience in technical support, cloud operations, or IT engineering, preferably within financial services or healthcare environments at a mid-sized organization
  • At least 3 years of hands-on experience with Google Cloud Platform (GCP), including log monitoring, IAM management, and basic infrastructure troubleshooting
  • Minimum of 3 years of experience troubleshooting single sign-on (SSO) issues across identity platforms such as Okta, Azure AD, or similar solution
  • Familiarity with enterprise AI platforms including ChatGPT (OpenAI), Claude (Anthropic), and Google Gemini Enterprise.

Nice to have

  • Previous experience working in Azure environment
  • Understanding of Netskope or comparable CASB/SSL inspection tools and their interaction with SaaS and AI services
  • Relevant cloud certifications (Google Associate Cloud Engineer, Microsoft Azure Administrator AZ-104, or equivalent)
  • Familiarity with AI governance, data loss prevention (DLP), or acceptable use policies for AI platforms in regulated industries.

What we offer

Excellent benefits and compensation packages

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

AIOps Support Engineer

8 matching positions

New

Premium Support Engineer

We are looking to welcome a Premium Support Engineer to our growing team, with a...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
opentext.com Logo
OpenText
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficient in any of the following OpenText Operations Support Management solutions, including SMAX (Service Management Automation X), uCMDB (Universal Configuration Management Database), AIOps (AI Operations Management), and ODL (Operations Data Lake) with 8+ years of experience
  • Experienced in performing product integrations and upgrades
  • Strong understanding of high-level architecture and product configuration across both on-premises and cloud-based environments
  • Excellent problem solving and troubleshooting skills
  • Working knowledge of database systems such as Oracle, SQL Server, and/or PGSQL databases, network architecture, firewalls, extranet security, virtual environments, backup and high available structures
  • General knowledge of web servers, browsers and other internet applications
  • Solid understanding of Operating System platforms, Storage platforms, different database products, cloud deployment providers (AWS, Azure, GCP), Application build platforms, and integration with external vendors like SAP, Microsoft, Peoplesoft, Salesforce, etc.
  • Strong relationship and team building skills, with the ability to negotiate and resolve conflict
  • Great communication, coordination, collaboration skills, and ability to navigate complex, matrixed organizations
  • Bachelor's degree preferred or Associate degree holder (technical field) with previous working experience in a customer support environment
Job Responsibility
Job Responsibility
  • Be the customer's single point of contact for support incidents opened for a specific product center
  • Develop an in-depth understanding of your customer's environment and implementation & develop a strong working relationship with customers
  • Leverage deep technical expertise and knowledge of your customer's environment to resolve incidents more efficiently
  • Provide timely updates on open incidents & coordinate with other OpenText experts as needed to expedite timely resolution
  • Apply best practices to help our customers minimize operational risks and avoid common pitfalls
  • Provide periodic supportability assessments & offer technical support mentoring to increase the customer's knowledge
  • Understand the operational profile of your customer's environment to improve the support that OpenText delivers
  • Act as a strategic partner in developing plans to proactively improve and maintain the customer's software investment
  • Fulltime
Read More
Arrow Right
New

Premium Support Engineer

OPENTEXT - THE INFORMATION COMPANY OpenText is a global leader in information m...
Location
Location
India , Mumbai
Salary
Salary:
Not provided
opentext.com Logo
OpenText
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficient in any of the following OpenText Operations Support Management solutions, including SMAX (Service Management Automation X), uCMDB (Universal Configuration Management Database), AIOps (AI Operations Management), and ODL (Operations Data Lake) with 8+ years of experience
  • Experienced in performing product integrations and upgrades
  • Strong understanding of high-level architecture and product configuration across both on-premises and cloud-based environments
  • Excellent problem solving and troubleshooting skills
  • Working knowledge of database systems such as Oracle, SQL Server, and/or PGSQL databases, network architecture, firewalls, extranet security, virtual environments, backup and high available structures
  • General knowledge of web servers, browsers and other internet applications
  • Solid understanding of Operating System platforms, Storage platforms, different database products, cloud deployment providers (AWS, Azure, GCP), Application build platforms, and integration with external vendors like SAP, Microsoft, Peoplesoft, Salesforce, etc.
  • Strong relationship and team building skills, with the ability to negotiate and resolve conflict
  • Great communication, coordination, collaboration skills, and ability to navigate complex, matrixed organizations
  • Bachelor’s degree preferred or Associate degree holder (technical field) with previous working experience in a customer support environment
Job Responsibility
Job Responsibility
  • Be the customer's single point of contact for support incidents opened for a specific product center
  • Develop an in-depth understanding of your customer’s environment and implementation & develop a strong working relationship with customers
  • Leverage deep technical expertise and knowledge of your customer’s environment to resolve incidents more efficiently
  • Provide timely updates on open incidents & coordinate with other OpenText experts as needed to expedite timely resolution
  • Apply best practices to help our customers minimize operational risks and avoid common pitfalls
  • Provide periodic supportability assessments & offer technical support mentoring to increase the customer's knowledge
  • Understand the operational profile of your customer's environment to improve the support that OpenText delivers
  • Act as a strategic partner in developing plans to proactively improve and maintain the customer's software investment
  • Fulltime
Read More
Arrow Right

Lead Support Engineer

We are seeking a Support Engineer Lead with strong expertise in Java application...
Location
Location
United States , Malden
Salary
Salary:
105000.00 - 120000.00 USD / Year
techmahindra.com Logo
Tech Mahindra
Expiration Date
June 27, 2026
Flip Icon
Requirements
Requirements
  • A Bachelor’s or Higher Degree is the minimum entry required for the position
  • Java 11
  • 8.00 to 12.00 Years Total Experience
  • 7 to 10 Years experience
  • Minimum 4–5 years in Application/Infra Support
  • Experience in customer-facing onsite roles preferred
  • Strong Java application support (Spring, APIs, Microservices)
  • Hands-on AWS (EC2, Lambda, S3, RDS, CloudWatch)
  • Experience in Application + Infrastructure troubleshooting
  • Knowledge of Linux, networking basics, databases (SQL/NoSQL)
Job Responsibility
Job Responsibility
  • Lead L2/L3 support for Java-based applications and AWS infrastructure
  • Coordinate across application, middleware, and infrastructure layers for issue resolution
  • Manage production support operations (24x7/on-call rotation)
  • Drive incident triaging, RCA, and resolution
  • Ensure adherence to SLA / KPI (MTTR, availability, uptime)
  • Perform post-incident reviews and preventive actions
  • Manage and troubleshoot AWS services (EC2, S3, Lambda, RDS)
  • Ensure scalability, reliability, and cost optimisation of cloud systems
  • Work on monitoring, logging, alerting tools (New Relic, CloudWatch, etc.)
  • Act as primary escalation points for customer incidents
What we offer
What we offer
  • medical
  • vision
  • dental
  • life
  • disability insurance
  • paid time off (including holidays, parental leave, and sick leave)
  • Fulltime
Read More
Arrow Right

Platform Engineer – AIOps & Infrastructure

The Platform Engineer – AIOps & Infrastructure will be responsible for designing...
Location
Location
Salary
Salary:
Not provided
solvedex.com Logo
Solvedex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or equivalent experience
  • 5+ years of experience in Platform Engineering, DevOps, Cloud Infrastructure, SRE, MLOps, or related fields
  • Strong experience with AWS, Azure, or GCP
  • Hands-on expertise with Kubernetes, Docker, and Infrastructure-as-Code tools (Terraform, CloudFormation, or similar)
  • Experience building CI/CD pipelines and automation workflows
  • Strong scripting skills using Python, Bash, or similar languages
  • Experience with monitoring and observability platforms such as Grafana, Prometheus, Datadog, or ELK
  • Advanced English proficiency (B2 - C1)
  • Comfortable working remotely with minimal supervision
  • Proactive, detail-oriented, and collaborative
Job Responsibility
Job Responsibility
  • Design and maintain scalable cloud-native infrastructure for AI/ML workloads
  • Manage Kubernetes environments, container orchestration, and platform services
  • Build and optimize CI/CD pipelines and Infrastructure-as-Code frameworks
  • Support MLOps and LLMOps workflows, including deployment, monitoring, and lifecycle management
  • Implement monitoring, logging, alerting, and observability solutions
  • Drive DevSecOps, automation, security, and reliability best practices
  • Collaborate with AI Engineers, Data Scientists, and Infrastructure teams to support production AI systems
  • Participate in troubleshooting, incident response, and platform optimization initiatives
  • Fulltime
Read More
Arrow Right

Principal AIOps Engineer

We’re building a world of health around every individual — shaping a more connec...
Location
Location
United States
Salary
Salary:
144200.00 - 288400.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
July 01, 2026
Flip Icon
Requirements
Requirements
  • 10+ years of experience in SRE, production operations supporting highly available services along with experience with Product model
  • Proven technical leadership: ability to set direction, lead cross-team initiatives, and advise stakeholders through architecture reviews, tradeoffs, and operational readiness
  • Strong programming/scripting skills (Python preferred) and experience building automation, integrations, and APIs
  • Experience integrating observability platforms and event sources across hybrid environments (cloud/on-prem) and operating production-grade monitoring/event management at scale
  • Strong ServiceNow experience as an ITSM system of record (Incident/Problem/Change
  • CMDB/asset concepts). Ability to build and operate integrations at scale (REST, webhooks, event management) to support automation and auditability
  • Python (preferred) for automation and data/ML pipelines
  • experience building integrations, services, and operational tooling
  • Workflow orchestration and integrations (ServiceNow APIs, event pipelines, runbook automation) with strong reliability, security, and auditability practices
  • Observability: Prometheus/Grafana, OpenTelemetry, ELK/Splunk/Datadog (or equivalent)
Job Responsibility
Job Responsibility
  • Lead the AIOps strategy, roadmap, and operating model (intake, triage, automation lifecycle, KPIs) to measurably improve MTTR, alert quality, and operational efficiency
  • Own the observability-to-AIOps pipeline (metrics, logs, traces, events) and drive standardization of telemetry, service health models, and actionable alerting across teams and platforms
  • Design and implement event intelligence: correlation, deduplication, suppression, anomaly detection, incident clustering, and probable-cause analysis using topology/CMDB context
  • Advise operations, service owners, and leadership stakeholders
  • lead change enablement, adoption, and value measurement for AIOps and agentic automation across the organization
  • Develop ServiceNow-centric AIOps integrations (ITSM + ITOM/Event Management where applicable): event ingestion, alert-to-incident policies, enrichment, assignment/routing, approvals, change workflows, and closure updates for auditable closed-loop ops
  • Establish governance for operational AI (risk controls, approvals, auditability, data access, prompt/response logging, evaluation, and continuous improvement) in partnership with security, compliance, and operations
  • Build and operationalize agentic AI workflows for incident triage and resolution: signal summarization, similar-incident retrieval, knowledge article drafting, ticket updates, stakeholder communications, and human-in-the-loop remediation
  • Enable closed-loop automation and self-healing by connecting AIOps detections to orchestrated actions (runbooks/workflows), with clear approvals, safety checks, and rollback paths
  • Partner with NOC/SOC, infrastructure, and application owners to onboard services into AIOps, define service models, and improve signal quality, escalation paths, and operational readiness
What we offer
What we offer
  • Medical, dental, and vision coverage
  • Paid time off
  • Retirement savings options
  • Wellness programs
  • Bonus, commission or short-term incentive program
  • Equity award program
  • Fulltime
Read More
Arrow Right

Principal Site Reliability Engineer (AIOps)

Palo Alto Networks runs a large hybrid infrastructure and is one of the largest ...
Location
Location
United States , Santa Clara
Salary
Salary:
151600.00 - 245300.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS or MS in Computer Science, a related field, or equivalent professional experience
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Experience in Production Engineering, DevOps, or Site Reliability
  • Expertise in private or public cloud
  • Strong Linux administration, internals, and network troubleshooting
  • Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
  • Familiarity with CI/CD pipelines, GitLab and GitHub preferred
  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions
  • Excellent written and verbal communication, able to collaborate and rally support
  • Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive
Job Responsibility
Job Responsibility
  • Contribute to the success of SRE and DevOps
  • Develop expertise in new technologies
  • Work with developers, researchers, data scientists, and security experts
  • Design, build and operate reliable, secure Cloud infrastructure
  • Ensure that applications are production-ready, scalable, and reliable
  • Develop tools and automation frameworks
  • Automate robust deployment of robust services
  • Orchestrate end-to-end monitoring and alerting
  • Participate with SRE and Dev teams in the on-call rotation
  • Lead root cause analysis of critical business and production issues
What we offer
What we offer
  • restricted stock units
  • bonus
  • employee benefits
  • Fulltime
Read More
Arrow Right

Principal Engineer Software (AIOps)

At Palo Alto Networks®, we're united by a shared mission—to protect our digital ...
Location
Location
United States , Santa Clara
Salary
Salary:
147000.00 - 237500.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must have 5+ years of hands-on experience in building large enterprise applications
  • Must have extensive hands-on programming skills in Java and distributed systems
  • Deep understanding of design pattern
  • Good communication skills and ability to work in a fast-paced environment.
Job Responsibility
Job Responsibility
  • Tackle new and challenging problems by building a new generation of highly scaled data processing and analytics systems
  • Contribute in architecture, design and development of features
  • Solve complex problems in pipeline scaling and data storage to facilitate dashboards
  • Suggest and implement improvements to the development processes
  • Work with DevOps and Technical Support teams to investigate and resolve critical customer defects.
What we offer
What we offer
  • Restricted stock units
  • Bonus
  • Employee benefits may be found here
  • Fulltime
Read More
Arrow Right

Principal Engineer Software (AIOps)

Strata Logging Service (SLS) powers advanced cybersecurity innovations by provid...
Location
Location
United States , Santa Clara
Salary
Salary:
147000.00 - 237500.00 USD / Year
paloaltonetworks.it Logo
Palo Alto Networks Italia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must have 5+ years of hands-on experience in building large enterprise applications
  • Must have extensive hands-on programming skills in Java and distributed systems
  • Deep understanding of design pattern
  • Good communication skills and ability to work in a fast-paced environment
Job Responsibility
Job Responsibility
  • Tackle new and challenging problems by building a new generation of highly scaled data processing and analytics systems
  • Contribute in architecture, design and development of features, solve complex problems in pipeline scaling and data storage to facilitate dashboards
  • Suggest and implement improvements to the development processes
  • Work with DevOps and Technical Support teams to investigate and resolve critical customer defects
  • Fulltime
Read More
Arrow Right