AIOps Support Engineer Job at ClearBridge Technology Group (San Francisco)

New

Premium Support Engineer

We are looking to welcome a Premium Support Engineer to our growing team, with a...

Location

India , Bangalore

Salary:

Not provided

OpenText

Expiration Date

Until further notice

Requirements

Proficient in any of the following OpenText Operations Support Management solutions, including SMAX (Service Management Automation X), uCMDB (Universal Configuration Management Database), AIOps (AI Operations Management), and ODL (Operations Data Lake) with 8+ years of experience
Experienced in performing product integrations and upgrades
Strong understanding of high-level architecture and product configuration across both on-premises and cloud-based environments
Excellent problem solving and troubleshooting skills
Working knowledge of database systems such as Oracle, SQL Server, and/or PGSQL databases, network architecture, firewalls, extranet security, virtual environments, backup and high available structures
General knowledge of web servers, browsers and other internet applications
Solid understanding of Operating System platforms, Storage platforms, different database products, cloud deployment providers (AWS, Azure, GCP), Application build platforms, and integration with external vendors like SAP, Microsoft, Peoplesoft, Salesforce, etc.
Strong relationship and team building skills, with the ability to negotiate and resolve conflict
Great communication, coordination, collaboration skills, and ability to navigate complex, matrixed organizations
Bachelor's degree preferred or Associate degree holder (technical field) with previous working experience in a customer support environment

Job Responsibility

Be the customer's single point of contact for support incidents opened for a specific product center
Develop an in-depth understanding of your customer's environment and implementation & develop a strong working relationship with customers
Leverage deep technical expertise and knowledge of your customer's environment to resolve incidents more efficiently
Provide timely updates on open incidents & coordinate with other OpenText experts as needed to expedite timely resolution
Apply best practices to help our customers minimize operational risks and avoid common pitfalls
Provide periodic supportability assessments & offer technical support mentoring to increase the customer's knowledge
Understand the operational profile of your customer's environment to improve the support that OpenText delivers
Act as a strategic partner in developing plans to proactively improve and maintain the customer's software investment

Fulltime

New

Premium Support Engineer

OPENTEXT - THE INFORMATION COMPANY OpenText is a global leader in information m...

Location

India , Mumbai

Salary:

Not provided

OpenText

Expiration Date

Until further notice

Requirements

Proficient in any of the following OpenText Operations Support Management solutions, including SMAX (Service Management Automation X), uCMDB (Universal Configuration Management Database), AIOps (AI Operations Management), and ODL (Operations Data Lake) with 8+ years of experience
Experienced in performing product integrations and upgrades
Strong understanding of high-level architecture and product configuration across both on-premises and cloud-based environments
Excellent problem solving and troubleshooting skills
Working knowledge of database systems such as Oracle, SQL Server, and/or PGSQL databases, network architecture, firewalls, extranet security, virtual environments, backup and high available structures
General knowledge of web servers, browsers and other internet applications
Solid understanding of Operating System platforms, Storage platforms, different database products, cloud deployment providers (AWS, Azure, GCP), Application build platforms, and integration with external vendors like SAP, Microsoft, Peoplesoft, Salesforce, etc.
Strong relationship and team building skills, with the ability to negotiate and resolve conflict
Great communication, coordination, collaboration skills, and ability to navigate complex, matrixed organizations
Bachelor’s degree preferred or Associate degree holder (technical field) with previous working experience in a customer support environment

Job Responsibility

Be the customer's single point of contact for support incidents opened for a specific product center
Develop an in-depth understanding of your customer’s environment and implementation & develop a strong working relationship with customers
Leverage deep technical expertise and knowledge of your customer’s environment to resolve incidents more efficiently
Provide timely updates on open incidents & coordinate with other OpenText experts as needed to expedite timely resolution
Apply best practices to help our customers minimize operational risks and avoid common pitfalls
Provide periodic supportability assessments & offer technical support mentoring to increase the customer's knowledge
Understand the operational profile of your customer's environment to improve the support that OpenText delivers
Act as a strategic partner in developing plans to proactively improve and maintain the customer's software investment

Fulltime

Lead Support Engineer

We are seeking a Support Engineer Lead with strong expertise in Java application...

Location

United States , Malden

Salary:

105000.00 - 120000.00 USD / Year ▼

Tech Mahindra

Expiration Date

June 27, 2026

Requirements

A Bachelor’s or Higher Degree is the minimum entry required for the position
Java 11
8.00 to 12.00 Years Total Experience
7 to 10 Years experience
Minimum 4–5 years in Application/Infra Support
Experience in customer-facing onsite roles preferred
Strong Java application support (Spring, APIs, Microservices)
Hands-on AWS (EC2, Lambda, S3, RDS, CloudWatch)
Experience in Application + Infrastructure troubleshooting
Knowledge of Linux, networking basics, databases (SQL/NoSQL)

Job Responsibility

Lead L2/L3 support for Java-based applications and AWS infrastructure
Coordinate across application, middleware, and infrastructure layers for issue resolution
Manage production support operations (24x7/on-call rotation)
Drive incident triaging, RCA, and resolution
Ensure adherence to SLA / KPI (MTTR, availability, uptime)
Perform post-incident reviews and preventive actions
Manage and troubleshoot AWS services (EC2, S3, Lambda, RDS)
Ensure scalability, reliability, and cost optimisation of cloud systems
Work on monitoring, logging, alerting tools (New Relic, CloudWatch, etc.)
Act as primary escalation points for customer incidents

What we offer

medical
vision
dental
life
disability insurance
paid time off (including holidays, parental leave, and sick leave)

Fulltime

Platform Engineer – AIOps & Infrastructure

The Platform Engineer – AIOps & Infrastructure will be responsible for designing...

Location

Salary:

Not provided

Solvedex

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Engineering, Information Systems, or equivalent experience
5+ years of experience in Platform Engineering, DevOps, Cloud Infrastructure, SRE, MLOps, or related fields
Strong experience with AWS, Azure, or GCP
Hands-on expertise with Kubernetes, Docker, and Infrastructure-as-Code tools (Terraform, CloudFormation, or similar)
Experience building CI/CD pipelines and automation workflows
Strong scripting skills using Python, Bash, or similar languages
Experience with monitoring and observability platforms such as Grafana, Prometheus, Datadog, or ELK
Advanced English proficiency (B2 - C1)
Comfortable working remotely with minimal supervision
Proactive, detail-oriented, and collaborative

Job Responsibility

Design and maintain scalable cloud-native infrastructure for AI/ML workloads
Manage Kubernetes environments, container orchestration, and platform services
Build and optimize CI/CD pipelines and Infrastructure-as-Code frameworks
Support MLOps and LLMOps workflows, including deployment, monitoring, and lifecycle management
Implement monitoring, logging, alerting, and observability solutions
Drive DevSecOps, automation, security, and reliability best practices
Collaborate with AI Engineers, Data Scientists, and Infrastructure teams to support production AI systems
Participate in troubleshooting, incident response, and platform optimization initiatives

Fulltime

Principal AIOps Engineer

We’re building a world of health around every individual — shaping a more connec...

Location

United States

Salary:

144200.00 - 288400.00 USD / Year

CVS Health

Expiration Date

July 01, 2026

Requirements

10+ years of experience in SRE, production operations supporting highly available services along with experience with Product model
Proven technical leadership: ability to set direction, lead cross-team initiatives, and advise stakeholders through architecture reviews, tradeoffs, and operational readiness
Strong programming/scripting skills (Python preferred) and experience building automation, integrations, and APIs
Experience integrating observability platforms and event sources across hybrid environments (cloud/on-prem) and operating production-grade monitoring/event management at scale
Strong ServiceNow experience as an ITSM system of record (Incident/Problem/Change
CMDB/asset concepts). Ability to build and operate integrations at scale (REST, webhooks, event management) to support automation and auditability
Python (preferred) for automation and data/ML pipelines
experience building integrations, services, and operational tooling
Workflow orchestration and integrations (ServiceNow APIs, event pipelines, runbook automation) with strong reliability, security, and auditability practices
Observability: Prometheus/Grafana, OpenTelemetry, ELK/Splunk/Datadog (or equivalent)

Job Responsibility

Lead the AIOps strategy, roadmap, and operating model (intake, triage, automation lifecycle, KPIs) to measurably improve MTTR, alert quality, and operational efficiency
Own the observability-to-AIOps pipeline (metrics, logs, traces, events) and drive standardization of telemetry, service health models, and actionable alerting across teams and platforms
Design and implement event intelligence: correlation, deduplication, suppression, anomaly detection, incident clustering, and probable-cause analysis using topology/CMDB context
Advise operations, service owners, and leadership stakeholders
lead change enablement, adoption, and value measurement for AIOps and agentic automation across the organization
Develop ServiceNow-centric AIOps integrations (ITSM + ITOM/Event Management where applicable): event ingestion, alert-to-incident policies, enrichment, assignment/routing, approvals, change workflows, and closure updates for auditable closed-loop ops
Establish governance for operational AI (risk controls, approvals, auditability, data access, prompt/response logging, evaluation, and continuous improvement) in partnership with security, compliance, and operations
Build and operationalize agentic AI workflows for incident triage and resolution: signal summarization, similar-incident retrieval, knowledge article drafting, ticket updates, stakeholder communications, and human-in-the-loop remediation
Enable closed-loop automation and self-healing by connecting AIOps detections to orchestrated actions (runbooks/workflows), with clear approvals, safety checks, and rollback paths
Partner with NOC/SOC, infrastructure, and application owners to onboard services into AIOps, define service models, and improve signal quality, escalation paths, and operational readiness

What we offer

Medical, dental, and vision coverage
Paid time off
Retirement savings options
Wellness programs
Bonus, commission or short-term incentive program
Equity award program

Fulltime

Principal Site Reliability Engineer (AIOps)

Palo Alto Networks runs a large hybrid infrastructure and is one of the largest ...

Location

United States , Santa Clara

Salary:

151600.00 - 245300.00 USD / Year

Palo Alto Networks

Expiration Date

Until further notice

Requirements

BS or MS in Computer Science, a related field, or equivalent professional experience
Expertise in configuration management with a framework such as Ansible, Terraform, Helm
Experience in Production Engineering, DevOps, or Site Reliability
Expertise in private or public cloud
Strong Linux administration, internals, and network troubleshooting
Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
Familiarity with CI/CD pipelines, GitLab and GitHub preferred
Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions
Excellent written and verbal communication, able to collaborate and rally support
Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive

Job Responsibility

Contribute to the success of SRE and DevOps
Develop expertise in new technologies
Work with developers, researchers, data scientists, and security experts
Design, build and operate reliable, secure Cloud infrastructure
Ensure that applications are production-ready, scalable, and reliable
Develop tools and automation frameworks
Automate robust deployment of robust services
Orchestrate end-to-end monitoring and alerting
Participate with SRE and Dev teams in the on-call rotation
Lead root cause analysis of critical business and production issues

What we offer

restricted stock units
bonus
employee benefits

Fulltime

Principal Engineer Software (AIOps)

At Palo Alto Networks®, we're united by a shared mission—to protect our digital ...

Location

United States , Santa Clara

Salary:

147000.00 - 237500.00 USD / Year

Palo Alto Networks

Expiration Date

Until further notice

Requirements

Must have 5+ years of hands-on experience in building large enterprise applications
Must have extensive hands-on programming skills in Java and distributed systems
Deep understanding of design pattern
Good communication skills and ability to work in a fast-paced environment.

Job Responsibility

Tackle new and challenging problems by building a new generation of highly scaled data processing and analytics systems
Contribute in architecture, design and development of features
Solve complex problems in pipeline scaling and data storage to facilitate dashboards
Suggest and implement improvements to the development processes
Work with DevOps and Technical Support teams to investigate and resolve critical customer defects.

What we offer

Restricted stock units
Bonus
Employee benefits may be found here

Fulltime

Principal Engineer Software (AIOps)

Strata Logging Service (SLS) powers advanced cybersecurity innovations by provid...

Location

United States , Santa Clara

Salary:

147000.00 - 237500.00 USD / Year

Palo Alto Networks Italia

Expiration Date

Until further notice

Requirements

Must have 5+ years of hands-on experience in building large enterprise applications
Must have extensive hands-on programming skills in Java and distributed systems
Deep understanding of design pattern
Good communication skills and ability to work in a fast-paced environment

Job Responsibility

Tackle new and challenging problems by building a new generation of highly scaled data processing and analytics systems
Contribute in architecture, design and development of features, solve complex problems in pipeline scaling and data storage to facilitate dashboards
Suggest and implement improvements to the development processes
Work with DevOps and Technical Support teams to investigate and resolve critical customer defects

Fulltime

Select Country

AIOps Support Engineer

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?