CrawlJobs Logo

DevOps Engineer (Observability)

India · Job Posted March 21, 2026
Apply Position
Job Link Share

Job Description

Working on a team of professionals, you will manage, configure, and support observability tooling for Teradata’s product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud). You will define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed. You will build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences. You will monitor all the layers of Teradata’s application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations. You will constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata’s observability tooling.

Job Responsibility

  • Manage, configure, and support observability tooling for Teradata’s product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
  • Define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
  • Build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
  • Monitor all the layers of Teradata’s application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
  • Constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata’s observability tooling
  • Work closely with product engineering and cloud operations personnel to help administer all aspects of Teradata’s observability tooling in pre-production and production environments
  • Work with security and compliance teams to help provide evidence necessary to meet Teradata’s compliance obligations

Requirements

  • Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
  • 2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
  • defining group policies
  • authoring monitors, alerts, and dashboards
  • and integration with other enterprise applications such as ServiceNow
  • Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
  • Strong scripting skills with a modern programming language such as Python
  • Experience with a configuration management tool such as Ansible or Puppet
  • Experience with a build/deployment automation tool such as Jenkins or Bamboo
  • Experience with at least one modern source control tool, preferably Git
  • Experience with at least one modern defect tracking tool, preferably Jira
  • Familiarity with both SQL and noSQL databases, and use cases for each
  • Experience administering Linux-based systems
  • 3 to 4 years of experience in the software industry in a devops or site reliability engineering role
  • An in-depth understanding of infrastructure-level and application-level monitoring principles and practice, across both production and non-production environments
  • An understanding of enterprise software deployment and security/compliance principles
  • Proficiency with multi-layered technical troubleshooting and root-cause analysis
  • The ability to quickly and comprehensively decompose a problem, identifying dependencies and defining tasks
  • The ability to work both independently and collaboratively in a fast-paced environment, and adjust as priorities change
  • The flexibility to work on a globally-distributed team managed from the United States

What we offer

  • People-first culture
  • Flexible work model
  • Focus on well-being
  • Inclusive environment

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

DevOps Engineer (Observability)

8 matching positions

DevOps Engineer (Observability)

Working on a team of professionals, you will manage, configure, and support obse...
Location
Location
Mexico
Salary
Salary:
Not provided
teradata.com Logo
Teradata
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
  • 2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
  • defining group policies
  • authoring monitors, alerts, and dashboards
  • and integration with other enterprise applications such as ServiceNow
  • Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
  • Strong scripting skills with a modern programming language such as Python
  • Experience with a configuration management tool such as Ansible or Puppet
  • Experience with a build/deployment automation tool such as Jenkins or Bamboo
  • Experience with at least one modern source control tool, preferably Git
Job Responsibility
Job Responsibility
  • manage, configure, and support observability tooling for Teradata's product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
  • define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
  • build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
  • monitor all the layers of Teradata's application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
  • constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata's observability tooling
  • Fulltime
Read More
Arrow Right

AWS Agent Core Engineer / DevOps (Observability Focus)

We are looking for an experienced engineer to build and enhance observability ca...
Location
Location
Poland
Salary
Salary:
Not provided
Intellias
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience working as a DevOps / Platform Engineer
  • Strong experience with AWS (EKS, EC2, VPC, RDS, Route53, API Gateway, Lambda)
  • Hands-on experience with Terraform (AWS, Kubernetes/Helm, Hashicorp Vault)
  • Hands-on experience with Observability tools: New Relic, Open Telemetry
  • Strong knowledge of Kubernetes
  • Strong programming skills in Python (scripting, FastAPI, Swagger) and Bash / PowerShell
  • Solid understanding of monitoring, logging, and distributed tracing concepts
  • Experience with containerization (Docker, Kubernetes)
  • Experience with CI/CD tools (Jenkins, GitLab)
  • Experience with configuration management tools (Ansible, Chef, Puppet)
Job Responsibility
Job Responsibility
  • Design and implement observability frameworks for agent-based and distributed systems
  • Build and maintain monitoring, logging, and tracing pipelines
  • Develop dashboards and alerts to ensure system health and performance visibility
  • Analyze system behavior and identify performance bottlenecks and anomalies
  • Ensure high availability and reliability of runtime components
  • Integrate observability tools with AWS infrastructure and CI/CD pipelines
  • Support incident response, troubleshooting, and root cause analysis
  • Collaborate with platform and AI teams to improve system transparency and operability
  • Fulltime
Read More
Arrow Right

DevOps Engineer - Observability

Location
Location
Poland
Salary
Salary:
Not provided
lingarogroup.com Logo
Lingaro
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Azure - cloud management experience
  • Knowledge of the Azure cloud and core services
  • Ability to design the architecture of cloud environments (nice to have)
  • Grafana, Prometheus - ability to work with metrics monitoring and visualization tools
  • Docker - experience in management applications working in containers
  • Kubernetes - experience working with a container orchestration platform
  • Terraform - knowledge and experience in Infrastructure As Code
  • GitHuB
  • Azure Repos
  • Knowledge of CI/CD processes
Job Responsibility
Job Responsibility
  • Monitoring infrastructure (Creating Grafana Dashboards, Alerts, Collecting Prometheus Metrics)
  • Implement automated management features, such as performance monitoring, diagnostics and failover
  • Configuring implemented solutions in accordance with system security processes
  • Managing CI (continuous integration) systems and pipelines
  • Designing and implementing infrastructure
What we offer
What we offer
  • Stable employment
  • “Office as an option” model
  • Workation
  • Great Place to Work® certified employer
  • Flexibility regarding your preferred form of contract
  • Comprehensive online onboarding program with a “Buddy” from day 1
  • Cooperation with top-tier engineers and experts
  • Unlimited access to the Udemy learning platform from day 1
  • Certificate training programs
  • Upskilling support
  • Fulltime
Read More
Arrow Right

Senior DevOps Engineer (Observability)

You will enable our machine learning team, data engineers, and applications team...
Location
Location
United States , New York
Salary
Salary:
180000.00 - 225000.00 USD / Year
evolutioniq.com Logo
EvolutionIQ
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of DevOps experience
  • Extensive experience designing and running production systems on GCP
  • Deep exposure and familiarity to networking concepts, Kubernetes clusters, Docker, containerized development, Terraform, Helm, Dagster (DE), and ArgoCD
  • Experience with production operations and working with product engineering teams
  • Experience integrating with SIEM and security software, such as vulnerability scanners
  • You know the critical questions to ask in order to understand a client’s business problem and can show the business impact of your technical solutions
  • Team player who is solutions-oriented
  • You have crisp written and verbal communication skills
Job Responsibility
Job Responsibility
  • Improve and further our observability stack across GCP infrastructure and applications
  • Drive consistency and operational excellence across all teams
  • Enable the data engineering team to use Dagster efficiently
  • Leverage tools like Terraform, Github Actions, Helm, and ArgoCD to build efficient infrastructure as code pipelines
  • Ensure industry standard security controls in our cloud environments
  • Institute culture of reliability in a federated ownership environment
What we offer
What we offer
  • Medical, dental, vision, short & long-term disability, life insurance and AD&D, and 401k matching
  • Additional family, wellness, and pet benefits
  • Paid time off and sick leave, 100% paid parental leave (16 weeks for primary caregivers and 12 weeks for secondary caregivers)
  • We offer a flexible schedule for new parents returning to work
  • Catered lunches, happy hours, pet-friendly spaces, and monthly technology stipend
  • $1,000/year for each employee for professional development, as well opportunities for tuition reimbursement
  • An annual bonus plan and company equity plan (RSUs) are also included in our compensation package
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer / Observability Engineer

Rackspace is building up its Professional Services Center of Excellence on Appli...
Location
Location
Egypt , Giza
Salary
Salary:
Not provided
rackspace.com Logo
Rackspace
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in engineering/computer science or equivalent
  • Senior-level experience with Site Reliability Engineering, DevOps, Code level application support and troubleshooting, AWS Infrastructure design, implementation and optimization, Automation for deployment, scaling and reliability
  • Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc.
  • Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
  • Proactive approach to identifying problems and solutions
  • Experience writing code with one or more interpreted languages such as Python, PHP, Perl, Ruby, Linux Shell
  • Experience with Terraform or Cloud Formation scripting
  • Experience with configuration management tools like Ansible, Chef or Puppet
  • Experience with standard software development best practices and tools such as code repositories (Git preferred)
  • Experience executing in an agile software development environment
Job Responsibility
Job Responsibility
  • Work with customers and implement Observability solutions
  • Build and maintain scalable systems and robust automation that supports engineering goals
  • Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance
  • Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation
  • Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standards
  • Collaborate with team members to document and share solutions
  • Maintain a deep understanding of the customer’s business as well as their technical environment
  • Identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues
  • Fulltime
Read More
Arrow Right
New

Sr. DevOps Engineer

We are looking for a Sr. DevOps Engineer to help strengthen and scale a cloud-ba...
Location
Location
United States , San Francisco
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in DevOps, platform engineering, infrastructure engineering, site reliability engineering, or a closely related field
  • Hands-on experience working with AWS in production environments
  • Strong background using Terraform for infrastructure automation
  • exposure to Ansible is a plus
  • Practical experience building and supporting CI/CD pipelines within Git-based development workflows
  • Programming ability in Python and at least one additional scripting or programming language used for automation
  • Experience supporting PostgreSQL databases in production settings
  • Familiarity with serverless architectures and containerized services in cloud environments
Job Responsibility
Job Responsibility
  • Develop, manage, and optimize cloud infrastructure that supports the company's live SaaS environment
  • Create and maintain delivery pipelines for application services and machine learning workloads to enable efficient releases
  • Oversee deployments across serverless components and container-based applications while ensuring consistency and reliability
  • Administer and support a PostgreSQL data environment, including performance, availability, and operational upkeep
  • Strengthen system dependability by improving monitoring, alerting, logging, and overall platform observability
  • Automate provisioning and release processes through infrastructure-as-code practices using tools such as Terraform
  • Collaborate with backend and machine learning teams to address infrastructure needs and enable smooth deployment workflows
  • Participate in production support activities, including incident response and an on-call rotation
  • Contribute to operational improvements that reduce manual effort and increase platform stability over time
What we offer
What we offer
  • Medical, vision, dental, and life and disability insurance
  • 401(k) plan
  • Fulltime
Read More
Arrow Right
New

Senior DevOps Engineer

NTT Data is looking for a Senior DevOps Engineer with strong experience in infra...
Location
Location
Mexico , Guadalajara
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong hands-on DevOps experience with GitHub Actions and Terraform
  • Experience building and supporting cloud platforms, preferably on AWS
  • Strong understanding of cloud deployment, reliability, monitoring, scalability, and security practices
  • Experience setting up multi-source CI/CD pipelines
  • Hands-on experience with Docker and Kubernetes for model/application deployment
  • Experience with Datadog for application monitoring, observability, dashboards, alerts, logs, and metrics
  • Experience integrating SonarQube for code quality, static analysis, and quality gates
  • Exposure to multi-cloud environments, including Azure and GCP integration
  • Strong troubleshooting skills across infrastructure, CI/CD, containers, cloud services, and monitoring tools
  • Ability to work with cross-functional teams including developers, architects, security, QA, and operations
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable DevOps platforms and CI/CD pipelines using GitHub Actions and related automation tools
  • Develop and manage Infrastructure as Code using Terraform for cloud infrastructure provisioning, deployment, and lifecycle management
  • Support cloud platform creation, deployment, reliability, and security, with AWS as the preferred primary cloud platform
  • Build and maintain multi-source pipelines integrating code repositories, infrastructure modules, security scans, quality gates, deployment workflows, and monitoring
  • Deploy and manage containerized applications using Docker and Kubernetes
  • Implement application and infrastructure monitoring using Datadog, including dashboards, logs, metrics, alerts, and operational visibility
  • Integrate code quality and security controls using SonarQube within CI/CD workflows
  • Support multi-cloud integrations across AWS, Azure, and GCP, where required
  • Collaborate with application teams to improve deployment reliability, release automation, environment consistency, and operational efficiency
  • Partner with security teams to enforce DevSecOps practices, vulnerability controls, secrets management, and compliance requirements
  • Fulltime
Read More
Arrow Right
New

AWS Cloud DevOps Engineer - VOIS

We are seeking a cloud and DevOps professional to design, implement, and operate...
Location
Location
India , Pune
Salary
Salary:
Not provided
vodafone.com Logo
Vodafone
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Degree in IT or a related discipline
  • Experienced in cloud computing or enterprise IT environments
  • Proficient in AWS services, Kubernetes, Terraform, and CI/CD tooling
  • Comfortable working with Git, GitOps practices, and automation-first approaches
  • Knowledgeable in cloud security, compliance, and vulnerability management
  • Able to analyse, troubleshoot, and resolve complex infrastructure issues
  • Clear communicator who values documentation, collaboration, and continuous improvement
  • Familiar with Agile ways of working and enterprise service management tools
Job Responsibility
Job Responsibility
  • Manage project-driven integration and day-to-day administration of AWS cloud solutions
  • Design, build, and deploy AWS cloud infrastructure using Terraform and Terragrunt
  • Develop automation through CI/CD pipelines using tools such as GitHub Actions and ArgoCD
  • Operate and maintain AWS EKS, CET-EKS, and CET-EKS-O11Y platforms, including lifecycle management and upgrades
  • Design and deploy hybrid connectivity solutions across CET-based AWS accounts
  • Administer Kubernetes clusters, including capacity management, security, and compliance
  • Implement GitOps practices using GitHub, ArgoCD, Helm charts, and EKS
  • Manage observability platforms, including Grafana dashboards and monitoring stacks
  • Support AWS RDS infrastructure operations, including upgrades, rightsizing, and maintenance
  • Identify and implement cost-optimisation opportunities using AWS Cost Explorer and Budgets
What we offer
What we offer
  • Opportunities to work on large-scale, business-critical cloud platforms
  • Exposure to modern DevOps, GitOps, and cloud-native practices in a global environment
  • A collaborative culture focused on learning, automation, and continuous improvement
  • The chance to influence platform stability and cloud adoption across Vodafone
  • Fulltime
Read More
Arrow Right