DevOps Engineer (Observability) Job at Teradata

DevOps Engineer (Observability)

Working on a team of professionals, you will manage, configure, and support obse...

Location

Mexico

Salary:

Not provided

Teradata

Expiration Date

Until further notice

Requirements

Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
defining group policies
authoring monitors, alerts, and dashboards
and integration with other enterprise applications such as ServiceNow
Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
Strong scripting skills with a modern programming language such as Python
Experience with a configuration management tool such as Ansible or Puppet
Experience with a build/deployment automation tool such as Jenkins or Bamboo
Experience with at least one modern source control tool, preferably Git

Job Responsibility

manage, configure, and support observability tooling for Teradata's product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
monitor all the layers of Teradata's application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata's observability tooling

Fulltime

AWS Agent Core Engineer / DevOps (Observability Focus)

We are looking for an experienced engineer to build and enhance observability ca...

Location

Poland

Salary:

Not provided

Intellias

Expiration Date

Until further notice

Requirements

5+ years of experience working as a DevOps / Platform Engineer
Strong experience with AWS (EKS, EC2, VPC, RDS, Route53, API Gateway, Lambda)
Hands-on experience with Terraform (AWS, Kubernetes/Helm, Hashicorp Vault)
Hands-on experience with Observability tools: New Relic, Open Telemetry
Strong knowledge of Kubernetes
Strong programming skills in Python (scripting, FastAPI, Swagger) and Bash / PowerShell
Solid understanding of monitoring, logging, and distributed tracing concepts
Experience with containerization (Docker, Kubernetes)
Experience with CI/CD tools (Jenkins, GitLab)
Experience with configuration management tools (Ansible, Chef, Puppet)

Job Responsibility

Design and implement observability frameworks for agent-based and distributed systems
Build and maintain monitoring, logging, and tracing pipelines
Develop dashboards and alerts to ensure system health and performance visibility
Analyze system behavior and identify performance bottlenecks and anomalies
Ensure high availability and reliability of runtime components
Integrate observability tools with AWS infrastructure and CI/CD pipelines
Support incident response, troubleshooting, and root cause analysis
Collaborate with platform and AI teams to improve system transparency and operability

Fulltime

DevOps Engineer - Observability

Location

Poland

Salary:

Not provided

Lingaro

Expiration Date

Until further notice

Requirements

Azure - cloud management experience
Knowledge of the Azure cloud and core services
Ability to design the architecture of cloud environments (nice to have)
Grafana, Prometheus - ability to work with metrics monitoring and visualization tools
Docker - experience in management applications working in containers
Kubernetes - experience working with a container orchestration platform
Terraform - knowledge and experience in Infrastructure As Code
GitHuB
Azure Repos
Knowledge of CI/CD processes

Job Responsibility

Monitoring infrastructure (Creating Grafana Dashboards, Alerts, Collecting Prometheus Metrics)
Implement automated management features, such as performance monitoring, diagnostics and failover
Configuring implemented solutions in accordance with system security processes
Managing CI (continuous integration) systems and pipelines
Designing and implementing infrastructure

What we offer

Stable employment
“Office as an option” model
Workation
Great Place to Work® certified employer
Flexibility regarding your preferred form of contract
Comprehensive online onboarding program with a “Buddy” from day 1
Cooperation with top-tier engineers and experts
Unlimited access to the Udemy learning platform from day 1
Certificate training programs
Upskilling support

Fulltime

Senior DevOps Engineer (Observability)

You will enable our machine learning team, data engineers, and applications team...

Location

United States , New York

Salary:

180000.00 - 225000.00 USD / Year

EvolutionIQ

Expiration Date

Until further notice

Requirements

7+ years of DevOps experience
Extensive experience designing and running production systems on GCP
Deep exposure and familiarity to networking concepts, Kubernetes clusters, Docker, containerized development, Terraform, Helm, Dagster (DE), and ArgoCD
Experience with production operations and working with product engineering teams
Experience integrating with SIEM and security software, such as vulnerability scanners
You know the critical questions to ask in order to understand a client’s business problem and can show the business impact of your technical solutions
Team player who is solutions-oriented
You have crisp written and verbal communication skills

Job Responsibility

Improve and further our observability stack across GCP infrastructure and applications
Drive consistency and operational excellence across all teams
Enable the data engineering team to use Dagster efficiently
Leverage tools like Terraform, Github Actions, Helm, and ArgoCD to build efficient infrastructure as code pipelines
Ensure industry standard security controls in our cloud environments
Institute culture of reliability in a federated ownership environment

What we offer

Medical, dental, vision, short & long-term disability, life insurance and AD&D, and 401k matching
Additional family, wellness, and pet benefits
Paid time off and sick leave, 100% paid parental leave (16 weeks for primary caregivers and 12 weeks for secondary caregivers)
We offer a flexible schedule for new parents returning to work
Catered lunches, happy hours, pet-friendly spaces, and monthly technology stipend
$1,000/year for each employee for professional development, as well opportunities for tuition reimbursement
An annual bonus plan and company equity plan (RSUs) are also included in our compensation package

Fulltime

Site Reliability Engineer / Observability Engineer

Rackspace is building up its Professional Services Center of Excellence on Appli...

Location

Egypt , Giza

Salary:

Not provided

Rackspace

Expiration Date

Until further notice

Requirements

Bachelor’s degree in engineering/computer science or equivalent
Senior-level experience with Site Reliability Engineering, DevOps, Code level application support and troubleshooting, AWS Infrastructure design, implementation and optimization, Automation for deployment, scaling and reliability
Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc.
Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
Proactive approach to identifying problems and solutions
Experience writing code with one or more interpreted languages such as Python, PHP, Perl, Ruby, Linux Shell
Experience with Terraform or Cloud Formation scripting
Experience with configuration management tools like Ansible, Chef or Puppet
Experience with standard software development best practices and tools such as code repositories (Git preferred)
Experience executing in an agile software development environment

Job Responsibility

Work with customers and implement Observability solutions
Build and maintain scalable systems and robust automation that supports engineering goals
Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance
Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation
Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standards
Collaborate with team members to document and share solutions
Maintain a deep understanding of the customer’s business as well as their technical environment
Identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues

Fulltime

New

Sr. DevOps Engineer

We are looking for a Sr. DevOps Engineer to help strengthen and scale a cloud-ba...

Location

United States , San Francisco

Salary:

Not provided

Robert Half

Expiration Date

Until further notice

Requirements

5+ years of experience in DevOps, platform engineering, infrastructure engineering, site reliability engineering, or a closely related field
Hands-on experience working with AWS in production environments
Strong background using Terraform for infrastructure automation
exposure to Ansible is a plus
Practical experience building and supporting CI/CD pipelines within Git-based development workflows
Programming ability in Python and at least one additional scripting or programming language used for automation
Experience supporting PostgreSQL databases in production settings
Familiarity with serverless architectures and containerized services in cloud environments

Job Responsibility

Develop, manage, and optimize cloud infrastructure that supports the company's live SaaS environment
Create and maintain delivery pipelines for application services and machine learning workloads to enable efficient releases
Oversee deployments across serverless components and container-based applications while ensuring consistency and reliability
Administer and support a PostgreSQL data environment, including performance, availability, and operational upkeep
Strengthen system dependability by improving monitoring, alerting, logging, and overall platform observability
Automate provisioning and release processes through infrastructure-as-code practices using tools such as Terraform
Collaborate with backend and machine learning teams to address infrastructure needs and enable smooth deployment workflows
Participate in production support activities, including incident response and an on-call rotation
Contribute to operational improvements that reduce manual effort and increase platform stability over time

What we offer

Medical, vision, dental, and life and disability insurance
401(k) plan

Fulltime

New

Senior DevOps Engineer

NTT Data is looking for a Senior DevOps Engineer with strong experience in infra...

Location

Mexico , Guadalajara

Salary:

Not provided

NTT DATA

Expiration Date

Until further notice

Requirements

Strong hands-on DevOps experience with GitHub Actions and Terraform
Experience building and supporting cloud platforms, preferably on AWS
Strong understanding of cloud deployment, reliability, monitoring, scalability, and security practices
Experience setting up multi-source CI/CD pipelines
Hands-on experience with Docker and Kubernetes for model/application deployment
Experience with Datadog for application monitoring, observability, dashboards, alerts, logs, and metrics
Experience integrating SonarQube for code quality, static analysis, and quality gates
Exposure to multi-cloud environments, including Azure and GCP integration
Strong troubleshooting skills across infrastructure, CI/CD, containers, cloud services, and monitoring tools
Ability to work with cross-functional teams including developers, architects, security, QA, and operations

Job Responsibility

Design, build, and maintain scalable DevOps platforms and CI/CD pipelines using GitHub Actions and related automation tools
Develop and manage Infrastructure as Code using Terraform for cloud infrastructure provisioning, deployment, and lifecycle management
Support cloud platform creation, deployment, reliability, and security, with AWS as the preferred primary cloud platform
Build and maintain multi-source pipelines integrating code repositories, infrastructure modules, security scans, quality gates, deployment workflows, and monitoring
Deploy and manage containerized applications using Docker and Kubernetes
Implement application and infrastructure monitoring using Datadog, including dashboards, logs, metrics, alerts, and operational visibility
Integrate code quality and security controls using SonarQube within CI/CD workflows
Support multi-cloud integrations across AWS, Azure, and GCP, where required
Collaborate with application teams to improve deployment reliability, release automation, environment consistency, and operational efficiency
Partner with security teams to enforce DevSecOps practices, vulnerability controls, secrets management, and compliance requirements

Fulltime

New

AWS Cloud DevOps Engineer - VOIS

We are seeking a cloud and DevOps professional to design, implement, and operate...

Location

India , Pune

Salary:

Not provided

Vodafone

Expiration Date

Until further notice

Requirements

Degree in IT or a related discipline
Experienced in cloud computing or enterprise IT environments
Proficient in AWS services, Kubernetes, Terraform, and CI/CD tooling
Comfortable working with Git, GitOps practices, and automation-first approaches
Knowledgeable in cloud security, compliance, and vulnerability management
Able to analyse, troubleshoot, and resolve complex infrastructure issues
Clear communicator who values documentation, collaboration, and continuous improvement
Familiar with Agile ways of working and enterprise service management tools

Job Responsibility

Manage project-driven integration and day-to-day administration of AWS cloud solutions
Design, build, and deploy AWS cloud infrastructure using Terraform and Terragrunt
Develop automation through CI/CD pipelines using tools such as GitHub Actions and ArgoCD
Operate and maintain AWS EKS, CET-EKS, and CET-EKS-O11Y platforms, including lifecycle management and upgrades
Design and deploy hybrid connectivity solutions across CET-based AWS accounts
Administer Kubernetes clusters, including capacity management, security, and compliance
Implement GitOps practices using GitHub, ArgoCD, Helm charts, and EKS
Manage observability platforms, including Grafana dashboards and monitoring stacks
Support AWS RDS infrastructure operations, including upgrades, rightsizing, and maintenance
Identify and implement cost-optimisation opportunities using AWS Cost Explorer and Budgets

What we offer

Opportunities to work on large-scale, business-critical cloud platforms
Exposure to modern DevOps, GitOps, and cloud-native practices in a global environment
A collaborative culture focused on learning, automation, and continuous improvement
The chance to influence platform stability and cloud adoption across Vodafone

Fulltime

Select Country

DevOps Engineer (Observability)

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?

DevOps Engineer (Observability)

DevOps Engineer (Observability)

AWS Agent Core Engineer / DevOps (Observability Focus)

DevOps Engineer - Observability

Senior DevOps Engineer (Observability)

Site Reliability Engineer / Observability Engineer

Sr. DevOps Engineer

Senior DevOps Engineer

AWS Cloud DevOps Engineer - VOIS

Our AI answers in your language