This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
ABOUT THE JOB Act as the Level 3 escalation point for complex application, network, and infrastructure issues. Monitor system performance and proactively identify root causes of slowness or downtime. Implement and enhance observability practices (logs, metrics, traces). Automate repetitive tasks to reduce manual effort and improve system reliability. Define, monitor, and improve SLAs, SLOs, and SLIs. Collaborate with architects and teams to improve CI/CD pipelines and deployment processes. Manage and support hybrid environments (AWS, Azure, and on-premise systems). Coordinate with cross-functional teams to ensure smooth operations and incident resolution.
Job Responsibility:
Act as the Level 3 escalation point for complex application, network, and infrastructure issues
Monitor system performance and proactively identify root causes of slowness or downtime
Implement and enhance observability practices (logs, metrics, traces)
Automate repetitive tasks to reduce manual effort and improve system reliability
Define, monitor, and improve SLAs, SLOs, and SLIs
Collaborate with architects and teams to improve CI/CD pipelines and deployment processes
Manage and support hybrid environments (AWS, Azure, and on-premise systems)
Coordinate with cross-functional teams to ensure smooth operations and incident resolution
Requirements:
3–4 years of experience in Site Reliability Engineering, Cloud Infrastructure, or L3 Application Support
Strong experience with hybrid infrastructure (on-premise and AWS cloud)
Hands-on experience with monitoring and observability tools (Dynatrace, Datadog, AWS CloudWatch, CloudTrail)
Experience with logs, metrics, and distributed tracing concepts
Familiarity with load testing tools (e.g., k6 or similar)
Experience with scripting languages such as Python or Bash
Solid understanding of network/system monitoring and API integrations
Strong analytical and problem-solving skills, especially in identifying performance bottlenecks
Nice to have:
Experience with Azure environments is a plus
Professional proficiency in English is a plus
What we offer:
Global Diversity: Be part of an international team of 110+ nationalities
Trust and Growth: 70% of our leaders starting at entry-level
Continuous Learning: internal Academy and over 250 training modules
Vibrant Culture: afterworks, networking events
Meaningful Impact: CSR initiatives, including the WeCare Together program