This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Shape the future of trust in the age of AI. At Oscilar, we're building the most advanced AI Risk Decisioning™ Platform. Banks, fintechs, and digitally native organizations rely on us to manage their fraud, credit, and compliance risk with the power of AI. Oscilar is growing fast, and so is the complexity of our systems. We’re looking for a experienced SRE to take ownership of reliability across our multi-region, cloud-native platform. You’ll have the mandate and autonomy to design, implement, and evolve systems that stay performant and resilient—through traffic spikes, dependency failures, and global deployments. You’ll be shaping how we scale, how we build observability, and how we run infrastructure that supports billions of events and large-scale data pipelines.
Job Responsibility:
Architect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes)
Lead initiatives to improve availability, latency, and performance at scale
Design and evolve our CI/CD pipelines to optimize for speed, safety, and repeatability
Define the metrics, alerts, and runbooks that form our observability backbone
Run chaos experiments and failure simulations to harden the platform
Mentor engineers and set best practices for SRE across the company
Requirements:
Proven track record as a senior SRE or Infrastructure Engineer in high-scale environments
Expert-level skills in AWS and Infrastructure as Code (Pulumi, Terraform)
Strong programming ability in Go or Python. We use Go
Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture
Mastery of container orchestration (Kubernetes) and production debugging
Strong sense of ownership, and the judgment to balance velocity with reliability