CrawlJobs Logo

Platform Engineer - Compute

Portugal · Job Posted March 22, 2026
Apply Position
Job Link Share

Job Description

The Platform Engineering Compute team is responsible for the overall cloud platform architecture that supports the cloud services provided to our customers and internal developers. As a Professional Platform Engineer you will own meaningful areas of our platform, contribute to design decisions, and help improve our automation and tooling. You'll work with other platform engineers to develop and maintain the systems that support Feedzai's cloud service and enable faster, more nimble product delivery.

Job Responsibility

  • Design, build, and maintain Kubernetes Operators and platform services, including deployment, monitoring, and operations
  • Develop in Go or similar languages, following team standards and contributing to best practices
  • Automate cloud infrastructure and incident response
  • improve self-healing and reliability
  • Develop and refine playbooks, runbooks, and alerting to streamline response procedures
  • Maintain and improve the product deployment pipeline and GitOps practices (e.g. FluxCD, Argo CD)
  • Participate in incident response, root cause analysis, and resolution
  • contribute to post-incident improvements
  • Work with AI-assisted development tools (e.g. Cursor) as part of your daily workflow to ship faster and iterate effectively
  • Maintain and extend Infrastructure as Code (IaC) and platform lifecycle (monitoring, alerting, security, cost, configuration, backup) in production
  • Help improve developer experience and platform capabilities for product teams

Requirements

  • A bachelor's degree in Computer Science, Information Systems, or the equivalent combination of education, experience, and training
  • 2+ years of hands-on experience in platform engineering, DevOps, or cloud infrastructure
  • Strong programming skills in Go, Java, or similar, with experience building and maintaining systems
  • Hands-on experience with container technologies and orchestration (Docker, Kubernetes)
  • Experience with CI/CD (e.g. Jenkins, GitLab) and GitOps tools (e.g. FluxCD, Argo CD)
  • Experience working with at least one major cloud provider (AWS or GCP) and cloud-native patterns
  • Experience with monitoring and observability (e.g. Grafana, Prometheus)
  • Experience with Infrastructure-as-Code (e.g. Terraform, Crossplane) and platform lifecycle management
  • Self-driven, collaborative, and motivated to learn and improve how we build and run the platform

Nice to have

  • Excellent communication skills, both written and verbal
  • Comfort with AI-augmented development tools (e.g. Cursor) and willingness to adopt new tooling
  • Kubernetes, cloud, or programming certifications or equivalent are valued

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Platform Engineer - Compute

8 matching positions

Advanced Platform Engineer - Compute

Feedzai is the world’s first RiskOps platform for financial risk management, and...
Location
Location
Portugal
Salary
Salary:
Not provided
feedzai.com Logo
Feedzai
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Information Systems, or the equivalent combination of education, experience, and training
  • 6+ years of hands-on experience in platform engineering, DevOps, or cloud infrastructure
  • Strong programming skills in Go, Java, or similar, with a track record of designing and delivering maintainable systems
  • Deep experience with container technologies and orchestration (Docker, Kubernetes), including operator development or ecosystem tooling
  • Proven experience with CI/CD (e.g. Jenkins, GitLab) and GitOps (e.g. FluxCD, Argo CD)
  • Substantial experience with at least one major cloud provider (AWS, GCP, Azure) and familiarity with cloud-native patterns
  • Strong experience with monitoring and observability (e.g. Grafana, Prometheus) and using data to drive reliability and performance
  • Solid experience with Infrastructure-as-Code (e.g. Terraform, Crossplane) and platform lifecycle management
  • Track record of leading projects, driving technical decisions, and mentoring others
  • Self-driven, collaborative, and motivated to improve how we build and run the platform
Job Responsibility
Job Responsibility
  • Lead the design, implementation, and evolution of Kubernetes Operators and platform services, including deployment, monitoring, and operations
  • Drive development in Go or similar languages, setting standards and best practices for the team
  • Own and evolve automation for cloud infrastructure and incident response, and champion self-healing and reliability improvements
  • Define and improve playbooks, runbooks, and alerting strategies to streamline response and reduce toil
  • Own and advance the product deployment pipeline and GitOps practices (e.g. FluxCD, Argo CD)
  • Lead or coordinate incident response, root cause analysis, and post-incident reviews
  • drive preventive measures
  • Work with AI-assisted development tools (e.g. Cursor) as part of your daily workflow to ship faster and iterate effectively
  • Own and extend Infrastructure as Code (IaC) and platform lifecycle (monitoring, alerting, security, cost, configuration, backup) in production
  • Contribute to developer experience and internal platform capabilities so product teams can ship with less friction
  • Fulltime
Read More
Arrow Right

Software Engineer, Compute Platform

We are seeking talented distributed systems engineers who are passionate about b...
Location
Location
United States , Foster City
Salary
Salary:
130000.00 - 290000.00 USD / Year
replit.com Logo
Replit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Distributed systems: Track record of working with platform-as-a-service, distributed storage, or information retrieval systems. Experience in designing scalable architectures and optimizing systems for latency or cost
  • Problem-solving mindset: Ability to approach complex challenges pragmatically and devise effective solutions
  • Self-directed and autonomous: Able to work independently, set priorities, and drive projects forward
  • Versatility and flexibility: Able to wear multiple hats and tackle a wide range of challenges
  • Continuous learning and adaptability: Passionate about staying up-to-date with industry trends and expanding your skill set
Job Responsibility
Job Responsibility
  • Expand Replit's cloud infrastructure offerings: Launch new cloud products to be used by Replit Agent to build complex apps
  • Enhance reliability and scalability: Identify bottlenecks, optimize critical paths, and implement robust monitoring and alerting systems
  • Improve utilization of cloud infrastructure: Analyze our infrastructure costs and identify opportunities for optimization
What we offer
What we offer
  • Competitive Salary & Equity
  • 401(k) Program with a 4% match
  • Health, Dental, Vision and Life Insurance
  • Short Term and Long Term Disability
  • Paid Parental, Medical, Caregiver Leave
  • Commuter Benefits
  • Monthly Wellness Stipend
  • Autonomous Work Environment
  • In Office Set-Up Reimbursement
  • Flexible Time Off (FTO) + Holidays
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Compute Platform

We are seeking a strong Senior Engineer to contribute to the design, development...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of software engineering experience, including expertise in distributed systems or infrastructure engineering
  • Bachelors degree in Compute Science or related field
  • Experience in Golang, Java, Python, C/C++
  • Background in large-scale backend infrastructure
  • Knowledge of cluster management solutions such as Mesos or Kubernetes
  • Understanding of container technologies such as docker or containerd
  • Knowledge of operating systems and linux kernel
Job Responsibility
Job Responsibility
  • Design, build, and enhance core components of Uber’s Kubernetes-based Compute Platform, focusing on reliability, scalability, and global availability
  • Implement and optimize Kubernetes controllers, operators, CRDs, and multi-cluster management features to support diverse workloads across on-prem and cloud environments
  • Work on runtime systems—containerd, Docker, CRI-O—improving image lifecycle, sandboxing, security, and end-to-end pod execution performance
  • Develop and evolve the infrastructure abstraction layers and APIs that enable developers to deploy, manage, and scale stateful, batch, and mission-critical services with minimal operational overhead
  • Lead technical initiatives around scheduling, autoscaling, resource management, and workload placement to improve cluster efficiency and ensure high availability
  • Collaborate with cross-functional teams including Networking, Storage, ML Infra, Developer Productivity, and Data Platform to build solutions and elevate the overall developer experience
  • Debug, troubleshoot, and resolve complex issues across Linux systems, container runtimes, Kubernetes control plane, and distributed compute workflows
  • Contribute to architectural discussions, influence long-term design decisions, and help maintain a high technical bar within the Compute Platform team
Read More
Arrow Right

Sr Staff Software Engineer - Compute Platform

We are seeking a highly experienced Senior Staff Engineer to lead the technical ...
Location
Location
United States , Sunnyvale
Salary
Salary:
267000.00 - 297000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of software engineering experience, including expertise in distributed systems or infrastructure engineering
  • Deep expertise in Kubernetes internals, container runtimes, and cloud-native compute platforms
  • Strong background in containerization, resource scheduling, and cluster management at scale
  • Hands-on experience with performance tuning, reliability engineering, and cost optimization in compute environments
  • Excellent leadership, communication, and organizational skills, with a track record of building and mentoring high-performing teams
  • Strong coding proficiency in one or more languages such as Go, Java, or Python
  • Demonstrated ability to drive cross-functional technical initiatives and deliver impactful results
Job Responsibility
Job Responsibility
  • Own the technical vision, architecture, and strategy for the global compute platform org
  • Define and execute the roadmap for our compute platform, focusing on scalability, performance, and efficiency
  • Drive architectural decisions and set technical direction for compute scheduling, resource allocation, and container orchestration systems
  • Ensure high availability and reliability of the compute platform through best-in-class observability, automation, and incident response practices
  • Drive adoption of best practices in scalability, availability, and security for multi-tenant compute environments
  • Evaluate emerging technologies in cloud-native ecosystems and guide their integration into the platform
  • Partner with product and infrastructure teams to deliver high-impact, cross-organizational initiatives
  • Mentor and coach engineers, helping grow their technical depth and leadership skills
  • Influence company-wide engineering standards and practices
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right

Senior Software Engineer II - Cloud Compute Platform

As a Software Engineer on the Compute Platform team, you will be a key technical...
Location
Location
United States
Salary
Salary:
197400.00 - 232000.00 USD / Year
confluent.io Logo
Confluent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience delivering scalable software solutions
  • Proven track record of leading the delivery of large-scale, highly available, low-latency systems
  • Deep expertise in Kubernetes including controller development, operator patterns, and multi-cluster architectures
  • Strong proficiency in Go with experience building production-grade distributed systems
  • Experience with multi-tenant platform architectures and security isolation patterns
  • Familiarity with gRPC, Protobuf, and API design for internal platform services
  • Experience with observability tools and operational excellence practices
  • Experience with multi-cloud environments (AWS, GCP, Azure) and cloud-provider integrations
  • Track record of providing technical leadership and mentorship
  • Track record of working collaboratively across teams including product management, SRE, and other engineering teams
Job Responsibility
Job Responsibility
  • Drive the overall technical charter for the Compute Platform, including multi-cluster orchestration, workload placement, and security architecture
  • Design and implement platform APIs and Kubernetes operators using Go to support evolving workload requirements
  • Work closely with product management and engineering leadership to build and drive the roadmap for Confluent's Compute Platform, enabling new business opportunities across Confluent
  • Deliver high-impact initiatives in areas such as workload scheduling, disruption management, network isolation, rolling update strategies, and cross-cluster resource management
  • Lead technical design reviews and drive architectural decisions across organizational boundaries
  • Mentor and grow other engineers on the team through code reviews, pairing, and technical guidance
  • Own operational aspects including availability, reliability, performance monitoring, emergency response, and disaster recovery for our global compute infrastructure
What we offer
What we offer
  • Remote-First Work
  • Robust Insurance Benefits
  • Flexible Time Away
  • The Best Teammates
  • Experience Ambassadors
  • Open and Honest Culture
  • Well-Being and Growth
  • Offers Equity
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer Platform Engineer

Join a mission-driven, national financial services organization at the heart of ...
Location
Location
United States , Reston
Salary
Salary:
Not provided
tier4group.com Logo
Tier4 Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years hands-on operating and managing Kubernetes and OpenShift clusters
  • Strong experience with Microsoft Azure (compute, networking, storage, and data services)
  • Proven skills in automation and Infrastructure-as-Code (Terraform, Ansible, GitOps)
  • Proficiency with observability tooling (Datadog, Prometheus, Grafana)
  • Scripting/coding ability in Bash, Python, or Go
Job Responsibility
Job Responsibility
  • Operate, tune, and optimize OpenShift/Kubernetes clusters (scheduling, ingress, upgrades, quotas, policies)
  • Stand up and/or refine observability (Datadog, Prometheus, Grafana)—dashboards, alerts, SLOs, runbooks
  • Map current hybrid topology and critical delivery pipelines
  • identify toil and prioritize automation (Terraform/Ansible)
  • Begin supporting Azure environments (compute, networking, storage, data services) used by analytics teams
  • Drive GitOps-first workflows
  • harden CI/CD with ArgoCD/Jenkins/GitHub Actions and policy-as-code guardrails
  • Implement or enhance platform services (Vault, Kafka/AMQ, ingress, service mesh) for dev and data teams
  • Lead incident response and postmortems
  • institutionalize RCA, blameless learning, and continuous improvement
  • Fulltime
Read More
Arrow Right

Senior ML Platform Engineer, AI Platform

We are seeking a skilled and passionate ML Platform Engineer to join our team an...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
airwallex.com Logo
Airwallex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in backend software development
  • at least 2+ years focus on AI/ML Platform or MLOps infrastructure
  • deep expertise in MLOps practices, including automated deployment pipelines, model optimization, and production lifecycle management
  • proven experience designing and implementing low-latency model serving solutions
  • proficiency in Python
  • skill in writing high-quality, maintainable code
  • experience in design and development of large-scale distributed, high concurrency, low-latency inference, high availability systems
  • excellent communication and mentoring abilities
  • a relevant degree in Computer Science, Mathematics or related fields
Job Responsibility
Job Responsibility
  • Platform Development: Design, build, and maintain the end-to-end MLOps platform using Kubernetes and Cloud Services
  • Infrastructure as Code (IaC): Use Terraform or similar tools to manage, provision, and scale all ML-related infrastructure securely and efficiently
  • Pipeline Automation: Implement and optimize CI/CD/CT (Continuous Integration, Delivery, Training) pipelines to automate model training, testing, packaging, and deployment using tools like Argo and Kubeflow Pipelines
  • Serving Infrastructure: Build highly available, low-latency, and high-throughput model serving infrastructure
  • Observability: Implement robust monitoring, alerting, and logging solutions to track infrastructure health, model performance, and data/model drift
  • Tooling & Support: Evaluate, integrate, and support ML tools such as Feature Stores and distributed model training pipelines
  • Security & Compliance: Ensure platform security, implement RBAC (Role-Based Access Control), and manage secrets for sensitive data and production environments
  • Collaboration: Work closely with Data Scientists and ML Engineers to understand their needs and provide technical guidance on best practices for scaling their models
  • Fulltime
Read More
Arrow Right

Platform Engineer - Data Science Platform

Location
Location
United States , Columbus, OH or Dallas, TX or Minneapolis, MN
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science or a related field, or equivalent practical experience
  • 5+ years of experience supporting Data Science infrastructure
  • 5+ years of hands-on experience with AWS-hosted Data Lake, Data Science, or AI/ML platforms
  • 5+ years of working knowledge with Kubernetes
  • AWS services such as SageMaker, Glue, Lambda, Athena
  • CI/CD tools such as Azure DevOps
  • Infrastructure as Code tools such as Terraform
  • Container technologies including Docker and Amazon ECR
  • Security tools such as AQUA and Kenna
  • Experience producing technical documentation and written solutions
Job Responsibility
Job Responsibility
  • Support and maintain ongoing Data Science infrastructure operations
  • Design, build, and deploy AWS environments using automated CI/CD pipelines
  • Manage and scale large, secure cloud environments to support current and future Data Science initiatives
  • Implement, own, and improve the image management lifecycle process
  • Assist with the setup and ongoing management of AWS accounts dedicated to the Data Science platform
  • Develop and maintain infrastructure pipelines using CI/CD tools (e.g., Azure DevOps)
  • Build and manage environments using Infrastructure as Code (IaC) tools such as Terraform
  • Develop scripts and applications using programming languages such as Python
  • Manage and support database technologies including Athena, Oracle, MySQL, and PostgreSQL
  • Leverage AWS services to enable Data Lake, Data Science, and AI/ML workloads
What we offer
What we offer
  • medical
  • vision
  • dental
  • life and disability insurance
  • 401(k) plan
  • Fulltime
Read More
Arrow Right