CrawlJobs Logo

Advanced Platform Engineer

United States, Austin, Texas · Job Posted February 20, 2026
Apply Position
Job Link Share

Job Description

We’re looking for a Senior Platform Engineer with strong product instincts – someone who can spot friction in developer workflows, understand their needs, and take the initiative to build solutions that remove barriers. You'll work with a diverse tech stack and collaborate with developers from multiple teams to design, implement and maintain the systems that support our game development. This role combines systems thinking, practical engineering, and a strong bias towards developer experience. You should be comfortable moving between infrastructure, backend services, automation and occasional frontend or tooling work as needed.

Job Responsibility

  • Build and evolve the Internal Developer Platform: Design and implement platform capabilities supporting CI/CD, environment management, observability, and secure-by-default infrastructure
  • Identify & Eliminate Developer Friction: Proactively discover pain points through conversations, telemetry, and hands-on investigation
  • apply pragmatic judgement to weigh quick wins against strategic investments
  • propose and deliver high-impact fixes based on business insights
  • Develop Shared Services & Tooling: Contribute to backend services, SDKs and UI components that enable teams to ship features faster and safer
  • Drive Engineering Excellence: Champion reliability, simplicity, automation and maintainability across systems and workflows
  • Build Relationships Across Teams: Partner with developers of varied backgrounds to understand requirements, gather feedback, and build trust
  • use those relationships to find friction and identify incremental improvements aligned with long term business goals
  • Operate What You Build: Help establish SLIs/SLOs, own operational readiness, and participate in on-call rotations for the services you support
  • Influence Technical Direction: Help shape architectural decisions, tooling standards, and long-term roadmaps for our developer platform. Use small successes to establish a pattern of trust and gradually move the Overton window of what’s achievable with future innovation

Requirements

  • 5-10+ years of experience in software engineering, preferably with platform, infrastructure, or DevOps focus
  • Passion for enabling developers
  • Strong ability to self-direct, identify opportunities, connect developers, and drive improvements to completion
  • Experience designing, building and operating reliable distributed systems
  • Excellent communication and collaboration skills, especially with cross-functional partners
  • Proficiency in at least one of our backend languages: C#, Go, Python or Typescript
  • Hands-on experience with cloud-native infrastructure (AWS preferred) and containerized environments (Docker, Kubernetes)
  • BS in Computer Science, Computer Engineering or equivalent

Nice to have

  • Experience working with game development or creative product teams
  • Knowledge of security, compliance, or auditability considerations for internal platforms
  • Contributions to internal developer portals or platform engineering initiatives at scale

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Advanced Platform Engineer

8 matching positions

Advanced Platform Engineer - Compute

Feedzai is the world’s first RiskOps platform for financial risk management, and...
Location
Location
Portugal
Salary
Salary:
Not provided
feedzai.com Logo
Feedzai
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Information Systems, or the equivalent combination of education, experience, and training
  • 6+ years of hands-on experience in platform engineering, DevOps, or cloud infrastructure
  • Strong programming skills in Go, Java, or similar, with a track record of designing and delivering maintainable systems
  • Deep experience with container technologies and orchestration (Docker, Kubernetes), including operator development or ecosystem tooling
  • Proven experience with CI/CD (e.g. Jenkins, GitLab) and GitOps (e.g. FluxCD, Argo CD)
  • Substantial experience with at least one major cloud provider (AWS, GCP, Azure) and familiarity with cloud-native patterns
  • Strong experience with monitoring and observability (e.g. Grafana, Prometheus) and using data to drive reliability and performance
  • Solid experience with Infrastructure-as-Code (e.g. Terraform, Crossplane) and platform lifecycle management
  • Track record of leading projects, driving technical decisions, and mentoring others
  • Self-driven, collaborative, and motivated to improve how we build and run the platform
Job Responsibility
Job Responsibility
  • Lead the design, implementation, and evolution of Kubernetes Operators and platform services, including deployment, monitoring, and operations
  • Drive development in Go or similar languages, setting standards and best practices for the team
  • Own and evolve automation for cloud infrastructure and incident response, and champion self-healing and reliability improvements
  • Define and improve playbooks, runbooks, and alerting strategies to streamline response and reduce toil
  • Own and advance the product deployment pipeline and GitOps practices (e.g. FluxCD, Argo CD)
  • Lead or coordinate incident response, root cause analysis, and post-incident reviews
  • drive preventive measures
  • Work with AI-assisted development tools (e.g. Cursor) as part of your daily workflow to ship faster and iterate effectively
  • Own and extend Infrastructure as Code (IaC) and platform lifecycle (monitoring, alerting, security, cost, configuration, backup) in production
  • Contribute to developer experience and internal platform capabilities so product teams can ship with less friction
  • Fulltime
Read More
Arrow Right

Sr. Staff Software Engineer - Advanced Analytics Platform

At DISQO, we’re redefining how companies turn data into decisions. Our mission i...
Location
Location
United States , Los Angeles, Glendale
Salary
Salary:
200000.00 - 240000.00 USD / Year
disqo.com Logo
DISQO
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of professional software engineering experience
  • 5+ years architecting or building high-performance data systems or analytics platforms
  • 3+ years of product Rust experience
  • Deep expertise in Rust and strong experience in Java
  • Proven track record building large-scale data analytics or OLAP systems from the ground up
  • Deep understanding of columnar data engines, vectorized execution, and query/dataframe optimization
  • Hands-on experience with performance engineering, profiling, and hardware-aware optimization
  • Strong expertise with AWS - designing, deploying, and optimizing large-scale data and compute systems in the cloud
  • A systems-thinking mindset
  • Thrives in a fast-moving, startup environment
Job Responsibility
Job Responsibility
  • Architect and deliver a high-performance Advanced Analytics Engine
  • Design and build an Agentic AI system that leverages this Advanced Analytics Engine
  • Partner with product, engineering and data teams to power agentic AI analytics systems
  • Profile, benchmark, and optimize Rust components
  • Leverage AWS cloud services to architect scalable, reliable, and cost-efficient analytics infrastructure
  • Shape the evolution of DISQO’s broader data platform and its integration across our product ecosystem
  • Mentor and guide engineers
  • Contribute to open-source or internal frameworks that advance analytical systems and distributed computation
What we offer
What we offer
  • 100% covered Medical/Dental/Vision for employee
  • Equity
  • 401K
  • Generous PTO policy
  • Flexible workplace policy
  • Team offsites, social events & happy hours
  • Life Insurance
  • Health FSA
  • Commuter FSA (for hybrid employees)
  • Catered lunch and fully stocked kitchen
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer Platform Engineer

Join a mission-driven, national financial services organization at the heart of ...
Location
Location
United States , Reston
Salary
Salary:
Not provided
tier4group.com Logo
Tier4 Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years hands-on operating and managing Kubernetes and OpenShift clusters
  • Strong experience with Microsoft Azure (compute, networking, storage, and data services)
  • Proven skills in automation and Infrastructure-as-Code (Terraform, Ansible, GitOps)
  • Proficiency with observability tooling (Datadog, Prometheus, Grafana)
  • Scripting/coding ability in Bash, Python, or Go
Job Responsibility
Job Responsibility
  • Operate, tune, and optimize OpenShift/Kubernetes clusters (scheduling, ingress, upgrades, quotas, policies)
  • Stand up and/or refine observability (Datadog, Prometheus, Grafana)—dashboards, alerts, SLOs, runbooks
  • Map current hybrid topology and critical delivery pipelines
  • identify toil and prioritize automation (Terraform/Ansible)
  • Begin supporting Azure environments (compute, networking, storage, data services) used by analytics teams
  • Drive GitOps-first workflows
  • harden CI/CD with ArgoCD/Jenkins/GitHub Actions and policy-as-code guardrails
  • Implement or enhance platform services (Vault, Kafka/AMQ, ingress, service mesh) for dev and data teams
  • Lead incident response and postmortems
  • institutionalize RCA, blameless learning, and continuous improvement
  • Fulltime
Read More
Arrow Right

GCP AI Platform Architect / Lead AI Platform Engineer

Our client is an innovative technology company specializing in the development o...
Location
Location
Poland , Kraków
Salary
Salary:
Not provided
teamquest.pl Logo
TeamQuest Sp. z o. o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • GCP Expertise (verifiable - ask for production examples): GCP is their primary cloud not secondary experience alongside AWS/Azure. Production deployments across most of: Vertex AI, Cloud Run or GKE, Pub/Sub, BigQuery, Secret Manager, VPC Service Controls, IAM + Workload Identity. Has designed for GCP from scratch, not migrated from another cloud, end-to-end ownership
  • AI / Backend Engineering: Python is the primary language - production-grade service/API development, not scripting or data science only. Strong track record building distributed systems and integrating LLMs.
  • Agentic Architecture (must be production, not PoC): Hands-on production experience with at least one: LangGraph, Google ADK, CrewAI, or custom multi-agent orchestration layer. RAG pipelines shipped to production. Google ADK: candidate must be able to explain what it is, when to use it, and how it compares to LangGraph and custom orchestration. AI agent workflows, ReAct prompting, and Function Calling in production environments
  • Multi-Tenant Architecture: Has designed a multi-tenant SaaS platform end-to-end - not just contributed. Can articulate tenant isolation strategies: IAM boundary design, data isolation per tenant, VPC controls.
  • API Design & Integrations: Proven ability to create secure, high-performance APIs capable of asynchronously managing traffic and communication between multiple decoupled services.
  • Enterprise Security: Practical knowledge of data isolation in multi-tenant SaaS architectures, IAM, and securing cloud-based environments.
  • Vector Databases: Hands-on experience with semantic search and at least one of: Pinecone, Weaviate, pgvector, or Vertex Matching Engine.
Job Responsibility
Job Responsibility
  • System Architecture: Design and develop a scalable, cloud-native architecture on Google Cloud Platform (GCP) that meets enterprise security and multi-tenant data isolation requirements for a SaaS environment
  • AI Agent Orchestration: Architect and implement autonomous, multi-step AI workflows with a clear separation of agent responsibilities (retrieval, analysis, reasoning, response generation)
  • Hands-on Core Development: Actively contribute to core system development-coding orchestration logic, designing services, optimizing performance, and building secure API integrations for routing queries across internal and external agents
  • Frontend Enablement: Design the backend layer, streaming protocols, and APIs to seamlessly support and integrate with advanced conversational UIs
  • Data Management & Extensibility: Build a robust backend capable of processing qualitative and social data, ensuring the platform is easily extensible to incorporate new data sources
What we offer
What we offer
  • Attractive salary
  • Full remote work
  • Social benefits:sporto card,healthcare insurance
  • Fulltime
Read More
Arrow Right

GCP AI Platform Architect / Lead AI Platform Engineer

Our client is an innovative technology company specializing in the development o...
Location
Location
Poland , Katowice
Salary
Salary:
Not provided
teamquest.pl Logo
TeamQuest Sp. z o. o.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • GCP Expertise (verifiable - ask for production examples): production deployments across most of: Vertex AI, Cloud Run or GKE, Pub/Sub, BigQuery, Secret Manager, VPC Service Controls, IAM + Workload Identity
  • Has designed for GCP from scratch, not migrated from another cloud, end-to-end ownership
  • AI / Backend Engineering: Python is the primary language - production-grade service/API development, not scripting or data science only
  • Strong track record building distributed systems and integrating LLMs
  • Agentic Architecture (must be production, not PoC): Hands-on production experience with at least one: LangGraph, Google ADK, CrewAI, or custom multi-agent orchestration layer
  • RAG pipelines shipped to production
  • Google ADK: candidate must be able to explain what it is, when to use it, and how it compares to LangGraph and custom orchestration
  • AI agent workflows, ReAct prompting, and Function Calling in production environments
  • Multi-Tenant Architecture: Has designed a multi-tenant SaaS platform end-to-end - not just contributed
  • Can articulate tenant isolation strategies: IAM boundary design, data isolation per tenant, VPC controls
Job Responsibility
Job Responsibility
  • System Architecture: Design and develop a scalable, cloud-native architecture on Google Cloud Platform (GCP) that meets enterprise security and multi-tenant data isolation requirements for a SaaS environment
  • AI Agent Orchestration: Architect and implement autonomous, multi-step AI workflows with a clear separation of agent responsibilities (retrieval, analysis, reasoning, response generation)
  • Hands-on Core Development: Actively contribute to core system development-coding orchestration logic, designing services, optimizing performance, and building secure API integrations for routing queries across internal and external agents
  • Frontend Enablement: Design the backend layer, streaming protocols, and APIs to seamlessly support and integrate with advanced conversational UIs
  • Data Management & Extensibility: Build a robust backend capable of processing qualitative and social data, ensuring the platform is easily extensible to incorporate new data sources
What we offer
What we offer
  • Attractive salary
  • Full remote work
  • Social benefits: sport card, healthcare insurance
  • Fulltime
Read More
Arrow Right
New

Principal Data Genai Platform Engineer - Senior Vice President

Location
Location
India , Chennai
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of relevant experience in enterprise application development, data engineering, or AI platform engineering, with a strong track record of leadership in regulated environments
  • 8+ years of experience leading multi-team Agile organizations (20+ engineers), including managing distributed and hybrid AI-assisted teams
  • Advanced expertise in Python, PySpark, and Databricks ecosystem for large-scale data processing and ELT/ETL pipelines
  • Proven experience architecting and implementing enterprise AI/GenAI platforms, including agentic AI frameworks, LLM integrations, and prompt engineering
  • Hands-on experience with AI-assisted development tools such as Devin.AI and GitHub Copilot and integrating them into engineering workflows
  • Strong experience with microservices architecture, APIs, and cloud-native deployment (Kubernetes/OpenShift)
  • Strong experience with event-driven architectures and streaming platforms (Kafka)
  • Deep understanding of data architecture, data mesh, data federation, and regulatory data requirements
  • Exceptional leadership, communication, stakeholder management, and decision-making capabilities
  • Experience with cloud platforms (AWS, Azure, GCP, Databricks) and modern data ecosystems
Job Responsibility
Job Responsibility
  • Lead multiple agile scrum teams comprising ~15+ engineers, including hybrid teams of human engineers and AI-assisted development (Devin.AI, Copilot), ensuring delivery excellence and alignment with business priorities
  • Define and execute the enterprise strategy for Python engineering, AI agent platforms, and full-stack data applications, aligned with Retail and Wealth Risk objectives
  • Serve as the senior architect and technical authority for enterprise-scale AI agents, data engineering pipelines, and microservices-based applications, ensuring scalability, resilience, and security
  • Drive the adoption and operationalization of AI Product Development Lifecycle (AI PDLC), including model governance, evaluation, deployment, monitoring, and compliance with Model Risk Management (MRM)
  • Lead development of high-volume data pipelines and data federation layers using PySpark, Databricks, Kafka, and Data Mesh architecture to support regulatory reporting (CCAR, FDIC) and risk analytics
  • Architect and oversee GenAI agent ecosystems using LLMs (Google ADK, Gemini/Flash), implementing Human-in-the-Loop (HITL) frameworks to ensure explainability, auditability, and compliance
  • Drive AI-augmented software development lifecycle, integrating tools such as Devin.AI, GitHub Copilot, and MCP platforms through advanced prompt engineering and governance guardrails
  • Lead microservices and cloud-native architecture using FastAPI/Spring Boot, Kubernetes/OpenShift, and CI/CD pipelines, ensuring high availability and performance
  • Drive engineering efficiency and standardization by reusing and repurposing enterprise-level frameworks, platforms, and tools, reducing duplication and accelerating delivery across teams
  • Ensure all engineering solutions incorporate data governance and non-functional requirements, including Data Quality (DQ), data lineage, data tracing, and auditability, aligned with enterprise governance processes and regulatory expectations
  • Fulltime
Read More
Arrow Right

Platform Engineer – AIOps & Infrastructure

The Platform Engineer – AIOps & Infrastructure will be responsible for designing...
Location
Location
Salary
Salary:
Not provided
solvedex.com Logo
Solvedex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or equivalent experience
  • 5+ years of experience in Platform Engineering, DevOps, Cloud Infrastructure, SRE, MLOps, or related fields
  • Strong experience with AWS, Azure, or GCP
  • Hands-on expertise with Kubernetes, Docker, and Infrastructure-as-Code tools (Terraform, CloudFormation, or similar)
  • Experience building CI/CD pipelines and automation workflows
  • Strong scripting skills using Python, Bash, or similar languages
  • Experience with monitoring and observability platforms such as Grafana, Prometheus, Datadog, or ELK
  • Advanced English proficiency (B2 - C1)
  • Comfortable working remotely with minimal supervision
  • Proactive, detail-oriented, and collaborative
Job Responsibility
Job Responsibility
  • Design and maintain scalable cloud-native infrastructure for AI/ML workloads
  • Manage Kubernetes environments, container orchestration, and platform services
  • Build and optimize CI/CD pipelines and Infrastructure-as-Code frameworks
  • Support MLOps and LLMOps workflows, including deployment, monitoring, and lifecycle management
  • Implement monitoring, logging, alerting, and observability solutions
  • Drive DevSecOps, automation, security, and reliability best practices
  • Collaborate with AI Engineers, Data Scientists, and Infrastructure teams to support production AI systems
  • Participate in troubleshooting, incident response, and platform optimization initiatives
  • Fulltime
Read More
Arrow Right

Platform Engineer

We are seeking a highly progressive Platform Engineer specializing in AI infrast...
Location
Location
Canada , Vancouver
Salary
Salary:
43.79 - 58.39 USD / Hour
https://www.randstad.com Logo
Randstad
Expiration Date
July 25, 2026
Flip Icon
Requirements
Requirements
  • 3-5 years of dedicated cloud platform engineering or SRE experience working with high-volume distributed systems natively in AWS and Azure
  • Elite proficiency with Terraform, with an emphasis on creating modular, reusable code structures and multi-environment pipelines
  • Coding proficiency in Python or Go, with a solid history of integrating with complex REST/JSON APIs
  • Strong operational working knowledge of GitLab CI/CD, Docker containerization, and cloud orchestration layers
  • Proven, hands-on exposure to AI/LLM development concepts (advanced prompting, tool/skill integration, and Retrieval-Augmented Generation [RAG])
  • Extensive experience leveraging AI and Agentic Coding tools to accelerate software delivery and maintain platform scripts
Job Responsibility
Job Responsibility
  • Build integration patterns, API mediation layers, and approval workflows supporting autonomous AI agent tool execution and runtime function calling
  • Integrate advanced distributed telemetry for agent runs (execution traces, evaluation metrics, latency logs, and token cost analytics)
  • Establish runtime safety controls for AI applications, embedding automated rollback scripts, cost control ceilings, and master kill-switches
  • Build and scale highly secure, automated multi-cloud landing zones (AWS and Azure) utilizing reusable Terraform modules
  • Construct and maintain robust GitLab CI/CD pipelines, package registries, and automated infrastructure release strategies
  • Implement strict automated infrastructure guardrails using Open Policy Agent (OPA), Conftest, or Azure Policies to guarantee security without breaking developer velocity
  • Embed least-privileged access, zero-trust network segmentation, private endpoints, KMS encryption keys, and advanced secrets management
  • Champion Site Reliability Engineering standards by managing Service Level Objectives (SLOs), calculating error budgets, configuring autoscaling matrices, and leading chaos engineering simulations
  • Apply cloud financial management protocols (structured resource tagging, budget alarms, anomaly detection, and cluster right-sizing)
  • Author clear, accessible developer guides and self-service templates that streamline the adoption of core AI platform features
What we offer
What we offer
  • Pioneering Technical Landscape
  • Elite Multi-Cloud Exposure
  • High Extensibility Indicators
  • Premier Workspace
  • Fulltime
Read More
Arrow Right