This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
If you are excited by the challenge of designing distributed systems that process petabytes of data for the world's most advanced AI models, this is your team. We are not looking for someone to just write queries or maintain legacy pipelines. We are looking for Systems Builders—engineers who understand the internals of distributed compute, who treat data infrastructure as a product, and who want to architect the backbone of Microsoft Copilot. Join us to build the "Paved Road" for AI. You will own the platform that transforms raw, massive-scale signals into the fuel that powers training, inference, and evaluation for millions of users. We need someone who is energized by solving hard problems in stream processing, lakehouse architecture, and developer experience.
Job Responsibility:
Core Platform Engineering: Design and build the underlying frameworks (based on Spark/Databricks) that allow internal teams to process massive datasets efficiently
Distributed Systems Architecture: Modernize our data stack by moving from batch-heavy patterns to event-driven architectures
Unstructured AI Data Pipelines: Architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets
AI Feedback Loops: Engineer the high-throughput telemetry systems that capture user interactions with Copilot
Infrastructure as Code: Treat the data platform as software. Define and deploy all storage, compute, and networking resources using IaC (Bicep/Terraform)
Data Reliability Engineering: Move beyond simple "validation checks" to build automated governance and observability systems
Compute Optimization: Deep-dive into query execution plans and cluster performance. Optimize shuffle operations, partition strategies, and resource allocation
Requirements:
Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 3+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience
Proficiency in Python, Scala, Java, or Go
Deep Distributed Systems Knowledge: Demonstrated technical understanding of massive-scale compute engines (e.g., Apache Spark, Flink, Ray, Trino, or Snowflake)
Experience architecting Lakehouse environments at scale (using Delta Lake, Iceberg, or Hudi)
Experience building internal developer platforms or "Data-as-a-Service" APIs
Strong background in streaming technologies (Kafka, Azure EventHubs, Pulsar) and stateful stream processing
Experience with container orchestration (Kubernetes) for deploying data applications