This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Microsoft’s Azure Data engineering team is leading the transformation of analytics in the world of data with products like databases, data integration, big data analytics, messaging & real-time analytics, and business intelligence. The products our portfolio include Microsoft Fabric, Azure SQL DB, Azure Cosmos DB, Azure PostgreSQL, Azure Data Factory, Azure Synapse Analytics, Azure Service Bus, Azure Event Grid, and Power BI. Our mission is to build the data platform for the age of AI, powering a new class of data-first applications and driving a data culture. Within Azure Data, the big data analytics team provides a range of products that enable data engineers and data scientists to extract intelligence from all data – structured, semi-structured, and unstructured. We build the Data Engineering, Data Science, and Data Integration pillars of Microsoft Fabric. The Cosmos Analytics Platform team within Azure Data is hiring Senior Software Engineer to drive the evolution, reliability, and performance of Microsoft's hyperscale big data platform—Cosmos. This team builds and operates foundational infrastructure that powers mission-critical data analytics workloads across Microsoft. The team works on core platform components, complex distributed systems problems, modernizing the compute platforms, live-site excellence, and developer experience improvements.
Job Responsibility:
Design and evolve core execution, scheduling, and resource management systems that power Cosmos Analytics at hyperscale, ensuring high performance, predictability, and operational excellence.
Evolve core platform capabilities for performance sensitive and ML/AI heavy workloads. Includes large scale shuffle data management, ARM based compute, GPU accelerated execution paths and secure containerization.
Collaborate across Azure services (Fabric, Storage, ACI, and Capacity teams) to land cross service features, remove architectural bottlenecks, and ensure platform readiness for large scale customer scenarios.
Lead critical reliability and live site improvements by diagnosing deep distributed systems issues, strengthening failover paths, and driving measurable reductions in incident load and mitigation times.
Raise engineering quality and velocity by contributing to diagnostics, tooling, automated validation, and mentorship that strengthens the overall technical bar of the team.
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 5+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Experience designing scalable, reliable, secure services and debugging complex, multi component production issues.
Familiarity with cloud environments (e.g., Azure) and service deployment/operations.
Hands on with big data execution engines (Spark, SCOPE) and cluster orchestration.
Experience with shuffle systems and data movement pipelines (concepts like partitioning, spill/merge, locality).
Practical exposure to containerization (Docker/OCI), orchestration (Kubernetes/Service Fabric), and image/build pipelines.
Background in ARM compute and/or GPU acceleration
performance tuning on heterogeneous hardware.
Proven cross team collaboration, ability to drive clarity in ambiguous spaces, and excellent technical communication.
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check.