This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Core AI is at the forefront of Microsoft’s mission to redefine how software is built and experienced. We build core tools that have defined software engineering for years—VS Code, GitHub, Visual Studio, AI Foundry—and we are infusing AI deeply into all parts of this portfolio. Within Core AI, the CloudMine team serves as the data backbone of Microsoft's engineering intelligence. Our mission is to make Microsoft’s engineering teams the most productive, secure, and agile in the industry by building the authoritative engineering data platform that powers decisions at scale. We mine, validate, and serve trusted datasets derived from engineering artifacts—pull requests, work items, build pipelines, security signals, AI telemetry, and more—used daily in shiproom operations, security and compliance enforcement, organization-wide productivity programs, and AI-powered insights across Windows, Azure, Microsoft 365, Xbox, and other major product lines. Our data directly shapes how Microsoft ships software.
Job Responsibility:
Lead the design and delivery of critical data mining and data pipeline infrastructure
Build and operate large-scale data ingestion, transformation, and validation systems
Architect and implement data pipelines in C# running on Azure
Build automated data quality and validation frameworks
Design schema evolution and lineage tracking
Integrate new engineering signal sources into the platform
Set technical direction for your area
Mentor other engineers
Drive cross-team alignment on data contracts and quality standards
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years
Nice to have:
Demonstrated experience building and operating production data pipelines, ETL/ELT systems, or data processing frameworks at scale
Strong proficiency in C# or a comparable strongly-typed backend language, with the ability to write performant, maintainable data-processing code
Hands-on experience with data validation, data quality practices, or data governance (e.g., schema enforcement, anomaly detection, correctness monitoring)
Experience with Azure data and cloud services (e.g., Azure Data Lake, Azure Data Explorer/Kusto, Cosmos DB, Event Hubs, Azure Storage, or similar)
Track record of operating production systems: monitoring, incident response, performance analysis, and driving reliability improvements
Experience with Microsoft-internal big data systems (e.g., Cosmos/SCOPE, Azure Data Explorer/Kusto) or equivalent large-scale analytics platforms
Deep familiarity with data pipeline patterns: idempotency, backfill, schema evolution, data lineage, and late-arriving data handling
Experience building data quality frameworks with automated validation, data profiling, and regression detection
Familiarity with engineering systems signals (source control, CI/CD, work tracking, security scanning) and deriving metrics and insights from them
Experience setting technical direction, mentoring engineers, and driving cross-team alignment on data standards and contracts
Background in data mining, data analysis, or working with large heterogeneous datasets to extract structured, actionable information