Senior Data Engineer Job at Microsoft Corporation (Redmond)

Job Description

If you love the pursuit of excellence and are inspired by the challenges that come through driving innovations that impact how the world lives, works and plays, then we invite you to learn more about Microsoft Business Operations (MBO) - and the value we deliver across Microsoft and to our customers and partners. We offer unique opportunities to work on interesting global projects in an environment that appreciates diversity, focuses on talent development and recognizes and rewards great work. As a Senior Data Engineer, you’ll build and operate the platforms that power how Microsoft does business—solving complex, high-impact problems at global scale and shipping technology used every day across the company.

Job Responsibility

Own the end-to-end engineering lifecycle for key components and services—designing, coding, testing, deploying, and operating solutions that are secure, reliable, and maintainable
Use AI tools across the full SDLC in a disciplined way. Own the quality of AI-generated requirements, designs, and code — yours and your teammates' — and apply Responsible AI practices
Build the training, feature, retrieval, grounding, and evaluation datasets that LLMs, and agents depend on. Partner with PM and engineering to make data contracts, freshness, drift signals, and offline/online consistency first-class concerns
Lead design discussions for your project area, evaluate tradeoffs across batch vs. streaming, warehouse vs. lakehouse, ELT vs. ETL, and storage choices for analytical, feature, and vector workloads. Own architectural decisions with minimal oversight
Partner with PM and engineering to define data requirements
ensure feedback loops on data quality, usage, model performance, and downstream product impact are in place
Write extensible, secure, performant code for pipelines, transformations, and supporting services. Apply modern patterns including GenAI-assisted development. Drive code reviews and best practices at the product level
Own the test strategy for your area, including data contract tests, schema validation, freshness checks, distribution and drift monitoring, and offline/online parity. Improve the test suite and use AI tools for test automation
Identify cross-team data dependencies, manage upstream producer and downstream consumer impact, and resolve conflicts when semantics or schemas change in ways that affect models or downstream products
Drive your workgroup's project and release plans. Break work into a roadmap including backfills, migrations, and model-impacting changes, and coach others on estimation
Design and run experiments — A/B tests, shadow pipelines, offline replays, evaluation harnesses — and interpret results to guide ship decisions for data and for the models that depend on it
Drive deployment automation toward zero-touch
strengthen CI/CD for data systems, including reversible migrations, safe backfills, and controlled rollouts of semantic changes
Participate in on-call rotation as a DRI for data pipelines and serving paths. Use telemetry to diagnose, mitigate, and lead retrospectives. Drive metrics that improve reliability, data quality, and customer impact — including model and agent behavior traced back to data
Apply security-as-code, threat modeling, and breach-drill practices. Ensure AI safety controls and data governance for the AI features your data supports. Meet privacy, compliance, and accessibility standards
Lead by example. Mentor engineers on data engineering craft and AI fluency, raise the team's bar, and foster an inclusive culture

Requirements

Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 3+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience
Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 8+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience
2+ years experience with data governance, data compliance and/or data security
4+ years of hands-on software development experience in one or more general purpose programming languages (e.g., C#, Java, C++, Python, JavaScript/TypeScript)
Hands-on experience designing and operating production data pipelines and platforms at scale
Working understanding of how GenAI systems consume data — training datasets, features and labels, embeddings, retrieval and grounding data, evaluation harnesses, and the data-side failure modes that drive model and agent regressions
Hands-on experience using AI-assisted development tools (e.g., GitHub Copilot, agentic coding workflows, GenAI-based code review and test generation) in a disciplined, production-grade way
Experience integrating AI capabilities (LLMs, agents, model-backed features) into production systems, including familiarity with Responsible AI principles and applying AI safety controls in production
Experience owning a feature area end-to-end, from design through deployment, monitoring, and on-call ownership
Experience designing and operating large-scale distributed data systems in a cloud environment (e.g., Azure), including data model and pipeline design, performance tuning, and cost optimization
Engineering fundamentals: data structures and algorithms, object-oriented and systems design, and building resilient services (reliability, availability, scalability)
Experience with DevOps practices and tooling (CI/CD, infrastructure as code, monitoring and alerting, incident response) and a track record of driving toward zero-touch deployment
Experience building observable systems: designing telemetry, metrics, and dashboards — including data quality and drift signals — that drive reliability, performance, and customer-impact decisions
Experience building secure software, including secure coding practices, threat modeling, premortems, and privacy/compliance considerations relevant to data systems
Demonstrated technical leadership through design reviews, mentoring, and driving improvements to code quality and engineering processes
Experience collaborating in cross-functional and communicating technical concepts clearly to engineering, product, and executive audiences

Nice to have

Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 8+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience
2+ years experience with data governance, data compliance and/or data security
4+ years of hands-on software development experience in one or more general purpose programming languages (e.g., C#, Java, C++, Python, JavaScript/TypeScript)
Hands-on experience designing and operating production data pipelines and platforms at scale
Working understanding of how GenAI systems consume data — training datasets, features and labels, embeddings, retrieval and grounding data, evaluation harnesses, and the data-side failure modes that drive model and agent regressions
Hands-on experience using AI-assisted development tools (e.g., GitHub Copilot, agentic coding workflows, GenAI-based code review and test generation) in a disciplined, production-grade way
Experience integrating AI capabilities (LLMs, agents, model-backed features) into production systems, including familiarity with Responsible AI principles and applying AI safety controls in production
Experience owning a feature area end-to-end, from design through deployment, monitoring, and on-call ownership
Experience designing and operating large-scale distributed data systems in a cloud environment (e.g., Azure), including data model and pipeline design, performance tuning, and cost optimization
Engineering fundamentals: data structures and algorithms, object-oriented and systems design, and building resilient services (reliability, availability, scalability)
Experience with DevOps practices and tooling (CI/CD, infrastructure as code, monitoring and alerting, incident response) and a track record of driving toward zero-touch deployment
Experience building observable systems: designing telemetry, metrics, and dashboards — including data quality and drift signals — that drive reliability, performance, and customer-impact decisions
Experience building secure software, including secure coding practices, threat modeling, premortems, and privacy/compliance considerations relevant to data systems
Demonstrated technical leadership through design reviews, mentoring, and driving improvements to code quality and engineering processes
Experience collaborating in cross-functional and communicating technical concepts clearly to engineering, product, and executive audiences

Microsoft Corporation - All Job Offers

Select Country

Senior Data Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?