Senior Data Engineer Job at Microsoft Corporation (Vancouver)

Job Description

Are you passionate about shaping the future of AI and empowering millions of users to unlock their full potential? The Copilot Pages and Artifacts team is leading an exciting transformation with Copilot—intelligent, dynamic experiences infused with powerful AI that act as a true 'second brain.' Imagine effortlessly capturing ideas, intuitively understanding complex information, and seamlessly taking informed action. This is the heart of our mission. As a Senior Data Engineer in Office Product Group, you will play a critical role in ensuring product telemetry is piped, aggregated, verified, and delivered to Data Scientists. Your work will span across multiple hosts and endpoints of Copilot Pages and Artifacts and scale to data coming in from around the world. The clarity driven by this data will enable us to have deep insight and understanding of how customers are using the product, influence decision making and shape the future of the product. This opportunity will build your experience working directly with Software Engineers, Data Scientists & Product Managers, all working together to build an innovative product that delights users. The complexity and planet-scale nature of the data will challenge and increase your data pipeline design skills. The direct connection to both the customer and the building of the product will advance your skills in truly delivering product visions & missions. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Job Responsibility

Compliance: Anticipates the need for data governance and designs data modeling and data handling procedures, with direct support and partnership with Corporate, External, and Legal Affairs (CELA), to ensure compliance with applicable laws and policies across all aspects of the development process
Tags data based on categorization (e.g., personally identifiable information [PII], pseudo-anonymized, financial)
Documents data type, build data dictionary, classifications, and lineage to ensure traceability
Governs accessibility of data within assigned data pipelines
Provides guidance on contributions to the data glossary to document the origin, usage, and format of data for each program
Independently implements data governance and privilege of least access practices leveraging security tools
Builds responsible AI-compliant data products and/or applications
Data Management and Transformation: Plans and creates efficient techniques and operations (e.g., inserting, aggregating, joining) to transform raw data into a form (e.g., dimensional data model) that is compatible with downstream data consumers, databases, and formats that support applications, analytics and reporting
Independently uses software, query languages, and computing tools (e.g., cloud-based) to transform raw data across end-to-end pipelines
Evaluates data to ensure data quality and completeness using queries, data wrangling, and statistical techniques
Merges data into distributed systems, products, or tools for further processing
Identifies opportunities to leverage and contribute to the development of data tools that are used to transform, manage, and access data, scaling with efficiency and reduced time to new data insights
Writes, implements, and validates code to test storage and availability of data platforms and drives the implementation of sustainable design patterns to make data platforms more usable and robust to failure and change
Analyzes relevant data sources that allow others to develop insights into data architecture designs or solution fixes
Collaborates with appropriate stakeholders across teams and escalates concerns around data requirements by assessing and conducting feature estimation
Assesses data costs, access, usage, use cases, dependencies across products, and availability for business or customer scenarios related to one or more product features
Informs clients on feasibility of data needs and suggests transformations or strategies to acquire data if requirements cannot be met
Negotiates agreements with partners and system owners to align on project delivery, data ownership between both parties, and the shape and cadence of data extraction for one or more features
Proposes new data metrics or measures to assess data across varied service lines
Defines data source contracts (e.g., nature of data, data schemas, data latency, data availability, data privacy, ethical use of data)
Performs root cause analysis in response to detected problems/anomalies to identify the reason for alerts and implement solutions that minimize points of failure
Implements and monitors self-healing processes across multiple product features to prevent issues from recurring in the future and retain data quality and optimal performance (e.g., latency, cost) throughout the data lifecycle
Uses cost analysis to drive product/program level solutions that reduce budgetary risks
Documents the problem and associated solutions through postmortem reports and shares insights with team and the customer
Provides data-based insights into the health of data products owned by the team according to SLAs across multiple features
Implements and practices both agile and data operations (DataOps) practices
Maintains involvement with, and awareness of current and upcoming data engineering practices (e.g., tools, technology) through Microsoft's internal data community with the purpose of connecting into a data mesh
Writes code to implement performance monitoring protocols across data pipelines
Builds visualizations and smart aggregations (e.g., advanced statistics) to monitor issues with patterns in data quality and pipeline health that could threaten pipeline performance
Develops and updates troubleshooting guides (TSGs) and operating procedures for reviewing, addressing, and/or fixing advanced problems/anomalies flagged by automated testing
Supports and monitors platforms, analyzing telemetry data to understand the health of the systems and takes proactive steps for live site improvement

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Nice to have

Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 8+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience
2+ years experience with data governance, data compliance and/or data security
2+ years experience with Azure Data Factory and Azure Data Explorer
2+ years experience with Azure Data Lake

Microsoft Corporation - All Job Offers

Select Country

Senior Data Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?