This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Data Engineer, you’ll build and refine the pipelines, data models, and services that make Microsoft’s research content discoverable and usable across modern applications and emerging AI scenarios. You’ll define the architecture and core data systems largely from scratch, creating the foundation for both web and AI-powered experiences. You’ll use existing AI models and out-of-the-box Microsoft tools to turn unstructured content into structured, high-quality data—not heavy ML research, but practical, model-assisted enrichment. You’ll partner with a PM and a full stack engineer while independently driving backend and data direction. The work includes an initial infrastructure lift to establish the environment, followed by iterative development as the MVP grows.
Job Responsibility:
Build and maintain core data pipelines
Build and maintain end-to-end ingestion pipelines for documents, datasets, code repositories, videos, transcripts, and internal knowledge sources
Clean, normalize, structure, and store data in formats that support both web applications and AI-driven use cases
Use “out of the box” Microsoft tools—such as Fabric, Azure services, Cosmos DB, or Copilot Studio—to create reliable, maintainable systems
Enrich and model research data
Use AI models to transform unstructured content into structured metadata and durable knowledge assets
Design the architecture and foundational data systems, establishing the patterns and infrastructure for a new, scalable environment
Develop and refine embeddings, vector indexes, and retrieval components to support semantic search and grounding scenarios
Build backend and data services
Build data services, APIs, and backend components that power internal applications and agent-supported workflows
Iterate on systems after the initial MVP, improving reliability, performance, and scalability over time
Collaborate and translate requirements
Collaborate with a PM and full stack engineer to understand requirements and translate them into actionable data solutions
Work cross-functionally to define data needs and align systems with downstream consumers and discovery workflows
Requirements:
Proven ability to design and build end-to-end data systems, from ingestion through cleaning, structuring, storage, and serving
Experience building and shipping data products that deliver practical value
Demonstrated impact using AI models in data workflows (applied use, not ML research)
5+ years of software or data engineering experience, including at least 2 years of hands-on work with data pipelines
Comfortable defining architecture and starting systems from scratch, working independently in a small cross-functional team
Proficiency in Python, SQL, or similar languages used in data engineering workflows
Nice to have:
Experience with Microsoft Fabric, Cosmos DB, Azure data services, or Copilot Studio
Background building data that supports embeddings, semantic search, or retrieval use cases
Familiarity with metadata frameworks, taxonomies, or knowledge modeling
Experience shaping ambiguous information into structured datasets and iterating quickly after an MVP
What we offer:
Flexible time-off plan
100% employer-paid medical, dental, and vision insurance
Employer-paid life insurance for those enrolled in medical coverage
401(k) plan with company match
Fertility, surrogacy, and adoption benefits
Fitness and caregiver benefits
Employee Assistance Program
100% employer-paid short- and long-term disability coverage
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.