CrawlJobs Logo

Member of Technical Staff - Data Platform

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Mountain View

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

119800.00 - 234700.00 USD / Year

Job Description:

If you are excited by the challenge of designing distributed systems that process petabytes of data for the world's most advanced AI models, this is your team. We are not looking for someone to just write queries or maintain legacy pipelines. We are looking for Systems Builders—engineers who understand the internals of distributed compute, who treat data infrastructure as a product, and who want to architect the backbone of Microsoft Copilot. Join us to build the "Paved Road" for AI. You will own the platform that transforms raw, massive-scale signals into the fuel that powers training, inference, and evaluation for millions of users. We need someone who is energized by solving hard problems in stream processing, lakehouse architecture, and developer experience.

Job Responsibility:

  • Core Platform Engineering: Design and build the underlying frameworks (based on Spark/Databricks) that allow internal teams to process massive datasets efficiently
  • Distributed Systems Architecture: Modernize our data stack by moving from batch-heavy patterns to event-driven architectures
  • Unstructured AI Data Pipelines: Architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets
  • AI Feedback Loops: Engineer the high-throughput telemetry systems that capture user interactions with Copilot
  • Infrastructure as Code: Treat the data platform as software. Define and deploy all storage, compute, and networking resources using IaC (Bicep/Terraform)
  • Data Reliability Engineering: Move beyond simple "validation checks" to build automated governance and observability systems
  • Compute Optimization: Deep-dive into query execution plans and cluster performance. Optimize shuffle operations, partition strategies, and resource allocation

Requirements:

  • Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 3+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience
  • Proficiency in Python, Scala, Java, or Go
  • Deep Distributed Systems Knowledge: Demonstrated technical understanding of massive-scale compute engines (e.g., Apache Spark, Flink, Ray, Trino, or Snowflake)
  • Experience architecting Lakehouse environments at scale (using Delta Lake, Iceberg, or Hudi)
  • Experience building internal developer platforms or "Data-as-a-Service" APIs
  • Strong background in streaming technologies (Kafka, Azure EventHubs, Pulsar) and stateful stream processing
  • Experience with container orchestration (Kubernetes) for deploying data applications
  • Experience enabling AI/ML workloads (Feature Stores, Vector Databases)

Nice to have:

  • Bachelor's or Master's Degree in Computer Science, Software Engineering, or related technical field
  • 4+ years of experience in Software Engineering or Data Infrastructure

Additional Information:

Job Posted:
February 13, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff - Data Platform

Member of Technical Staff - Platform Engineer

Platform Engineer to join our team building backend infrastructure for new ML-po...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Backend engineering experience with Python, TypeScript, or Node.js
  • Hands-on experience working with production PyTorch models, model checkpoints, and inference logic
  • Strong knowledge of building APIs and services that are scalable, stable, and secure
  • Passion for bridging backend engineering and ML systems, especially at the infrastructure layer
  • Familiarity with tools such as FastAPI, Postgres, Redis, Kubernetes, and React
  • Desire to be hands-on and contribute to shaping the foundation of a new enterprise ML product
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Build and maintain backend services to support LLM integration, inference orchestration, and data flow
  • Write clean, reliable Python code for experimentation, model integration, and production systems
  • Collaborate closely with ML researchers to rapidly iterate on product ideas and deploy features
  • Design and implement infrastructure to handle scalable inference workloads and enterprise-level use cases
  • Own system components and ensure reliability, observability, and maintainability from day one
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Competitive stock options
Read More
Arrow Right

Member of Technical Staff - Backend Engineer

Endor Labs is building the Application Security platform for the software develo...
Location
Location
Netherlands
Salary
Salary:
70000.00 - 100000.00 EUR / Year
https://www.endorlabs.com Logo
Endor Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or higher degree in engineering, with 6–8 years of experience building scalable backends for product/SaaS companies
  • At least 3 years of experience in Golang programming, with a focus on microservices and distributed architecture
  • Practical experience designing APIs with one or more frameworks (gRPC [preferred], REST, GraphQL, Thrift, etc.)
  • Affinity with modern AI platforms (OpenAI, Google Gemini, LangChain, etc.)
  • Ability to build and design technical solutions from scratch, with code and documentation that exemplify best practices at Endor
  • Scalable distributed systems experience—understanding microservices, domain-driven design, load balancing, horizontal/vertical scaling, and stateless architectures
  • Strong architectural knowledge, with a keen eye for scalable and extensible systems. Able to apply data-driven techniques to evaluate and recommend architectural choices
  • Ability to discuss trade-offs between architectural decisions and influence teams toward the right direction
  • Comfort working in a fast-moving environment with evolving requirements
  • Creative and independent problem-solving skills, especially in uncharted or ambiguous contexts
Job Responsibility
Job Responsibility
  • Work closely with the R&D team to help integrate novel solutions and scale them to production
  • Design and implement AI-first platforms
  • Have the autonomy and responsibility to design and implement high-quality features used by customers
  • Lead and contribute to large-scale technical projects, ensuring scalability, reliability, and performance
  • Design, architect, and build features end-to-end—including unit and integration tests—while working closely with Product Management and our distributed engineering team
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Cloud Infrastructure

As a Software Engineer on our Cloud Infrastructure team, you'll be at the forefr...
Location
Location
United States , New York, NY; San Mateo, CA; Redwood City, CA
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure)
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Strong software development skills in languages like Python, or C++
  • Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization
Job Responsibility
Job Responsibility
  • Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines
  • Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure
  • Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency
  • Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning
  • Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions
  • Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Ray, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability
  • Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, AI Training Infrastructure

As a Training Infrastructure Engineer, you'll design, build, and optimize the in...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience
  • 3+ years of experience with distributed systems and ML infrastructure
  • Experience with PyTorch
  • Proficiency in cloud platforms (AWS, GCP, Azure)
  • Experience with containerization, orchestration (Kubernetes, Docker)
  • Knowledge of distributed training techniques (data parallelism, model parallelism, FSDP)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for large-scale model training workloads
  • Develop and maintain distributed training pipelines for LLMs and multimodal models
  • Optimize training performance across multiple GPUs, nodes, and data centers
  • Implement monitoring, logging, and debugging tools for training operations
  • Architect and maintain data storage solutions for large-scale training datasets
  • Automate infrastructure provisioning, scaling, and orchestration for model training
  • Collaborate with researchers to implement and optimize training methodologies
  • Analyze and improve efficiency, scalability, and cost-effectiveness of training systems
  • Troubleshoot complex performance issues in distributed training environments
What we offer
What we offer
  • meaningful equity in a fast-growing startup
  • comprehensive benefits package
  • Fulltime
Read More
Arrow Right
New

Member of Technical Staff, Infrastructure Data & Analytics

We are seeking experienced Infrastructure Data & Analytics Engineers to join our...
Location
Location
United States , Multiple Locations; Mountain View; San Francisco Bay area; New York City metropolitan area
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, or related technical field AND 8+ years technical engineering experience with data engineering, analytics, or data science, with increasing technical ownership in startup environment AND 6+ years experience with distributed data processing frameworks and large-scale data systems
  • OR equivalent experience
  • Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with technical engineering experience with data engineering, analytics, or data science, with increasing technical ownership in startup environment AND 10+ years experience with distributed data processing frameworks and large-scale data systems
  • OR equivalent experience
  • Proven technical leadership in data engineering, analytics platforms, or large-scale telemetry systems
  • Hands-on experience with ETL orchestration frameworks such as Airflow, Dagster, or similar
  • Strong communication skills
  • can explain complex systems clearly to senior leader
Job Responsibility
Job Responsibility
  • Act as the technical lead and owner for infrastructure analytics across compute, storage, and networking
  • Design and build durable, scalable data pipelines that ingest telemetry from clusters, schedulers, health systems, and capacity trackers into Data Warehouse
  • Define and standardize core metrics and semantics (e.g., utilization, occupancy, MFU, goodput, capacity readiness, delivery-to-production)
  • Architect and maintain self-service dashboards and APIs for fleet, cluster, and squad-level visibility
  • Partner closely with stakeholders across Supercomputing Infra, Researchers, Strategy and Executives to ensure metrics reflect operational and business reality
  • Implement robust and fault-tolerant systems for data ingestion and processing
  • Lead data architecture and engineering decisions, applying strong technical judgment to proactively shape executive-level discussions and decisions
  • Identify data gaps and instrumentation issues
  • drive fixes by influencing upstream engineering teams
  • Establish data quality, validation, documentation, and governance so metrics are trusted and repeatable
  • Fulltime
Read More
Arrow Right

Staff Machine Learning Engineer

Join PagerDuty as a Staff Machine Learning Engineer to tackle complex problems, ...
Location
Location
Canada , Toronto
Salary
Salary:
156000.00 - 232000.00 CAD / Year
https://www.pagerduty.com Logo
PagerDuty
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience building, designing, and evolving data architecture for large-scale systems
  • Excellent communication skills
  • Experience working with Product teams, ensuring and driving a timely delivery
  • Have a deep understanding of the trade-offs to be considered when designing and delivering machine learning solutions to production
  • Experience leading cross-team architecture discussions, building technical prototypes, and driving the adoption of best practices across diverse teams
  • Demonstrated experience with data engineering processes, working with unstructured data and cloud-based data infrastructures
  • Passionate about ML engineering and interested in driving discussions with stakeholders and executives
Job Responsibility
Job Responsibility
  • Build and improve the capabilities of the data platform that enable and accelerate the production of ML/AI-based solutions
  • Drive and define standards for AI/ML across the organization
  • Provide guidance, technical leadership, and mentoring to other members of the team
  • Mentor junior members and participate in scaling up the existing team
  • Proactively recommend improvements and new approaches addressing potential systemic pain points and technical debt
  • Anticipate technical demands on the data platform based on the organization’s roadmap and systematically drive the evolution of the architecture toward those ends
  • Develop a long-term plan for ML/AI investments
What we offer
What we offer
  • Competitive salary
  • Comprehensive benefits package from day one
  • Flexible work arrangements
  • Company equity
  • ESPP (Employee Stock Purchase Program)
  • Retirement or pension plan
  • Generous paid vacation time
  • Paid holidays and sick leave
  • Dutonian Wellness Days & HibernationDuty - companywide paid days off in addition to PTO
  • Paid parental leave: 22 weeks for pregnant parent, 12 weeks for non-pregnant parent
  • Fulltime
Read More
Arrow Right

Specialist, Data Visualization

The Data Visualization Specialist reports directly to the Data Visualization Man...
Location
Location
United States , New York
Salary
Salary:
Not provided
amsive.com Logo
Amsive
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 1-2 years working experience in a marketing-focused data role involving a data visualization platform (Datorama, Looker, Tableau, PowerBI, etc.)
  • Bachelor’s degree or equivalent work experience
  • Experience with Datorama, Looker, PowerBI, or Tableau, or some other data visualization platform (Datorama preferred)
  • Analytical and problem-solving skills
  • Ability to work independently and with team members from different backgrounds
  • Self-starter and eager to learn new skills
  • An understanding of various methods for visualizing analysis and presenting data in a way that can be easily consumed by less technical staff
  • Intermediate Excel skills
  • ability to integrate disparate data sets into clear, concise reports
  • Ability to handle multiple projects and prioritize responsibilities
Job Responsibility
Job Responsibility
  • Build out new dashboards and updates existing dashboards in Datorama, Looker, and PowerBI
  • Learn all associated platforms that feed into the client dashboards
  • Handle all client requests correctly & on-time
  • Work with your manager and internal marketing teams to determine what we should track to develop reporting & data visualizations that meet their needs
  • Work closely with Ad Ops team to ensure tagging implemented supports reporting needs
  • Where necessary, problem solve for API connection issues
  • Tracking & attribution disagreements between ad platforms, analytics platforms, and CRM systems (in conjunction with Ad Ops)
  • Load time improvements
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Research Tooling & Data Platform

We're looking for an engineer to own Runway's internal exploratory data analysis...
Location
Location
United States
Salary
Salary:
240000.00 - 290000.00 USD / Year
runwayml.com Logo
Runway
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of industry experience in a backend focused software engineering role
  • Strong experience in at least 2 of 3 areas (platform/infrastructure, ML domain knowledge, frontend/product engineering) with eagerness to learn the third
  • Platform/infrastructure: experience with vector databases, cloud primitives (i.e. SQS, ECR, Kinesis) and container orchestration (Kubernetes, ECS)
  • ML domain knowledge: Understanding of ML workflows, model training, evaluation, testing, dataset management, feature engineering, or research tooling
  • Product engineering: Ability to build clean, intuitive user experiences with product thinking and user empathy. You care deeply about building tools people love to use (TypeScript/React experience is a plus)
  • Comfortable setting up and maintaining production infrastructure and services
  • Self-starter who can navigate ambiguity and make pragmatic technical decisions
  • Humility and open mindedness
Job Responsibility
Job Responsibility
  • Own the EDA platform end-to-end: Take full ownership of architecture, infrastructure, feature development, and operations
  • Optimize for scale: Improve query performance and write efficiency for vector search, integrate with new data warehouses, and optimize our custom query parsing/suggestion system
  • Build for researchers: Design and ship features that help ML researchers source data faster, run more effective evaluations, and iterate quickly
  • Enable cross-functional users: Work with design, product, and creative teams to build intuitive evaluation workflows
  • Manage infrastructure: Deploy and maintain services across ECS and Kubernetes, including embedding services and database integrations
  • Provide support: Be responsive to user needs, debug issues quickly, and gather feedback to prioritize improvements
  • Fulltime
Read More
Arrow Right