Senior Software Engineer - Real-Time Workflows & ML Serving Job at Microsoft Corporation (Bangalore)

Job Description

Modern ads platforms run on always-on, real-time data: streaming events, feature computation, near-real-time aggregations, and low-latency serving to power ML models that operate at massive scale under strict freshness, cost, and reliability requirements. Microsoft Ads builds and operates large-scale, latency-sensitive systems that serve billions of requests. We are looking for a Sr Software Engineer who is hands-on with production coding and system design to build the real-time data pipelines and feature/embedding materialization systems that feed online stores/caches and integrate tightly with ML inference serving. This role is ideal for engineers who enjoy: building robust streaming + ETL systems (correctness, idempotency, backfills, late data), owning SLOs with strong observability and operational maturity, and optimizing end-to-end performance and cost across compute, storage, and serving integrations. Primary success metrics are freshness, correctness, latency, reliability, and cost in production.

Job Responsibility

Design and implement real-time streaming ETL / feature pipelines (e.g., Flink or Spark Structured Streaming) that meet strict freshness and correctness constraints
Build and operate reliable messaging and ingestion with Kafka/Pulsar (partitioning strategy, retries, ordering guarantees, DLQs, backpressure handling)
Own data contracts between producers, pipelines, and consumers: schema evolution, versioning, compatibility, validation, and safe rollout
Implement production-grade backfill/replay workflows
Define and meet SLOs using OpenTelemetry/Prometheus/Grafana for metrics, tracing, dashboards, alerting, and incident response readiness
Integrate pipelines with online stores/caches and ML consumers (feature stores, embedding pipelines, LLM API calls, online/offline consistency patterns)
Partner with applied scientists on feature/embedding definitions, validation, and end-to-end quality measurement
Optimize end-to-end performance and efficiency: CPU/memory/I/O, serialization, caching, network overhead, concurrency, and pipeline compute cost
Contribute to serving/inference integrations where needed (e.g., Triton/ONNX Runtime/TensorRT) including batching and latency/cost tradeoffs
Ship safely with CI/CD, automated testing (unit/integration/data quality), and operational playbooks/runbooks

Requirements

Bachelor’s or Master’s degree in Computer Science, Electrical/Computer Engineering, or a related field, with 6+ years of related experience
Strong programming skills in language C++,C# or Python (at least one required)
Hands-on experience in one or more: Building and operating streaming data pipelines in production (Flink or Spark Structured Streaming), Distributed systems engineering with strong reliability and operational rigor, Messaging systems such as Kafka/Pulsar
Experience operating services with Kubernetes/containers and production readiness practices (deployments, scaling, rollbacks)
Experience with observability stacks such as OpenTelemetry, Prometheus, Grafana

Nice to have

Experience with feature stores, embedding pipelines, and online/offline consistency (freshness guarantees, correctness validation)
Experience with data lakehouse/table formats and optimizations eg partitioning, compaction, and incremental processing
Experience with GPU inference serving (Triton, ONNX Runtime/TensorRT) and performance techniques (batching, request shaping, tail-latency reduction)
Background in cost/performance modeling, capacity planning, and reliability improvements for high-scale data platforms
Experience in Ads/search/recommendations or other high-scale systems where freshness, latency, and cost are important

Microsoft Corporation - All Job Offers

Select Country

Senior Software Engineer - Real-Time Workflows & ML Serving

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Senior Software Engineer - Real-Time Workflows & ML Serving

Senior Software Engineer, ML Platform

FX STIRT Technology - Senior Software Engineer (SVP)

Digital Software Engineer Senior Manager - Vice President

Senior Software Engineer, AI

Senior Software Platform Engineer

Senior AI Software Engineer

Senior Software Developer

Senior Machine Learning Engineer

Our AI answers in your language