This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
You will architect and deliver high‑performance data ingestion and AI‑ready pipelines across AMD’s lakehouse environment (Apache Iceberg). Your work will shape how enterprise data connects, relates, and becomes useful for analytics and AI/ML. You’ll partner with platform engineers, data scientists, and technical leaders to curate AMD‑specific corpora, unify signals across storage, network, and operations, and operationalize models that infer root causes and improve decision‑making
Job Responsibility:
Build AI‑ready data pipelines: Design scalable ingestion and transformation workflows for structured and unstructured data in Apache Iceberg, enabling schema evolution, partitioning, and performance optimization
Model data relationships across platforms: Connect signals across storage, network, and applications to enable cross‑system anomaly tracing and contextual insights
Operationalize anomaly detection and inference: Implement pipelines/services that detect anomalies, enrich events, route signals to LLM/SLM models, and surface probable root causes
Enable LLM/SLM use cases on AMD data: Curate domain‑specific corpora, define retrieval/feature pipelines, and support training, fine‑tuning, and evaluation workflows
Deliver batch and streaming pipelines at scale: Build near real‑time and scheduled processing using Spark or similar engines with a focus on reliability and low latency
Ensure governance, lineage, and quality: Apply metadata, lineage, and data‑quality checks across pipelines
maintain CI/CD processes and versioned codebases
Lead platform stewardship: Operate and troubleshoot Linux‑based environments
monitor lakehouse performance and cost
Requirements:
Delivered production‑grade AI solutions (anomaly detection, inference, predictive analytics)
Hands‑on work with LLMs/SLMs, including training, fine‑tuning, evaluation, or retrieval‑augmented workflows
Strong Python and SQL
Experience with Spark and the Hadoop ecosystem
Building Iceberg (or similar) table formats with schema evolution and performance tuning
Designing data models that relate signals across storage, networking, and operations
Experience with data governance, metadata/lineage, data‑quality automation, and CI/CD for data/ML
Linux systems expertise, shell scripting, and performance troubleshooting
Bachelor’s or Master’s degree in Computer Science, Software Engineering, Data Engineering, or a related field preferred