This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
In this role, you will lead sustaining and performance improvements across Teradata’s memory and execution systems, ensuring correctness, reliability, and customer trust under complex, large-scale workloads. You will focus on diagnosing and resolving memory-driven system behaviors that manifest in customer-impacting scenarios, particularly under AI-driven and high-concurrency workloads.
Job Responsibility
Lead sustaining and performance improvements across Teradata’s memory and execution systems, ensuring correctness, reliability, and customer trust under complex, large-scale workloads
Focus on diagnosing and resolving memory-driven system behaviors that manifest in customer-impacting scenarios, particularly under AI-driven and high-concurrency workloads
Drive long-term improvements by identifying systemic gaps in memory behavior and influencing design, validation, and operational practices across the platform
Own end-to-end resolution and prevention of classes of customer-impacting issues related to memory pressure, allocation, and execution-path behavior
Lead cross-team technical investigations involving SQLE, PDE, OS, networking, infrastructure, and performance teams to resolve complex, system-wide issues
Influence engineering decisions through deep technical insights derived from real-world system behavior under memory pressure
Diagnose and resolve complex system behaviors driven by memory pressure, resource contention, and cross-layer execution dynamics
Identify and resolve cases that appear as SQL Engine issues but originate in memory or execution-layer interactions
Provide PDE-adjacent L4 triage and escalation support, particularly for high-severity incidents
Drive root cause analysis and translate field learnings into systemic engineering improvements
Drive collaboration with System Performance Engineering to address cross-layer memory and concurrency challenges
Develop and scale memory-focused diagnostic playbooks and best practices across the organization
Deliver and drive targeted code fixes and enhancements in memory- and PDE-adjacent execution paths when required
Requirements
10+ years of experience in distributed systems, database engine, performance engineering, or large-scale data platforms
Strong expertise in diagnosing complex, cross-layer issues involving memory, OS, networking, infrastructure, concurrency, and execution behavior in production environments
Deep understanding of system internals (e.g., SQLE, PDE, or equivalent execution/runtime environments)
Demonstrated ability to lead technical investigations and influence engineering decisions across teams
Strong problem-solving and communication skills in cross-functional engineering environments
Bachelor’s or Master’s degree in Computer Science, Information Systems, or a related field
Nice to have
Experience with OS-level diagnostics and understanding of system resource behavior (e.g., memory, processes, I/O, scheduling)
Strong scripting skills (e.g., Python) to develop diagnostic tools, automate analysis, and process large-scale system data
Familiarity with networking and cloud infrastructure concepts to support end-to-end diagnosis of distributed system behavior
Experience with database execution internals or platform runtime systems
Strong background in memory behavior, resource management, and performance under concurrency
Experience working on high-severity production incidents (Sev1/Sev2)
Familiarity with performance analysis, observability tools, and diagnostic frameworks
Understanding of spill behavior, caching strategies, and execution-path optimization
Ability to distinguish between surface symptoms and underlying system causes (e.g., SQL vs memory vs execution)
Experience translating field issues into actionable engineering improvements
Experience working across global teams and influencing complex engineering organizations
Ability to stay current with emerging trends in engineering, AI, and cloud-native technologies