Sovereign AI Field Application Engineer Job at AMD

Sr Principal Site Reliability Engineer (Sovereign Cloud)

The Prisma Access team is seeking a seasoned Principal Site Reliability Engineer...

Location

Bulgaria , Sofia

Salary:

Not provided

Palo Alto Networks

Expiration Date

Until further notice

Requirements

10+ years of experience in Infrastructure, SRE, or DevOps roles
BS or MS in Computer Science, a related field, or equivalent professional experience
7+ years of experience with GCP, and expertise in their architecture, services and PKI concepts for cloud security
Expert troubleshooting skills to resolve cloud infrastructure and service issues, effectively identifying root cause and devising effective solutions
Proficiency in automation using Python and shell scripting
Expertise in Infrastructure as Code (IaC) with Terraform and Helm, leveraging AI tools for development
Solid experience with Kubernetes, container networking, and container workloads
Strong Linux administration skills
Proficiency with CI/CD pipelines, GitOps principles, and tooling like GitLab and Jenkins
Excellent written and verbal communication skills, with the ability to collaborate effectively to drive outcomes

Job Responsibility

Design, build, and operate reliable, secure Cloud infrastructure across multi-cloud environments for our sovereign customers
Lead cross-functional initiatives to ensure applications are production-ready, scalable, secure, and resilient
Develop expertise in new technologies, embracing continuous learning and the adoption of AI tools
Develop tools and automation frameworks, championing Infrastructure as Code (IaC) and Monitoring as Code (MaC) principles
Automate robust deployments and orchestrate end-to-end monitoring and alerting solutions
Participate in on-call rotations to support critical business and production systems
Lead root cause analysis of critical issues, driving improvements and preventing recurrence
Champion the success of SRE and DevOps initiatives, aligning technical decisions with business goals

Fulltime

Senior Azure ACE Engineer - Sovereign Cloud

As a customer focused Advanced Cloud Engineer, you are the primary engineering c...

Location

Ireland , Dublin

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree Engineering, Computer Science, Information Technology (IT), Data Analytics/Science, Artificial Intelligence (AI), or related field AND relevant experience in technology industry, cloud, technical support, and/or customer experience engineering OR equivalent experience.
Demonstrated experience supporting and troubleshooting enterprise level, mission-critical applications and infrastructure resolving complex issues/situations and driving technical resolution across cross-functional organizations.
Familiarity with operating in regulated or restricted‑access environments, such as sovereign cloud, high‑security, or compliance‑driven workloads is an advantage.
Experience with being on-call and driving mitigation for mission critical incidents.
Demonstrated hands-on experience with expertise in one or more of the following Cloud technologies: Core IaaS: Compute (Windows/Linux), Storage, Networking, Kubernetes, High Availability
Data Platform and Bigdata: Azure SQL DB, Cosmos DB, PostgreSQL on Azure, Azure Data Bricks, Azure Data Factory, AI/ML
Azure PaaS Services: App Services, Azure Functions, Redis Cache, and Event Hub.
Monitoring technologies: Azure Monitor, Log Analytics, Grafana, Datadog, Confluent and similar technologies.
Communication skills: ability to empathize with customers and convey confidence. Able to explain highly technical issues to varied audiences. Able to prioritize and advocate customer’s needs to the proper channels. Take ownership and work towards a resolution.
Customer Obsession: Passion for customers and focus on delivering the right customer experience.

Job Responsibility

With minimal oversight, track customer incidents, engage with strategic customers and partners to understand issues, contribute to troubleshooting through diagnostics, communicate progress and next steps to customers with a focus on reducing time taken to mitigate critical incidents.
Use engineering and support tools, customer telemetry and/or direct customer input to detect and flag issues in the products or with the customer usage of the products.
Help customers stay current with best practices by sharing content.
Identify and leverage developmental opportunities across product areas and business processes (e.g., mentorships, shadowing, trainings) for professional growth and to develop technical skills to resolve customer issues.
With minimal guidance, serve as a connecting point between the product team and customers throughout the engagement life cycle, engage with customers to understand their business and availability needs, develop and offer proactive guidance on designing configurations and deploying solutions on Azure with support from subject matter experts.
Handle critical escalations on customer issues from the customer or support or field teams, conduct impact analysis, help customers with answers to their technical questions, and serve as an escalation resource in areas of subject matter expertise.
Conduct in-depth root cause analysis of issues and translates findings into opportunities for improvement and track and drive them as repair items.
Act as the voice of customers and channel product feedback from strategic customers to product groups. Identify customer usage patterns and drive resolutions on reoccurring issues with product groups. Close the feedback loop with the customers on product features.
With minimal guidance, partner with other teams (e.g., program managers, software engineers, product, customer service support teams), prioritize, unblock, and resolve critical customer issues.
Collaborate with stakeholders to support delivery of solutions to strategic customers and resolving customer issues.

Fulltime

Principal AI Factory Solution Product Manager

Product Manager - AI Factory Solution

Location

United States , Spring

Salary:

152000.00 - 349000.00 USD / Year

Hewlett Packard Enterprise

Expiration Date

July 27, 2026

Requirements

Bachelor's degree in Computer Science, Engineering, Business, or a related field
MBA or advanced degree preferred
10+ years of product management experience, with at least 5 years focused on AI/ML products or solutions
Demonstrated ability to build large-scale AI solutions that bring together hardware, software and services into a cohesive offering
Strong understanding of AI technologies, including AI/ML lifecycle (training, tuning, inferencing), large language models, computer vision, and cloud-based AI platforms (e.g., AWS SageMaker, Microsoft AzureML, Google AI)
Proven track record of launching successful AI products, with experience in agile methodologies and tools like Jira
Background in High Performance Computing (HPC) and experience blending it with AI workloads will be an advantage
Excellent analytical skills, with proficiency in data analysis and market testing
Outstanding communication and stakeholder management abilities, capable of presenting to technical and non-technical audiences up to the senior executive/SVP levels
Ability to thrive in a startup-like fast-paced, innovative environment with strong problem-solving skills

Job Responsibility

Define and drive the overall AI factory at-scale and sovereign solution vision, roadmap, and features, while closely aligning with customer needs and HPE strategic goals
Define and drive the key software components necessary for the solution, which may be a mix of HPE developed, commercial and community IP
Conduct market research, competitive analysis, and customer interviews to identify AI factory opportunities and validate solution ideas and software features in a quick turn manner
Collaborate with engineers, product managers and presales architects to translate requirements into technical specifications and prototypes
Oversee the software integration and end-to-end solution lifecycle, from feature ideation and MVP development to launch, iteration, and scaling
Monitor solution performance using KPIs like full-stack wins, product mix, customer satisfaction, and iterate offering based on data insights
Work with legal, finance, pricing and supply chain to setup and manage resale contracts for commercial SW
Partner with sales and marketing to develop go-to-market strategies, pricing models, support strategies and customer enablement materials
Ensure solution complies with ethical AI standards while ensuring highest level of data privacy and sovereignty (e.g., GDPR, CCPA)
Stay abreast of AI trends, such as generative models, agentic AI, and industry applications

What we offer

Health & Wellbeing
Personal & Professional Development
Unconditional Inclusion

Fulltime

Senior Principal AI Infrastructure Architect

The Senior Principal AI Infrastructure Architect is a highly skilled and advance...

Location

Italy , Milano

Salary:

Not provided

NTT DATA

Expiration Date

Until further notice

Requirements

Significant experience in a consulting, presales or architecture role within a large-scale (preferably multi-national) technology services environment, with a track record of leading AI infrastructure pursuits
Demonstrable experience designing and delivering production AI platforms — from single multi-GPU servers through to multi-rack training clusters and inference factories
Strong working knowledge of the AI hardware vendor landscape (NVIDIA, AMD, Intel, Dell, HPE, Lenovo, Supermicro, Cisco, Pure, VAST, WEKA, DDN, NetApp) and how to position partner ecosystems competitively
Proven ability to translate AI workload requirements (model size, parameter count, sequence length, throughput SLOs, latency targets) into accurate hardware bills of materials and sizing justifications
Significant client engagement and consulting experience, including client needs assessment, change management and the ability to identify whitespace for follow-on AI infrastructure and managed-services work
Significant business development and presales experience on infrastructure-led deals, ideally including sovereign AI, AI Factory or regulated-industry GenAI programmes
Strong understanding of how AI infrastructure integrates with business processes, applications, data platforms and existing enterprise architecture
Bachelor's degree or equivalent in Information Technology, Engineering, Computer Science or a related field
Deep, hands-on knowledge of AI hardware: GPU and accelerator portfolios (NVIDIA Hopper / Blackwell, AMD MI300/MI325, Intel Gaudi 3, emerging custom silicon), host CPU platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), system topologies (HGX, DGX, MGX, OAM) and how each choice maps to specific AI workloads
Strong understanding of AI-class storage: parallel filesystems, all-flash NVMe platforms, S3-class object stores, checkpoint and dataset pipelines and the I/O patterns of large-scale training and inference (VAST, WEKA, DDN EXAScaler, Pure FlashBlade, NetApp ONTAP AI, Dell PowerScale)

Job Responsibility

Lead the end-to-end design of large, complex AI infrastructure solutions — covering accelerated compute (NVIDIA H100/H200/B200 and GB200 NVL72, AMD Instinct MI300X/MI325X, Intel Gaudi 3), CPU host platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), high-throughput storage tiers and lossless AI fabric — for enterprise, sovereign AI and AI Factory clients
Architect reference designs built on NVIDIA DGX/HGX SuperPOD, Dell AI Factory with NVIDIA, Cisco Nexus HyperFabric AI, HPE / Lenovo / Supermicro accelerated compute and equivalent platforms, balancing single-node performance with cluster-scale efficiency
Size and validate GPU clusters against real workloads — foundation-model pre-training, distributed fine-tuning, RAG, real-time and batch inference — using the right combination of NVLink/NVSwitch domains, InfiniBand NDR/XDR or Ultra Ethernet / NVIDIA Spectrum-X fabrics and tiered NVMe and parallel storage (VAST, WEKA, DDN, Pure FlashBlade, NetApp ONTAP AI, Dell PowerScale)
Define the supporting datacenter design: high-density power (50–140 kW/rack), direct-to-chip and rear-door liquid cooling, structured cabling for AI fabrics and modular deployment models across on-prem, colo and sovereign-cloud footprints
Work closely with the sales team to drive the presales process for AI infrastructure pursuits — client discovery, technical workshops, proposal writing, executive presentations and bid defence
Translate clients' AI ambitions and business outcomes into a hardware and platform roadmap, positioning NTT DATA's end-to-end portfolio — silicon, systems, storage, fabric, MLOps stack and managed services — to land service-led AI solutions
Lead integration of compute, storage, networking, the AI software stack (CUDA, ROCm, Triton, NIM, NVIDIA AI Enterprise, Run:ai, Slurm, Kubernetes / Kubeflow) and managed-service operating models across multiple domains, delivery units and geographies
Build business cases, TCO and unit-economics models (cost per token, cost per training run, GPU-hour economics) and end-to-end transition roadmaps for cloud-to-private AI migrations and sovereign AI deployments
Define architectural principles for AI infrastructure — accelerator utilisation, data gravity, multi-tenancy, model lifecycle, energy efficiency — and apply them to influence architectural outcomes and governance
Develop As-Is, Vision, FMO and To-Be AI platform architectures, identify gaps and develop transition roadmaps

Fulltime

Sr Cloud Solution Architect - Cloud & AI Data

Join Microsoft’s US Public Sector Industries DIB Team—where mission meets innova...

Location

United States , St. Louis

Salary:

106400.00 - 203600.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor’s Degree in Computer Science, Information Technology, Engineering, Business, Liberal Arts, or a related field AND 4+ years of experience in cloud/infrastructure technologies, IT consulting or support, systems administration, network operations, software development or support, technology solutions, architecture, or consulting OR equivalent experience
Active U.S. Government Top Secret Security Clearance
U.S. citizenship
Ability to pass Microsoft Cloud background check
Technical expertise in Azure Data Services, Synapse, Postgres, SQL, Databricks, Fabric, and Purview
Proficiency in Azure Kubernetes Service (AKS) and Azure API Management (APIM)
Exposure to DevSecOps principles and practices
Familiarity with DIB mission priorities, including compliance frameworks such as FedRAMP High, ITAR, and DFARS
Experience delivering data platform solutions in regulated or classified environments, including Azure Government, GCC High, and sovereign cloud deployments
Strong understanding of Zero Trust architecture, agentic DevOps, and secure-by-design principles for disconnected and mission-critical environments

Job Responsibility

Leading the modernization of customer data estates using Microsoft’s cloud-native services by designing and deploying secure, scalable architectures with Azure Data Services, Synapse, Fabric, and Purview
Aligning data platform strategies to mission outcomes, especially in regulated and classified environments
Supporting both commercial defense contractors and federal agencies through tailored data solutions, integrating Zero Trust principles, data governance, and compliance frameworks (e.g., ITAR, CMMC, FedRAMP)
Ensure data security across GCC, GCCH, and sovereign cloud environments and collaborate with security CSAs to deliver Purview, Information Protection, and Insider Risk Management capabilities
Proficient in usage of Azure Application services, including Azure AI Gateway, Azure AI Foundry, Azure Kubernetes Service (AKS), and GitHub Copilot
Support Azure Commercial, Government, Secret, and Top Secret and FedRAMP High environments with deep technical guidance on compliance, resiliency, and Zero Trust architectures

Fulltime

Senior Cloud Solution Architect - Apps

Join Microsoft’s US Public Sector Industries DIB Team—where mission meets innova...

Location

United States , St. Louis

Salary:

106400.00 - 203600.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor’s Degree in Computer Science, Information Technology, Engineering, Business, Liberal Arts, or a related field AND 4+ years of experience in cloud or infrastructure technologies, IT consulting or support, systems administration, network operations, software development or support, technology solutions, architecture, or consulting OR equivalent experience
Active U.S. Government Top Secret Security Clearance
U.S. citizenship
Ability to work on site in St. Louis, MO
Deep understanding of Azure Application services, including Azure AI Gateway, Azure AI Foundry, Azure Kubernetes Service (AKS), and GitHub Copilot
Support Azure Government, Secret, and Top Secret and FedRAMP High environments with deep technical guidance on compliance, resiliency, and Zero Trust architectures
Understanding of FedRAMP, ITAR, DFARS, and Zero Trust architectures for Azure Gov and Secret environments
Knowledge in Azure secure enclaves and MS-ISR, specifically application and data architecture, RMF/ATO awareness, IL6 aligned data and application patterns, and secure data movement (batch ingestion, controlled transfer models)
App Platform Expertise: Proficiency in Azure App modernization, Logic Apps, containerization patterns, API-based integration, DevSecOps pipelines, CI/CD under disconnected or semi-connected conditions, and integration with M365 workloads
Demonstrated technical depth in Azure application services, including Azure Functions, Logic Apps, Power Platform, and AI integration

Job Responsibility

Architect and deliver agentic AI applications and secure DevOps pipelines tailored to DIB mission platforms, systems integrators, and digital-native defense startups
Lead technical engagements that accelerate secure, AI-powered transformation across mission-critical Defense workloads in an Air-gapped Cloud environment
Collaborate with engineering, delivery, and account teams to modernize platforms and applications in enclave-based deployments and drive innovation aligned to national security priorities
Translate mission workloads (Apps & Data) into deployable architectures, supporting secure data platforms, app hosting patterns, and DevSecOps pipelines
Deliver deep technical expertise in Azure application modernization and agentic AI, drive usage excellence across mission workloads, and accelerate adoption of Microsoft’s cloud and AI platforms within classified, sovereign, and disconnected environments

Fulltime

Principal AI Ops Architect

Scale’s rapidly growing Global Public Sector team is focused on using AI to addr...

Location

Qatar; United Kingdom , Doha; London

Salary:

Not provided

Scale

Expiration Date

Until further notice

Requirements

6+ years in a high-impact technical role (SRE, FDE or MLOps) with experience in the public sector
Familiarity with international government security standards and the complexities of deploying sovereign AI
Proven experience maintaining production-grade applications with a deep understanding of the full request lifecycle-connecting frontend/API layers to the backend and AI core
Proficiency in coding and the modern AI infrastructure, including Kubernetes, vector databases, agentic development, and LLM observability tools
Ownership: You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them
Reliability: You understand that in the public sector, a model failure may be a risk to public safety or privacy
Customer communication: The ability to explain to a high-ranking official why the performance of the system has degraded and how we are fixing it

Job Responsibility

Own the production outcome: Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies
Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment
Scale the feedback loop: Build automated systems to monitor model performance and data drift across geographically dispersed environments, ensuring the right levels of reliability
Navigate global compliance: Manage the technical lifecycle within diverse regulatory frameworks
Incident command: Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again
Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials
Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases

AI Applications Ops Lead

Scale’s rapidly growing International Public Sector team is focused on using AI ...

Location

Qatar; United Kingdom , Doha; London

Salary:

Not provided

Scale

Expiration Date

Until further notice

Requirements

6+ years in a high-impact technical role (SRE, FDE or MLOps) with experience in the public sector
Familiarity with international government security standards and the complexities of deploying sovereign AI
Proven experience maintaining production-grade applications with a deep understanding of the full request lifecycle-connecting frontend/API layers to the backend and AI core
Proficiency in coding and the modern AI infrastructure, including Kubernetes, vector databases, agentic development, and LLM observability tools
Ownership: You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them
Reliability: You understand that in the public sector, a model failure may be a risk to public safety or privacy
Customer communication: The ability to explain to a high-ranking official why the performance of the system has degraded and how we are fixing it

Job Responsibility

Own the production outcome: Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies
Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment
Scale the feedback loop: Build automated systems to monitor model performance and data drift across geographically dispersed environments, ensuring the right levels of reliability
Navigate global compliance: Manage the technical lifecycle within diverse regulatory frameworks
Incident command: Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again
Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials
Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases

Select Country

Sovereign AI Field Application Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?