CrawlJobs Logo

Senior Software Development Engineer, AI Open-Source Software

United States, Santa Clara Employment contract 204000.00 - 306000.00 USD / Year · Job Posted May 14, 2026
Apply Position
Job Link Share

Job Description

AMD is looking for a principal software developer to join our growing team. As a key contributor you will be part of our ROCm GPU-compute mathematical libraries team working on innovative hardware and software technologies. You will help deliver exceptional performance and feature enhancements via maintainable code development, optimizations/tuning, and collaboration.

Job Responsibility

  • Develop software in C++, Python, HIP, assembly, and SOTA programming technologies to enable key mathematical operations on GPU
  • Design GPU computational software libraries for AI, HPC applications
  • Aid management in planning, and delivering industry-leading software for current and future processors
  • Supervise small development team
  • Carry-out performance optimizations and projections for important use-cases to maximize hardware utilization
  • Support development of programs to sustain seamless performance analysis, and performance/functional test coverage
  • Identify and help resolve quality issues working closely with libraries development teams and other internal engineering teams

Requirements

  • 10+ years professional software development experience
  • Demonstrated capacity to technically lead and people manage junior to mid-level developers
  • Proficient in C/C++ & Python programming employing best software design practices
  • GPU software development or validation involving HIP, CUDA, or OpenCL
  • Experience with software libraries and API design
  • Exposure to Matrix/Tensor operations and numerical work
  • Software emulation to support FP numerical formats is a plus
  • Experience in software performance estimations, optimizations and debugging
  • Ability to closely interact with technical leads, developers, and test teams to maintain and release production software

Nice to have

Software emulation to support FP numerical formats is a plus

What we offer

Benefits offered are described: AMD benefits at a glance

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Software Development Engineer, AI Open-Source Software

8 matching positions

Senior Research Software Development Engineer, MSR AI for Science

We are on the cusp of a new frontier in which machine learning and artificial in...
Location
Location
Netherlands , Schiphol
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's degree or equivalent work experience in Computer Science, Physics, Engineering, Chemistry, Mathematics or a related field
  • Strong familiarity with Linux and the open-source ecosystem
  • Proficient working with large datasets in a cloud or HPC environment
  • Proficient in building and optimizing distributed systems and large-data applications, including those using tensor accelerators or GPUs
  • Strong analytical, problem-solving, and communication skills
  • Passionate about pushing the boundaries of science
Job Responsibility
Job Responsibility
  • Architect, design, and implement scalable and robust solutions for machine learning and scientific research involving large volumes of heterogeneous data
  • Build and optimize distributed data processing and model building pipelines
  • Develop and maintain tools and technologies for building, training, optimizing, scaling machine learning solutions
  • Collaborate with cross-functional teams, including scientists, researchers, and software engineers
  • Document and share best practices across the organization
  • Maintain the highest standards in code quality and software design
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Managed AI - AI model LifeCycle

The Senior Software Engineer for the Model LifeCycle team will contribute to bui...
Location
Location
United States , San Francisco
Salary
Salary:
172425.00 - 209000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or a related field
  • Experience delivering production-ready features
  • Familiarity with essential cloud-based services (e.g., compute, storage, networking)
  • Familiarity with Generative AI (Large Language Models, Multimodal)
  • Experience with AI infrastructure components (training, inference)
  • 4-5+ years of industry experience with demonstrated history of consistent success leading a varied portfolio of initiatives across your function
Job Responsibility
Job Responsibility
  • Implement and maintain systems for fine-tuning large foundation models (SFT, PEFT, LoRA, adapters), including multi-node orchestration, checkpointing, failure recovery, and cost-efficient scaling
  • Implement and maintain end-to-end training pipelines for Large Language Models
  • Implement components for distillation and reinforcement learning pipelines (e.g., preference optimization, policy optimization, reward modeling)
  • Develop and maintain core agent execution infrastructure
  • Implement features for dataset, model, and experiment management, focusing on versioning, lineage, evaluation, and reproducible fine-tuning
  • Work closely with Senior Engineers and Principal Engineers, as well as product and platform teams, to implement system abstractions and APIs
  • Contribute to technical discussions on training runtimes, scheduling, storage, and model lifecycle management
  • Engage with the open-source LLM ecosystem
What we offer
What we offer
  • Restricted Stock Units
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Managed AI - AI Platform

Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'...
Location
Location
United States , San Francisco, CA; Sunnyvale, CA
Salary
Salary:
172425.00 - 209000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree in Computer Science/Engineering
  • 4-5+ years of industry experience with demonstrated history of consistent success leading a varied portfolio of initiatives across your function
  • Experience with distributed systems, cloud services (compute, storage, networking, database), and delivering early-stage projects quickly
  • Experience with Generative AI (LLMs, Multimodal) and familiar with AI infrastructure (training, inference, ETL pipelines)
  • Proficient with container runtimes (e.g., Kubernetes), microservices, REST APIs, gRPC, and the full software development lifecycle including CI/CD
Job Responsibility
Job Responsibility
  • Lead the design and implementation of core AI services, including: Resilient fault-tolerant queues for efficient task distribution
  • Model catalogs for managing and versioning AI models
  • Scheduling mechanisms optimized for cost and performance
  • Architect and scale infrastructure to handle millions of API requests per second
  • Implement robust monitoring and alerting to ensure system health and 24/7 availability
  • Collaborate closely with product management, business strategy, and other engineering teams to define the AI platform roadmap
  • Influence the long-term vision and architectural decisions of the platform
  • Contribute to open-source AI frameworks and actively participate in the AI community
  • Prototype and rapidly iterate on emerging technologies and new features
What we offer
What we offer
  • Restricted Stock Units
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Senior Software Engineer- AI and Data Governance

At GEICO, we offer a rewarding career where your ambitions are met with endless ...
Location
Location
United States , Palo Alto
Salary
Salary:
100000.00 - 215000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advance knowledge of at least one modern OOP languages such as Go, Python, Java, etc.
  • Advance knowledge of web technologies such as HTML, CSS, JavaScript is preferred
  • Understand open-source databases like MySQL, PostgreSQL, etc., familiar with No-SQL databases like Cassandra, MongoDB, Elasticsearch, etc.
  • Experience in architecting, designing, building automation, workflows, custom objects/apps, declarative functionality, triggers, migration tools in BMC Helix platform and transition such platform to Open Source is a big plus
  • Experience building and configuring flows, and process builders
  • Strong understanding of web service integration (GRPC / REST) and enterprise middleware integration tiers
  • Ability to articulate channel dataflow and process flow including email, messaging, chat, mobile Push and SDK's
  • Excellent communication skills – needs to be able to lead projects from the front and interact with clients and sponsors on a regular basis
  • Experience partnering with engineering teams and transferring research to production
  • Experience with continuous delivery (CI/CD) and Infrastructure as Code
Job Responsibility
Job Responsibility
  • Collaborate with product managers, team members, customers, and other engineering teams to solve our toughest problems
  • Develop and execute technical software development strategy for the Platform Engineering domain including Service Management, Business Continuity, Recovery, Incident Response and Paging platforms
  • Accountable for the quality, usability, and performance of the solutions
  • Deep hands-on experience in complex system design and data pipeline and architectures, scale and performance, tuning, with good knowledge on Docker and Kubernetes
  • Consistently share best practices and improve processes within and across teams
  • Willing to take on-call and operational support
  • Experience designing recommendation systems, ranking, personalization, similarity search and embeddings
  • Experience with NLP, LLMs and RAG, as well as translating natural language into graph or data queries
  • Experience designing scalable AI systems and Data pipelines
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Senior Software Development Engineer

We are seeking an experienced and highly technical SMTS Software Development Eng...
Location
Location
United Kingdom
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or related technical field
  • 8+ years of software engineering experience in systems software, runtime libraries, GPU programming, or compiler/runtime interfaces
  • Strong proficiency in modern C++ (C++14/C++17 or newer), templates, memory models, and low‑level systems programming
  • Deep understanding of at least one GPU computing model (HIP, CUDA, SYCL, OpenCL, OpenMP offload)
  • Hands‑on experience with runtime systems, driver interfaces, or high‑performance compute libraries
  • Strong debugging skills using tools such as gdb, sanitizers, profilers, and GPU debugging tools
  • Solid understanding of parallel programming concepts—memory hierarchy, synchronization, concurrency, thread scheduling
Job Responsibility
Job Responsibility
  • Architect, implement, and optimize features in the HIP runtime, including memory management, kernel dispatch, device abstraction, multi‑GPU coordination, and synchronization primitives
  • Contribute to the evolution of the HIP programming model and interoperability with ROCr, HSA runtime, and compiler toolchains
  • Ensure functional correctness, performance, and scalability of runtime APIs across different GPU generations
  • Conduct root‑cause analysis and systems‑level debugging across the runtime, driver, compiler, and hardware layers
  • Profile GPU applications and internal runtime components to identify bottlenecks and design performance improvements
  • Optimize HIP runtime behavior for large-scale AI, HPC, and cloud workloads
  • Work closely with compiler teams (LLVM/Clang), driver teams, GPU architecture, and systems engineers to deliver end‑to‑end GPU software solutions
  • Contribute to API specifications and collaborate with upstream open-source communities where appropriate
  • Define and drive technical strategy for correctness, reliability, and conformance of the HIP runtime
  • Support enhancements in automated testing, CI, and stress/failure scenarios in the HIP test suite
Read More
Arrow Right

Senior Principal Software Engineer - AI Governance

As a Senior Principal Software Engineer, you will serve as a technical leader fo...
Location
Location
United States , San Francisco
Salary
Salary:
165000.00 - 220000.00 USD / Year
onetrust.com Logo
OneTrust
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in computer science, Engineering, or related technical or business field
  • 12+ years of professional software engineering/development experience
  • Strong expertise in Java/J2EE, Spring, design patterns, microservices architecture, and cloud-native distributed systems
  • Proven experience building production-grade agentic AI systems with robust tool-use, guardrails, and observability for enterprise-scale workloads
  • Solid understanding of RAG pipelines, agent workflows, model orchestration, and evaluation practices
  • Working experience with one or more AI platforms like Amazon SageMaker, Google Vertex, AWS Bedrock etc.
  • Experience with elastic search and data streaming tools like Kafka
  • Good understanding of web services and SOA related standards like REST/OAuth/JSON
  • Moderate understanding of code and script (Python, Bash)
  • Good experience with SQL and NoSQL databases
Job Responsibility
Job Responsibility
  • Lead the design and development of Java/Python microservices and shared libraries integrating with AI platforms for OneTrust's AI Governance product
  • Design, build, and test cloud-native applications deployed on Microsoft Azure using Core Java, REST, and the Spring ecosystem
  • Build features with RAG, agent workflows, and model orchestration
  • Own technical design for critical systems, ensuring scalability, security, and reliability
  • Maintain strong automated unit/integration test coverage and engineering standards
  • Work closely with UX, Product Managers and/or Product Owners, as well as other developers to contribute to planning and grooming sessions and drive team's discussions on system architecture and component design
  • Partner with Product, UX, and Customer Success to understand customer AI use cases and governance needs
  • Lead architecture discussions and technical planning
  • Drive alignment across teams on platform design and priorities
  • Contribute to sprint planning and delivery
What we offer
What we offer
  • Comprehensive healthcare coverage
  • flexible PTO
  • equity RSUs
  • annual performance bonus opportunities
  • retirement account support
  • 14+ weeks of paid parental leave
  • career development opportunities
  • company-paid privacy certification exam fees
  • Fulltime
Read More
Arrow Right

Senior Principal Software Engineer - AI Governance

As a Senior Principal Software Engineer, you will serve as a technical leader fo...
Location
Location
United States , Atlanta
Salary
Salary:
165000.00 - 220000.00 USD / Year
onetrust.com Logo
OneTrust
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in computer science, Engineering, or related technical or business field
  • 12+ years of professional software engineering/development experience
  • Strong expertise in Java/J2EE, Spring, design patterns, microservices architecture, and cloud-native distributed systems
  • Proven experience building production-grade agentic AI systems with robust tool-use, guardrails, and observability for enterprise-scale workloads
  • Solid understanding of RAG pipelines, agent workflows, model orchestration, and evaluation practices
  • Working experience with one or more AI platforms like Amazon SageMaker, Google Vertex, AWS Bedrock etc.
  • Experience with elastic search and data streaming tools like Kafka
  • Good understanding of web services and SOA related standards like REST/OAuth/JSON
  • Moderate understanding of code and script (Python, Bash)
  • Good experience with SQL and NoSQL databases
Job Responsibility
Job Responsibility
  • Lead the design and development of Java/Python microservices and shared libraries integrating with AI platforms for OneTrust’s AI Governance product
  • Design, build, and test cloud-native applications deployed on Microsoft Azure using Core Java, REST, and the Spring ecosystem
  • Build features with RAG, agent workflows, and model orchestration
  • Own technical design for critical systems, ensuring scalability, security, and reliability
  • Maintain strong automated unit/integration test coverage and engineering standards
  • Work closely with UX, Product Managers and/or Product Owners, as well as other developers to contribute to planning and grooming sessions and drive team’s discussions on system architecture and component design
  • Partner with Product, UX, and Customer Success to understand customer AI use cases and governance needs
  • Lead architecture discussions and technical planning
  • Drive alignment across teams on platform design and priorities
  • Contribute to sprint planning and delivery
What we offer
What we offer
  • Comprehensive healthcare coverage
  • Flexible PTO
  • Equity RSUs
  • Annual performance bonus opportunities
  • Retirement account support
  • 14+ weeks of paid parental leave
  • Career development opportunities
  • Company-paid privacy certification exam fees
  • Fulltime
Read More
Arrow Right

Senior Staff Software Engineer - AI

GEICO is seeking an experienced Engineer with a passion for building high-perfor...
Location
Location
United States , Seattle, WA; Austin, TX; Palo Alto, CA; Chicago, IL; Dallas, TX
Salary
Salary:
110000.00 - 230000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience building and deploying ML systems in production with cross-functional engineering teams
  • Fluency in at least two modern languages such as Python, Go, Java, C++, or C# including object-oriented design
  • Experience architecting multi-component ML platforms using open-source/cloud-agnostic components: Datastores: PostgreSQL, NoSQL (MongoDB, Cassandra, CosmosDB) Streaming: Kafka, Flink, or Spark Streaming
  • Experience with end-to-end ML lifecycle: version control, CI/CD, Kubernetes, testing, monitoring, and production support
  • Experience with cloud providers (Azure, AWS or GCP) in production ML environments
  • Experience with observability tools and distributed systems monitoring, logging, tracing, and root cause analysis
  • Experience building multi-agent systems using LLMs and agentic frameworks (e.g., LangChain, LangGraph, AutoGen, Semantic Kernel, CrewAI)
  • Hands-on experience with RAG, semantic search, and vector databases (e.g., Milvus, pgvector, Qdrant, ElasticSearch)
  • Experience designing human-in-the-loop workflows and safety controls for autonomous systems
  • Strong architecture and design skills with ability to influence technical direction and roadmap
Job Responsibility
Job Responsibility
  • Design and build a multi-agent AI platform where specialized agents autonomously detect, diagnose, and resolve issues through agent-to-agent (A2A) collaboration
  • Develop intelligent agents using LLMs and agentic frameworks that coordinate detection, diagnostic, remediation, and knowledge tasks with minimal human intervention
  • Define agent interaction protocols, A2A communication standards, and evaluation frameworks for agent decision quality and autonomous action safety
  • Architect vector database solutions (Milvus, pgvector, Qdrant) for semantic search and RAG to enable context-aware agent decision-making
  • Build end-to-end ML pipelines for severity classification, anomaly detection, failure pattern recognition, and impact forecasting using observability data
  • Establish scalable orchestration infrastructure for multi-agent workflows with CI/CD, automated evaluation, canary releases, and rollback strategies
  • Implement monitoring for agent interactions, A2A communication patterns, decision quality, data drift, and system reliability
  • Lead technical architecture ensuring scalability, observability, and integration with existing alerting, logging, and monitoring systems
  • Define standards for agent safety, explainability, governance, and human-in-the-loop controls for high-impact automated actions
  • Partner with SRE, Product, and Engineering teams to translate reliability goals into measurable ML objectives and maintain pragmatic technical roadmaps
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right