CrawlJobs Logo

Devops Software Development Engineer

China, Shanghai · Job Posted May 30, 2026
Apply Position
Job Link Share

Job Description

The AI/ML Frameworks team is hiring an Software Development Engineer to build and maintain scalable DevOps infrastructure that accelerates AMD's AI software development. You will design and own CI/CD pipelines, manage Kubernetes‑based GPU environments, and automate systems using Python, Go, and Ansible. The role involves creating and maintaining production‑grade automation and tooling that enables fast, reliable software delivery across teams.

Job Responsibility

  • Develop deep expertise in build tools and flows (CMake, Bazel, Make, compiler toolchains)
  • Triage complex build failures by understanding the full build pipeline
  • Identify root causes across infrastructure, toolchain, and code-level issues
  • Train and mentor team members on build systems, CI/CD workflows, and debugging techniques
  • Create documentation, runbooks, and training sessions
  • Understand the architecture and codebase of ML frameworks (PyTorch, TensorFlow, ROCm stack)
  • Review, debug, and contribute code changes as needed
  • Design and develop internal tools, automation scripts, and services primarily in Python and Go
  • Design, implement, and manage efficient continuous integration and delivery pipelines using Buildkite, GitHub Actions, and Jenkins
  • Deploy and maintain robust Kubernetes-based environments across both on-premise and cloud platforms
  • Automate provisioning, configuration, and management of infrastructure using Ansible, Python, and Bash
  • Administer application and service deployment in Kubernetes using Helm charts
  • Configure, manage, and maintain GPU-based compute environments
  • Interact with MySQL databases to support dynamic data updates and integrate data sources into Grafana dashboards
  • Work closely with ML framework developers, SREs, and project stakeholders
  • Integrate automated testing frameworks into CI pipelines

Requirements

  • Strong understanding of CMake, Bazel, Make, and compiler toolchains (GCC, Clang, LLVM)
  • Ability to debug complex build failures, understand dependency resolution, and optimize build performance
  • Strong proficiency in Python and Go for building tools, services, and automation
  • The ability to read and modify C++ code is a plus
  • Understanding of ML framework architecture (PyTorch, TensorFlow, JAX, or similar)
  • Ability to navigate large codebases, understand their build systems, and contribute fixes or improvements
  • Experience documenting complex systems and training team members
  • Ability to break down technical concepts and create effective learning materials
  • Proficient with Buildkite, GitHub Actions, Jenkins, Ansible, and scripting for streamlining DevOps workflows
  • Strong experience with Docker, Kubernetes, and Helm for deploying and managing scalable, containerized applications
  • Hands-on experience automating infrastructure provisioning and configuration to ensure reproducibility and scalability across environments
  • Familiarity with GPU server lifecycle management, ROCm/CUDA toolchains, and integration of GPU resources into CI test workflows
  • Experience using tools like Checkmk, Prometheus, and Grafana to monitor infrastructure health and application performance
  • Advanced knowledge of Git-based version control, including branching strategies and CI/CD integration
  • Solid background in Linux environments, including shell scripting and system-level troubleshooting across distributed systems
  • Comfort working in Agile teams and partnering with software, infrastructure, and product teams
  • Bachelor's or Master's degree in Computer Science, Software Engineering, or related technical discipline

Nice to have

Familiarity with C++

What we offer

Benefits offered are described: AMD benefits at a glance

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Devops Software Development Engineer

8 matching positions

Software Development Engineer, DevOps - Lifecycle Platform (US Federal)

About the Team Join a creative and collaborative team delivering platform engin...
Location
Location
United States , McLean
Salary
Salary:
135200.00 - 202900.00 USD / Year
Workday
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A minimum of 5 years of development experience with building services and tooling in one or more of the following languages: Go, Python, Java.
  • A minimum of 2 years of SRE/DevOps experience.
  • A minimum of 2 years of experience automating deployment, scaling, and/or management of containerized applications with Kubernetes
  • A minimum of 2 years of experience with public Cloud platforms: AWS or GCP
  • This role may require a security clearance at the TS/SCI w/CI Poly level. Applicants must have the ability to obtain and maintain a U.S. government issued security clearance. An active TS/SCI w/CI Poly is preferred.
  • Ability to architect, analyze, and support complex distributed systems
  • A degree in Computer Science or related field, or equivalent practical experience
  • Experience debugging and optimizing systems and code
  • Desire to optimize continuous integration and deployment pipelines
  • Skill in automating routine tasks to eliminate toil
Job Responsibility
Job Responsibility
  • Build, run, and deploy large-scale, fault-tolerant systems using Workday's private cloud technology, Amazon Web Services (AWS), and Google Cloud Platform (GCP)
  • Focus on the scalability, availability, automation, performance and security of distributed web services
  • Participate in the team's paid on-call rotation
  • Work closely with engineering teams in the US, Canada, and Ireland to deliver exciting next generation applications and services for Workday
  • Engage in a culture of learning and innovation through hackathons, online course offerings, and employee-led special interest guilds
  • Be a part of Workday's amazing people-first culture and experience fun benefits like clubs and team events
  • Fulltime
Read More
Arrow Right

Senior Software Development Engineer, DevOps

Equip's engineering culture emphasizes agility, collaboration, and ownership, fo...
Location
Location
United States
Salary
Salary:
144000.00 - 160000.00 USD / Year
equip.health Logo
Equip Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree or equivalent training and work experience in Computer Science, Software Engineering, or a related field
  • 5–10 years of experience in DevOps, SRE, Platform Engineering, or Software Engineering roles
  • Deep expertise in AWS and its ecosystem of services
  • Proven track record building cloud infrastructure using Infrastructure as Code (Terraform, CloudFormation)
  • Strong experience with container orchestration and serverless architectures, including ECS/Fargate and Docker
  • Solid understanding of AWS networking concepts, including VPCs, subnets, security groups, route tables, and load balancers
  • Hands-on experience creating and maintaining CI/CD pipelines (e.g., CircleCI, GitLab CI, etc.)
  • Strong experience with scalable backend systems, including microservices, APIs, caching layers, and various databases
  • Experience deploying and managing React and other JavaScript applications using AWS services like CloudFront and S3
  • Experience setting up comprehensive monitoring and alerting for infrastructure, services, and data pipelines
Job Responsibility
Job Responsibility
  • Design and build a robust, scalable cloud platform to empower web and data engineering teams to deliver high-quality applications
  • Partner with engineering and data teams to improve developer velocity, ensure system reliability, and embed operational excellence
  • Lead best practices in cloud infrastructure architecture, CI/CD automation, monitoring, and backend systems reliability
  • Develop tools and automation of a variety of frameworks and languages to enhance the performance, availability, and scalability of services
  • Contribute to a culture of continuous improvement through proactive monitoring, root cause analysis, and knowledge sharing
  • Perform other duties as assigned
What we offer
What we offer
  • Flex PTO policy (3-5 wks/year recommended) + 11 paid company holidays
  • Competitive Medical, Dental, Vision, Life, and AD&D insurance
  • Equip pays for a significant percentage of benefits premiums for individuals and families
  • Maven, a company paid reproductive and family care benefit for all employees
  • Employee Assistance Program (EAP), a company paid resource for mental health, legal services, financial support, and more
  • $50/month stipend added directly to an employee’s paycheck to cover home internet expenses
  • One-time work from home stipend of up to $500
  • Fulltime
Read More
Arrow Right

Devops Lead Software Development Engineer

AMD is looking for an influential software engineer who is passionate about DevO...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Prior experience in a DevOps/Infrastructure engineering position, preferably involving use of cloud platforms such as AWS and GCP
  • Experience of 15-20 Years is a must
  • Experience of delivering solutions using virtualization and containerisation technologies such as Kubernetes, Docker, OpenStack, Rancher, and Harvester
  • Experience in Unix (Linux) Systems Administration
  • Knowledge of automation, orchestration and “Infrastructure as Code” tooling such as Terraform & Ansible
  • Strong technical knowledge with prior experience of leading a team
  • Excellent written and spoken English
  • Experience of developing CI/CD pipelines using common DevOps tooling (Jenkins, Gitlab, BuildBot, Gerrit)
  • Experience of DevSecOps, infrastructure security, zero trust capabilities, and architecture
  • Familiarity with open source project development cycles and contribution processes, particularly around CI/CD infrastructure
Job Responsibility
Job Responsibility
  • Lead the design, implementation, operation and support of one or more of our developer platforms
  • Provision of software build and test solutions catering to different market segments such as Server & DCGPU
  • Lifecycle management of core project infrastructure, from design through to deployment, maintenance, and performance optimization
  • Continuous improvement of workflow quality, security, and efficiency
  • Communication & collaboration with engineering IT, software development teams and the open source community
  • Design and define the overall DevOps architecture/ framework to for a project/ module delivery as per the client requirement
  • Manage and drive the DevOps pipeline that supports the application life cycle across the DevOps toolchain from planning, coding and building, to testing, to staging, to release, configuration and monitoring
Read More
Arrow Right

DevOps Engineer for Software Planning & Development

The I-Talent Program at Scania is an inspiring six-month journey designed for IT...
Location
Location
Sweden , Södertälje
Salary
Salary:
Not provided
academicwork.se Logo
Academic Work
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Post-secondary education in computer science etc.
  • Knowledge of one or more of the following areas: Terraform, Ansible, AWS, Kubernetes, GitLab, Python and Bash
  • Graduate in spring 2026 or has up to maximum 2 years of work experience when the program starts in september 2026
  • Very good knowledge in English as it´s the primary language you will use in your daily work
Job Responsibility
Job Responsibility
  • Managing and optimizing infrastructure using tools like Terraform, Ansible, AWS, and Kubernetes
  • Overseeing GitLab
  • Proactively monitoring system stability and security
  • Developing automation scripts in Python and Bash to streamline operations
  • Working with PostgreSQL, Redis, and other modern database solutions
What we offer
What we offer
  • Hands-on learning and growth opportunities
  • Supportive and inclusive team
  • Collaboration through mob and pair programming
  • Buddy assigned for the first time
  • Good network building opportunities
  • Multicultural work environment
  • Company that invests in leadership
  • Room for everyone to grow
  • Fulltime
Read More
Arrow Right

DevOps Engineer for Software Planning & Development

The I-Talent Program at Scania is an inspiring six-month journey designed for IT...
Location
Location
Sweden , Södertälje
Salary
Salary:
Not provided
academicwork.se Logo
Academic Work
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A post-secondary education in computer science etc.
  • Knowledge of one or more of the following areas: AWS, Kubernetes, Grafana, GitLab, Python and Bash.
  • Graduate in spring 2026 or has up to maximum 2 years of work experience when the program starts in September 2026
  • Very good knowledge in English as it´s the primary language you will use in your daily work
Job Responsibility
Job Responsibility
  • Managing and optimizing infrastructure using tools like AWS, Pulumi, FluxCD and Kubernetes
  • Overseeing Grafana platform
  • Proactively monitoring system stability and security
  • Developing automation scripts in Python and Bash to streamline operations
What we offer
What we offer
  • Six-month I-Talent Program with hands-on experience
  • Training
  • Supportive and inclusive team
  • Collaboration through mob and pair programming
  • Opportunities to develop into leadership and specialist roles
  • Fulltime
Read More
Arrow Right

Software Development Advisor - DevOps-Focused Backend Engineer

We are seeking a Backend Engineer with a strong DevOps mindset to join our engin...
Location
Location
Mexico , GDL (Guadalajara)
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience in backend development with a focus on Java
  • Practical experience building and managing CI/CD pipelines
  • Direct experience implementing workflow automation via GitHub Actions
  • Proficiency in Bash scripting for automation and system administration
  • Professional background in release management and deployment orchestration
  • Working knowledge of JavaScript and modern web technologies
  • A strong understanding of DevOps principles, focusing on the 'developer-first' approach to infrastructure
  • Experience operating effectively within Agile or high-growth environments
Job Responsibility
Job Responsibility
  • Design, develop, and maintain robust backend services primarily utilizing Java
  • Support and enhance CI/CD pipelines, driving automation across build, test, and deployment phases
  • Architect and maintain automated workflows using GitHub Actions to streamline development cycles
  • Author and manage complex Bash scripts to support deployment orchestration and routine operational tasks
  • Coordinate application releases, ensuring processes are reliable, repeatable, and documented
  • Partner with Engineering and Product teams to improve system uptime and refine deployment methodologies
  • Troubleshoot production and deployment-related issues, performing root cause analysis to prevent recurrence
Read More
Arrow Right

Software Engineer / Senior Software Engineer - CoreAI

Azure DevOps is a suite of modern development services that enables software dev...
Location
Location
Czech Republic , Prague
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Solid software development experience
  • Demonstrable experience with C#, C++, Java or any other OOP language
  • Strong analytical skills as well as communication skills both verbal and written
  • Ability to understand unfamiliar code bases, debug client and service side applications (including database stored procedures)
  • Knowledge and experience with Microsoft Azure, AWS or similar cloud computing platforms is preferred
  • Experience with SQL performance tuning (preferably Microsoft SQL Server)
  • Solid understanding of testing principles
  • Ability to prioritize and handle multiple tasks completely and independently and generate clarity in ambiguous situations
  • Troubleshooting skills across network, application, caching, queuing, load-balancing storage and distributed services layers
Job Responsibility
Job Responsibility
  • Design, develop, test and support features, experiences
  • Collaborate on the design and development of features and solutions, contributing to technical direction across business scenarios
  • Support highly available services used by top companies and millions of developers on a daily basis
  • Troubleshooting of complex issues through the entire tech stack including frontend and database layers
  • Participate in on-call rotations with your team. Triage and respond to issues and advocate for opportunities to improve service health
  • Collaborate through pairing and code reviews and contribute to a culture of learning and growth
  • Fulltime
Read More
Arrow Right

Software Engineer / Senior Software Engineer

ARiA is looking for highly motivated self-starters and low-ego team players to j...
Location
Location
United States , Madison; Alexandria; Seattle
Salary
Salary:
Not provided
ariacoustics.com Logo
Applied Research in Acoustics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Applicants selected for employment will be subject to a government security investigation and must meet eligibility requirements, including U.S. citizenship, for access to sensitive information
  • Bachelor’s degree or greater in a relevant technical field (Computer Science, Engineering, or equivalent)
  • Expertise designing and developing code using modern programming/scripting languages such as C, C++, Golang, JavaScript (and variants), and Python
  • Expertise developing and deploying software in an agile, continuous-integration (CI) framework across a variety of hardware platforms (desktop, server, cloud) using modern tools including containerization (e.g., Docker, Kubernetes)
  • Exceptional ability and desire to acquire new knowledge and skills to solve challenges
  • Ability to work independently but collaboratively
  • Ability to manage multiple projects in a fast-paced professional office environment
  • Ability to communicate technical solutions to colleagues and customers
  • Superior oral and written communications skills
Job Responsibility
Job Responsibility
  • Algorithm and software design, development, research, and testing to support prototypes and products
  • Supporting the transition of research algorithms to fielded systems
  • Preparing documentation to summarize design and status of prototypes and products
  • Assisting with in-field integration, testing, and support, with some local travel required
  • Developing an interface between a C++ underwater-acoustics physics engine and a video game for education and training
  • Developing a JavaScript backend for a scenario-design and management tool for players and integration of that system with a learning-management system (LMS)
  • Developing algorithms and software for a cloud-deployed cognitive tool that allows natural-language query of legal documents to answer user questions about government regulations and supporting the DevOps process for deployment of the prototype
  • Fulltime
Read More
Arrow Right