Platform Engineer (ai/llm Infrastructure) Job at NTT DATA (Santa Clara)

Principal Software Engineer, AI Developer Tools

At Docker, we make app development easier so developers can focus on what matter...

Location

United States , Seattle

Salary:

232000.00 - 319000.00 USD / Year

Docker

Expiration Date

Until further notice

Requirements

10+ years software engineering experience with 3+ years in Staff or Principal Engineer roles
Deep expertise in AI/ML technologies with hands-on production experience building LLM-powered applications, AI agents, or AI-assisted developer tools
Strong understanding of LLM APIs (OpenAI, Anthropic, etc.), prompt engineering, agent orchestration frameworks, and practical applications of AI in software development workflows
Proven track record of architecting and building highly scalable distributed systems and developer-facing platforms
Production experience with modern cloud-native infrastructure including Kubernetes, GitOps deployment patterns, observability systems, and CI/CD pipelines
Proficiency in Go (preferred), Rust, Java, or Python with strong software engineering fundamentals
Experience designing developer tools, platform engineering systems, or internal tools that enable other teams
Exceptional product and platform mindset considering business outcomes, developer experience, and technical trade-offs
Strong communication skills with ability to influence technical and non-technical stakeholders across the organization
Track record of technical mentorship and elevating engineering teams' capabilities

Job Responsibility

Define the long-term technical vision and architecture for AI-powered developer tools and the self-service platform that enables teams to build their own AI agents
Establish architectural patterns, technical standards, and best practices for LLM integration, AI agent development, and production AI systems serving developers
Lead technical strategy for platform capabilities including deployment frameworks (ArgoCD/GitOps), observability integration (Grafana), security controls, and operational tooling for AI developer tools
Design highly available, scalable infrastructure for hosting AI agents and developer tools with predictable performance and intelligent resource management
Drive technical decisions on AI technology choices, LLM provider strategies, prompt engineering approaches, and agent orchestration frameworks
Partner with Senior Manager and product leadership to align technical architecture with business objectives and productization opportunities
Architect and build production-ready AI agents for developer productivity including code review assistants, test generators, deployment diagnostics, and incident response automation
Design and implement the self-service platform infrastructure that reduces time-to-production for new AI tools from weeks to days
Build systems that accelerate adoption of AI-native development tools (Claude Code, Cursor, Warp) across Docker's engineering organization
Establish reliability, security, and performance standards for AI systems including SLOs, monitoring, incident response, and cost management

What we offer

Freedom & flexibility
fit your work around your life
Designated quarterly Whaleness Days plus end of year Whaleness break
Home office setup
we want you comfortable while you work
16 weeks of paid Parental leave
Technology stipend equivalent to $100 net/month
PTO plan that encourages you to take time to do the things you enjoy
Training stipend for conferences, courses and classes
Equity

Fulltime

Software Engineer II, AI Developer Tools

At Docker, we make app development easier so developers can focus on what matter...

Location

United States , Seattle

Salary:

128000.00 - 181500.00 USD / Year

Docker

Expiration Date

Until further notice

Requirements

2+ years building backend systems, APIs, or developer-facing tools with strong software engineering fundamentals
Proficiency in Go (preferred), Rust, Java, or Python with understanding of data structures, algorithms, and design patterns
Basic understanding of AI/ML concepts with eagerness to learn about LLM APIs, prompt engineering, and AI agent development through hands-on work
Experience with cloud platforms (AWS, GCP, or Azure) and understanding of distributed systems or microservices
Familiarity with CI/CD pipelines, automated testing, version control (Git), and modern development workflows
Strong problem-solving skills with ability to work through technical challenges with guidance from senior engineers
Good communication skills in remote, asynchronous environments with ability to document technical decisions
Collaborative mindset with eagerness to learn from code reviews and feedback
Self-motivated with ability to work autonomously while knowing when to ask for help
Passion for developer tools and user experience

Job Responsibility

Build AI Developer Tool Features: Implement features for AI-powered developer tools such as code review assistants, test generators, deployment diagnostics, and on-call assistance tools
Implement LLM Integrations: Build integrations with LLM APIs (OpenAI, Anthropic, etc.) such as prompt engineering, response handling, error management, and performance optimization
Contribute to Platform Infrastructure: Help build self-service platform capabilities such as deployment pipelines, observability integration, security controls, and operational tooling that enable teams to rapidly deploy AI developer tools
Support AI-Native Development Adoption: Contribute to tools and programs that help teams adopt AI developer tools such as Claude Code, Cursor, and Warp across Docker's engineering organization
Write Quality Code: Develop well-tested code with unit and integration tests
follow team coding standards and participate actively in code reviews to learn best practices
Maintain Production Systems: Assist with monitoring, alerting, and troubleshooting production AI systems
participate in incident response and learn operational best practices
Collaborate and Learn: Work closely with Senior Engineers and Principal Engineer on technical designs
ask questions, seek feedback, and continuously improve your skills in AI/LLM technologies and platform engineering

What we offer

Freedom & flexibility
fit your work around your life
Designated quarterly Whaleness Days plus end of year Whaleness break
Home office setup
we want you comfortable while you work
16 weeks of paid Parental leave
Technology stipend equivalent to $100 net/month
PTO plan that encourages you to take time to do the things you enjoy
Training stipend for conferences, courses and classes
Equity

Fulltime

Senior Software Engineer, Frontend Platform

At Vanta, our mission is to help businesses earn and prove trust. We believe tha...

Location

United States

Salary:

179000.00 - 211000.00 USD / Year

Vanta

Expiration Date

Until further notice

Requirements

Have experience building and maintaining software services and infrastructure platforms
Have a deep understanding of TypeScript and React
Have a deep understanding of writing and testing performant web client code
Have experience scaling platform systems
Have experience writing codemods and migration automations
Demonstrate empathy for the developer experience (DX) and have a strong product sense for developer tooling
Open to using AI to amplify their skills and strengthen their work - demonstrating curiosity, a willingness to learn, and sound judgment in applying AI responsibly to improve efficiency and impact

Job Responsibility

Lead complex projects with multiple stakeholders and engineers to enable our business and team to scale
Evolve our frontend build system: bundlers, static analysis, caching, package management, and monorepo tooling
Develop and scale our frontend testing strategy (unit, integration, visual regression, and E2E)
Maintain and evolve GraphQL tooling: code patterns, developer tooling, scalability, and reliability
Improve web performance and observability: set SLOs, implement monitoring and alerting, and drive remediation
Plan and run modernization and migration initiatives leveraging codemods and AI/LLM tooling
Work with talented and kind engineers to make a significant impact on our customer base, enabling them to improve their security and prove it
Contribute to building Vanta’s engineering culture as we grow

What we offer

Offers Equity
medical benefits
401(k) plan
other company perk programs
Comprehensive medical, dental, and vision coverage, with 100% of employee-only benefit premiums covered for most medical plans
16 weeks fully-paid Parental Leave for all new parents
Health & wellness stipend
Remote workspace, internet, and cellphone stipend
Commuter benefits for team members who report to the SF and NYC office
Family planning benefits

Fulltime

LLM & AI DevOps Engineer

Join our team as a DevOps Engineer specializing in Artificial Intelligence (AI) ...

Location

United States , Remote

Salary:

Not provided

Robert Half

Expiration Date

Until further notice

Requirements

Proven experience as a DevOps Engineer, preferably supporting AI or machine learning platforms
Hands-on expertise with Kubernetes (EKS, AKS, GKE, or on-prem), Docker, Terraform, and Ansible
Experience with monitoring/observability tools such as Grafana and Prometheus
Familiarity with NVIDIA GPU drivers, CUDA, and hardware provisioning for machine learning tasks
Proficiency in at least one scripting language (Python, Bash, etc.)
Cloud platform experience (AWS, GCP, Azure)
hybrid/on-premise a plus
Previous work with MLOps tools and data pipeline automation is highly desirable
Bachelor’s degree in Computer Science or related field, or equivalent professional experience

Job Responsibility

Build, automate, and manage CI/CD pipelines for deploying and maintaining AI/LLM workloads
Collaborate with AI engineers and data scientists to streamline model deployment, versioning, and monitoring
Design and maintain cloud infrastructure using Infrastructure as Code (IaC) platforms such as Terraform and Ansible
Orchestrate and manage containerized AI environments using Kubernetes
Implement robust monitoring and logging solutions utilizing Grafana and Prometheus
Optimize AI model inference and training workloads—especially for NVIDIA GPU-powered environments
Apply strict security and compliance standards for all infrastructure components
Diagnose and resolve production issues, continuously improving reliability and scalability of AI services

What we offer

medical
vision
dental
life and disability insurance
401(k) plan

New