CrawlJobs Logo

IT Network Engineer - Network AI Automation & Full Stack Infrastructure

valvolineglobal.com Logo

Valvoline Global

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are seeking a highly skilled IT Network Engineer – Network AI Automation & Full Stack Infrastructure to join our Network Engineering team and help evolve our global enterprise network. This role is ideal for a strong mid-level engineer with a minimum of 6 years of enterprise global network engineering experience, deep networking expertise, and hands-on automation experience. You will support and engineer enterprise infrastructure across routing, switching, firewalls, SASE, cloud networking, Global WAN, wireless, and proxy services, and be capable of troubleshooting and triaging issues across the full network stack. Technologies in this environment include Palo Alto Networks, Cisco/Meraki networking, Aviatrix cloud networking, Megaport connectivity, and observability platforms such as Sumo Logic. This role will also participate in the AI & Automation Network Committee, helping advance initiatives focused on AI-driven observability, automated workflows, GenAI, Agentic AI, and Infrastructure as Code (IaC) to modernize network operations and improve operational visibility.

Job Responsibility:

  • Engineer and support global enterprise network infrastructure across routing, switching, firewalls, SASE, WAN/SD-WAN, wireless, and proxy services
  • Troubleshoot complex network issues across multiple infrastructure layers
  • Support hybrid cloud networking architectures across Azure, AWS, or GCP
  • Manage and optimize next-generation firewall policies using platforms such as Palo Alto Networks
  • Support Cisco and Meraki switching and wireless infrastructure
  • Design and maintain global WAN connectivity, including high-performance cloud connectivity such as Megaport
  • Support cloud networking solutions leveraging platforms such as Aviatrix
  • Implement network automation and Infrastructure as Code (IaC) to improve operational efficiency
  • Leverage observability platforms such as Sumo Logic to improve monitoring and visibility
  • Contribute to AI-driven network operations and automation initiatives
  • Partner with Cloud, Security, and Infrastructure teams on modern network architecture

Requirements:

  • 6+ years of enterprise global network engineering experience
  • Bacehlors Degree in Information Technology, CIS or related field or equivalent experience
  • Strong experience with routing and switching protocols (BGP, OSPF, VLANs)
  • Experience with next-generation firewalls and SASE, preferably Palo Alto
  • Experience with Cisco or Meraki switching and wireless
  • Cloud networking experience in Azure, AWS, or GCP
  • Experience supporting global WAN or SD-WAN environments
  • Familiarity with Aviatrix, Megaport, and observability platforms such as Sumo Logic
  • Practical experience with network automation (Python, Terraform, Ansible)

Nice to have:

Experience integrating automation, observability, and ITSM workflows to improve operational efficiency is highly valued

Additional Information:

Job Posted:
April 22, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for IT Network Engineer - Network AI Automation & Full Stack Infrastructure

Consultant A2 - Infra

Microsoft Industry Solution - Global Center for Innovation and Delivery (GCID) d...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4-10 years of industry experience with a minimum of 3+ years of Azure Infrastructure experience
  • Bachelor’s degree in computer science, Engineering, or equivalent professional experience
  • advanced technical certifications or higher education preferred
  • Mandatory Microsoft Azure Administrator certification (AZ104)
  • Azure Solutions Architect Expert (AZ305) highly preferred
  • Terraform Associate certification strongly valued, with proven experience in Infrastructure-as-Code (IaC) automation
  • Cloud AI certification such as Microsoft Azure AI Fundamentals (AI‑900) or equivalent is a strong advantage, with practical understanding of AI/ML workloads on cloud platforms
  • Experience across both application architecture and cloud infrastructure domains
  • Proficiency in Python for automation, AI integration, and scripting across infra workflows
  • Experience with Azure AI Services (AI Foundry, AI Agent services, Azure AI Search, Vision, Speech, Translation,) leveraged as infrastructure components for applications
Job Responsibility
Job Responsibility
  • As a full Stack Infrastructure Consultant, design, build, and optimize end-to-end cloud and on-premises infrastructure solutions, ensuring secure, scalable, and high-performing environments across the entire technology stack
  • Responsible for technical quality assurance, identification, and mitigation of technical risk across customer or partner deliverables delivery technology strategy aligned solutions
  • Applies technical experience and industry-specific knowledge to develop solutions, based on an analysis of how the proposed approach affects the business objectives of customers and partners
  • Applies information-compliance and assurance policies to ensure stakeholder confidence
  • Identifies new processes and innovations to help customers or partners build and accelerate capabilities by using Microsoft technologies
  • Identifies the best practice approach for a project, across a wide scope of technical issues, and develops or reuses intellectual capital with customers, world-wide, and for programs and initiatives across Microsoft
  • Defines engagements and opportunities to use Intellectual Property (IP) and address product gaps, while leveraging existing IP and community resources to ensure consistency and improve predictability
  • Fulltime
Read More
Arrow Right

Senior Consultant - AI & Infra

Microsoft Industry Solution - Global Center for Innovation and Delivery (GCID) d...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ years of overall IT industry experience with at least 5+ years of handson expertise in Azure Infrastructure architecture, design, and operations.
  • Bachelor’s degree in computer science, Engineering, or equivalent professional experience
  • advanced technical certifications or higher education preferred.
  • Mandatory Microsoft Azure Administrator certification (AZ104)
  • Azure Solutions Architect Expert (AZ305) highly preferred.
  • Terraform Associate certification strongly valued, with proven experience in InfrastructureasCode (IaC) automation.
  • Cloud AI certification such as Microsoft Azure AI Fundamentals (AI‑900) or equivalent is a strong advantage, with practical understanding of AI/ML workloads on cloud platforms.
  • Experience across both application architecture and cloud infrastructure domains, with the ability to bridge development and operations teams effectively.
  • Professional certifications in Delivery Management methodologies (Scrum, Agile, ITIL, Change/Project Management) considered a strong advantage.
  • Proficiency in Python for automation, AI integration, and scripting across infra workflows
Job Responsibility
Job Responsibility
  • As a full Stack Infrastructure Consultant, design, build, and optimize endtoend cloud and onpremises infrastructure solutions, ensuring secure, scalable, and highperforming environments across the entire technology stack.
  • Collaborate with the customer/partner team of Chief Information Officers (CIOs), other C-level executives, and technical and business decision-makers to align customer vision with solutions.
  • Manages risk and Services business goals within engagements.
  • Responsible for technical quality assurance, identification, and mitigation of technical risk across customer or partner deliverables delivery technology strategy aligned solutions
  • Applies technical experience and industry-specific knowledge to develop solutions, based on an analysis of how the proposed approach affects the business objectives of customers and partners.
  • Applies information-compliance and assurance policies to ensure stakeholder confidence.
  • Identifies new processes and innovations to help customers or partners build and accelerate capabilities by using Microsoft technologies.
  • Identifies the best practice approach for a project, across a wide scope of technical issues, and develops or reuses intellectual capital with customers, world-wide, and for programs and initiatives across Microsoft.
  • Defines engagements and opportunities to use Intellectual Property (IP) and address product gaps, while leveraging existing IP and community resources to ensure consistency and improve predictability.
  • Drives new ways of thinking, across the division and subsidiary, to improve quality, engineering productivity, and responsiveness to feedback and changing priorities.
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - M365 Copilot

We are looking for a Full-Stack Software Engineer with strong infrastructure fun...
Location
Location
China , Beijing
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Solid experience designing distributed systems (fault tolerance, scalability, consistency tradeoffs, performance, and operability)
  • Hands-on experience with Kubernetes in production (cluster architecture, networking, storage, security, scaling)
  • Solid database and data-system expertise (e.g., PostgreSQL/MySQL, NoSQL, caching, messaging/streaming) with proven performance and reliability tuning experience
  • Proficiency in at least one of: Python, C++, Rust, or Java (production-quality coding)
  • Experience with at least one major cloud platform: Azure and/or AWS (compute, networking, IAM, managed Kubernetes, storage, monitoring)
  • Solid engineering practices: testing, CI/CD, code quality, design docs, and operational ownership (on-call/incident response)
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Architect, build, and operate a scalable and reliable sandbox infrastructure for AI Agent execution, including isolation, scheduling, lifecycle management, and resource governance
  • Design and implement distributed backend services/APIs to orchestrate high-throughput, low-latency sandbox sessions and system integrations
  • Build and optimize Kubernetes-based platform capabilities (multi-tenancy, autoscaling, networking, storage, admission control, policy enforcement)
  • Own and evolve the data layer (relational/NoSQL/cache/queue), including schema design, indexing, performance tuning, and reliability strategies (backups, replication, failover)
  • Develop full-stack features: implement secure, performant backend endpoints and build modern web UIs for platform workflows (e.g., session management, policy configuration, debugging, monitoring views)
  • Drive UX excellence: collaborate with designers and product managers, translate user needs into clear interaction flows, iterate based on feedback and usage telemetry, and maintain consistent UI patterns/design systems
  • Improve system robustness via SLO-driven engineering, capacity planning, incident response, and continuous hardening of reliability and security
  • Implement end-to-end observability (metrics, logs, traces), define dashboards/alerts, and reduce operational toil with automation and self-service tooling
  • Identify bottlenecks and lead performance/cost optimizations across compute, storage, and network
  • Maintain high engineering standards through code reviews, automated testing, CI/CD, documentation, and well-defined runbooks
  • Fulltime
Read More
Arrow Right

Infrastructure Software Engineer

Building cutting-edge model-specific ASICs requires crafting custom infrastructu...
Location
Location
Taiwan , Taipei
Salary
Salary:
Not provided
etched.com Logo
Etched
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Are a systems-minded software engineer who loves building foundational platforms, working close to the metal and cloud, solving high-leverage problems at scale
  • Are a deeply technical engineer who treats infrastructure as a software problem - prioritizing clean abstractions, version control, small change lists, easy roll backs, testing, and long-term maintainability over ad hoc configuration
  • Have strong programming skills in languages such as Python, Go, Rust, and C++, and are comfortable building production-grade tooling
  • Have experience manufacturing hardware working with big name firms in Taiwan
  • Possess expert-level knowledge of Linux, virtualization, containerization, and CI/CD pipelines, with a deep understanding of how to debug, optimize, and scale complex systems
  • Are familiar with Infrastructure as Code tools like OpenTofu, Ansible, or Puppet, and enjoy designing declarative, reproducible infrastructure systems
  • Understand and use PromQL and other telemetry/query languages and have used LLM to extract insight from real-time metrics, and know how to architect and tune observability stacks
  • Have a track record of debugging and resolving difficult hardware-software integration problems across bare-metal systems, networks, and distributed workloads
  • Can lead and mentor technical teams, guiding design decisions and helping others develop sound engineering instincts
  • Have 4+ years of experience in infrastructure engineering, systems programming, or backend software development - ideally in environments where performance, scale, or hardware interaction mattered
Job Responsibility
Job Responsibility
  • Architect and Scale Distributed Compute Systems: Design and build the orchestration layers that drive our hybrid high-performance clusters—enabling simulation, synthesis, and continuous integration of AI ASICs at unprecedented scale
  • Build Infrastructure-as-Code Systems: Develop and maintain a fully programmable infrastructure control plane to ensure reproducibility, auditability, and rapid iteration across the entire stack
  • Optimize End-to-End Developer Experience: Create tools and abstractions that empower engineers to harness massive parallelism without worrying about the underlying complexity
  • Workload Elasticity, Reliability, and Efficiency: Prototype and execute workload orchestration and migration strategies between on-premise and cloud environments, balancing performance, storage availability and replication, uptime, and cost across heterogeneous hardware and compute backends
  • Implement real-time telemetry, tracing systems that surface insights from millions of metrics, enabling proactive debugging and system optimization
  • Push the Limits of Observability: Build a full observability stack that includes dashboards, alerting, automated responses, and a synthetic testing framework to proactively test infrastructure performance and reliability for various application and data flows, ensuring we remain ahead of issues impacting development and productivity workflows
  • Build an integrated, world-class manufacturing infrastructure: in close collaboration with partners to design, test, and ship the highest-quality AI acceleration hardware
What we offer
What we offer
  • Competitive compensation packages including generous equity packages
  • Comprehensive insurance coverage and other top-of-market benefits
  • Fulltime
Read More
Arrow Right

Infrastructure Software Engineer

Building cutting-edge model-specific ASICs requires crafting custom infrastructu...
Location
Location
United States , San Jose
Salary
Salary:
150000.00 - 250000.00 USD / Year
etched.com Logo
Etched
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Are a systems-minded software engineer who loves building foundational platforms, working close to the metal and cloud, solving high-leverage problems at scale
  • Are a deeply technical engineer who treats infrastructure as a software problem - prioritizing clean abstractions, version control, small change lists, easy roll backs, testing, and long-term maintainability over ad hoc configuration
  • Have strong programming skills in languages such as Python, Go, Rust, and C++, and are comfortable building production-grade tooling
  • Possess expert-level knowledge of Linux, virtualization, containerization, and CI/CD pipelines, with a deep understanding of how to debug, optimize, and scale complex systems
  • Are familiar with Infrastructure as Code tools like OpenTofu, Ansible, or Puppet, and enjoy designing declarative, reproducible infrastructure systems
  • Understand and use PromQL and other telemetry/query languages and have used LLM to extract insight from real-time metrics, and know how to architect and tune observability stacks
  • Have a track record of debugging and resolving difficult hardware-software integration problems across bare-metal systems, networks, and distributed workloads
  • Can lead and mentor technical teams, guiding design decisions and helping others develop sound engineering instincts
  • Have 8+ years of experience in infrastructure engineering, systems programming, or backend software development - ideally in environments where performance, scale, or hardware interaction mattered
  • Are driven by curiosity, take initiative, and have an innate sense of ownership — you thrive in uncharted territory, design for edge cases, and love making systems more powerful, reliable, and elegant
Job Responsibility
Job Responsibility
  • Architect and Scale Distributed Compute Systems: Design and build the orchestration layers that drive our hybrid high-performance clusters—enabling simulation, synthesis, and continuous integration of AI ASICs at unprecedented scale
  • Build Infrastructure-as-Code Systems: Develop and maintain a fully programmable infrastructure control plane to ensure reproducibility, auditability, and rapid iteration across the entire stack
  • Optimize End-to-End Developer Experience: Create tools and abstractions that empower engineers to harness massive parallelism without worrying about the underlying complexity
  • Workload Elasticity, Reliability, and Efficiency: Prototype and execute workload orchestration and migration strategies between on-premise and cloud environments, balancing performance, storage availability and replication, uptime, and cost across heterogeneous hardware and compute backends
  • Implement real-time telemetry, tracing systems that surface insights from millions of metrics, enabling proactive debugging and system optimization
  • Push the Limits of Observability: Build a full observability stack that includes dashboards, alerting, automated responses, and a synthetic testing framework to proactively test infrastructure performance and reliability for various application and data flows, ensuring we remain proactive against issues impacting development and productivity workflows
What we offer
What we offer
  • Medical, dental, and vision packages with generous premium coverage
  • $500 per month credit for waiving medical benefits
  • Housing subsidy of $2k per month for those living within walking distance of the office
  • Relocation support for those moving to San Jose (Santana Row)
  • Various wellness benefits covering fitness, mental health, and more
  • Daily lunch + dinner in our office
  • Fulltime
Read More
Arrow Right

Site Reliability Engineering Specialist

The Site Reliability Engineering Specialist independently executes activities th...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
plus.net Logo
Plusnet
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A degree in IT, Maths or Science
  • A deep understanding of full stack monitoring solutions such as Dynatrace
  • Strong proficiency in one or more programming languages (e.g. Java, Python)
  • Experience with cloud platforms (AWS, Azure, or GCP)
  • Solid understanding of software architecture, design patterns, and microservices
  • Familiarity with CI/CD tools and DevOps practices
  • High levels of quality presentation and reporting capabilities
  • Resilience to ensure support teams are engaged 24x7x365
  • Ability to adapt to latest industry trends
  • CI/CD/CT Pipeline management
Job Responsibility
Job Responsibility
  • Executes the implementation of new software development life cycle automation tools, frameworks, and code pipelines
  • Coordinates a diverse team and creates the initial test schedule
  • Executes the implementation of automation technologies
  • Proactively identifies and manages risk
  • Leads scale testing to measure, tune and optimise system performance
  • Executes metric/monitoring analysis
  • Designs, analyses, develops and troubleshoots highly distributed large-scale production systems
  • Executes approaches that scale systems sustainably
  • Writes and delivers infrastructure as code software
  • Implements robust monitoring and alerting systems and performs root cause analysis
  • Fulltime
Read More
Arrow Right

Senior Infrastructure Engineer

Build the backbone of next-gen defense technology! We are seeking a Senior Infra...
Location
Location
Greece , Athens
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
May 28, 2026
Flip Icon
Requirements
Requirements
  • 6+ years in designing, implementing and maintaining distributed infrastructures
  • At least 6 years of experience in complex, high-performance distributed environments
  • Deep expertise in networking, including routing, switching, VLAN/VXLAN, firewalls, load balancing and software-defined networking
  • Strong experience with VMware or similar hypervisors
  • Extensive experience designing and operating Kubernetes clusters for large-scale distributed workloads
  • Experience with storage systems such as distributed storage, SAN/NAS, and software-defined storage
  • Deep knowledge of server architecture and GPU configurations
  • Strong knowledge of Linux operating systems and system internals
Job Responsibility
Job Responsibility
  • Own the full stack (Compute, Storage, Network, Virtualization) for a highly available, on-premises data center
  • Deploy and manage K8s clusters for massive, distributed AI workloads
  • Automate everything using Terraform, Ansible, and CI/CD pipelines
  • Optimize hardware (GPUs, high-speed networking) for demanding ML/AI requirements
  • Manage databases, monitor capacity, and ensure 'mission-ready' security and resilience
  • Fulltime
Read More
Arrow Right

Senior ML Infrastructure / ML DevOps Engineer

We are looking for a Senior ML Infrastructure / DevOps Engineer who loves Linux,...
Location
Location
Salary
Salary:
Not provided
Pathway
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Former or current Linux / systems / network administrator comfortable living in the shell and debugging at OS and network layers (systemd, filesystems, iptables/security groups, DNS, TLS, routing)
  • 5+ years of experience in DevOps/SRE/Platform/Infrastructure roles running production systems, ideally with high‑performance or ML workloads
  • Deep familiarity with Linux as a daily driver, including shell scripting and configuration of clusters and services
  • Strong experience with workload management, containerization, and orchestration (Slurm, Docker, Kubernetes) in production environments
  • Solid understanding of CI/CD tools and workflows (GitHub Actions, GitLab CI, Jenkins, etc.), including building pipelines from scratch
  • Hands-on cloud infrastructure experience (AWS, GCP, Azure), especially around GPU instances, VPC/networking, storage, and managed ML services (e.g., SageMaker HyperPod, Vertex AI)
  • Proficiency with infrastructure as code (Terraform, CloudFormation, or similar) and a bias toward automation over manual operations
  • Experience with monitoring and logging stacks (Grafana, Prometheus, Loki, CloudWatch, or equivalents)
  • Familiarity with ML pipeline and experiment orchestration tools (MLflow, Kubeflow, Airflow, Metaflow, etc.) and with model/version management
  • Solid programming skills in Python, plus the ability to read and debug code that uses common ML libraries (PyTorch, TensorFlow) even if you are not a full‑time model developer
Job Responsibility
Job Responsibility
  • Design, operate, and scale GPU and CPU clusters for ML training and inference (Slurm, Kubernetes, autoscaling, queueing, quota management)
  • Automate infrastructure provisioning and configuration using infrastructure‑as‑code (Terraform, CloudFormation, cluster‑tooling) and configuration management
  • Build and maintain robust ML pipelines (data ingestion, training, evaluation, deployment) with strong guarantees around reproducibility, traceability, and rollback
  • Implement and evolve ML‑centric CI/CD: testing, packaging, deployment of models and services
  • Own monitoring, logging, and alerting across training and serving: GPU/CPU utilization, latency, throughput, failures, and data/model drift (Grafana, Prometheus, Loki, CloudWatch)
  • Work with terabyte‑scale datasets and the associated storage, networking, and performance challenges
  • Partner closely with ML engineers and researchers to productionize their work, translating experimental setups into robust, scalable systems
  • Participate in on‑call rotation for critical ML infrastructure and lead incident response and post‑mortems when things break
What we offer
What we offer
  • Intellectually stimulating work environment
  • Be a pioneer: you get to work with realtime data processing & AI
  • Work in one of the hottest AI startups, with exciting career prospects
  • Team members are distributed across the world
  • Responsibilities and ability to make significant contribution to the company’s success
  • Inclusive workplace culture
  • Fulltime
Read More
Arrow Right