This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Meta's Lab Infrastructure, Network, Compliance, and Security (LINCS) team is seeking a network engineer to help build and scale the network infrastructure supporting Meta's global engineering labs. Our team is responsible for network design, deployment, and operations for Meta's global engineering labs where we support multiple engineering teams. With the importance of rapidly maturing new technologies like the Metaverse and Gen AI, there are significant opportunities to re-think traditional networking and iterate quickly in our environment. This role offers an opportunity to work directly with engineering teams that are maturing new hardware and software on the path to production.
Job Responsibility
Own end-to-end frontend and backend network design, deployment, and operations for AI and compute lab clusters
Serve as a primary networking point of contact for backend fabrics, including Arista- and internally developed network OS-based scale-out networks supporting AI workloads
Design, deploy, and support high-throughput, low-latency cluster networking, including congestion management (PFC/ECN), RDMA validation, and lossless transport
Perform hands-on troubleshooting and root-cause analysis across L1–L4 using packet captures, telemetry, and vendor tools to resolve complex lab issues
Support silicon, hardware, and software bring-ups, ensuring reliable connectivity and on-time validation
Lead and execute lab network lifecycle activities, including upgrades, migrations, capacity expansions, and decommissioning across regions
Develop and maintain network automation, configuration templates, and zero-touch provisioning (ZTP) workflows
Create and maintain MOPs, runbooks, and readiness checklists for internal teams and vendor executions
Provide direct consultation and training to cross-functional partners, enabling teams to operate and troubleshoot lab networks
End-to-end ownership of projects from requirements definition through customer handoff
Collaborate closely with hardware, software, systems, and lab operations teams to validate new platforms, optics, and network designs
Support limited travel (about 10%) for critical lab builds, migrations, or escalations
Requirements
6+ years of experience designing, deploying, and operating network infrastructure in production or lab environments
Experience working in multi-vendor environments, including Arista, FBOSS-based platforms, and lab networking hardware
Experience with configuration management, code repositories, and zero-touch provisioning (ZTP) for network infrastructure
Experience with IPv4/IPv6, L2/L3 protocols, including STP, OSPF, BGP, TCP/IP, DHCP, DNS, VLANs, VRRP, LACP, MC-LAG, ACLs, MACsec, and EVPN/VXLAN
Working knowledge of scripting or programming languages (e.g., Python, shell) for automation and tooling
Demonstrated experience to operate consistently while working under your own initiative, seeking feedback and input where appropriate in a global, time-critical environment, managing multiple priorities and mission-critical timelines
Nice to have
Understanding of physical infrastructure design, including structured cabling, space, power, and cooling systems
Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy review)
Networking L1 expertise in validating multi-vendor optics, with proficiency using the BCM shell and I2C utilities to troubleshoot hardware-level issues
Experience with network automation, CI/CD pipelines, audit frameworks, and validation tooling
Hands-on experience with backend cluster networking, including scale-out fabrics, RDMA networks, and congestion management
Experience supporting AI/ML or high-performance compute clusters in lab or pre-production environments
Hands-on experience with lab test equipment, optics qualification (e.g., 400G/800G), optical switches and physical infrastructure
Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
Hold networking certifications such as CCIE, JNCIE or equivalent
Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies
Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)
Hands-on experience with disaggregated networking products and software, such as Meta's open network OS (FBOSS), SONiC, Cumulus Linux, or equivalent open networking platforms