CrawlJobs Logo

Data Center Production Operations Engineer

meta.com Logo

Meta

Location Icon

Location:
Singapore

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Meta is seeking a forward thinking experienced engineer to join the Production Operations team within our Data Centers. These Data Centers are the foundation upon which our rapidly scaling infrastructure efficiently operates and upon which our innovative services are delivered. Meta is at the leading edge of the global data center industry both in terms of how data centers are designed and operated. This person should enjoy working in a fast paced, technical environment where adaptability and flexibility will be key to their success. We seek an IT professional with advanced, hands-on technical skills in server hardware and Linux - ideally in a Data Center environment. Having extensive knowledge of server administration and performing on complex projects in a large-scale distributed data center environment is a core competency of this individual. The candidate should also have good knowledge and experience in a few of the following core areas: Hardware repair, OS management, Tooling and Automation, Networking, or Technical Project Management.

Job Responsibility:

  • Support platform health by successfully resolving and closing complex tickets, while addressing the overall issue (i.e. addressing root cause) including, but not limited to, remote troubleshooting and physical inspection of services in data halls
  • Perform in-depth exploration and root cause analysis of complex technical issues within the data center, ranging from automated tooling to hardware failures and network issues
  • Facilitate collaboration with cross-functional teams on projects and initiatives related to topics such as process, hardware and automation
  • Lead the introduction of new platforms and hardware to the site and geographical area, in collaboration with partners and global resources, accelerating the time it takes to bring these products to sustained mass production
  • Use tools and data analysis effectively to identify issues that are larger in scope and which impact one or multiple Data Centers. Take actions to communicate with all stakeholders appropriately and manage or escalate as needed
  • Drive corrective actions of complex hardware issues, work with internal teams and vendors
  • provide an ownership stake, and influence future design changes to ensure ease of serviceability
  • Solve complex and systemic hardware and/or software issues at scale using scripting, automation, and tooling to drive global resolution
  • Continuously evaluate and identify areas for improvement in processes, tools, and systems to optimize efficiency and quality of repairs
  • Use data analytics to drive maximum server up-time and utilization rates, understanding hardware failure rates and service level agreements
  • Coach and mentor team members to evaluate and identify better ways to resolve issues, and define updates to tools and processes
  • Provide engineering support and be a go-to technical resource and Subject Matter Expert for the team, leadership, and cross-functional teams in all aspects of operating and maintaining data center servers
  • Maintain and update documentation i.e. procedures, runbooks and guides
  • Build cross functional relationships and influence policies and procedures that improve global data center operations
  • Participate in 24/7 on-call rotation
  • Ability to travel up to 15% of the time

Requirements:

  • BS, BA or BEng in technical field or commensurate experience
  • 7+ years of technical IT experience within an infrastructure environment, in a role such as Systems Administrator, DevOps Engineer, or Site Reliability Engineer
  • Expert in Linux (or equivalent OS) in a complex IT environment with the ability to triage, debug, and troubleshoot complex, systemic issues
  • Hands-on experience and knowledge of server hardware and components, including storage
  • Experience of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, and network
  • Experience managing multiple technical issues concurrently driving to the root cause
  • Experience participating in or leading technical projects related to areas such as process improvement, technology, and/or automation. Brings peers, partners and other resources into the project where additional expertise is needed, and to provide growth and learning opportunities for others
  • Ability to communicate effectively, in a clear and concise manner, appropriately tailoring messages to the audience
  • Extensive technical knowledge of technologies such as HTTP, DNS, RAID, and DHCP
  • Experience in providing technical guidance to external vendors
  • Experience in debugging, modifying and developing commonly used scripting or programming languages in at least one of these languages: Bash, PHP, Python, SQL, Rust, Go or Perl
  • Knowledge of out-of-band/lights-out server communication methods, such as IPMI and serial console
  • Experience using data and metrics to drive decisions

Nice to have:

  • Proven experience in fostering growth in others, and driving influence across all organizational levels
  • Experience in a large-scale data center environment
  • Experience with large-scale AI implementations
  • Six Sigma knowledge/certification

Additional Information:

Job Posted:
January 23, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Data Center Production Operations Engineer

Is Data Center Operations Engineer

Bridging Information Technology (IT) and the Mechanical, Electrical, and Plumbin...
Location
Location
United States , New Albany
Salary
Salary:
91731.00 - 114948.00 USD / Year
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s degree
  • Bachelor’s degree and 2 years of data center operations experience
  • Associate’s degree and 6 years of data center operations experience
  • High school diploma / GED and 8 years of data center operations experience
  • Hands-on experience with rack/stack, structured cabling, and IT hardware installation
  • Familiarity with Dell PowerEdge, Nutanix, NetApp, and Cisco platforms
  • Ability to interpret electrical and mechanical drawings (awareness-level competency)
  • Experience using monitoring, alerting, or automation systems (AI-enabled platforms preferred)
  • Solid understanding of IT operations concepts including hardware lifecycle management and disaster recovery
  • Ability to read and update documentation, diagrams, and cable records
Job Responsibility
Job Responsibility
  • Serve as the liaison between IT teams and facilities staff, ensuring flawless communication
  • Interpret electrical one-line diagrams, distribution drawings, and cooling schematics to support incident response and planning
  • Install, rack, cable, and support enterprise IT systems including Dell PowerEdge, Nutanix, NetApp, and Cisco technologies
  • Support day-to-day moves, adds, and changes (MACs) in building IDF and VDER environments
  • Perform fiber and copper patch cabling in data centers, IDFs, and VDER closets
  • Trace and troubleshoot cabling issues to restore connectivity
  • Monitor infrastructure, proactively detect issues, and bring up with urgency to appropriate teams
  • Apply AI-enabled monitoring and automation platforms to enhance data center operations
  • Maintain documentation of infrastructure layouts, procedures, and operational standards
  • Participate in capacity planning, disaster recovery drills, and continuous improvement initiatives
What we offer
What we offer
  • A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
  • A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
  • Stock-based long-term incentives
  • Award-winning time-off plans
  • Flexible work models, including remote and hybrid work arrangements, where possible
  • Fulltime
Read More
Arrow Right

Data Center Operations Manager

As the Manager of our datacenter operations team you’ll contribute in the strate...
Location
Location
United States , Santa Clara
Salary
Salary:
122500.00 - 179630.00 USD / Year
rackspace.com Logo
Rackspace
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in computer science, computer engineering or a related field. Additional experience may substitute for the degree
  • 7+ years of experience as a data center operations technician
  • Previous people management within a Data Center experience is required
  • Demonstrated successful experience meeting data center production/operation schedules
Job Responsibility
Job Responsibility
  • Manage a team of datacenter operation engineers and maintain a better than 99.999% uptime through impeccable housekeeping and robust operational discipline
  • Report on operational performance to the leadership team
  • Recommend changes in procedures or equipment that would increase productivity, reduce cost, and better serve Data Center requirements and customers
  • Train employees on policies and procedures and engage them in change
  • Recommend employees for hiring, firing, promotions and demotions
  • Provide input on pay reviews
  • Prepare and perform performance appraisals
  • Monitor and prioritize an internal ticketing system
  • Provide operating system storage troubleshooting, along with storage upgrades, hardware troubleshooting and Raid configuration changes
  • Provide hardware support and upgrades for servers running Microsoft Windows Server, Red Hat Enterprise Server, Ubuntu Linux or VMWare ESX Server
What we offer
What we offer
  • Incentive compensation opportunities in the form of an annual bonus or incentives, equity awards and an Employee Stock Purchase Plan (ESPP)
  • Fulltime
Read More
Arrow Right

Sr. Network Data Center Engineer

If you live and breathe networking, virtualization, and high-availability system...
Location
Location
United States
Salary
Salary:
150000.00 USD / Year
corporatetools.com Logo
Corporate Tools
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with Proxmox or other hypervisors (VMware, KVM, Xen, Hyper-V)
  • 5+ years of network engineering, data center operations, or cloud infrastructure
  • Experience with Ceph or SAN-based storage solutions (iSCSI, NFS, ZFS)
  • Experience with containers and networking
  • Excellent problem-solving skills and a keen eye for detail
  • Ability to work on projects solo or with a team
  • Love for learning and improving code
  • Strong communication and collaboration skills
  • Understanding of Ceph storage architecture (OSDs, MONs, MDS, RADOS, etc.)
  • Experience in iSCSI/NFS/ZFS SAN setups and performance tuning
Job Responsibility
Job Responsibility
  • Develop and design robust and scalable software solutions
  • Take ownership of projects from conception to deployment, ensuring timely delivery and meeting the specified requirements
  • Work closely with cross-functional teams, including IT, product management, and other software teams, to ensure seamless integration and alignment with business objectives
  • Stay updated with the latest industry trends, technologies, and best practices to bring innovative solutions to the table
  • Design, implement, and maintain a robust network architecture that supports Proxmox virtualization, Ceph/SAN storage, and container networking
  • Manage firewalls (iptables, pfSense, UFW, etc.) to secure access to virtualized environments and hosting services
  • Configure and optimize VLANs, subnets, and routing to ensure isolated and secure network segments for virtual machines, storage, and frontend applications
  • Configure and maintain VPNs, BGP, OSPF, or other routing protocols to ensure proper network redundancy and failover
  • Set up and maintain bridged, NAT, and VXLAN networking in Proxmox for efficient VM communication
  • Implement high-availability (HA) networking for Hypervisor networks and Ceph/SAN clusters
What we offer
What we offer
  • 100% employer-paid medical, dental and vision for employees
  • Annual review with raise option
  • 22 days Paid Time Off accrued annually, and 4 holidays
  • After 3 years, PTO increases to 29 days. Employees transition to flexible time off after 5 years with the company—not accrued, not capped, take time off when you want
  • The 4 holidays are: New Year’s Day, Fourth of July, Thanksgiving, and Christmas Day
  • Paid Parental Leave
  • Up to 6% company matching 401(k) with no vesting period
  • Quarterly allowance
  • Use to make your remote work set up more comfortable, for continuing education classes, a plant for your desk, coffee for your coworker, a massage for yourself... really, whatever
  • Open concept office with friendly coworkers
  • Fulltime
Read More
Arrow Right

Data Center QA Engineer

Designs, develops, troubleshoots and debugs software programs for software enhan...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • MS/BS degree in Computer Science or equivalent experience
  • Expert knowledge of Layer 2 and Layer 3 technologies through validation or deployment of networking products
  • Hands-on experience with at least one popular Networking OS: JunOS, NXOS, IOS, EOS, or SONiC
  • Solid understanding of clos-based Data Center network architectures (3-stage and 5-stage)
  • Familiarity with Data Center protocols such as VXLAN and MP-BGP
  • Proficiency in Python programming
  • Strong grasp of Linux-based systems and network troubleshooting tools
  • A quality-focused mindset with a keen eye for identifying product and interaction limitations
  • Minimum 5+ years of relevant experience
Job Responsibility
Job Responsibility
  • Test IP networking-related software products to ensure they operate as defined by requirements
  • Build network configurations to model well-optimized network reference designs
  • Plan, develop, and execute automated and manual test plans
  • Provide constructive feedback, report issues, and interact with developers to deliver superior product quality
  • Review requirements from the Product Management team
  • Utilize network troubleshooting tools (packet captures, monitoring devices, log files, customer input) to resolve issues effectively
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Fulltime
Read More
Arrow Right

Data Center QA Engineer

Designs, develops, troubleshoots and debugs software programs for software enhan...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • MS/BS degree in Computer Science or equivalent experience
  • Expert knowledge of Layer 2 and Layer 3 technologies through validation or deployment of networking products
  • Hands-on experience with at least one popular Networking OS: JunOS, NXOS, IOS, EOS, or SONiC
  • Solid understanding of clos-based Data Center network architectures (3-stage and 5-stage)
  • Familiarity with Data Center protocols such as VXLAN and MP-BGP
  • Proficiency in Python programming
  • Strong grasp of Linux-based systems and network troubleshooting tools
  • A quality-focused mindset with a keen eye for identifying product and interaction limitations
  • Minimum 5+ years of relevant experience
Job Responsibility
Job Responsibility
  • Test IP networking-related software products to ensure they operate as defined by requirements
  • Build network configurations to model well-optimized network reference designs
  • Plan, develop, and execute automated and manual test plans
  • Provide constructive feedback, report issues, and interact with developers to deliver superior product quality
  • Review requirements from the Product Management team
  • Utilize network troubleshooting tools (packet captures, monitoring devices, log files, customer input) to resolve issues effectively
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Fulltime
Read More
Arrow Right

Resident Engineer

We are looking for a Resident Engineer to join our team in a highly technical, p...
Location
Location
United States , All, Maryland
Salary
Salary:
89400.00 - 206500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 7 - 10 years of hands-on network engineering experience
  • Network engineering experience in a consultative manner supporting full-stack enterprise solutions
  • Data Center Switching, Data Center Routing, Data Center Security, and EVPN/VXLAN
  • Hands on experience supporting Juniper QFX and EX product family platforms
  • Intermediate to Advanced Knowledge of EVPN/VXLAN
  • Juniper Networking Specialist certification is required, Professional level preferred
  • Expert level relationship management, communications skills, and interpersonal skills to manage face-to-face communications daily with multiple levels of customer management and engineering staff across multiple departments within the host company in a responsible and professional fashion
  • Customer culture alignment in a strong technical team with deep expertise across multiple layers of technical responsibility
  • Expert-level troubleshooting methodology to isolate and identify configuration, design, and software anomalies
  • ability to clearly articulate findings in written and verbal communications with development-level engineering staff
Job Responsibility
Job Responsibility
  • Develop and maintain expertise on the products deployed within the customer’s network
  • Provide post-sales on-site support for Juniper’s networking products
  • Troubleshoot and identify configuration, design, and software anomalies
  • Manage multiple projects and customer engagements
  • Align with customer culture in a strong technical team environment
  • Suggest and implement Juniper solutions when appropriate
What we offer
What we offer
  • Health & Wellbeing suite supporting physical, financial, and emotional wellbeing
  • Specific programs for personal and professional development
  • Celebrating individual uniqueness through unconditional inclusion
  • Flexibility for work-life balance
  • Fulltime
Read More
Arrow Right

Data Engineer

We are looking for a skilled and enthusiastic Data Engineer to help design and o...
Location
Location
United States , East Windsor
Salary
Salary:
Not provided
beaconfireinc.com Logo
Beaconfire
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
  • Strong analytic skills related to working with unstructured datasets
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management
  • A successful history of manipulating, processing and extracting value from large disconnected datasets
  • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores
  • Strong project management and organizational skills
  • Experience supporting and working with cross-functional teams in a dynamic environment
Job Responsibility
Job Responsibility
  • Create and maintain optimal data pipeline architecture
  • Assemble large, complex data sets that meet functional / non-functional business requirements
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics
  • Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs
  • Keep our data separated and secure across national boundaries through multiple data centers and AWS regions
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
  • Work with data and analytics experts to strive for greater functionality in our data systems
  • Fulltime
Read More
Arrow Right

Resident Engineer

Resident Engineer position with HPE requiring TS/SCI with Full Scope Polygraph c...
Location
Location
United States , Columbia/Fort Meade
Salary
Salary:
101900.00 - 234500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Operations and maintenance of a data center network
  • Deep understanding of routing and switching technologies including EVPN, Multicast, OSPF, BGP, RSVP, LDP, Pseudo wires, L2VPN, L3VPN, MPLS Traffic Engineering (TE), Routing Policies, and tunneling technologies (GRE, VXLAN, etc.)
  • Experience with Juniper products: QFX, MX, EX, SRX
  • Engineering Design/Architecture, Operational, and Support experience in a medium to large scale data center network
  • Automation and DevOps knowledge
  • Automation access methods: XML API, NETCONF, REST API, gRPC
  • Python tools: RPCs, REST, PyEZ, ncclient
  • Ansible: Playbooks and Jinja2 templates
  • Automation scripts: Commit, Op, Event, or SNMP scripts
  • Python and SLAX
Job Responsibility
Job Responsibility
  • Operations and maintenance of a data center network
  • Support customer engineers and JTAC to resolve hardware and software issues
  • Implement solutions
  • Troubleshoot equipment and network problems
  • Open/track JTAC cases through to problem resolution
  • Deliver root cause analysis
  • Support technology briefings and training sessions
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Career development programs
  • Unconditional inclusion environment
  • Fulltime
Read More
Arrow Right