CrawlJobs Logo

Infrastructure Engineer (Performance Optimization)

United Kingdom, London Employment contract 600.00 - 650.00 GBP / Day · Job Posted May 27, 2026
Apply Position
Job Link Share

Job Description

Seeking a hands-on Infrastructure Engineer to join a newly formed Performance Optimization Squad within a massive-scale production platform.

Job Responsibility

  • Execute Optimizations: Drive well-scoped initiatives to completion, including compute resource rightsizing, JVM tuning, and workload placement
  • Build Automation: Implement infrastructure changes and build automation to scale impact across the fleet
  • Collaborate: Work within a focused team of 4 engineers, a data analyst, and an engineering manager

Requirements

  • 5 years in Infrastructure, Platform, or Backend engineering roles
  • Solid experience with Kubernetes (ideally GKE)
  • Proficiency in at least two of: Java, Go, or Python (strong scripting/automation skills preferred)
  • Comfortable with GCP (compute, networking, IAM, cost monitoring)
  • Familiarity with IaC (Terraform, Helm) and CI/CD pipelines

Nice to have

  • Background in Reliability Engineering / SRE (SLOs, error budgets, safe rollouts)
  • Experience with JVM-based services at scale
  • Familiarity with GCP Billing or BigQuery cost exports

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Infrastructure Engineer (Performance Optimization)

8 matching positions

Performance Infrastructure Engineer- Data Center GPU

You will be part of a small, but dedicated team driving discrete GPU products’ p...
Location
Location
United States , Santa Clara
Salary
Salary:
192000.00 - 288000.00 USD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong development experience in Python and/or Bash (or equivalent scripting languages)
  • Experience with Github, Jenkins, or similar CI/CD and code review systems
  • Linux system administration experience preferred
  • Experience developing automated test infrastructure and orchestrating multisystem workflows is preferred
  • Ansible experience is a bonus
  • Strong analytical, problem solving, and debugging skills
  • Excellent communication skills
  • must be a critical thinker and self-starter
  • Ability to quickly learn and apply new tools, technologies, and frameworks
  • Networking experience preferred, including common protocols and basic debugging
Job Responsibility
Job Responsibility
  • Technical team lead for a team of 5-6 engineers
  • Assess and understand the current automation and performance analysis infrastructure, identifying strengths, gaps, and opportunities for improvement
  • Collaborate with internal teams to gather technical requirements and understand evolving needs
  • Develop a forward looking plan that balances reusing existing systems with building new infrastructure where appropriate
  • Design, develop, and maintain automation and performance analysis tooling using Python, Bash, Make, and related technologies
  • Build and enhance workflow automation solutions using internally developed tools to orchestrate ML workloads
  • Develop new techniques and tooling to optimize ML workload execution, profiling, and analysis at scale
Read More
Arrow Right

Performance & Capacity Engineer - Planning Optimization

Meta is seeking a Performance & Capacity Engineer to join the Capacity Engineeri...
Location
Location
United States , Bellevue
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 6+ years experience in any coding language and designing software systems
  • 4+ years experience in capacity, performance, software, or reliability engineering
  • Proven experience to manage ambiguity, experience frequently learning new technical and business concepts
Job Responsibility
Job Responsibility
  • Own both technical as well as business outcomes for capacity planning for all of Meta: all software products/services and plans for how to scale server and data center resources most efficiently
  • Use the tools you build to own the business outcomes: develop and analyze variety of business and technical scenarios to drive the highest levels of executive decision making around infrastructure/product, up to the CxO level
  • Partner across the engineering technical landscape to optimize at the intersection of hardware, infrastructure, and software. Work closely with software service owners, Production Engineering, Server Hardware Engineering, Server Supply Chain, Network Engineering, Data Center Design, Operations, and Planning teams to find the most optimal ways to scale our infrastructure and place our services
  • Design and help build software systems to build scalable, reliable planning systems to connect business strategy with detailed technical execution including regional and temporal bin-packing, optimal service placement, traffic shifts and service migrations, efficient hardware refresh, etc
  • Partner with Finance to balance cost efficiency with technical and product considerations
  • Greenfield work: Work cross-functionally to define problem statements, collect data, build analytical models and make recommendations to drive change and optimization at the most strategic levels
  • A lot of other cool work: Identify capacity-related issues proactively and work across technical and business teams to define and implement solutions
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Performance & Capacity Engineer - Planning Optimization

Meta is seeking a Performance & Capacity Engineer to join the Capacity Engineeri...
Location
Location
United States , Bellevue
Salary
Salary:
184000.00 - 257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 8+ years experience in any coding language and designing software systems
  • 8+ years experience in capacity, performance, software, or reliability engineering
  • Proven experience to manage ambiguity, experience to frequently learn new technical and business concepts
Job Responsibility
Job Responsibility
  • Own both technical as well as business outcomes for capacity planning for all of Meta: all software products/services and plans for how to scale server and data center resources most efficiently
  • Build automated, scalable data and analytics solutions by developing state-of-the-art automation, mathematical optimization, and/or AI models using Meta’s unparalleled data infrastructure
  • Use the tools you build to own the business outcomes: develop and analyze variety of business and technical scenarios to drive the highest levels of executive decision making around infrastructure/product, up to the CxO level
  • Design and help build software systems to build scalable, reliable planning systems to connect business strategy with detailed technical execution including regional and temporal bin-packing, optimal service placement, traffic shifts and service migrations, efficient hardware refresh, etc
  • Partner across the engineering technical landscape to optimize at the intersection of hardware, infrastructure, and software. Work closely with software service owners, Production Engineering, Server Hardware Engineering, Server Supply Chain, Network Engineering, Data Center Design, Operations, and Planning teams to find the most optimal ways to scale our infrastructure and place our services
  • Partner with Finance to balance cost efficiency with technical and product considerations
  • Greenfield work: Work cross-functionally to define problem statements, collect data, build analytical models and make recommendations to drive change and optimization at the most strategic levels
  • A lot of other cool work: Identify capacity-related issues proactively and work across technical and business teams to define and implement solutions
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Performance & Capacity Engineer - Planning Optimization

Meta is seeking a Performance & Capacity Engineer to join the Capacity Engineeri...
Location
Location
United States , Bellevue
Salary
Salary:
117000.00 - 181000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • Minimum 4 years of experience working with distributed systems at scale
  • Proficient in any coding language and designing software systems
  • Desire to learn about capacity planning and optimization
  • Experience managing ambiguity. Experience learning and applying new business and technical concepts
Job Responsibility
Job Responsibility
  • Own both technical as well as business outcomes for capacity planning for all of Meta: all software products/services and plans for how to scale server and data center resources most efficiently
  • Build automated, scalable data and analytics solutions by developing state-of-the-art automation, mathematical optimization, and/or AI models using Meta’s unparalleled data infrastructure
  • Use the tools you build to own the business outcomes: develop and analyze variety of business and technical scenarios to drive the highest levels of executive decision making around infrastructure/product, up to the CxO level
  • Partner across the engineering technical landscape to optimize at the intersection of hardware, infrastructure, and software. Work closely with software service owners, Production Engineering, Server Hardware Engineering, Server Supply Chain, Network Engineering, Data Center Design, Operations, and Planning teams to find the most optimal ways to scale our infrastructure and place our services
  • Partner with Finance to balance cost efficiency with technical and product considerations
  • Greenfield work: Work cross-functionally to define problem statements, collect data, build analytical models and make recommendations to drive change and optimization at the most strategic levels
  • A lot of other cool work: Identify capacity-related issues proactively and work across technical and business teams to define and implement solutions
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Performance & Capacity Engineer - Capacity Planning Optimization

Meta is seeking a Performance & Capacity Engineer to join the Capacity Planning ...
Location
Location
United States , Menlo Park
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in Performance, Capacity, or software engineering
  • Proficient in Python, C++, or other coding languages and designing large scale software systems
  • Demonstrated success leading large engineering projects and initiatives. Defining goals, managing ambiguity, inspiring and leading other engineers and non-technical contributors
  • Experience with large-scale technical infrastructure and distributed systems
Job Responsibility
Job Responsibility
  • Own infrastructure capacity planning for all of Meta: all software products/services and plans for how to scale server and data center resources most efficiently
  • Partner across the engineering technical landscape to optimize at the intersection of hardware, infrastructure, and software. Work closely with software service owners, Production Engineering, Server Hardware Engineering, Server Supply Chain, Network Engineering, Data Center Design, Operations, and Planning teams to find the most optimal ways to scale our infrastructure and place our services
  • Design and help build software systems to build scalable, reliable planning systems to connect business strategy with detailed technical execution including regional and temporal bin-packing, optimal service placement, traffic shifts and service migrations, efficient hardware refresh, etc
  • Effectively lead large engineering efforts while implementing the most complex parts of the system and process design yourself
  • Partner with Finance and business teams to balance cost efficiency with technical and product considerations
  • Work cross-functionally to define problem statements, collect data, build software driven models and make recommendations to drive change and optimization at the most strategic levels
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Research Engineer / Software Engineer (platform/core infrastructure)

Build the future of offensive security with XBOW. Attackers are already using AI...
Location
Location
United States
Salary
Salary:
150000.00 - 350000.00 USD / Year
xbow.com Logo
Xbow
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience building and operating scalable, distributed systems on cloud infrastructure such as AWS or similar
  • Comfortable working with infrastructure as code (e.g., Terraform, CDK)
  • A track record of performance tuning across cloud services, databases, and compute layers
  • Eager to learn new tools, languages, and technologies as needed
  • A thoughtful communicator who values clarity and simplicity and is comfortable working in a fast-paced startup and navigating ambiguity
  • Strong problem-solving skills and the ability to work with incomplete information
  • Curious, practical, and eager to work across layers of the stack when needed
  • You think proactively about failure modes and bring experience implementing disaster recovery and business continuity plans that keep critical systems running
Job Responsibility
Job Responsibility
  • Design and implement infrastructure systems that scale reliably and securely, and can be deployed across multiple cloud environments (AWS, Azure, OCI etc.) and contexts (SaaS, on prem)
  • Tune and optimize cloud services across compute, storage, networking, and observability to drive performance, reliability and maintainability of core services
  • Develop our core services, written in TypeScript, Kotlin and Go
  • Support large-scale systems with event driven architectures
  • Own problems end-to-end—from design through deployment to production support
  • Navigate ambiguity and help define how we build as much as what we build
  • Partner closely with other engineers, AI researchers and Security researchers to enable high-quality, high-velocity product development
  • Design for resilience by implementing disaster recovery and business continuity strategies that ensure uptime, even when things break
  • Improve how we build, deploy, and monitor services at scale
What we offer
What we offer
  • Competitive salary and a generous equity package
  • Career Growth: Shape your role, lead the function, and grow with the company
  • Meaningful Work: You will tackle technically complex challenges and play a pivotal role in the growth of our business
  • Fulltime
Read More
Arrow Right
New

Infrastructure Engineer

We are looking for an Infrastructure Engineer to join our team in an onsite cont...
Location
Location
United States , Randolph
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Information Technology, or a related technical field
  • At least 3 years of experience in network engineering, infrastructure engineering, or a similar technical role
  • Hands-on experience deploying and supporting servers and enterprise storage platforms
  • Strong working knowledge of virtualization technologies such as VMware or Hyper-V
  • Demonstrated experience administering Windows Server environments in an enterprise setting
  • Solid understanding of core networking concepts and protocols, with practical experience in routing, switching, and secure connectivity
  • Experience working with enterprise hardware and security solutions, including Cisco, Aruba, or Palo Alto technologies
  • Palo Alto Network Security Architect certification and Cisco CCNP certification, with scripting experience in Python, PowerShell, or Bash preferred
Job Responsibility
Job Responsibility
  • Design, implement, and support infrastructure solutions that promote secure, dependable connectivity and consistent system performance across the organization
  • Administer core server, storage, and virtualization environments while monitoring capacity, availability, and overall health
  • Strengthen infrastructure security by applying best practices, maintaining system hardening standards, and supporting compliance-related requirements
  • Investigate and resolve advanced infrastructure and network issues, serving as an escalation resource for complex technical incidents
  • Maintain and optimize Windows Server environments to improve reliability, manageability, and scalability
  • Support backup and disaster recovery readiness by contributing to recovery planning, testing activities, and continuity efforts
  • Work with enterprise networking and security hardware, including Cisco and Palo Alto platforms, to maintain stable and secure operations
  • Develop and use automation scripts to streamline administration, reduce manual effort, and improve consistency across infrastructure tasks
What we offer
What we offer
  • Medical
  • Vision
  • Dental
  • Life and disability insurance
  • 401(k) plan
Read More
Arrow Right

Infrastructure Engineer

Reducto is the agentic document platform for leading AI teams who demand enterpr...
Location
Location
United States , San Francisco
Salary
Salary:
150000.00 - 300000.00 USD / Year
reducto.ai Logo
Reducto
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Are your own worst critic—have an extremely high bar for quality and always aim for robust solutions rather than quick fixes
  • Have 5+ years of hands-on experience in building or supporting production-grade infrastructure and reliability processes for high-throughput systems
  • Are comfortable with Python or similar languages, and exceptional at working across cloud platforms, container orchestration (e.g., Kubernetes), networking, and storage technologies
  • Build your own tools on the fly to diagnose, experiment, and address reliability problems—whether it's an internal dashboard or an automated remediation workflow
  • Bring a quantitative, hands-on approach to system operations, automation, and continuous improvement
Job Responsibility
Job Responsibility
  • Designing, building, and maintaining highly available, scalable infrastructure to support intensive AI/ML workloads and real-time model deployments
  • Implementing robust monitoring, alerting, and observability systems to ensure system health, performance, and uptime across cloud and on-prem environments
  • Debugging, optimizing, and automating infrastructure for fast iteration and rapid deployment cycles, focusing on both reliability and developer velocity
  • Proactively identifying, investigating, and resolving incidents to minimize downtime and maintain world-class service levels for enterprise customers
  • Collaborating closely with engineers, ML specialists, and founders to shape product, infrastructure, and security strategies
What we offer
What we offer
  • Unlimited PTO
  • Lunch
  • Reimbursed Transportation
  • Insurance: Generous health insurance covering medical, dental, and vision
  • Health and Wellness Budget: We provide up to $150/mo reimbursement for health and wellness spending, such as gym memberships, fitness classes, or similar
  • Parental Leave
  • Fulltime
Read More
Arrow Right