CrawlJobs Logo

AI Infrastructure Engineer

blackrock.com Logo

BlackRock Investments

Location Icon

Location:
United Kingdom , Edinburgh

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

At BlackRock, technology underpins everything we do. AI is a core strategic priority for the firm, embedded across Aladdin and our investment, client, and operational platforms. We are seeking an AI Infrastructure Engineer to help build and operate the foundational infrastructure that enables AI systems to scale safely, securely, and reliably across the enterprise. This role sits within Aladdin Platform Engineering and focuses on the infrastructure and platform services required to support machine learning models, large language models (LLMs), and emerging AI capabilities in production. The successful candidate will work closely with AI Engineers, Data Scientists, Platform Engineers, Security, and Product partners to deliver resilient, cloud native AI platforms in a highly regulated environment.

Job Responsibility:

  • Design, build, and operate AI focused infrastructure platforms supporting model development, training, evaluation, and inference
  • Engineer scalable, reliable, and secure cloud native services to support AI workloads across AWS, Azure, and hybrid environments
  • Partner with AI Engineering and Data Science teams to improve developer experience, performance, and operational stability of AI systems
  • Enable production deployment of ML models and LLMs within governed enterprise environments, aligned with firmwide risk and compliance standards
  • Implement and maintain infrastructure as code and automation to ensure repeatable, auditable platform provisioning
  • Build and operate observability, monitoring, and alerting solutions for AI platforms, ensuring availability, performance, and cost transparency
  • Collaborate with Security and Risk partners to integrate identity, access controls, data protection, and governance into AI infrastructure
  • Contribute to architectural decisions and technical standards for AI platforms across Aladdin
  • Participate in on-call rotations and operational support as required for critical platforms
  • Continuously evaluate emerging AI infrastructure technologies and apply them pragmatically within BlackRock’s enterprise context

Requirements:

  • Strong experience in cloud infrastructure, platform engineering, or systems engineering roles
  • 4+ hands-on expertise with AWS and/or Azure and/or GCP, including Azure ML, Azure Foundry, AWS Bedrock, Google Vertex, as well as cloud compute, networking, storage, and security services
  • Understanding of ML platform operations and governance concepts, including model deployment strategies, lifecycle management, monitoring/observability, and Disaster Recovery
  • Experience supporting LLMs, generative AI platforms, or model serving infrastructure
  • Experience supporting AI and machine learning workloads, with exposure to managed compute for model training and finetuning, experimentation over large datasets, and endtoend MLOps pipeline flow including data ingestion, training, validation and deployment
  • Proficiency with Infrastructure as Code tools (e.g., Terraform, ARM/Bicep, CloudFormation)
  • Strong programming or scripting skills (e.g., Python, Bash, or similar)
  • Experience building and operating containerized and Kubernetes based platforms
  • Solid understanding of reliability, scalability, observability, and operational best practices
  • Ability to work effectively in cross functional teams and communicate complex technical concepts clearly

Nice to have:

  • Familiarity with GPU or accelerator based infrastructure
  • Experience working in financial services or other highly regulated industries
  • Familiarity with multicloud architectures and enterprise governance requirements
What we offer:
  • Retirement investment and tools designed to help you in building a sound financial future
  • Access to education reimbursement
  • Comprehensive resources to support your physical health and emotional well-being
  • Family support programs
  • Flexible Time Off (FTO)

Additional Information:

Job Posted:
February 20, 2026

Expiration:
May 27, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for AI Infrastructure Engineer

Senior AI Infrastructure Engineer

This role will be responsible for designing, deploying, and maintaining high-per...
Location
Location
United States , Bothell; Overland Park; Bellevue
Salary
Salary:
113600.00 - 205000.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years technical engineering experience, preferably in multiple technology focus areas
  • Expert understanding of AI/ML infrastructure components, or GPU-based systems – preferably in a high-availability, large scale environment
  • Hands-on Experience with NVIDIA DGX servers, BasePOD architectures, and advanced GPU technologies
  • Proficient in Linux/UNIX environments, including scripting/automation tools (Bash, Python, Ansible, Terraform)
  • Understanding of AI infrastructure security best practices
  • Experience with container orchestration (Kubernetes, Docker) and GPU workload management tools
  • Strong knowledge of networking (InfiniBand/Ethernet) and storage solutions in AI/ML contexts
Job Responsibility
Job Responsibility
  • Technical System Expertise: Understands system protocols, how systems operate and data flows
  • Technical Engineering Services: Drives engineering projects by active contribution to the application of engineering techniques
  • Innovation: Contributes to designs to implement new ideas which improve an existing and new system/process/service
  • Technical Writing: Writes basic documentation on how technology works
  • Technical Leadership: Collaborates with technical teams and utilizes system expertise to deliver technical solutions
  • Technology Strategy: Contributes to new and existing technology options that support business goals
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off
  • Paid holidays
  • Paid parental and family leave
  • Fulltime
Read More
Arrow Right

Software Engineer, AI Infrastructure

As a Software Engineer on our AI Infrastructure team, you will help design the c...
Location
Location
United States , New York, NY; San Mateo, CA
Salary
Salary:
Not provided
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 3 years of experience in software engineering, with a focus on infrastructure or machine learning systems
  • Strong programming skills in Python, Go, or a similar language
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, MLflow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Basic understanding of LLM knowledge (e.g., context length, disaggregated prefill, KV cache memory estimation, etc)
Job Responsibility
Job Responsibility
  • Contribute to the design and development of scalable backend infrastructure that supports distributed training, inference, and data pipelines
  • Build and maintain core backend services such as LLM CI/CD pipeline, control plane, and model serving systems
  • Support performance optimization, cost efficiency, and reliability improvements across compute, storage, and networking layers
  • Building frameworks and safeguards to ensure Fireworks AI has the best model quality in the industry
  • Collaborate with performance, training, and product teams to translate research and product needs into infrastructure solutions
  • Participate in code reviews, technical discussions, and continuous integration and deployment processes
What we offer
What we offer
  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation
  • Fulltime
Read More
Arrow Right

AI Research Engineer, Data Infrastructure

As a Research Engineer in Infrastructure, you will design and implement a robust...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 250000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience in building data pipelines and ETL systems
  • Ability to design and implement systems for data collection and management from robotic fleets
  • Familiarity with architectures that span on-robot components, on-premise clusters, and cloud infrastructure
  • Experience with data labeling tools or building dataset visualization and annotation tooling
  • Proficiency in creating or applying machine learning models for dataset organization and automated labeling
Job Responsibility
Job Responsibility
  • Optimize operational efficiency of data collection across the NEO robot fleet
  • Design intelligent triggers to determine when and what data should be uploaded from the robots
  • Automate ETL pipelines to make fleet-wide data easily queryable and training-ready
  • Collaborate with external dataset providers to prepare diverse multi-modal pre-training datasets
  • Build frontend tools for visualizing and automating the labeling of large datasets
  • Develop machine learning models for automatic dataset labeling and organization
What we offer
What we offer
  • Equity
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer - CI/CD & AI Automation (AI-first)

Groupon is undergoing a critical platform transformation, modernizing its core d...
Location
Location
Czechia , Prague
Salary
Salary:
Not provided
groupon.com Logo
Groupon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of dedicated experience in Platform Engineering, DevOps, or Infrastructure roles
  • Deep expertise building, scaling, and migrating CI/CD systems, with strong practical experience in Jenkins and/or GitHub Actions
  • Expertise in scripting and automation (Python, Go, or Bash)
  • Solid understanding of container technologies, Kubernetes, and cloud build systems
  • Proven experience leveraging AI tooling (e.g., Claude Code, code analysis) to meaningfully increase developer output and optimize platform work
  • Excellent communication and ability to drive technical decisions across multiple platform and product teams
Job Responsibility
Job Responsibility
  • Platform Transformation: Lead the design, planning, and execution of the Jenkins-to-GitHub Actions migration across a large portfolio of microservices
  • Pipeline Engineering: Design and optimize high-performance, secure, and observable CI/CD workflows across GitHub Actions, Jenkins, and Kubernetes environments
  • AI-First Automation: Drive an AI-First workflow by leveraging tools (e.g., Copilot, code generation) to eliminate infrastructure toil, accelerate development, and analyze pipeline failures
  • Core Automation: Develop robust platform automation (e.g., Python, Go, Bash) to improve build efficiency, artifact caching, reliability, and repository hygiene
  • Security & Compliance: Harden CI/CD infrastructure with robust controls for secrets management, RBAC, audit logging, and secure runner design
  • Observability: Implement and enhance CI/CD observability using tools like Prometheus, Grafana, and OpenTelemetry to provide deep insights into performance and reliability
  • Technical Leadership: Mentor engineers and partner across Cloud, Security, and Developer Experience teams to define and evolve our end-to-end delivery platform architecture
Read More
Arrow Right

Engineering Manager, AI Platform

Lead Airtable's AI Platform pod, which builds the foundational infrastructure an...
Location
Location
United States , San Francisco; New York City
Salary
Salary:
240000.00 - 339900.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Platform builder at heart: think in systems and abstractions
  • experience building infrastructure other teams depend on
  • Technical depth with strategic thinking
  • Systems thinker with shipping velocity
  • AI infrastructure experience: worked on ML platforms, agent frameworks, or AI infrastructure at scale
  • Quality through architecture
  • Strong technical and management growth trajectory: 5+ years experience as an engineer (previously in a staff or TL level IC position) and 1+ years as a manager, or a similar combination
Job Responsibility
Job Responsibility
  • Build the AI platform foundation: own the core agent architecture, orchestration layer, and runtime
  • Design for platform scale: create robust abstractions and APIs
  • Establish AI reliability systems: build evaluation frameworks, monitoring, and quality assurance systems
  • Drive technical strategy: partner with Staff+ engineers to define the technical roadmap
  • Enable AI democratization: build platform capabilities that make sophisticated AI accessible to all Airtable users
What we offer
What we offer
  • Benefits
  • Restricted stock units
  • Incentive compensation
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Engineering Manager - Machine Learning Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
241200.00 - 400000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8–10 years of experience in ML infrastructure, including direct hands-on expertise as an engineer, IC/TL
  • 2+ years of experience managing infrastructure or ML platform engineers
  • Proven experience delivering and operating ML or AI infrastructure at scale
  • Solid technical depth across ML/AI infrastructure domains (e.g., feature stores, pipelines, deployment, inference, observability)
  • Demonstrated ability to drive execution on complex technical projects with cross-team stakeholders
  • Strong communication and stakeholder management skills
Job Responsibility
Job Responsibility
  • Lead and support the ML Infra team, driving project execution and ensuring delivery on key commitments
  • Build and launch Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Define and drive adoption of an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines, deployment tooling, and inference systems
  • Partner with ML product teams to understand requirements and deliver solutions that accelerate model development and iteration
  • Recruit, mentor, and develop engineers, fostering a collaborative and high-performing team culture
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

Data Infrastructure Engineer – New York or DC (hybrid) – Competitive Salary + Eq...
Location
Location
United States , New York or DC
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Startup Energy: You thrive in fast-paced environments, manage ambiguity well, and focus on what moves the needle
  • Designing and deploying intuitive, user-friendly APIs
  • Demonstrated ability to train and deploy models at scale
  • Successfully launching machine learning services, particularly those leveraging LLMs, embeddings, and inference, into production environments
  • Handling and securing large-scale production data
  • Demonstrated proficiency in Python, Go, or C
  • A proactive approach to tackling complex challenges in a fast-paced, early-stage environment
  • A passion for innovation and a collaborative spirit
Job Responsibility
Job Responsibility
  • Joining as part of the founding Engineering team, you will be a key part of developing secure data sharing middleware
  • Their software will integrate seamlessly into the workflows of specialized professionals, ensuring secure and efficient data access throughout the asset recruitment process
  • The data infrastructure engineer requires a mix of software development and ML Ops practices, resulting in an exciting, fast paced engineering role
  • You will be able to demonstrate experience building, shipping and supporting mission critical services in support of the services that make up the Data platform
  • This role requires the ability to provide solutions for the full data stack – from the data management, software development and model and deployment lifecycles
What we offer
What we offer
  • Competitive Salary + Equity
  • Fulltime
Read More
Arrow Right