CrawlJobs Logo

Senior Software Engineer - ML Infrastructure

United States, New York 190800.00 - 286800.00 USD / Year · Job Posted March 22, 2026
Apply Position
Job Link Share

Job Description

We build simple yet innovative consumer products and developer APIs that shape how everybody interacts with money and the financial system. We believe that the way people interact with their finances will drastically improve in the next few years. We’re dedicated to empowering this transformation by building the tools and experiences that thousands of developers use to create their own products. Plaid powers the tools millions of people rely on to live a healthier financial life. We work with thousands of companies like Venmo, SoFi, several of the Fortune 500, and many of the largest banks to make it easy for people to connect their financial accounts to the apps and services they want to use. Plaid’s network covers 12,000 financial institutions across the US, Canada, UK and Europe. Founded in 2013, the company is headquartered in San Francisco with offices in New York, Washington D.C., London and Amsterdam. Plaid is evolving into an AI-first company, where data and machine learning are the key enablers of smarter, more secure insight products built on top of Plaid’s vast financial data network. The Machine Learning Infrastructure team sits at the center of this transformation. We build the platforms that enable model developers to experiment, train, deploy, and monitor machine learning systems reliably and at scale — from feature stores and pipelines, to deployment frameworks and inference tooling. We are in the midst of a pivotal shift: replacing legacy systems with a modern feature store, and establishing a standardized ML Ops “golden path.” Our mission is to enable Plaid’s product teams to move faster with trustworthy insights, deploy models with confidence, and unlock the next generation of AI-powered financial experiences. As a Senior Software Engineer on the Machine Learning Infrastructure team, you will design, build, and operate the systems that power machine learning across Plaid. You will apply your deep technical expertise to create scalable, reliable, and secure ML platforms, and collaborate closely with ML product teams to accelerate the delivery of ML & AI-powered products. This is a highly technical, hands-on role where you’ll contribute to core infrastructure, influence architectural direction, and mentor peers while helping to define the “golden path” for ML development and deployment at Plaid.

Job Responsibility

  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance

Requirements

  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams

Nice to have

  • Experience with ML Ops tools such as MLFlow, SageMaker, or model registries
  • Exposure to modern AI infrastructure environments (LLMs, real-time inference, agentic models)
  • Background in scaling ML infrastructure in fast-paced product environments

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Software Engineer - ML Infrastructure

8 matching positions

Senior Software Engineer - ML Infrastructure

We are seeking a Senior Software Engineer to design and build the infrastructure...
Location
Location
United States , Boston
Salary
Salary:
152000.00 - 224000.00 USD / Year
simplisafe.com Logo
SimpliSafe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience building software systems and infrastructure
  • 3+ years of experience deploying and supporting production solutions on AWS
  • Experience building and operating production applications on Kubernetes
  • Experience with AWS data services such as Athena, Glue and Kinesis
  • Familiarity with AWS services such as Lambda, Dynamodb and IAM
  • Expertise in containers, infrastructure automation, and CI/CD tooling
Job Responsibility
Job Responsibility
  • Design, build, and maintain software systems and infrastructure that support the end-to-end ML lifecycle
  • Support the development and operation of production-grade machine learning solutions
  • Develop and operate microservices in a public cloud environment (AWS, Azure, or GCP)
  • Collaborate cross-functionally with ML and platform teams to deliver scalable solutions
  • Provide technical guidance and mentorship to engineers
  • Promote and practice high engineering standards, including unit, integration, and mock testing
  • Contribute to cloud infrastructure automation, CI/CD pipelines, and containerized deployments
  • Take ownership of projects with a proactive, “can-do” mindset
What we offer
What we offer
  • A mission- and values-driven culture and a safe, inclusive environment where you can build, grow and thrive
  • A comprehensive total rewards package that supports your wellness and provides security for SimpliSafers and their families
  • Free SimpliSafe system and professional monitoring for your home
  • Employee Resource Groups (ERGs) that bring people together, give opportunities to network, mentor and develop, and advocate for change
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, ML Infrastructure

LMArena is seeking a Senior Software Engineer (Infrastructure) to lead the desig...
Location
Location
United States , Bay Area
Salary
Salary:
Not provided
arena.ai Logo
Arena Intelligence, Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in software engineering, with a focus on infrastructure or large-scale data and ML systems
  • Deep expertise in distributed systems, stream processing, and scalable backend architecture
  • Proven ability to design and operate low-latency, high-throughput, and fault-tolerant systems
  • Strong foundation in systems design, performance tuning, and building reliable, fault-tolerant services
  • Comfortable in a dynamic, high-ownership, fast-growth environment
Job Responsibility
Job Responsibility
  • Architect and scale high-performance, real-time API and data systems
  • Design and implement low-latency pipelines to process and analyze large-scale event streams
  • Ensure reliability through robust data integrity, availability, and consistency mechanisms
  • Mentor and guide engineers on infrastructure best practices, architecture, and performance tuning
  • Collaborate cross-functionally with AI researchers, product leaders, and engineers to anticipate evolving infrastructure needs and deliver resilient, extensible systems
What we offer
What we offer
  • Comprehensive health and wellness benefits, including medical, dental, vision, and additional support programs.
  • The opportunity to work on cutting-edge AI with a small, mission-driven team
  • A culture that values transparency, trust, and community impact
  • Fulltime
Read More
Arrow Right

Senior Software Engineer and Principal Software Engineer - Power Point AI Team

The PowerPoint team is embarking on an exciting new chapter - evolving a product...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 8+ years of experience in backend service engineering, including work on high-scale infrastructures
  • Proficiency in one or more systems programming languages such as C#, C++
  • 1+ years of experience in software engineering, designing and developing systems (and APIs) that deploy and integrate with AI models
  • 2+ years of experience working with rich telemetry, making data driven decisions, and carrying out rapid experimentation
  • 2+ years of experience building software for scale, performance, and reliability
  • Academic or industry experience with building, finetuning, deploying or building eval-driven systems utilizing the models (any category)
Job Responsibility
Job Responsibility
  • Lead design and delivery of complex, scalable AI features ensuring resilience and exceptional user experience
  • Drive technical strategy and architecture decisions across multiple services, influencing partner teams and aligning with compliance and security requirements
  • Champion modern engineering practices, including AI-driven approaches, automation, and cloud-native patterns, across the full development lifecycle
  • Mentor and guide engineers, fostering technical excellence and continuous improvement in security, reliability, and performance
  • Collaborate cross-org to solve challenging technical problems, streamline processes, and reduce operational costs while improving live-site health
  • Design and implement scalable backend services optimized for machine learning workflows and large language model integration
  • Develop and maintain evaluation-driven systems that leverage text and multimodal inputs (e.g., images) to power visual-creation experiences
  • Build and optimize APIs and infrastructure to support high-performance model inference and experimentation at scale
  • Collaborate with product, ML, and design teams to integrate models into user-facing features, ensuring seamless functionality and performance
  • Conduct model evaluations and experiments, analyze results, and iterate on improvements to enhance accuracy and user experience
  • Fulltime
Read More
Arrow Right

Senior ML Infrastructure Engineer, Inference Platform

About the Team: The ML Inference Platform is part of the AV ML Infrastructure or...
Location
Location
United States , Austin, Texas; Mountain View, California; Sunnyvale, California
Salary
Salary:
155420.00 USD / Year
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience, with focus on machine learning systems or high performance backend services
  • Expertise in either Python, C++ or other relevant coding languages
  • Expertise in ML inference, model serving frameworks (triton, rayserve, vLLM etc)
  • Strong communication skills and a proven ability to drive cross-functional initiatives
  • Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities
Job Responsibility
Job Responsibility
  • Design and implement core platform backend software components
  • Collaborate with ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value
  • Lead technical decision-making on model serving strategies, orchestration, caching, model versioning, and auto-scaling mechanisms for highly optimized use of accelerators
  • Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization of inference services
  • Proactively research and integrate state-of-the-art model serving frameworks, hardware accelerators, and distributed computing techniques
  • Lead technical initiatives across GM’s ML ecosystem
  • Raise the engineering bar through technical leadership, establishing best practices
  • Contribute to open source projects
  • represent GM in relevant communities
What we offer
What we offer
  • medical
  • dental
  • vision
  • Health Savings Account
  • Flexible Spending Accounts
  • retirement savings plan
  • sickness and accident benefits
  • life insurance
  • paid vacation & holidays
  • tuition assistance programs
  • Fulltime
Read More
Arrow Right
New

Senior Software Engineer- ML Network Stack

We are seeking an experienced engineer to join our team that owns the network st...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
Amazon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of non-internship professional software development experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • 3+ years as a mentor, tech lead or leading engineering teams
  • 3+years experience in SW/HW Co-Design
Job Responsibility
Job Responsibility
  • Be a senior engineer on a team that builds and maintains the infrastructure that monitors and reports on functionality and performance of massive testing workloads run at scale
  • Use internal Amazon CI/CD tools, Linux, and public AWS products to automate the delivery of our software to customers, saving developer time
  • Write Python code that effortlessly spools up large clusters and runs benchmarks and applications for ML and HPC workloads
  • Use AWS Managed Grafana and Athena to digest the massive amount of performance data generated by these workloads and create dashboards for developers and stakeholders
  • Invent automatic mechanisms to alert developers to functional and performance regressions so they never reach customers
  • Manage the complexity of infrastructure that covers many instance types, software stacks, Linux operating systems, cutting-edge releases and make it easy to evolve
Read More
Arrow Right

Senior Software Engineer, ML Products

A Senior Software Engineer will work closely with Product Managers and Machine L...
Location
Location
United States , Chicago
Salary
Salary:
171000.00 - 213000.00 USD / Year
arrivelogistics.com Logo
Arrive Logistics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or a related field or equivalent professional experience
  • 5+ years of experience with a backend language, object oriented programming and building highly scalable backend services
  • 3+ years of experience with relational and NoSql databases
  • 2+ years in a lead or senior-level capacity
  • 2+ years of experience designing maintainable and scalable systems
  • Proven expertise in system design with a focus on distributed systems and event-driven architectures
  • Experience developing cloud-native dockerized applications in Kubernetes
  • Experience working with online experimentation and platforms
  • Strong communication skills with the ability to articulate, diagram and document complex engineering concepts
  • Strong analytical, problem-solving, decision-making, and interpersonal skills
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ML products and infrastructure using Python, Postgres, and Elasticsearch
  • Lead sprints, conduct rigorous code reviews, and set the 'gold standard' for engineering practices across the organization
  • Actively mentor junior and mid-level engineers, fostering a culture of technical excellence and professional growth
  • Partner closely with other engineers, product managers, data scientists, data engineers, and product engineers to ensure the successful delivery of strategic and roadmap initiatives
  • Independently and with relatively little oversight, own systems throughout the software development lifecycle, from design to development, deployment and monitoring
  • Maintain and improve performance of existing systems and processes while balancing maintainability, observability and readability
  • Demonstrate a deep sense of ownership by developing a thorough understanding of a domain
  • Proactively propose solutions to gaps or risks in process, technology, software design and architecture
  • Provide rigorous and detailed code reviews that uphold team standards, testing and software design best practices
  • Foster a culture of constant improvement and growth, engineering excellence, humility, positivity and curiosity
What we offer
What we offer
  • Medical, dental, vision, life, and disability coverage
  • Matching 401(k) program
  • Employee Resource Groups
  • Office wide engagement activities, team events, happy hours
  • Casual dress code
  • Work in downtown Chicago, IL
  • Bike storage inside building
  • LifeStart gym with Peloton bikes and personal training
  • Free counseling sessions through Employee Assistance Program
  • Referral Program
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, ML Products

A Senior Software Engineer will work closely with Product Managers and Machine L...
Location
Location
Mexico , Guadalajara
Salary
Salary:
Not provided
arrivelogistics.com Logo
Arrive Logistics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or a related field or equivalent professional experience
  • 5+ years of experience with a backend language, object oriented programming and building highly scalable backend services
  • 3+ years of experience with relational and NoSql databases
  • 2+ years in a lead or senior-level capacity
  • 2+ years of experience designing maintainable and scalable systems
  • Proven expertise in system design with a focus on distributed systems and event-driven architectures
  • Experience developing cloud-native dockerized applications in Kubernetes
  • Experience working with online experimentation and platforms
  • Strong communication skills with the ability to articulate, diagram and document complex engineering concepts
  • Strong analytical, problem-solving, decision-making, and interpersonal skills
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ML products and infrastructure using Python, Postgres, and Elasticsearch
  • Lead sprints, conduct rigorous code reviews, and set the "gold standard" for engineering practices across the organization
  • Actively mentor junior and mid-level engineers, fostering a culture of technical excellence and professional growth
  • Partner closely with other engineers, product managers, data scientists, data engineers, and product engineers to ensure the successful delivery of strategic and roadmap initiatives
  • Independently and with relatively little oversight, own systems throughout the software development lifecycle, from design to development, deployment and monitoring
  • Maintain and improve performance of existing systems and processes while balancing maintainability, observability and readability
  • Demonstrate a deep sense of ownership by developing a thorough understanding of a domain
  • Proactively propose solutions to gaps or risks in process, technology, software design and architecture
  • Provide rigorous and detailed code reviews that uphold team standards, testing and software design best practices
  • Foster a culture of constant improvement and growth, engineering excellence, humility, positivity and curiosity
What we offer
What we offer
  • Monthly grocery vouchers
  • Vacation days
  • Savings fund
  • Medical insurance (including dental and vision plans)
  • Casual dress code
  • Office wide engagement activities, team events, happy hours
  • Work in our new Guadalajara office located in Torre 1500 (Av. Americas 1254)
  • Free coffee
  • Free counseling sessions through our Employee Assistance Program
  • Referral Program
  • Fulltime
Read More
Arrow Right