CrawlJobs Logo

Staff Software Engineer - AI/ML Infra

geico.com Logo

Geico

Location Icon

Location:
United States , Chevy Chase

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

115000.00 USD / Year

Job Description:

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Platform Engineer to build and scale our machine learning infrastructure with a focus on Large Language Models (LLMs) and AI applications. This role combines deep technical expertise in cloud platforms, container orchestration, and ML operations with strong leadership and mentoring capabilities. You will be responsible for designing, implementing, and maintaining scalable, reliable systems that enable our data science and engineering teams to deploy and operate LLMs efficiently at scale. The candidate must have excellent verbal and written communication skills with a proven ability to work independently and in a team environment.

Job Responsibility:

  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
  • Design and maintain robust CI/CD pipelines for ML model deployment using Azure DevOps, GitHub Actions, and MLOps tools
  • Implement automated model training, validation, deployment, and monitoring workflows
  • Set up comprehensive observability using Prometheus, Grafana, Azure Monitor, and custom dashboards
  • Continuously optimize platform performance, reducing latency and improving throughput for ML workloads
  • Design and implement backup, recovery, and business continuity plans for ML platforms
  • Mentor junior engineers and data scientists on platform best practices, infrastructure design, and ML operations
  • Lead comprehensive code reviews focusing on scalability, reliability, security, and maintainability
  • Design and deliver technical onboarding programs for new team members joining the ML platform team
  • Establish and champion engineering standards for ML infrastructure, deployment practices, and operational procedures
  • Create technical documentation, runbooks, and deliver internal training sessions on platform capabilities
  • Work closely with data scientists to understand requirements and optimize workflows for model development and deployment
  • Collaborate with product engineering teams to integrate ML capabilities into customer-facing applications
  • Support research teams with infrastructure for experimenting with cutting-edge LLM techniques and architectures
  • Present technical solutions and platform roadmaps to leadership and cross-functional stakeholders

Requirements:

  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
  • Hands-on experience with inference optimization using vLLM, TensorRT-LLM, Triton Inference Server, or similar
  • Advanced experience with Azure DevOps, GitHub Actions, Jenkins, or similar CI/CD platforms
  • Proficiency with Terraform, ARM templates, Pulumi, or CloudFormation
  • Deep understanding of Docker, container optimization, and multi-stage builds
  • Experience with Prometheus, Grafana, ELK stack, Azure Monitor, and distributed tracing
  • Knowledge of both SQL and NoSQL databases, data warehousing, and vector databases
  • Demonstrated track record of mentoring engineers and leading technical initiatives
  • Experience leading design reviews with focus on compliance, performance, and reliability
  • Excellent ability to explain complex technical concepts to diverse audiences
  • Strong analytical and troubleshooting skills for complex distributed systems
  • Experience managing cross-functional technical projects and coordinating with multiple stakeholders

Nice to have:

  • Master’s degree in computer science, Machine Learning, or related field
  • 8+ years of platform engineering or infrastructure experience
  • Experience with Staff Engineer or Tech Lead roles in ML/AI organizations
  • Background in distributed systems and high-performance computing
  • Open-source contributions to ML infrastructure projects or LLM frameworks
  • Multi-Cloud Experience: Hands-on experience with Azure, AWS (SageMaker, EKS) and/or GCP (Vertex AI, GKE)
  • Experience with specialized hardware (A100s, H100s, TPUs, TEEs) and optimization
  • RLHF & Fine-tuning: Experience with Reinforcement Learning from Human Feedback and LLM fine-tuning workflows
  • Experience with Milvus, Pinecone, Weaviate, Qdrant, or similar vector storage solutions
  • Deep experience with MLflow, Kubeflow, DataRobot, or similar platforms
  • Understanding of AI safety principles, model governance, and regulatory compliance
  • Background in regulated industries with understanding of data privacy requirements
  • Experience supporting ML research teams and academic partnerships
  • Deep understanding of GPU optimization, memory management, and high-throughput systems
What we offer:
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 31694 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Staff Software Engineer - AI/ML Infra

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Palo Alto
Salary
Salary:
90000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right
New

Payroll Specialist

Randstad is pleased to be partnering with a large and reputable Queensland busin...
Location
Location
Australia , Brisbane
Salary
Salary:
145000.00 AUD / Year
https://www.randstad.com Logo
Randstad
Expiration Date
June 05, 2026
Flip Icon
Requirements
Requirements
  • Strong technical ability in defining system configuration requirements for Awards and EAs
  • In-depth understanding of the Fair Work Act and relevant payroll legislation
  • Proven track record in identifying and driving payroll process improvements
  • Advanced analytical and problem-solving skills with high attention to detail
  • Exceptional communication skills and the ability to manage diverse stakeholder relationships
  • Experience utilising large systems (eg. Chris21/iChris, SAP, Aurion, Preceda, Dayforce, etc.)
  • 5 years experience
  • Previous experience in a similar role
Job Responsibility
Job Responsibility
  • Acting as a subject matter expert for payroll transformation initiatives and system upgrades
  • Translating complex Award and Enterprise Agreement (EA) requirements into accurate system configurations
  • Managing a backlog of change requests to ensure payroll systems are updated and maintained efficiently
  • Maintaining time and attendance rules, core data, and cyclical rate updates
  • Conducting regular audits to ensure ongoing compliance with legislation and internal standards
  • Resolving intricate pay queries and identifying corrective actions for any data discrepancies
  • Collaborating with IT, IR, and external vendors to streamline communication and implement process improvements
What we offer
What we offer
  • Competitive salary package of $145K
  • Flexible working arrangements including 1 day WFH and flexible start/finish times
  • Stunning office space with premium amenities and a fully equipped onsite gym
  • Convenient inner-suburbs location with easy access to public transport
  • Exceptional leadership from high-energy, reputable managers
  • Opportunity to work on major projects and system implementations during a time of significant business change
Read More
Arrow Right
New

Specialist Cleaner

Specialist Cleaner - Kings Lynn Library - £13.82 Per hour. Norse Cleaning Divisi...
Location
Location
United Kingdom , King's Lynn
Salary
Salary:
13.82 GBP / Hour
norsegroup.co.uk Logo
Norse Group LTD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Previous cleaning or caretaking experience
  • Knowledge of basic site health & safety
  • Excellent communication skills
  • Confident working independently
  • Previous commercial cleaning experience would be desirable
Job Responsibility
Job Responsibility
  • Undertaking specialist environmental cleans
  • Cleaning and maintaining designated areas - mopping, hoovering, wiping down, emptying bins
  • Building relationships with the client on site
  • Carrying out routine checks and following health and safety
  • Reporting any issues or defects on site
What we offer
What we offer
  • Enhanced rates of pay at weekends
  • Flexible working patterns
  • Permanent and Casual contracts
  • Uniform provided
  • Paid DBS upon completion of 6 months
  • 20 days annual leave + bank holidays (Pro-rata)
  • Ongoing training and development opportunities
  • Industry qualifications and apprenticeship training programmes
  • Cycle to work scheme
  • Amazon vouchers
  • Parttime
Read More
Arrow Right
New

Primary Teacher + Creative TLR

Primary Teacher + Creative TLR | Good Primary School | Brent. A recently 'Good' ...
Location
Location
United Kingdom , Brent, London
Salary
Salary:
38766.00 - 50288.00 GBP / Year
https://edex.co.uk Logo
EdEx
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must have UK QTS
  • Good understanding of the Primary curriculum
  • Must be a team player
  • Must be graded 'Good or Outstanding' in lesson observations
  • Fulltime
Read More
Arrow Right
New

Team Member

Location
Location
United Kingdom , Birmingham
Salary
Salary:
8.00 - 12.71 GBP / Hour
jobs.360resourcing.co.uk Logo
360 Resourcing Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A positive attitude and a passion for making people’s day
  • A team player who brings energy and enthusiasm
  • No experience needed—we’ll teach you everything!
Job Responsibility
Job Responsibility
  • Serve our famous shatter crunch chicken with energy and pride
  • Deliver unforgettable guest experiences every shift
  • Master different stations with full training provided
  • Be part of a supportive, fun-loving team
What we offer
What we offer
  • Flexible scheduling to suit your lifestyle
  • Free chicken on shift + 30% off when you're not working
  • Paid day off on your birthday
  • Clear career progression opportunities
  • 28 days holiday (pro rata)
  • Access up to 30% of your pay early with Wage Stream
  • Enhanced parental leave
  • Pension contributions
  • Gym and cycle-to-work discounts
  • Tech scheme & online perks platform
Read More
Arrow Right
New

Senior Financial Accountant

Senior Financial Accountant/ Finance Manager at a Law Firm on the Northern Gold ...
Location
Location
Australia , Gold Coast
Salary
Salary:
95000.00 - 110000.00 AUD / Year
https://www.randstad.com Logo
Randstad
Expiration Date
June 13, 2026
Flip Icon
Requirements
Requirements
  • Excellent presentation and communication skills
  • Strong understanding of trust accounting requirements
  • Experience with Xero
  • CPA, CA, or a Bachelor of Accounting
  • 5 years experience
  • CPA or CA or IPA Certifications
  • Bachelor in Accounting
Job Responsibility
Job Responsibility
  • Prepare monthly management packs including P&L, balance sheets, and cashflow with KPI commentary
  • Manage trust account receipting and payments while maintaining compliance with legal profession standards
  • Oversee daily and monthly trust and office account reconciliations
  • Assist with trust account audits (QLS requirements) and BAS preparation
  • Handle budgeting, multi-entity reporting, payroll (STP/PAYG), and debtor management
What we offer
What we offer
  • + super
  • Fulltime
Read More
Arrow Right
New

Primary Teacher + Foundation TLR

Primary Teacher + Foundation TLR | Outstanding Primary School | Brent. An Outsta...
Location
Location
United Kingdom , Brent, London
Salary
Salary:
39741.00 - 49571.00 GBP / Year
https://edex.co.uk Logo
EdEx
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must have UK QTS
  • Good understanding of the Primary curriculum
  • Must be a team player
  • Must be graded 'Good or Outstanding' in lesson observations
What we offer
What we offer
  • CPD Opportunities
  • High staff retention rates
  • Foundation TLR available
  • Fulltime
Read More
Arrow Right
New

Associate Attorney

Established insurance defense litigation firm is seeking an Associate Attorney w...
Location
Location
United States , Brooklyn
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Juris Doctor from an accredited law school
  • Active admission to the New York State Bar
  • Three to five years of experience handling insurance defense litigation or a comparable practice area
  • Demonstrated ability to draft motions, briefs, and discovery materials in a litigation setting
  • Strong legal research, written communication, and analytical problem-solving skills
  • Ability to work effectively both independently and as part of a team
  • Detail-oriented demeanor, strong interpersonal communication, and a clear commitment to continued development
Job Responsibility
Job Responsibility
  • Take and defend depositions
  • Draft and argue motions
  • Participate in court appearances and trials
  • Collaborate with attorneys on litigation strategy and case management
What we offer
What we offer
  • Hands-on litigation experience and mentorship
  • A collaborative, team-oriented culture without micromanagement
  • Hybrid work flexibility and strong work/life balance
  • Ongoing training, professional development, and team-building events
  • Competitive benefits package including medical, dental, vision, disability, 401(k) with match, commuter benefits, parental leave, PTO, and floating holidays
  • Fulltime
Read More
Arrow Right