Senior Staff Machine Learning Engineer (AI Agent) Job at Cresta

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...

Location

United States , San Francisco

Salary:

216500.00 - 324500.00 USD / Year

GoFundMe

Expiration Date

Until further notice

Requirements

9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
Extensive experience designing, developing, and operating scalable backend systems
Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)

Job Responsibility

Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure

What we offer

Competitive pay
Comprehensive healthcare benefits
Financial assistance for things like hybrid work, family planning
Generous parental leave
Flexible time-off policies
Mental health and wellness resources
Learning, development, and recognition programs

Fulltime

Senior Staff Engineer, Applied AI

GEICO is seeking a Senior Staff Engineer, Applied AI to provide technical archit...

Location

United States , Chevy Chase, MD; Palo Alto, CA

Salary:

130000.00 - 260000.00 USD / Year

Geico

Expiration Date

Until further notice

Requirements

8 or more years of professional software engineering or applied machine learning experience
2 or more years working with Generative AI or LLM-based systems in production
Proven track record of architecting and delivering complex AI/ML capabilities that span multiple teams and have measurable business impact
Deep hands-on expertise with Python and modern AI frameworks including LangChain, LangGraph, LangSmith, LlamaIndex, Hugging Face, OpenAI/Anthropic APIs, and emerging agentic frameworks
Demonstrated experience building and deploying production RAG (Retrieval-Augmented Generation) systems including document ingestion, chunking strategies, vector search, and context retrieval
Demonstrated experience designing and operating production AI systems including multi-agent architectures, intelligent automation, and workflow orchestration
Strong understanding of agent architectures, workflow orchestration, retrieval-augmented generation (RAG), vector databases, knowledge graphs, and semantic reasoning
Familiarity with Agent-to-Agent (A2A) communication protocols and Model Context Protocol (MCP) for building interoperable AI systems
Experience ensuring platform scalability, cross-domain coherence, and alignment with AI platform capabilities and strategy
Strong expertise in distributed systems, microservices architecture, service design, performance optimization, and reliability engineering

Job Responsibility

Specify architectures and system decompositions for AI/ML capabilities that involve significant integrations and cross-team collaboration across multiple product areas
Provide technical architecture and leadership for medium to large, complex, cross-functional AI initiatives with visibility at the tech VP level
Architect and lead implementation of advanced Generative AI solutions including agent-based systems, intelligent automation, document intelligence, and decision support systems that span multiple business domains
Design and implement sophisticated agentic workflows that orchestrate multiple AI agents, tools, APIs, reasoning steps, and business logic to automate complex enterprise processes at scale
Question status quo with an eye for simpler designs and more secure approaches, influencing tech VPs to set direction for multiple teams
Build systems and platforms that meet the highest standards for scalability, resilience, performance, availability, security, and compliance
Identify and scope opportunities for automating business processes using AI across multiple product areas and business domains
Advance the state-of-the-art in applied AI by integrating knowledge graphs, vector reasoning, retrieval architectures, and multi-agent systems to solve complex business problems
Drive innovation by exploring new models, frameworks, reasoning techniques, and AI architectures and applying them strategically to high-impact business challenges
Run rigorous experimentation programs including hypothesis definition, A/B testing, measurement frameworks, and iterative improvement across production AI systems

What we offer

Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
Financial benefits including market-competitive compensation
a 401K savings plan vested from day one that offers a 6% match
performance and recognition-based incentives
and tuition assistance
Access to additional benefits like mental healthcare as well as fertility and adoption assistance
Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year

Fulltime

Senior Staff Machine Learning Engineer – Agent Engineering

GEICO is seeking an experienced Sr Staff Machine Learning Engineer – Agent Engin...

Location

United States , New York City; Palo Alto; Chevy Chase

Salary:

130000.00 - 300000.00 USD / Year

Geico

Expiration Date

Until further notice

Requirements

10+ years of professional software development experience with at least two general-purpose programming languages such as Java, C++, Python, TypeScript, etc.
7+ years of experience architecting, building & deploying end-to-end AI solutions utilizing open-source/cloud-agnostic components such as search engine (e.g. elastic search, Qdrant), data warehouse (e.g. snowflake), streaming platform (e.g. Kafka), relational database (e.g. postgresql), Nosql (e.g. Cassandra), distributed processing (e.g. Spark, Ray), workflow orchestration (e.g. Airflow, Temporal), etc.
5+ years’ experience managing end-to-end solution development life cycle, esp. Measurement and monitoring of operations metrics, analytical insights and business outcomes via dashboards and other tools
Bachelor’s degree or above in Computer Science, Engineering, Statistics or a related field

Job Responsibility

Own design, development and maintenance of high-performance AI solutions that utilize agentic workflows to deliver concrete business value for internal stakeholders
Collaborate with cross-functional teams, including data scientists, ML engineers, software engineers, product managers, designers to gather requirements, define project scope and prioritize feature backlogs
Contribute to the selection, evaluation, and implementation of software technologies, tools, and frameworks
Take ownership in project planning and stakeholder management
Mentor and guide junior engineers via code reviews and design sessions

What we offer

Comprehensive Total Rewards program
401K savings plan with 6% match
performance and recognition-based incentives
tuition assistance
mental healthcare
fertility and adoption assistance
workplace flexibility
GEICO Flex program (work from anywhere in the US for up to four weeks per year)

Fulltime

Sr. Distinguished AI Engineer (Agentic AI Platform)

At Capital One, we are creating responsible and reliable AI systems, changing ba...

Location

United States , San Jose, California; San Francisco, California

Salary:

343400.00 - 392000.00 USD / Year

Capital One

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, Engineering, or AI plus at least 10 years of experience developing AI and ML algorithms or technologies, or Master's degree plus at least 8 years of experience developing AI and ML algorithms or technologies
At least 10 years of experience programming with Python, Go, Scala, or Java
9 years of experience deploying scalable and responsible AI solutions on cloud platforms
2+ years of experience supporting Agentic Frameworks
2+ years of experience with LLMOps
8+ years of experience designing mission-critical machine learning platforms
2+ years of experience architecting, designing, developing, integrating, delivering, and supporting complex AI systems
Demonstrated ability to lead and mentor multiple engineering teams and influence cross-functional stakeholders up to the VP level
Experience developing AI and ML algorithms or technologies using Python, C++, C#, Java, or Golang
Master's degree in Computer Science, Computer Engineering, or relevant technical field

Job Responsibility

Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
Contribute to the north star platform architecture, continuously publishing and refining living diagrams and canonical APIs
Standardizing and automating agentic workflows
Contribute to crafting an end to end GenAI SDK, CLI and starter kits
Help bring together a vision of central guardrail services
Collaborate with cross organization architects to drive end to end performance
Accelerate innovation by incubating proof of concepts and driving RFCs
Own central Helm charts, operators and CRDs that auto scale agents to hit tenant SLAs
Coach and evangelize - hosting architecture office hours, mentoring Staff, Principal and Senior engineers, authoring technical design documents and blogs and representing Capital One at Tier1 AI conferences

What we offer

Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
comprehensive, competitive, and inclusive set of health, financial and other benefits

Fulltime

Executive Director, Agentic AI

The Executive Director, Agentic AI will define and lead the enterprise strategy,...

Location

United States , Sacramento

Salary:

175100.00 - 334750.00 USD / Year

CVS Health

Expiration Date

May 30, 2026

Requirements

12+ years in software engineering, platforms, or AI/ML, with 5+ years in senior leadership roles
Hands-on experience delivering AI systems at enterprise scale (not just experimentation)
Deep understanding of: LLMs, SLMs, RAG, embeddings, vector databases
Agent frameworks and orchestration patterns
Distributed systems, APIs, event-driven architectures
Proven ability to operate in regulated, high-availability environments
Strong executive communication and stakeholder-management skills

Job Responsibility

Define the enterprise Agentic AI vision and roadmap, aligned to business outcomes (cost reduction, revenue growth, productivity, experience uplift)
Establish clear differentiation between LLM tools, copilots, workflows, and autonomous/multi-agent systems
Identify and prioritize high-value agentic use cases (e.g., customer support resolution, claims/prior auth automation, contract leakage reduction, operational orchestration, developer productivity)
Own the design and evolution of the Agentic AI Platform, including: Multi-agent frameworks (planner, executor, verifier, critic, retriever agents)
Tool/function calling and API orchestration
RAG, memory, state management, and context persistence
Human-in-the-loop / human-on-the-loop controls
Define standards for agent lifecycle management (design, testing, deployment, observability, rollback)
Partner with Digital Platform and Integration teams to ensure agents are API-first, event-driven, and scalable
Lead delivery of production-grade agentic solutions, not POCs

What we offer

Affordable medical plan options
401(k) plan (including matching company contributions)
Employee stock purchase plan
No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
Paid time off
Flexible work schedules
Family leave
Dependent care resources
Colleague assistance programs
Tuition assistance

Fulltime

Senior Staff Software Engineer - AI

GEICO is seeking an experienced Engineer with a passion for building high-perfor...

Location

United States , Seattle, WA; Austin, TX; Palo Alto, CA; Chicago, IL; Dallas, TX

Salary:

110000.00 - 230000.00 USD / Year

Geico

Expiration Date

Until further notice

Requirements

Experience building and deploying ML systems in production with cross-functional engineering teams
Fluency in at least two modern languages such as Python, Go, Java, C++, or C# including object-oriented design
Experience architecting multi-component ML platforms using open-source/cloud-agnostic components: Datastores: PostgreSQL, NoSQL (MongoDB, Cassandra, CosmosDB) Streaming: Kafka, Flink, or Spark Streaming
Experience with end-to-end ML lifecycle: version control, CI/CD, Kubernetes, testing, monitoring, and production support
Experience with cloud providers (Azure, AWS or GCP) in production ML environments
Experience with observability tools and distributed systems monitoring, logging, tracing, and root cause analysis
Experience building multi-agent systems using LLMs and agentic frameworks (e.g., LangChain, LangGraph, AutoGen, Semantic Kernel, CrewAI)
Hands-on experience with RAG, semantic search, and vector databases (e.g., Milvus, pgvector, Qdrant, ElasticSearch)
Experience designing human-in-the-loop workflows and safety controls for autonomous systems
Strong architecture and design skills with ability to influence technical direction and roadmap

Job Responsibility

Design and build a multi-agent AI platform where specialized agents autonomously detect, diagnose, and resolve issues through agent-to-agent (A2A) collaboration
Develop intelligent agents using LLMs and agentic frameworks that coordinate detection, diagnostic, remediation, and knowledge tasks with minimal human intervention
Define agent interaction protocols, A2A communication standards, and evaluation frameworks for agent decision quality and autonomous action safety
Architect vector database solutions (Milvus, pgvector, Qdrant) for semantic search and RAG to enable context-aware agent decision-making
Build end-to-end ML pipelines for severity classification, anomaly detection, failure pattern recognition, and impact forecasting using observability data
Establish scalable orchestration infrastructure for multi-agent workflows with CI/CD, automated evaluation, canary releases, and rollback strategies
Implement monitoring for agent interactions, A2A communication patterns, decision quality, data drift, and system reliability
Lead technical architecture ensuring scalability, observability, and integration with existing alerting, logging, and monitoring systems
Define standards for agent safety, explainability, governance, and human-in-the-loop controls for high-impact automated actions
Partner with SRE, Product, and Engineering teams to translate reliability goals into measurable ML objectives and maintain pragmatic technical roadmaps

What we offer

Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
Financial benefits including market-competitive compensation
a 401K savings plan vested from day one that offers a 6% match
performance and recognition-based incentives
and tuition assistance
Access to additional benefits like mental healthcare as well as fertility and adoption assistance
Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year

Fulltime

Member of technical staff - Research - Agent

About H: H exists to push the boundaries of superintelligence with agentic AI. B...

Location

France; United Kingdom , Paris; London

Salary:

Not provided

H Company

Expiration Date

Until further notice

Requirements

Senior Experience: Previous demonstrable role(s) as a Staff, Principal, or Senior Engineer (or equivalent Research Scientist) in a Frontier AI Lab with a proven track record of leading complex, end-to-end AI/ML projects from conception to production
Education / Publication: Preferably PhD (or equivalent research experience) in Machine Learning, Computer Science, or a related field, preferably with a strong publication record (e.g., NeurIPS, ICML, ICLR) in Computer Science
Core Expertise: Deep theoretical and practical expertise in Agentic AI and proven experience building, scaling, and shipping solutions involving foundation models (LLMs/VLMs)
Soft Skills: Collaborative: Enjoys collaboration and thrives in a teamwork-oriented, fast-paced research environment
High-Impact Communicator: Possesses impactful communication skills, with the ability to bridge the gap between research and engineering and articulate complex ideas clearly
Mission-Driven: Genuinely eager to explore and solve the new engineering and research challenges at the frontier of agentic AI

Job Responsibility

Research & Leadership: Design and develop new agents, proposing new research directions, e.g., combining state-of-the-art RL with foundation models (LLMs/VLMs)
Algorithm & Systems Design: Design, implement, and scale complex, high-performance systems for training large-scale agents. This includes both the foundational infrastructure and the novel algorithms, reward models, and sophisticated training environments
Research-to-Production: Collaborate closely with researchers and engineers to implement, test, and productionize new agent logics, learning algorithms, and system architectures
Evaluation & Reliability: Create, manage, and scale massive benchmarks and evaluation systems to rigorously track agent capabilities. You will own system reliability, scalability, and observability for our entire research infrastructure
Mentorship & Standards: Mentor and guide other engineers and researchers on the team, fostering technical excellence. You will establish and enforce engineering standards, tooling, and best practices for both code and research design
Innovation: Conduct thorough code and design reviews, champion technical innovation, and proactively address technical debt to accelerate the R&D lifecycle

What we offer

Join the exciting journey of shaping the future of AI, and be part of the early days of one of the hottest AI startups
Collaborate with a fun, dynamic, and multicultural team, working alongside world-class AI talent in a highly collaborative environment
Enjoy a competitive salary
Unlock opportunities for professional growth, continuous learning, and career development

Fulltime

Ai-first Core It Software Engineering: Software, Ml & Data

This is a Unified Application for our AI-First IT Transformation portfolio. We r...

Location

United States , Santa Clara

Salary:

Not provided

Palo Alto Networks

Expiration Date

Until further notice

Requirements

3-5+ years of experience in Software Engineering, Data Science, or Machine Learning (Staff level)
6-8+ years (Senior Staff)
8-12+ years (Principal level)
Expert-level server-side development (Python, Java, Go) OR deep expertise in statistical modeling, ML algorithms, and LLM fine-tuning
Direct experience with RAG architectures, LLM APIs, and Vector Databases (e.g., Pinecone, Milvus)
Hands-on experience with Kubernetes, CI/CD, and distributed systems for large-scale AI deployment

Job Responsibility

Lead the hands-on development of core Enterprise IT Business software leveraging AI components and LLM infrastructure with both traditional and Generative AI model deployment
Build and industrialize agentic AI systems and multi-agent frameworks, ensuring secure and effective use of GenAI technologies at the platform level
Design and implement robust foundational data pipelines, perform advanced statistical analysis, and develop new ML models to drive autonomous system behavior
Design large-scale, distributed AI/ML systems optimized for low latency, high throughput, and developer-friendliness (Inference optimization)
Establish evaluation frameworks to measure AI quality (accuracy, hallucination rates) and overall system reliability across the Enterprise AI Factory

Fulltime

Senior Staff Machine Learning Engineer (AI Agent)

Cresta

Location:
United States; Canada

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
December 07, 2025

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Staff Machine Learning Engineer (AI Agent)

Senior Staff Machine Learning Engineer

Senior Staff Engineer, Applied AI

Senior Staff Machine Learning Engineer – Agent Engineering

Sr. Distinguished AI Engineer (Agentic AI Platform)

Executive Director, Agentic AI

Senior Staff Software Engineer - AI

Member of technical staff - Research - Agent

Ai-first Core It Software Engineering: Software, Ml & Data

Our AI answers in your language

Senior Staff Machine Learning Engineer (AI Agent)

Cresta

Location:United States; Canada

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:December 07, 2025

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Staff Machine Learning Engineer (AI Agent)

Senior Staff Machine Learning Engineer

Senior Staff Engineer, Applied AI

Senior Staff Machine Learning Engineer – Agent Engineering

Sr. Distinguished AI Engineer (Agentic AI Platform)

Executive Director, Agentic AI

Senior Staff Software Engineer - AI

Member of technical staff - Research - Agent

Ai-first Core It Software Engineering: Software, Ml & Data

Location:
United States; Canada

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
December 07, 2025