CrawlJobs Logo

Applied Research Engineer, Agents

hebbia.ai Logo

Hebbia

Location Icon

Location:
United States , New York City

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

160000.00 - 300000.00 USD / Year

Job Description:

As an Applied Research Engineer, you will be the bridge between research, industry, and application shaping the future of our core natural language processing systems. You will be responsible for enabling agentic capabilities across the Hebbia product suite. You will own experiments and POCs focused on combining the latest research findings with specific high value problems that our customers encounter each and every day. You will leverage our deep relationships with foundation model providers - partnering to beta test models, experiment with new features, and develop guidance on relative model strengths

Job Responsibility:

  • Focused on LLMs, you will play a crucial role in analyzing and interpreting complex data types to derive and implement cutting edge insight generation systems
  • Iterate and explore new LLM and NLP techniques maintaining our foothold as an industry leader
  • You will utilize your expertise in statistics, programming, and machine learning to develop and deploy data-driven models and algorithms
  • Your work will contribute to solving business problems, improving processes, and enhancing the overall performance of the company
  • Collaborate with cross-functional teams to improve NLP/LLM capabilities in app
  • Stay up-to-date with the latest advancements and research in the space
  • Collaborate with software engineers to integrate agentic capabilities into existing systems or develop new applications
  • Ensure that systems are efficient, maintainable and well monitored
  • Iterate on validation and testing frameworks

Requirements:

  • Bachelor's degree in Computer Science, Engineering, or related field
  • 7+ years software development experience at a venture-backed startup or top technology firm, with a focus on applied machine learning systems
  • Strong programming skills in Python
  • Experience with NLP and text processing libraries such as NLTK, SpaCy, or Apache Tika
  • Experience with Search and Indexing technologies
  • Proficient in machine learning techniques and algorithms
  • Experience working with foundational models and corresponding APIs
  • Knowledge of statistical analysis and data scraping techniques
  • Prior experience in developing NLP models and systems
  • Excellent problem-solving and analytical skills
  • Strong communication and teamwork abilities
  • Strong capability to translate research into production software systems

Nice to have:

  • Master’s degree in Computer Science, Mathematics, Machine Learning or a related field is a plus
  • Experience with prompting and building LLM applications and agents is a plus
  • Experience building agentic systems or LLM enabled products
  • Frequent user of AI products, especially during the development lifecycle (i.e. Cursor, Claude Code, etc)
  • experience building with foundation models and experience working with Attention based NLP models is a plus
What we offer:
  • PTO: Unlimited
  • Insurance: Medical + Dental + Vision + 401K
  • Eats: Catered lunch daily + doordash dinner credit if you ever need to stay late
  • Parental leave policy: 3 months non-birthing parent, 4 months for birthing parent
  • Fertility benefits: $15k lifetime benefit
  • New hire equity grant: competitive equity package with unmatched upside potential

Additional Information:

Job Posted:
December 09, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Applied Research Engineer, Agents

AI Research Engineer

We're seeking a Research Engineer to conduct innovative research in key AI areas...
Location
Location
United Kingdom
Salary
Salary:
Not provided
prolific.com Logo
Prolific
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of engineering experience with significant AI/ML focus
  • Demonstrated research experience through publications, open-source contributions, or impactful projects
  • Strong engineering fundamentals and experience implementing AI systems in production environments
  • Deep knowledge of LLM evaluation methodologies, alignment techniques, and model optimization approaches
  • Experience with model fine-tuning, adapters, quantization, and distillation frameworks
  • Self-motivation and ability to define and pursue research directions independently
  • Excellent understanding of current challenges in AI safety, reliability, and alignment
  • Strong communication skills and ability to explain complex research concepts clearly
  • Passion for staying current with the rapidly evolving AI research landscape
Job Responsibility
Job Responsibility
  • Lead independent research projects in AI evaluation methodologies, alignment techniques, and synthetic data generation
  • Design and implement novel evaluation frameworks for LLMs and agent systems that are grounded in human data
  • Contribute to the academic AI community through publications and open-source contributions
  • Stay at the forefront of AI research and pioneer innovative approaches to tackle pressing open challenges in the field
  • Design and conduct rigorous experiments to study AI models and systems with sound methodological approaches
  • Develop scalable frameworks for systematic evaluation of model behaviours and capabilities
  • Create tools and frameworks that transform research insights into practical applications
  • Build infrastructure to support large-scale research experiments when needed
  • Apply knowledge of model fine-tuning, optimization techniques, distillation, and other ML engineering practices to support research goals
  • Work closely with ML engineers, data scientists, and product teams to translate research insights into practical applications
What we offer
What we offer
  • competitive salary
  • benefits
  • remote working
  • impactful, mission-driven culture
Read More
Arrow Right

Senior Machine Learning Engineer, Agentic

Join us in building the future of finance. Our mission is to democratize finance...
Location
Location
United States , Bellevue; Menlo Park; New York; Washington; Denver; Westlake; Chicago; Lake Mary; Clearwater; Gainesville
Salary
Salary:
146000.00 - 220000.00 USD / Year
robinhood.com Logo
Robinhood
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong technical expertise in software development, with understanding of agentic workflows—including reasoning loops, tool invocation, memory, and orchestration of autonomous AI agents
  • Hands-on experience using Large Language Models, including prompt engineering, fine-tuning, model distillation, and deploying optimized models (e.g. via DPO, PPO) into production environments
  • Proven ability to build and scale ML/AI systems, from experimentation to deployment—owning dataset generation, evaluation pipelines, A/B testing, and performance monitoring
  • Leadership and mentorship capabilities, with a track record of guiding complex technical projects and supporting the growth of teammates through code/design reviews and technical direction
  • Excellent communication and collaboration skills, with the ability to translate technical ideas into actionable plans and work effectively with cross-functional partners, including product and infrastructure teams
  • Innovation mindset and commitment to continuous learning and a bias toward action, staying at the forefront of ML/AI trends, agentic systems research, and best practices in tooling, safety, and evaluation
Job Responsibility
Job Responsibility
  • Design and create tools and workflows for agent development that support rapid prototyping—define agents, compose toolchains, and construct reasoning loops with minimal overhead
  • Build platform solutions to support scalable experimentation, synthetic dataset generation, and multi-agent evaluation across diverse tasks and domains
  • Develop feedback and optimization pipelines that incorporate both automated metrics and human-in-the-loop evaluation signals to fine-tune agent behavior
  • Implement and scale optimization techniques such as Direct Preference Optimization (DPO), Proximal Policy Optimization (PPO), and reward modeling to improve agent performance
  • Launch and support fine-tuned models in production environments with robust evaluation, rollback strategies, and performance monitoring
  • Collaborate closely with applied AI/ML teams to translate state-of-the-art research in agentic reasoning, planning, and tool use into reliable, production-ready systems
What we offer
What we offer
  • Market competitive and pay equity-focused compensation structure
  • 100% paid health insurance for employees with 90% coverage for dependents
  • Annual lifestyle wallet for personal wellness, learning and development, and more
  • Lifetime maximum benefit for family forming and fertility benefits
  • Dedicated mental health support for employees and eligible dependents
  • Generous time away including company holidays, paid time off, sick time, parental leave, and more
  • Lively office environment with catered meals, fully stocked kitchens, and geo-specific commuter benefits
  • Bonus opportunities
  • Equity
  • Fulltime
Read More
Arrow Right

Machine Learning Research Engineer

You will be part of Kiddom’s Data Science team, building the foundation of our s...
Location
Location
United States , San Francisco; New York
Salary
Salary:
175000.00 - 250000.00 USD / Year
kiddom.co Logo
Kiddom
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have 5+ years of industry experience applying machine learning to solve real-world problems with large, complex datasets, with 1–2 years in a technical leadership role
  • Proven track record designing, evaluating, and deploying ML/AI systems in production environments that drive measurable business impact, ideally in recommendation, personalization, search, or workflow optimization
  • Strong programming skills in Python and fluency in data manipulation (SQL, Pandas) and common ML toolkits (scikit-learn, XGBoost, TensorFlow/PyTorch)
  • Strong analytical skills and ability to break down complex problems into measurable hypotheses and experiments
  • Excellent communication skills with a history of cross-functional collaboration with product, design, and engineering stakeholders
Job Responsibility
Job Responsibility
  • Architect and scale machine learning systems for search, personalization, and recommendations that power Kiddom’s teacher helper and insight engine
  • Develop evaluation-first development workflows to measure how models improve teaching efficiency, lesson planning, and student learning outcomes
  • Fine-tune machine learning models with feedback signals from teachers and students to align outputs with instructional goals and classroom needs
  • Design intelligent discovery pipelines that combine semantic retrieval, curriculum alignment, and real-time personalization
  • Build agentic assistants that help teachers plan lessons, adapt instruction, and reduce repetitive tasks
  • Collaborate closely with product managers, designers, and curriculum experts to translate high-level educational goals into scalable ML-powered systems
  • Coach and mentor junior ML engineers and data scientists, fostering technical and professional growth
What we offer
What we offer
  • Competitive salary
  • Meaningful equity
  • Health insurance benefits: medical (various PPO/HMO/HSA plans), dental, vision, disability and life insurance
  • One Medical membership (in participating locations)
  • Flexible vacation time policy (subject to internal approval). Average use 4 weeks off per year
  • 10 paid sick days per year (pro rated depending on start date)
  • Paid holidays
  • Paid bereavement leave
  • Paid family leave after birth/adoption. Minimum of 16 paid weeks for birthing parents, 10 weeks for caretaker parents. Meant to supplement benefits offered by State
  • Commuter and FSA plans
  • Fulltime
Read More
Arrow Right

Research Engineer, GenAI

You will be part of Kiddom’s Data Science team, building the foundation of our s...
Location
Location
United States , San Francisco; New York
Salary
Salary:
175000.00 - 250000.00 USD / Year
kiddom.co Logo
Kiddom
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience applying machine learning to solve real-world problems with large, complex datasets
  • 1–2 years in a technical leadership role
  • Proven track record designing, evaluating, and deploying ML/AI systems in production environments that drive measurable business impact, ideally in recommendation, personalization, search, or workflow optimization
  • Strong programming skills in Python
  • Fluency in data manipulation (SQL, Pandas) and common ML toolkits (scikit-learn, XGBoost, TensorFlow/PyTorch)
  • Strong analytical skills and ability to break down complex problems into measurable hypotheses and experiments
  • Excellent communication skills with a history of cross-functional collaboration with product, design, and engineering stakeholders
Job Responsibility
Job Responsibility
  • Architect and scale machine learning systems for search, personalization, and recommendations that power Kiddom’s teacher helper and insight engine
  • Develop evaluation-first development workflows to measure how models improve teaching efficiency, lesson planning, and student learning outcomes
  • Fine-tune machine learning models with feedback signals from teachers and students to align outputs with instructional goals and classroom needs
  • Design intelligent discovery pipelines that combine semantic retrieval, curriculum alignment, and real-time personalization
  • Build agentic assistants that help teachers plan lessons, adapt instruction, and reduce repetitive tasks
  • Collaborate closely with product managers, designers, and curriculum experts to translate high-level educational goals into scalable ML-powered systems
  • Coach and mentor junior ML engineers and data scientists, fostering technical and professional growth
What we offer
What we offer
  • Meaningful equity
  • Health insurance benefits: medical (various PPO/HMO/HSA plans), dental, vision, disability and life insurance
  • One Medical membership (in participating locations)
  • Flexible vacation time policy (subject to internal approval). Average use 4 weeks off per year
  • 10 paid sick days per year (pro rated depending on start date)
  • Paid holidays
  • Paid bereavement leave
  • Paid family leave after birth/adoption. Minimum of 16 paid weeks for birthing parents, 10 weeks for caretaker parents. Meant to supplement benefits offered by State
  • Commuter and FSA plans
  • Fulltime
Read More
Arrow Right

Distinguished Applied Researcher

At Capital One, we are creating trustworthy and reliable AI systems, changing ba...
Location
Location
United States , McLean; San Francisco; New York; Cambridge; San Jose
Salary
Salary:
278400.00 - 381300.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 4 years of experience in Applied Research or M.S. in Electrical Engineering, Computer Engineering, Computer Science, AI, Mathematics, or related fields plus 6 years of experience in Applied Research
  • PhD in Computer Science, Machine Learning, Computer Engineering, Applied Mathematics, Electrical Engineering or related fields
  • LLM
  • PhD focus on NLP or Masters with 10 years of industrial NLP research experience
  • Core contributor to team that has trained a large language model from scratch (10B + parameters, 500B+ tokens) or through continued pre-training, post training pipeline for alignment and reasoning, LLM optimizations, complex reasoning with multi-agentic LLMs
  • Numerous publications at ACL, NAACL and EMNLP, Neurips, ICML or ICLR on topics related to the pre-training of large language models (e.g. technical reports of pre-trained LLMs, SSL techniques, model pre-training optimization)
  • Has worked on an LLM (open source or commercial) that is currently available for use
  • Demonstrated ability to guide the technical direction of a large-scale model training team
  • Experience with common training optimization frameworks (deep speed, nemo)
  • Experience contributing to the team that has trained a large language model from scratch (10B + parameters, 500B+ tokens) or through continued pre-training, post training pipeline for alignment and reasoning, LLM optimizations, complex reasoning with multi-agentic LLMs
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of data scientists, software engineers, machine learning engineers and product managers to deliver AI-powered products that change how customers interact with their money
  • Leverage a broad stack of technologies — Pytorch, AWS Ultraclusters, Huggingface, Lightning, VectorDBs, and more — to reveal the insights hidden within huge volumes of numeric and textual data
  • Build AI foundation models through all phases of development, from design through training, evaluation, validation, and implementation
  • Engage in high impact applied research to take the latest AI developments and push them into the next generation of customer experiences
  • Flex your interpersonal skills to translate the complexity of your work into tangible business goals
  • Partner with a cross-functional team of scientists, machine learning engineers, software engineers, and product managers to deliver AI-powered platforms and solutions that change how customers interact with their money
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right

Research Engineer – Agentic Platforms

The Office of the Chief Technology Officer (OCTO) is building the next generatio...
Location
Location
Mexico
Salary
Salary:
Not provided
teradata.com Logo
Teradata
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep hands-on expertise with large language model APIs and inference frameworks (e.g., OpenAI, Anthropic, Mistral, vLLM, Ollama, HuggingFace Transformers)
  • Strong practical experience designing and building multi-agent systems using agentic SDKs and orchestration frameworks such as LangChain, LlamaIndex, AutoGen, CrewAI, or equivalent
  • Proficiency in Python and experience building production-grade AI/ML services with clean, well-documented, testable code
  • Solid understanding of RAG architectures, vector databases (e.g., Pinecone, Weaviate, pgvector), and knowledge retrieval patterns
  • Experience with prompt engineering, LLM evaluation methodologies, and strategies for improving agent reliability and reducing hallucination
  • Familiarity with LLM inference optimization techniques including quantization, batching, caching, and model serving infrastructure
  • 5+ years of software engineering experience, with at least 2 years focused on LLM-based systems, agentic workflows, or applied AI research
  • Bachelor’s degree in Computer Science, Artificial Intelligence, or a related field, or equivalent demonstrated expertise
  • Proven track record designing and shipping multi-agent or LLM-powered systems into production environments
  • Experience collaborating across research and engineering teams to move from prototype to scalable, maintainable platform capability
Job Responsibility
Job Responsibility
  • Design and implement multi-agent systems that coordinate specialized LLM-powered agents to solve complex, multi-step analytical and operational tasks
  • Design agent orchestration patterns including task decomposition, inter-agent communication, tool use, memory management, and feedback loops
  • Build scalable agentic pipelines that integrate with Teradata’s data platform, enabling agents to query, analyze, and act on enterprise data autonomously
  • Design and manage LLM inference pipelines optimized for latency, throughput, and cost across cloud and on-premises deployments
  • Evaluate, benchmark, and select appropriate foundation models (open-source and proprietary) for specific agentic tasks within the Teradata ecosystem
  • Implement advanced prompting strategies including chain-of-thought, retrieval-augmented generation (RAG), and tool-augmented reasoning to maximize agent reliability and accuracy
  • Build and extend internal agentic SDKs and frameworks that enable research and product teams to rapidly develop and deploy agent-based applications
  • Integrate with leading agentic platforms and toolkits (e.g., LangChain, LlamaIndex, AutoGen, CrewAI, Anthropic Claude SDKs) and adapt them for enterprise-grade reliability
  • Develop reusable agent components, tool connectors, and evaluation harnesses that accelerate the path from prototype to production
  • Collaborate with research teams to translate cutting-edge LLM and agent research into production-ready platform capabilities
What we offer
What we offer
  • People-first culture
  • Flexible work model
  • Focus on well-being
  • Inclusive environment
  • Fulltime
Read More
Arrow Right

Senior Research Engineer

As a Senior Research Engineer at Microsoft, you will advance Microsoft’s mission...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, Mathematics, Statistics, Physics, or a related field and 4 or more years in applied ML or AI research and product engineering
  • OR Master’s degree and 3 or more years in applied ML or AI research and product engineering
  • OR PhD in a relevant field and 2 or more years with generative AI, LLMs, or related ML algorithms
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Bringing State-of-the-Art Research to Products
  • Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML
  • Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments
  • Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
  • Drive original research and thought leadership (whitepapers, internal notes, patents)
  • convert insights into shipped capabilities
  • Research Translation: Continuously review emerging work
  • identify high-potential methods and adapt them to Microsoft problem spaces
  • End-to-End System Development
  • ML Design & Architecture: Own end-to-end pipeline from data prep, training, evaluation, deployment, and feedback loops
  • Fulltime
Read More
Arrow Right

Agents Research Lead

At General Intelligence Company, we’re building highly-capable autonomous agents...
Location
Location
United States , New York
Salary
Salary:
400000.00 - 600000.00 USD / Year
The General Intelligence Company of New York
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5–8+ years in applied ML/AI, ML systems, or research engineering
  • high-ownership startup experience preferred
  • Deep experience with LLMs and agentic systems: prompting, function/tool calling, planning, retrieval/memory, evals
  • Track record moving paper → prototype → production (Python
  • comfort with distributed systems a plus)
  • Published papers in AI/ML that are directly applicable to modern agentic systems
  • Built reliable evaluation harnesses and curated datasets for multi-step tasks
  • Shipped improvements that moved core business metrics (success rate, latency, unit economics)
Job Responsibility
Job Responsibility
  • Design and run experiments across model selection, prompting, tool-use, memory, planning, and multi-agent coordination
  • Lead evaluations: build datasets, success criteria, and continuous benchmarks for real workflows
  • Push performance: reduce tail latency and failure modes
  • improve determinism, throughput, and cost per successful run
  • Partner with eng/product to ship weekly
  • Mentor a small, senior team
  • set standards for experiment design, code quality, and documentation
  • Push state-of-the-art results on custom agent systems
  • Pursue advanced memory research for multi agent systems
  • Fulltime
Read More
Arrow Right