CrawlJobs Logo

Backend Software Engineer, Observability Product (Agent)

NetBox Labs

Location Icon

Location:
United States; United Kingdom

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

165000.00 - 195000.00 USD / Year

Job Description:

NetBox Labs is seeking a Backend Software Engineer to join our rapidly expanding Observability Product team, which owns the full suite of NetBox Labs observability products - from Assurance and Discovery to upcoming Telemetry and Monitoring capabilities - helping customers gain real-time network visibility, automatically discover and monitor their infrastructure, and keep configuration drift in check. This role is focused on our open source Observability Agent (including pktvisor and discovery).

Job Responsibility:

  • Work with a full stack team to build and maintain open source, source available, and closed source software across our observability project portfolio – shipping to the community and delivering into our commercial cloud and on‑premise products
  • Integrate closely with NetBox’s data model to drive workflows for reconciling observed vs intended state and enriching telemetry and monitoring data
  • Define and maintain data schemas and APIs shared across products
  • Ensure observability systems meet scalability and reliability goals (SLAs/SLOs)
  • Implement testing, CI/CD automation, and code quality standards across observability services

Requirements:

  • Deep knowledge of the OSI framework, networks and protocols - esp. DPI, SNMP, sFlow/NetFlow, gNMI
  • Linux system and network programming experience (e.g. system calls, IPC, processes, threads, sockets)
  • Experience with C++ (and/or Rust), as well as Go and Python
  • Experience with eBPF helpful
  • 5+ years of professional experience as a software engineer, and 2+ years in a startup environment
  • Experience in distributed systems and backend microservices development
  • Strong understanding of gRPC, protobuf, event-driven architecture, and streaming data systems
  • Experience with Redis streams, Kafka, MQTT, AMQP or other messaging systems
  • Familiarity with programmatic interaction with network infrastructure via APIs, SSH/CLI automation (e.g., Netmiko, NAPALM), or other network automation frameworks
  • Familiarity with observability concepts (metrics, logs, traces) and related protocols, especially OpenTelemetry
  • Strong communication skills, including the ability to write clear technical specifications with diagrams
  • Familiarity with data visualization and analytics frameworks such as Grafana

Nice to have:

  • Experience building multi-tenant SaaS systems with security and compliance awareness (e.g., SOC 2)
  • Familiarity with Mimir, Loki, ClickHouse, Elastic, or other analytical data stores
  • Familiarity with AI/ML approaches for anomaly detection or performance prediction
  • Working with or contributing to open source projects, especially in observability
What we offer:
  • Offers Equity
  • Offers Bonus

Additional Information:

Job Posted:
February 20, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Backend Software Engineer, Observability Product (Agent)

Backend Engineer, AI Content Agents

We’re building the future of AI driven marketing, one content agent at a time. A...
Location
Location
United States
Salary
Salary:
180000.00 - 320000.00 USD / Year
hightouch.com Logo
Hightouch
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong backend architecture skills
  • Curiosity
  • Product intuition
  • Ability to bring ideas from napkin sketch to production reality
  • Comfortable shipping a full stack feature and experimenting with an LLM workflow
  • Ability to build APIs, data models, and orchestration layers
  • Experience building and maintaining backend services, APIs, and data models
  • Experience developing new agent behaviors from prototypes to scalable production systems
  • Experience extending web layers with new features
  • Experience improving reliability, observability, and performance across a stack
Job Responsibility
Job Responsibility
  • Design and ship intelligent systems that understand a company’s data, generate on-brand content, and collaborate with humans in real time to deliver creative work
  • Build APIs, data models, and orchestration layers that make our agents reliable, debuggable, and fast
  • Partner closely with product and design to deliver experiences that make AI feel natural, useful, and trustworthy
  • Building and maintaining backend services, APIs, and data models that power AI-generated marketing content
  • Developing new agent behaviors, taking them from quick prototypes to scalable production systems
  • Extending our existing web layer with new features that connect directly to AI workflows
  • Improving reliability, observability, and performance across our stack
  • Collaborating cross functionally to ship user facing features end-to-end, from design to deployment
  • Fulltime
Read More
Arrow Right

Senior Backend Software Engineer

The Coaching team builds Highspot’s personalized, AI-enhanced coaching capabilit...
Location
Location
Canada , Vancouver
Salary
Salary:
146000.00 - 178000.00 CAD / Year
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science or equivalent practical experience
  • 5+ years of experience in back-end software development building and maintaining large-scale web applications
  • At least 3 years of experience working with object-oriented programming languages
  • Ruby and Python preferred
  • Experience architecting, building, and deploying mid-to-large scale web applications in a distributed environment
  • Strong understanding of API design, data modeling, and backend scalability
  • Experience integrating or working with AI/LLM platforms such as OpenAI, Anthropic (Claude), or Azure OpenAI
  • Familiarity with AI-powered development tools (e.g., Cursor, GitHub Copilot, Cody, etc.) and a demonstrated ability to incorporate them effectively into day-to-day workflows
  • Deep expertise in web performance, security, and reliability best practices
  • Proven ability to deconstruct complex technical problems and deliver elegant, maintainable solutions
Job Responsibility
Job Responsibility
  • Design, develop, and maintain high-quality, scalable, and user-centric backend systems using modern technologies
  • Architect and optimize backend infrastructure to power intelligent, AI-driven workflows and Agentic AI integrations
  • Build and maintain integrations with multiple large language models (LLMs) including ChatGPT, Claude, and other OpenAI and Microsoft models
  • Collaborate closely with AI/ML engineers to productionize agentic workflows and autonomous reasoning systems
  • Partner effectively with Product Management and UX Design to translate ideas and research into production-ready, AI-enhanced features
  • Leverage AI-assisted development tools such as Cursor, GitHub Copilot, and other code generation frameworks to accelerate development and improve code quality
  • Lead and mentor engineers through complex projects, emphasizing clean architecture, testing, and software craftsmanship
  • Drive backend infrastructure improvements that enhance reliability, observability, and performance
  • Collaborate cross-functionally to deliver differentiated customer value through AI and data-driven solutions
  • Troubleshoot and resolve critical production issues while contributing to internal documentation and best practices
What we offer
What we offer
  • Comprehensive medical, dental, vision, disability, and life benefits
  • Group Retirement Savings Plan (RRSP) and matching employer contributions (DPSP) with immediate vesting
  • Flexible PTO
  • Generous Holiday Schedule + 5 Days for Annual Holiday Week
  • Quarterly Recharge Fridays (paid days off for mental health recharge)
  • Flexible work schedules
  • Access to Coaches and Therapists through Modern Health
  • 2 Volunteer days per year
  • Monthly transportation allowance for employees that work in our Vancouver Hub location
  • Employees are eligible to receive stock options
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, AI Runtime

We’re seeking a Senior Software Engineer to help power the future of agentic AI ...
Location
Location
United States
Salary
Salary:
157000.00 - 198900.00 USD / Year
apollographql.com Logo
Apollo GraphQL
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Expertise in agent-to-tool orchestration, routing, and coordination in scalable, fault-tolerant systems
  • Deep expertise in Rust programming language
  • Strong background in distributed systems, server architecture, and high-performance backend development
  • Proven experience with protocol design, message routing, and server-side orchestration frameworks
  • Experience building and maintaining robust runtime infrastructure that supports AI-driven workflows and enables reliable agent-to-tool interactions
  • Proven experience with protocol design, message routing, and building server-side frameworks that enable scalable, reliable multi-tool agent workflows
  • Hands-on experience with observability, monitoring, and debugging frameworks for complex systems
  • Passion for clean, maintainable code, high system reliability, and scalable architecture
  • Experience in strategic system design, making architectural trade-offs, and planning for long-term scalability and maintainability
  • Strong technical leadership and mentorship, including guiding junior engineers and driving engineering best practices across teams
Job Responsibility
Job Responsibility
  • Scale an enterprise AI/MCP Server and Gateway that powers multi-agent workflows across Apollo, including routing, orchestration, and integration boundaries
  • Implement robust server infrastructure to ensure reliability, performance, and security at scale
  • Build and maintain tools for agent discovery, communication, and coordination
  • Define deployment strategies and runtime optimizations to maximize efficiency and minimize operational overhead
  • Develop frameworks and patterns that enable seamless multi-agent collaboration and AI-driven orchestration
  • Integrate observability, logging, and monitoring for full visibility into server and agent behavior
  • Explore and implement AI-enhanced developer workflows to optimize orchestration and agent interactions
  • Collaborate with teams within our org to ensure the MCP Server meets evolving product and developer needs
Read More
Arrow Right

Staff Software Engineer, AI Runtime

We’re seeking a Staff Software Engineer to help power the future of agentic AI w...
Location
Location
United States
Salary
Salary:
185000.00 - 215000.00 USD / Year
apollographql.com Logo
Apollo GraphQL
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Expertise in agent-to-tool orchestration, routing, and coordination in scalable, fault-tolerant systems
  • Deep expertise in Rust programming language
  • Strong background in distributed systems, server architecture, and high-performance backend development
  • Proven experience with protocol design, message routing, and server-side orchestration frameworks
  • Experience building and maintaining robust runtime infrastructure that supports AI-driven workflows and enables reliable agent-to-tool interactions
  • Proven experience with protocol design, message routing, and building server-side frameworks that enable scalable, reliable multi-tool agent workflows
  • Hands-on experience with observability, monitoring, and debugging frameworks for complex systems
  • Passion for clean, maintainable code, high system reliability, and scalable architecture
  • Experience in strategic system design, making architectural trade-offs, and planning for long-term scalability and maintainability
  • Strong technical leadership and mentorship, including guiding junior engineers and driving engineering best practices across teams
Job Responsibility
Job Responsibility
  • Architect and scale an enterprise AI/MCP Server and Gateway that powers multi-agent workflows across Apollo, including routing, orchestration, and integration boundaries
  • Design and implement robust server infrastructure to ensure reliability, performance, and security at scale
  • Build and maintain tools for agent discovery, communication, and coordination
  • Define deployment strategies and runtime optimizations to maximize efficiency and minimize operational overhead
  • Develop frameworks and patterns that enable seamless multi-agent collaboration and AI-driven orchestration
  • Integrate observability, logging, and monitoring for full visibility into server and agent behavior
  • Explore and implement AI-enhanced developer workflows to optimize orchestration and agent interactions
  • Collaborate with teams across Apollo to ensure the MCP Server meets evolving product and developer needs
What we offer
What we offer
  • Offers Equity
  • Choice of 3 Anthem Blue Cross medical plans (California residents can also choose from an additional 2 Kaiser medical plans)
  • Dental and Vision benefits are provided by Sun Life Financial
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, AI

As a Senior AI Engineer on our Core AI team, you will be a cornerstone of FloQas...
Location
Location
India , Pune
Salary
Salary:
Not provided
floqast.com Logo
FloQast
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of professional software engineering experience
  • 3+ years focused on building backend for production applications
  • Mastery of Python
  • Familiarity with some AI application frameworks, context engineering, and scalable system design for AI products
  • Expertise in designing products that integrate with multiple technologies, APIs, and data sources in cloud-native environments (AWS preferred)
  • Strong desire to develop deep hands-on experience with LLM APIs, retrieval-augmented generation (RAG), conversational AI, document processing, and MCP integrations
  • Proven ability to lead tech product initiatives, establish technical standards and communicate complex system designs to both technical and business stakeholders
Job Responsibility
Job Responsibility
  • Architect and lead development of production AI products including intelligent chatbots, document processing systems, and agentic workflows using Python and modern AI frameworks
  • Design and implement our centralized AI platform including model routing, provider management, vector search, and AI application frameworks with seamless MCP (Model Context Protocol) integrations
  • Build scalable AI products that integrate with diverse technologies including accounting systems, document repositories, and external APIs while maintaining robust monitoring and observability
  • Master context engineering and system design for AI applications, ensuring optimal information retrieval, context assembly, and multi-turn conversation management
  • Collaborate with Product, Engineering, and Security teams to ensure AI products are robust, compliant, and aligned with business objectives in the regulated accounting space
  • Provide technical leadership and mentorship to the growing AI team, establishing best practices for AI product development, deployment, and governance
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Core AI

As a Staff AI Engineer on our Core AI team, you will be a cornerstone of FloQast...
Location
Location
United States , San Jose
Salary
Salary:
164000.00 - 246000.00 USD / Year
floqast.com Logo
FloQast
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of professional software engineering experience
  • 4+ years focused on building backend for production applications
  • Mastery of Python
  • Familiarity with some AI application frameworks, context engineering, and scalable system design for AI products
  • Expertise in designing products that integrate with multiple technologies, APIs, and data sources in cloud-native environments (AWS preferred)
  • Strong desire to develop deep hands-on experience with LLM APIs, retrieval-augmented generation (RAG), conversational AI, document processing, and MCP integrations
  • Proven ability to lead tech product initiatives, establish technical standards and communicate complex system designs to both technical and business stakeholders
Job Responsibility
Job Responsibility
  • Architect and lead development of production AI products including intelligent chatbots, document processing systems, and agentic workflows using Python and modern AI frameworks
  • Design and implement our centralized AI platform including model routing, provider management, vector search, and AI application frameworks with seamless MCP (Model Context Protocol) integrations
  • Build scalable AI products that integrate with diverse technologies including accounting systems, document repositories, and external APIs while maintaining robust monitoring and observability
  • Master context engineering and system design for AI applications, ensuring optimal information retrieval, context assembly, and multi-turn conversation management
  • Collaborate with Product, Engineering, and Security teams to ensure AI products are robust, compliant, and aligned with business objectives in the regulated accounting space
  • Provide technical leadership and mentorship to the growing AI team, establishing best practices for AI product development, deployment, and governance
What we offer
What we offer
  • Medical
  • Dental
  • Vision
  • Family Forming benefits
  • Life & Disability Insurance
  • Unlimited Vacation
  • Fulltime
Read More
Arrow Right

AI Engineer

In this role you will design and build intelligent, autonomous AI systems that e...
Location
Location
United States , San Diego
Salary
Salary:
199500.00 - 299300.00 USD / Year
teradata.com Logo
Teradata
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field
  • 3–5+ years of experience in software architecture, backend development, or AI infrastructure
  • Strong Python skills and familiarity with Java, Go, and C++
  • Deep expertise in agent development, LLM integration, prompt engineering, runtime systems, and AI tooling
  • Experience with MCP servers, vector databases, RAG systems, graph-based memory, and NLP frameworks
  • Ability to design core agentic capabilities such as memory management, context handling, observability, and identity
  • Strong background in distributed systems, backend services, API design, and cloud-native deployments (AWS, Azure, GCP)
  • Proficiency with containerization, CI/CD pipelines, and scalable production infrastructures
  • Excellent communication skills, documentation habits, and ability to mentor or collaborate across teams
  • Passion for building safe, human-aligned, autonomous systems and extending open-source tools to innovate
Job Responsibility
Job Responsibility
  • Design and build intelligent, autonomous AI systems that enable Teradata to push the boundaries of enterprise-scale agentic technology
  • Lead the development of scalable, secure, cloud-native frameworks that allow AI agents to reason, plan, act, and collaborate in real-world production environments
  • Create the foundational runtime components, automation capabilities, and infrastructure that power next-generation GenAI and Agentic AI solutions
  • Work closely with AI researchers, platform teams, and product leadership to bring advanced agentic capabilities from concept to production across Teradata’s data and AI platform
  • Succeed in this role by enabling enterprise customers to leverage powerful, resilient, and safely governed AI agents that drive measurable business value
What we offer
What we offer
  • Healthcare, life and disability insurance plans
  • 401(k)-retirement savings plan
  • Time-off programs
  • Flexible work model
  • Well-being focus
  • Diversity, Equity, and Inclusion commitment
  • Fulltime
Read More
Arrow Right
New

Senior AI Engineer

Guidepoint is seeking an experienced Senior AI Engineer to join our Toronto-base...
Location
Location
Canada , Toronto
Salary
Salary:
Not provided
modoras.com Logo
Modoras Accounting Syd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of professional experience (or 5+ with a Master’s degree) designing, building, and scaling distributed, production-grade backend systems
  • 2+ years building and operating Generative AI and agentic systems in production
  • Strong software engineering fundamentals in Python, including building and scaling REST APIs using frameworks such as FastAPI, with experience in asynchronous programming and microservices
  • Hands-on experience building enterprise AI agents and workflows using LLM platforms such as OpenAI, Anthropic (Claude), or Google Gemini, and frameworks like LangChain or agent SDKs
  • Experience building and operating within the enterprise AI ecosystem, including custom GPTs or agents, agent builders, connectors/apps, and application or agent SDKs (e.g., OpenAI Apps SDK, ChatKit, or equivalents)
  • Experience designing and operating agent integration layers (e.g., MCP servers or similar) that connect AI agents to internal APIs, tools, and services, with secure authentication and authorization using enterprise identity platforms such as Okta, Microsoft Entra ID, or OAuth-based systems
  • Strong understanding of AI governance, compliance, and responsible AI practices, including access control, auditability, data handling, and secure deployment of AI systems in enterprise environments
  • Direct experience with RAG, vector search using databases such as Elasticsearch, multi-agent AI systems, tool-calling agents, prompt engineering, and agent evaluation in production environments
  • Cloud-native experience deploying and operating containerized applications on Azure (preferred) or AWS/GCP using Docker and Kubernetes
  • Proven ability to lead complex technical initiatives, make sound architectural decisions, and mentor engineers building production-ready AI systems
Job Responsibility
Job Responsibility
  • Design, build, and operate scalable, low-latency backend services and REST APIs that power Generative AI capabilities, including retrieval-augmented generation (RAG) pipelines, vector search, and enterprise-grade agentic systems
  • Own the full lifecycle of AI applications and agents, from system architecture and development to CI/CD, deployment, agent evaluation, monitoring, and ongoing optimization in production
  • Build production-grade research agents and enterprise AI workflows that integrate LLMs with proprietary knowledge, vector databases (e.g., Elasticsearch), internal tools, external APIs, and real-time data
  • Design and operate multi-agent AI systems, including tool-calling agents and agent orchestration patterns, to support complex research and enterprise workflows
  • Apply AIOps best practices for building, evaluating, deploying, and operating AI agents with strong observability, reliability, and quality controls
  • Continuously improve retrieval and generation quality using prompt engineering, retrieval tuning, re-ranking, advanced chunking strategies, and hallucination reduction techniques
  • Provide technical leadership through design discussions, code reviews, and mentorship, and partner closely with product and business stakeholders to influence the AI roadmap
What we offer
What we offer
  • Paid Time Off
  • Comprehensive benefits plan
  • Company RRSP Match
  • Development opportunities through the LinkedIn Learning platform
Read More
Arrow Right