CrawlJobs Logo

Senior Software Engineer, Observability

together.ai Logo

Together AI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

160000.00 - 260000.00 USD / Year

Job Description:

The AI Infrastructure team at Together AI is at the forefront of building and scaling the foundational systems that power our generative AI platform. The storage and observability team is crucial for designing, implementing, and maintaining robust distributed storage solutions, ensuring seamless data access and management. They are also responsible for developing comprehensive observability platforms, providing critical insights into system performance and GPU utilization, and proactively identifying and resolving issues.

Job Responsibility:

  • Identify, design, and develop foundational backend services that power Together’s cloud platform
  • Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure
  • Partner with product teams to understand functional requirements and deliver solutions that meet business needs
  • Write clear, well-tested, and maintainable software and IaC for both new and existing systems
  • Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance
  • Participate in an on-call rotation to address critical incidents when necessary

Requirements:

  • 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems and API microservices
  • Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources
  • Excellent communication skills – able to write clear design docs and work effectively with both technical and non-technical team members
  • Demonstrated experience with building and operating high-performance and/or globally distributed microservice architectures across one or more cloud providers (AWS, Azure, GCP)
What we offer:
  • competitive compensation
  • startup equity
  • health insurance
  • flexibility in terms of remote work

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Software Engineer, Observability

Senior Software Engineer, Observability

The Observability team at Airtable ensures that engineers have the tools they ne...
Location
Location
United States , San Francisco; New York; Seattle
Salary
Salary:
196000.00 - 270000.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of software engineering experience
  • 3+ years focused on observability or infrastructure at scale
  • Demonstrated success implementing and running production-grade logging, metrics, or tracing systems
  • Proficiency in distributed systems concepts, data streaming pipelines, and container orchestration (Kubernetes)
  • Deep hands-on knowledge of tools such as Prometheus, Grafana, Datadog, OpenTelemetry, ELK Stack, Loki, or ClickHouse
  • Comfort with at least one programming language (e.g., Go, Python, Java) to build and maintain observability tooling
  • Experience mentoring engineers and collaborating across multiple teams
  • Strong communication skills
  • Eagerness to own high-impact initiatives
  • Proven ability to balance short-term fixes with long-term strategic vision
Job Responsibility
Job Responsibility
  • Architect and scale core observability systems
  • Lead the design and evolution of logging, metrics, and tracing pipelines
  • Evaluate and integrate new technologies (e.g., OpenTelemetry, ClickHouse, ELK stack)
  • Guide and mentor a growing team of infrastructure engineers
  • Define and uphold coding standards and operational excellence
  • Partner with Deploy Infrastructure, Service Orchestration, and Product teams
  • Align infrastructure decisions with business goals
  • Own end-to-end reliability for observability tools and establish SLAs, SLOs, and error budgets
  • Optimize performance and cost of large-scale data pipelines
  • Shape the observability roadmap
What we offer
What we offer
  • Opportunity to receive benefits
  • Restricted stock units
  • May include incentive compensation
  • Comprehensive benefit offerings
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Release Engineering

We’re looking for a Senior Software Engineer to join our Release Engineering tea...
Location
Location
United States
Salary
Salary:
143000.00 - 203000.00 USD / Year
getdbt.com Logo
dbt Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience designing, operating, or improving CI/CD systems for large-scale distributed applications
  • Proficiency with one or more of the following: Helm, ArgoCD, Terraform, GitHub Actions, or Kubernetes
  • Familiarity with infrastructure-as-code practices and the principles of reliable, observable systems
  • Background in Python (or other modern language) development for automation or platform tooling
  • A collaborative mindset and interest in enabling other developers through tooling and platform improvements
  • Worked asynchronously as part of a fully remote, distributed team
Job Responsibility
Job Responsibility
  • Design, build, and maintain components of our CI/CD platform to make deployments safer, faster, and more reliable
  • Lead initiatives that improve automation, observability, and self-service capabilities for engineers
  • Collaborate across teams to identify friction points in our delivery process and build tools to eliminate them
  • Evolve our release architecture to support dbt Cloud’s multi-cloud, cell-based infrastructure at scale
  • Continuously improve developer experience by refining build pipelines, release workflows, and infrastructure-as-code practices
What we offer
What we offer
  • Unlimited vacation
  • 401k w/3% guaranteed contribution
  • Excellent healthcare
  • Paid Parental Leave
  • Wellness stipend
  • Home office stipend
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Platform Observability

Everlaw is looking for a Senior Software Engineer that brings experience in buil...
Location
Location
United States , Oakland
Salary
Salary:
164000.00 - 208000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS or MS in Computer Science, or equivalent coursework
  • At least 3 years of experience building logging, metrics, and tracing infrastructure
  • Proficiency in coding in a language such as C, C++, C#, Java, Python, Javascript, Go or Rust
  • Experience with Infrastructure as Code and container solutions to manage cloud environments (ex: Terraform, Ansible, Docker, etc)
  • At least 1 year of experience leading multi-developer efforts, including planning, technical breakdown, and coordination
  • Excellent communication and collaboration skills
  • Please note that at this time, Everlaw is not sponsoring U.S. employment visas for this role. Due to federal contract requirements, Everlaw may only hire US citizens for this position.
Job Responsibility
Job Responsibility
  • Build observability strategies to support application and infrastructure metrics, logs, traces, dashboards, and alerts
  • Develop and maintain infrastructure as code (IAC) using tools such as Terraform and Ansible
  • Monitor usage trends to identify opportunities to optimize efficiency and performance of our metrics database and logging tools
  • Improve our on-call and incident management processes by encouraging deeper understanding, communication, and trust
  • Support developer projects by influencing design and implementation of infrastructure features as well as providing technical guidance
  • Support compliance efforts by promoting continuous documentation of our processes and involvement in audits
  • Provide Technical Mentorship to other engineers by both sharing your technical knowledge and becoming an expert in an area of our code base.
What we offer
What we offer
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Seventeen paid vacation days plus 11 federal holidays
  • Membership to Modern Health to help employees prioritize mental health and wellness
  • Annual allocation for Learning & Development opportunities and applicable professional membership dues
  • Company-sponsored life and disability insurance
  • Work in Uptown Oakland, just steps from the BART line and dozens of restaurants and walking distance to Lake Merritt
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Search

Truveta is the world’s first health provider led data platform with a vision of ...
Location
Location
United States , Seattle
Salary
Salary:
155000.00 - 190000.00 USD / Year
truveta.com Logo
Truveta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Information Systems, or a related field (advanced degree a plus)
  • 5+ years of professional software engineering experience
  • Designing, building, and operating distributed systems at scale
  • Writing production-quality, efficient, multi-threaded code that runs reliably in cloud environments
  • Architecting and implementing search system features (indexing, querying, optimization), including building robust test frameworks
  • Reviewing data specifications and handling large-scale data storage and distribution using specialized protocols
  • Debugging and resolving complex production issues in distributed systems
  • Proven experience with cloud-native architectures and DevOps practices (preferably Azure, though AWS/GCP experience is relevant)
Job Responsibility
Job Responsibility
  • Design, build, and maintain index, query, and search system features utilized to aggregate and analyze health data
  • Architecting, implementing, and testing new index and query features
  • Optimizing end-to-end index performance
  • Planning, architecting, and deploying highly scalable and highly reliable search systems
  • Implement relevant compliance controls and conduct thorough security reviews
  • Drive observability, reliability, and automation across the infrastructure and platform
  • Monitor emerging technology in the search and infrastructure domains, evaluate applicability, and champion adoption where appropriate
  • Contribute to knowledge sharing and best practices within the team
What we offer
What we offer
  • Comprehensive benefits with strong medical, dental and vision insurance plans
  • 401K plan
  • Professional development & training opportunities for continuous learning
  • Work/life autonomy via flexible work hours and flexible paid time off
  • Generous parental leave
  • Regular team activities (virtual and in-person)
  • Additional compensation such as incentive pay and stock options
  • Fulltime
Read More
Arrow Right

Senior Software Engineer

The Wikimedia Foundation is looking for a Senior Software Engineer to join our t...
Location
Location
United States of America
Salary
Salary:
141352.00 - 175725.00 USD / Year
wikimediafoundation.org Logo
Wikimedia Foundation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Being comfortable working in a semi-ambiguous environment, similar to that of a startup
  • Experience in supporting complex web applications running on Amazon Web Services or other comparable cloud platforms
  • Experience working with Kafka or similar distributed event processing systems
  • Experience working with Nodejs and Go applications
  • Comfortable with configuration management and orchestration tools (ECS, Kubernetes), and modern observability infrastructure (monitoring, metrics and logging)
  • Aptitude for automation and streamlining of tasks
  • Comfortable with shell and scripting languages used in an SRE/Operations engineering context (e.g. Python, Go, Bash, Ruby, etc.)
  • Good understanding of Linux/Unix fundamentals and debugging skills
  • Strong English language skills and ability to work independently, as an effective part of a globally distributed team
  • B.S. or M.S. in Computer Science or equivalent in related work experience
Job Responsibility
Job Responsibility
  • Bringing your creativity to improve our current infrastructure
  • Being a key part of planning our future technical roadmap
  • Maintaining and improving the reliability of highly used commercial data feeds
  • Supporting new code/feature deployments
  • Troubleshooting, debugging and following-up on emerging issues in our application stack and its surroundings
  • Assisting in the architectural design of new services and making them operate at scale
  • Incident response, diagnosis and follow-up on system outages or alerts across Wikimedia Enterprise’s production infrastructure
  • Sharing our values and work in accordance with them
  • Fulltime
Read More
Arrow Right

Senior Software Engineer II

The Entity Graph team builds the core knowledge graph and services that connect ...
Location
Location
United States , Seattle
Salary
Salary:
141000.00 - 225600.00 USD / Year
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong backend engineering experience (typically 8+ years) with proven technical leadership
  • Proficiency in one or more modern languages (e.g., Java/Kotlin, C#, Go, or similar) and cloud-native service development
  • Experience designing data models for complex, interrelated domains and working with relational and NoSQL/graph stores
  • Solid systems design skills for distributed, high-throughput services
  • Clear communicator who collaborates effectively across teams
  • Security- and privacy-conscious mindset
Job Responsibility
Job Responsibility
  • Design and implement scalable services for entity modeling, ingestion, indexing, and query
  • Define and evolve data and schema models for interconnected records
  • Lead end-to-end projects: architecture, implementation, and delivery
  • Collaborate with product and data partners to translate requirements into technical solutions
  • Improve service reliability, testing, and observability
  • Mentor peers and contribute to engineering best practices
  • Fulltime
Read More
Arrow Right

Manager, Software Engineering - Creation Engine

The Client Testing, Observability, and Performance (CTOP) team’s mission is to m...
Location
Location
United States , San Francisco; New York
Salary
Salary:
250000.00 - 350000.00 USD / Year
figma.com Logo
Figma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of engineering management experience leading high-output, high-performing teams
  • 4+ years as a hands-on engineer
  • Proven leadership in building, mentoring, and motivating senior engineers
  • Deeply passionate about the testing, observability, and tooling space
  • Demonstrated success delivering scalable, high-quality work and driving cross-functional initiatives in fast-paced, ambiguous environments
  • Empathetic leader with strong organizational and execution skills
Job Responsibility
Job Responsibility
  • Manage and support a team of experienced engineers to deliver best-in-class testing and observability frameworks for Figma client developers
  • Partner with product, data science, and engineering leadership to set strategy, priorities and mission for teams and projects
  • Roll up your sleeves as needed to get involved in the technical details and operational strategy
  • Engage on broader company programs to up-level the team’s work on performance & quality
  • Build and support a culture of doing great work together for our engineering team by investing in team culture, mentorship, and meaningful work
  • Grow your career in a collaborative and creative engineering community
What we offer
What we offer
  • Equity
  • Health, dental & vision
  • Retirement with company contribution
  • Parental leave & reproductive or family planning support
  • Mental health & wellness benefits
  • Generous PTO
  • Company recharge days
  • Learning & development stipend
  • Work from home stipend
  • Cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Senior / Staff Software Engineer (Database)

Our database technology sits at the heart of the Materialize product—a product t...
Location
Location
United States , New York
Salary
Salary:
164050.00 - 250000.00 USD / Year
materialize.com Logo
Materialize
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Several years of experience developing software
  • Passionate about distributed systems and/or databases
  • Excited to learn Rust if not already using it
  • Pride in owning work end-to-end
  • Ability to write clear design docs and well-documented code
  • Love solving hard problems in service of the customer
  • Excited about working at the intersection of frontier academic research and a venture-backed startup
Job Responsibility
Job Responsibility
  • Design and deliver improvements to the Database, with an eye on correctness, reliability, and performance
  • Own projects end-to-end, from early stage design to holding the pager
  • Debug and resolve complex distributed systems issues, sometimes directly with customers
  • Continually improve system reliability, observability, and automation
  • Collaborate across your team, with Product, with Field Eng, and all other stakeholders to align on direction, carefully prioritize, and build the best product for our users
  • Share your work through mentorship, demos, blog posts, and any other relevant channels
What we offer
What we offer
  • Equity
  • Fulltime
Read More
Arrow Right