CrawlJobs Logo

Full-Stack Software Engineer, Inference

cohere.com Logo

Cohere

Location Icon

Location:

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.

Job Responsibility:

  • Improve the platform’s auth, billing, and payment systems
  • Add new features to the interactive Playground where customers can try our models
  • Implement new platform features for managing deployments
  • Write and ship minimal code that runs in low-resource environments, and has highly stringent deployment mechanisms
  • As security and privacy are paramount, you will sometimes need to reinvent the wheel, and won’t be able to use the most popular libraries or tooling

Requirements:

  • 5+ years of experience writing clean backend code
  • Experience with Golang and React
  • Built payment systems and have experience with subscription or usage-based SaaS, and/or products with a freemium model
  • Strong coding abilities and comfortable working across the stack
  • Worked in both large enterprises and startups
  • Excel in fast-paced environments and can execute while priorities and objectives are a moving target
What we offer:
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)

Additional Information:

Job Posted:
February 20, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Full-Stack Software Engineer, Inference

Software Engineer, Full-Stack

We’re seeking a Full-Stack Software Engineer to play a highly impactful role in ...
Location
Location
United States , San Mateo
Salary
Salary:
Not provided
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3 - 7 years of software engineering experience
  • Deeply understand how a product fits into the business landscape
  • Proficiency in TypeScript and Python
  • Be a customer obsessed engineer who loves talking to users and getting feedback
  • Strong ability to make design decisions and craft great experiences
  • Willing to think outside of the box and build a product from scratch for users to serve new needs and use cases
  • Understanding of responsive design, component-based architecture, and UX fundamentals
  • Strong communication and collaboration skills
Job Responsibility
Job Responsibility
  • Contribute to the Fireworks Platform (developer-facing web app, serverless and on-demand inference, Python SDK) alongside other team members
  • Design and implement full stack technical features to address business problems
  • Ship features that users care about, iterate rapidly and ideate constantly
  • Rapidly prototype and experiment with a data driven focus
  • Own feature development from backend APIs to frontend user interfaces
  • Directly engage with users through various channels (Discord, meetups, etc.) and convert their needs into shipped features
  • Be able to explain why a feature matters to customers as well as its importance in the competitive landscape
  • Be a user of the inference platform to have a deep sense of what’s working and what’s not working in the product
What we offer
What we offer
  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation
  • Fulltime
Read More
Arrow Right

Full Stack Engineer

Join a small, mission-driven startup building voice-first AI technology to impro...
Location
Location
United States , San Francisco
Salary
Salary:
150000.00 - 185000.00 USD / Year
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years full-stack experience, strong backend focus (Node.js, Next.js/React, TypeScript/JavaScript, Nest.js)
  • Experience with REST APIs, SQL/PostgreSQL (Prisma/GraphQL a plus), Docker, CI/CD
  • Startup mindset, collaborative, proactive, and mission-driven
Job Responsibility
Job Responsibility
  • Build backend infrastructure, APIs, and data pipelines powering AI research and real-time inference
  • Ensure HIPAA-compliant, secure systems that integrate with healthcare software
  • Collaborate with clinical and regulatory teams while shaping startup architecture and strategy
  • Fulltime
Read More
Arrow Right
New

Software Engineer, Research - Human Data

OpenAI’s mission is to ensure that artificial general intelligence (AGI) benefit...
Location
Location
United States; United Kingdom , San Francisco; London
Salary
Salary:
230000.00 - 385000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering fundamentals
  • Experience building production systems at scale
  • Enjoy full-stack development with end-to-end ownership
  • Motivated by high-impact collaboration with research teams and solving novel, ambiguous problems
  • Excited to shape how AI systems learn from human preferences and reflect a broad range of human values
  • Care deeply about inclusive tooling and building systems that enhance model safety, reliability, and usefulness
Job Responsibility
Job Responsibility
  • Build and maintain robust full-stack systems for feedback collection, data labeling, and evaluation pipelines, while maintaining high levels of security
  • Translate experimental alignment research into scalable production infrastructure, including inference and model training stacks
  • Design and iterate on user-facing tools and backend services to support high-quality data workflows
  • Partner with researchers, engineers, and program leads to shape feedback loops and model interaction paradigms
  • Drive infrastructure improvements that enable faster iteration and scaling across OpenAI’s frontier models, from internal research tooling all the way to production ChatGPT
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right
New

Head of Inference Kernels

As a core member of the team, you will play a pivotal role in leading a high-per...
Location
Location
United States , San Jose
Salary
Salary:
200000.00 - 300000.00 USD / Year
etched.com Logo
Etched
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in designing and optimizing GPU kernels for deep learning on GPUs using CUDA, and assembly (ASM)
  • Experience with low-level programming to maximize performance for AI operations, leveraging tools like Compute Kernel (CK), CUTLASS, and Triton for multi-GPU and multi-platform performance
  • Deep fluency with transformer inference architecture, optimization levers, and full-stack systems (e.g., vLLM, custom runtimes)
  • History of delivering tangible perf wins on GPU hardware or custom AI accelerators
  • Solid understanding of roofline models of compute throughput, memory bandwidth and interconnect performance
  • Experienced in running large-scale workloads on heterogeneous compute clusters, optimizing for efficiency and scalability of AI workloads
  • Scopes projects crisply, sets aggressive but realistic milestones, and drives technical decision-making across the team
  • Anticipates blockers and shifts resources proactively
Job Responsibility
Job Responsibility
  • Architect Best-in-Class Inference Performance on Sohu: Deliver continuous batching throughput exceeding B200 by ≥10x on priority workloads
  • Develop Best-in-Performance Inference Mega Kernels: Develop complex, fused kernels that increase chip utilization and reduce inference latency, and validate these optimizations through benchmarking and regression-tested in production pipelines
  • Architect Model Mapping Strategies: Develop system level optimizations using a mix of techniques such tensor parallelism and expert parallelism for optimal performance
  • Hardware-Software Co-design of Inference-time Algorithmic Innovation: Develop and deploy production-ready inference-time algorithmic improvements (e.g., speculative decoding, prefill-decode disaggregation, KV cache offloading)
  • Build Scalable Team and Roadmap: Grow and retain a team of high-performing inference optimization engineers
  • Cross-Functional Performance Alignment: Ensure inference stack and performance goals are aligned with the software infrastructure teams, GTM and hardware teams for future generations of our hardware
What we offer
What we offer
  • Medical, dental, and vision packages with generous premium coverage
  • $500 per month credit for waiving medical benefits
  • Housing subsidy of $2k per month for those living within walking distance of the office
  • Relocation support for those moving to San Jose (Santana Row)
  • Various wellness benefits covering fitness, mental health, and more
  • Daily lunch + dinner in our office
  • significant equity package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff – Fullstack Engineer

As a fullstack engineer at Inflection, you will own the platforms, systems, and ...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional software engineering experience, particularly in full-stack development
  • Prior experience in high-growth or early-stage startup environments
  • Strong proficiency across the modern web stack: Python, TypeScript, Node.js, and modern frontend frameworks (e.g., React, Tailwind)
  • Experience in designing complex architectures, including asynchronous workflows and integrations
  • Proven problem-solving, collaboration, and communication skills
  • Experience building or integrating AI/LLM-powered applications
  • Experience with modern cloud and workflow infrastructure, including orchestration frameworks (e.g., Temporal), containerization and Kubernetes, and CI/CD pipelines on AWS/GCP/Azure
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Design and implement scalable backend systems and APIs that power production LLM experiences, including agentic workflows, memory systems, and tool integrations
  • Build and operate high-availability infrastructure to support real-time inference, retrieval, and conversation pipelines
  • Develop internal platforms to improve engineering productivity—CI/CD pipelines, service templates, observability frameworks, and rollout tooling
  • Collaborate closely with applied research and frontend teams to rapidly prototype, ship, and iterate on end-user features
  • Ensure systems meet our high bar for security, uptime, and latency through incident response, load testing, monitoring, and automation
  • Participate in on-call rotations to maintain the reliability of the services you build
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Meaningful equity component
  • Fulltime
Read More
Arrow Right

R&D Engineering Group Manager

This role will lead the R&D group responsible for Optimove’s Digital Personaliza...
Location
Location
United Kingdom , Dundee
Salary
Salary:
Not provided
optimove.com Logo
Optimove
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • B.Sc. (or higher) in Computer Science, or equivalent
  • Excellent knowledge of software design and scalable, cloud-native architecture
  • 5+ years’ of experience of full-stack commercial software development, ideally including back-end services in one of Python, Rust or Node.js and front-end services in Javascript or Typescript with a modern framework (e.g. React)
  • Experience of automated unit-testing and CI/CD pipelines
  • Experience with AWS (including ECS)
  • Knowledge of SQL and NoSQL databases
  • Experience mentoring junior developers
  • 2+ years of experience as an agile development team leader
  • Strong leadership skills to motivate and build a winning team
  • Working in a hybrid model from our Dundee office (at least 2 days per week)
Job Responsibility
Job Responsibility
  • Lead, mentor, motivate and inspire 4 software development teams (~25 engineers and QAs in total)
  • Work collaboratively with Architects, DevOps and Data Scientists to build and maintain a robust, scalable, secure, cloud-native architecture for model training and real-time inference
  • Together with Product Management and Development Team Leaders, establish a product roadmap for each development team and oversee the delivery of this
  • Review and refine our agile delivery processes on an ongoing basis
  • Encourage pragmatic adoption of modern agentic development tools and practices
  • Establish a quality first culture, take an active role in code review & deployment and triage customer-impacting issues promptly (with the client when required)
  • As a data-driven company, we work with multiple databases (e.g. DynamoDB, Postgres, Snowflake) and event streaming systems (e.g. Kinesis). We want you to identify and implement improvements to data architecture and data quality
  • Although predominantly a hands-off role, don’t be afraid to roll-up your sleeves and get stuck-in, helping scaffold solutions to gnarly or time-sensitive issues
  • See the full picture. Take ownership of Optimove Personalize from feature design to production and support, ensuring it is delivering value for our clients
Read More
Arrow Right
New

Ai Solutions Architect / Field Application Engineer

We are looking for an AI enthusiast with strong technical fundamentals and custo...
Location
Location
United States , Austin
Salary
Salary:
128400.00 - 192600.00 USD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Electrical Engineering, Computer Engineering, or a related field (or equivalent practical experience)
  • Strong interest in AI/ML technologies and a desire to work across hardware and software layers
  • Hands-on experience with Linux-based systems
  • Programming experience in one or more of the following: Python, C/C++, Bash
  • Familiarity with AI frameworks or tools (e.g., PyTorch, TensorFlow, ONNX, Hugging Face, or similar)
  • Strong communication skills with the ability to explain technical concepts clearly
  • Ability to work effectively in a team-oriented, cross-functional environment
Job Responsibility
Job Responsibility
  • Serve as a technical point of contact for customers, supporting AI and HPC workloads on AMD CPU and GPU platforms
  • Work directly with customers to understand their use cases, requirements, and constraints, and guide them through solution design and deployment
  • Deliver technical presentations, demos, and architecture walkthroughs to both technical and non-technical audiences
  • Program-manage customer opportunities as they grow in complexity, coordinating activities across internal and external stakeholders
  • Perform hands-on system bring-up including hardware installation, firmware configuration, OS installation, and driver setup
  • Deploy and validate open-source AI and HPC software stacks (e.g., Linux, ROCm, AI frameworks, containers)
  • Run functionality, performance, and scalability benchmarks on CPU and GPU workloads
  • Perform first-level profiling and analysis of applications to identify performance bottlenecks and optimization opportunities
  • Support AI workloads such as training, inference, and data preprocessing across CPU and GPU platforms
  • Develop working knowledge of AMD CPU and GPU architectures and how they impact real-world workloads
  • Fulltime
Read More
Arrow Right

AI Systems Engineer - Agentic Autonomy

We are seeking an AI Systems Engineer with deep expertise in large language mode...
Location
Location
United States , Greater Boston
Salary
Salary:
140000.00 - 180000.00 USD / Year
havocai.com Logo
HavocAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s, Master’s, or PhD in Computer Science, Machine Learning, Robotics, or a related field
  • Deep hands-on experience building with LLMs and multi-agent/agentic AI frameworks
  • Strong software engineering background in modern ML frameworks, cloud orchestration, and API development
  • Experience integrating AI systems into larger software architectures or robotics/autonomy workflows
  • Understanding of RAG pipelines, tool-use frameworks, LLM function-calling, memory systems, and agent orchestration
  • Experience with safety evaluation, model alignment, or mission-critical AI system validation
  • Ability to lead system-level design discussions and coordinate across multiple engineering disciplines
  • Must be a U.S. Citizen and eligible to obtain a Secret Clearance
Job Responsibility
Job Responsibility
  • Lead the design and development of LLM-powered software modules for mission reasoning, planning, operator interaction, and autonomous decision support
  • Integrate LLMs and agentic systems into HavocAI’s autonomy architecture, including ROS/ROS2 systems, planning engines, and mission software
  • Build multi-agent, tool-using AI systems that interact with perception data, mission databases, simulation systems, and operator inputs
  • Develop APIs, wrappers, and orchestration layers enabling LLMs to interface safely with embedded, cloud, and edge compute environments
  • Optimize LLM inference pipelines for performance, latency, and reliability in field-deployed systems
  • Evaluate model behavior, perform safety testing, and develop guardrails for mission-critical use cases
  • Collaborate with autonomy, embedded, simulation, and full-stack teams to define requirements and ensure robust system-level integration
  • Guide strategic decisions on model selection, fine-tuning approaches, safety frameworks, and long-term AI architecture
  • Contribute to field testing, operator evaluations, and iterative deployment cycles for AI-augmented autonomy systems
What we offer
What we offer
  • 100% Employer paid Health, Dental and Vision Insurance for you and your families
  • Life Insurance (Employer Paid)
  • Ability to participate in the companies 401k program (Matching)
  • Unlimited PTO policy with an enforced 2 week minimum
  • Equity Package
  • Work / Home Office Stipend
  • Global Entry
  • 16 Week Paid Parental Leave
  • Monthly Health and Wellness Stipend
  • Fulltime
Read More
Arrow Right