CrawlJobs Logo

Inference Software Engineer

etched.com Logo

Etched

Location Icon

Location:
United States , San Jose

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

175000.00 - 275000.00 USD / Year

Job Description:

Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history.

Job Responsibility:

  • Support porting state-of-the-art models to our architecture
  • Help build programming abstractions and testing capabilities to rapidly iterate on model porting
  • Build, enhance, and scale Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling
  • Optimize routing and communication layers using Sohu’s collectives
  • Utilize performance profiling and debugging tools to identify bottlenecks and correctness issues

Requirements:

  • Proficiency in C++ or Rust
  • Understanding of performance-sensitive or complex distributed software systems like Linux internals, accelerator architectures (e.g. GPUs, TPUs), Compilers, or high-speed interconnects (e.g. NVLink, InfiniBand)
  • Familiarity with PyTorch or JAX
  • Ported applications to non-standard accelerator hardware or hardware platforms

Nice to have:

  • Developed low-latency, high-performance applications using both kernel-level and user-space networking stacks
  • Deep understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols, consistency models, and communication patterns
  • Solid grasp of Transformer architectures, particularly Mixture-of-Experts (MoE)
  • Built applications with extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths
What we offer:
  • Medical, dental, and vision packages with generous premium coverage
  • $500 per month credit for waiving medical benefits
  • Housing subsidy of $2k per month for those living within walking distance of the office
  • Relocation support for those moving to San Jose (Santana Row)
  • Various wellness benefits covering fitness, mental health, and more
  • Daily lunch + dinner in our office

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Inference Software Engineer

AI Software Engineer

Join Qargo as an AI Software Engineer and help build intelligent, user-centric A...
Location
Location
Belgium , Ghent
Salary
Salary:
Not provided
qargo.com Logo
Qargo
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Min. 2 years of experience in software engineering, applied AI, or similar technical roles
  • Strong programming skills (preferably Python and/or modern backend languages)
  • Experience with AI/ML tools and frameworks such as PyTorch, Hugging Face, LangChain/LangGraph, vector databases, and inference tooling
  • Proven experience deploying and operating AI/ML systems in a production environment
  • Ability to experiment quickly, iterate fast, and validate assumptions
  • Strong problem-solving skills and the ability to work autonomously in a fast-paced environment
  • Clear communication skills and the ability to collaborate with engineers, product managers, and domain experts
Job Responsibility
Job Responsibility
  • Evaluate and prototype with new AI models and techniques to solve document, workflow, and conversational tasks
  • Bring AI prototypes to production, ensuring quality, scalability, and observability
  • Monitor and maintain AI systems running in production, optimising cost, latency, and reliability
  • Collaborate with cross-functional teams to define clear AI tasks (e.g., document classification, summarisation, task prediction)
  • Develop and enhance AI-driven features such as document extraction, matching flows, quality checks, chatbots, and automated bookings
  • Stay up to date with advancements in AI and identify opportunities to improve the product
What we offer
What we offer
  • Real impact and ownership in a growing international scale-up
  • A supportive and collaborative team culture
  • Hybrid working setup with flexibility and trust
  • Opportunities to learn, grow, and expand your technical knowledge
  • Competitive salary and benefits package
Read More
Arrow Right

Software Engineer Staff

This Software Engineer Staff will be engaged in data science-related research an...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Utilize analytical and programming skills and open-source systems, such as Apache Storm, Apache Spark, Elasticsearch, Cassandra, Graph DB etc. develop data processing pipeline required efficacy and latency
  • Require good knowledge and experience of the big data tool sets and techniques of distributed storage and computation engine
  • Require the experience to develop the reusable and highly scalable data processing component
  • Require good knowledge and experience to work with cloud based CICD tools and cloud devops teams to collect stats and create monitors for our data processing pipelines
  • Develop good quality python APIs to support micro services
  • Require the knowledge of APIs to various No SQL storage systems, Elasticsearch, Cassandra, and Redis, etc.
  • Good understanding Python Flask web service and be able to develop good quality code
  • Troubleshoot production environment and customer reported issues
  • Require the knowledge of the multi-cloud production environment
  • Require the agility to troubleshoot open-source data processing engine, such as Apache Spark, Apache Storm and Apache Flink
Job Responsibility
Job Responsibility
  • Designs, develops, troubleshoots and debugs software programs for software enhancements and new products
  • Develops software including operating systems, compilers, routers, networks, utilities, databases and Internet-related tools
  • Determines hardware compatibility and/or influences hardware design
  • Engaged in data science-related research and software application development and engineering duties related to our enterprise-grade Wi-Fi technology and autonomous platform to provide an unprecedented visibility into the user experience
  • Collaborate with other engineers and product managers to build the next generation of autonomous Wi-Fi networks leveraging big data and predictive models
  • Use knowledge of wireless communication networks, machine learning and software engineering to develop and implement scalable algorithms to process a large amount of streaming data to detect anomalies, predict problems, and classify them in real-time
  • Leverage the data collected from the Wi-Fi network to empower the inference engine of our Mist platform and systems, including the Mist virtual assistant chat bot
  • Determine the likelihood of failures across the Wi-Fi network and performing failure scope analysis
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

AI Software Engineer III

Planet DDS is a leading provider of a platform of cloud-based solutions that emp...
Location
Location
United Kingdom , Glasgow
Salary
Salary:
Not provided
planetdds.com Logo
Planet DDS
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5-7 years of professional software engineering experience
  • At least 4 years in AI/ML-focused roles
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Artificial Intelligence, or related field
  • Experience working in a SaaS or enterprise software environment
  • Publications or contributions to open-source AI/ML projects
  • Exposure to reinforcement learning, generative AI (LLMs, diffusion models), or real-time inference systems
Job Responsibility
Job Responsibility
  • Design, develop, and deploy AI and machine learning models in production environments
  • Architect scalable solutions that integrate AI capabilities into our products and services
  • Collaborate with data scientists, product managers, and backend/front-end engineers to translate prototypes into reliable, maintainable code
  • Own end-to-end development of AI systems, including data ingestion, model training, evaluation, and deployment
  • Implement best practices in model versioning, monitoring, and continuous improvement
  • Contribute to the evolution of our AI/ML infrastructure, including CI/CD pipelines and MLOps tools
  • Stay current on advancements in AI, ML, and deep learning and assess their applicability to business needs
  • Ensure AI solutions are ethical, interpretable, and aligned with regulatory requirements
  • Fulltime
Read More
Arrow Right

Software Engineer, Full-Stack

We’re seeking a Full-Stack Software Engineer to play a highly impactful role in ...
Location
Location
United States , San Mateo
Salary
Salary:
Not provided
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3 - 7 years of software engineering experience
  • Deeply understand how a product fits into the business landscape
  • Proficiency in TypeScript and Python
  • Be a customer obsessed engineer who loves talking to users and getting feedback
  • Strong ability to make design decisions and craft great experiences
  • Willing to think outside of the box and build a product from scratch for users to serve new needs and use cases
  • Understanding of responsive design, component-based architecture, and UX fundamentals
  • Strong communication and collaboration skills
Job Responsibility
Job Responsibility
  • Contribute to the Fireworks Platform (developer-facing web app, serverless and on-demand inference, Python SDK) alongside other team members
  • Design and implement full stack technical features to address business problems
  • Ship features that users care about, iterate rapidly and ideate constantly
  • Rapidly prototype and experiment with a data driven focus
  • Own feature development from backend APIs to frontend user interfaces
  • Directly engage with users through various channels (Discord, meetups, etc.) and convert their needs into shipped features
  • Be able to explain why a feature matters to customers as well as its importance in the competitive landscape
  • Be a user of the inference platform to have a deep sense of what’s working and what’s not working in the product
What we offer
What we offer
  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Network Enablement (Applied ML)

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering skills including systems design, APIs, and building reliable backend services (Go or Python preferred)
  • Production experience with batch and streaming data pipelines and orchestration tools such as Airflow or Spark
  • Experience building or operating real-time scoring and online feature-serving systems, including feature stores and low-latency model inference
  • Experience integrating model outputs into product flows (APIs, feature flags) and measuring impact through experiments and product metrics
  • Experience with model lifecycle and operations: model registries, CI/CD for models, reproducible training, offline & online parity, monitoring and incident response
Job Responsibility
Job Responsibility
  • Embed model inference into Network Enablement product flows and decision logic (APIs, feature flags, backend flows)
  • Define and instrument product + ML success metrics (fraud reduction, retention lift, false positives, downstream impact)
  • Design and run experiments and rollout plans (backtesting, shadow scoring, A/B tests, feature-flagged releases) to validate product hypotheses
  • Build and operate offline training pipelines and production batch scoring for bank intelligence products
  • Ship and maintain online feature serving and low-latency model inference endpoints for real-time partner/bank scoring
  • Implement model CI/CD, model/version registry, and safe rollout/rollback strategies
  • Monitor model/data health: drift/regression detection, model-quality dashboards, alerts, and SLOs targeted to partner product needs
  • Ensure offline and online parity, data lineage, and automated validation / data contracts to reduce regressions
  • Optimize inference performance and cost for real-time scoring (batching, caching, runtime selection)
  • Ensure fairness, explainability and PII-aware handling for partner-facing ML features
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer

At JFrog, we’re reinventing DevOps and MLOps to help the world’s greatest compan...
Location
Location
Israel , Netanya/Tel Aviv
Salary
Salary:
Not provided
jfrog.com Logo
JFrog
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of proven experience in software development
  • Strong background in designing, developing, and debugging complex distributed systems (e.g., microservices, event-driven architectures)
  • Hands-on experience with containerized environments, microservices, and Kubernetes
  • Proven experience with at least one major cloud provider (e.g., AWS, GCP, Azure)
  • Ability to lead technical discussions, mentor engineers, and drive architectural decisions
Job Responsibility
Job Responsibility
  • Be an integral part of a highly skilled team working to build the leading MLOps platform in the industry
  • Maintain and evolve the Runtime team’s products, ensuring their reliability and scalability
  • Design and develop a complete hosting system that supports various types of inference, analytics, monitoring, distribution, and more – enabling customers to run large-scale real-time, batch, and streaming ML pipelines
  • Play a key role in shaping our cross-company engineering culture
  • Conduct high-quality design reviews with a strong emphasis on scalability, maintainability, security, and sound use of design patterns
  • Write maintainable, well-tested code in multiple programming languages
  • Continuously improve the efficiency, scalability, and stability of critical system components
Read More
Arrow Right

AI Software Engineer I

We're looking for an AI engineer to help build the core features and pipelines t...
Location
Location
United States , Birmingham
Salary
Salary:
95700.00 - 160000.00 USD / Year
daxko.com Logo
Daxko
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years of software engineering experience
  • Hands-on experience integrating AI APIs (OpenAI, Azure Cognitive Services, AWS Bedrock/SageMaker)
  • Strong Python or JavaScript/TypeScript skills (C# a plus)
  • Experience with AI/ML frameworks: PyTorch, TensorFlow, scikit-learn, Hugging Face, LangChain
  • Familiarity with embeddings, vector databases, and basic RAG concepts
  • Understanding of microservices, REST/GraphQL APIs, and version control (Git)
  • Exposure to cloud environments and CI/CD pipelines
  • Ability to write clear, modular, maintainable code
  • Bachelor’s degree in Computer Science, Data Science, Software Engineering, or related experience
Job Responsibility
Job Responsibility
  • Build AI-enabled product features: chat, recommendations, anomaly detection, summarization, workflow automation
  • Contribute to RAG pipelines: ingestion, chunking, embeddings, vector search, retrieval logic
  • Integrate model APIs (OpenAI, Azure OpenAI, AWS Bedrock/SageMaker) into production systems
  • Implement reusable components for prompts, retrieval, and inference routing
  • Write clean, testable, secure code and participate in code reviews
  • Work with QA, DevOps, and Security to ensure reliable deployment and model behavior
  • Translate prototypes into maintainable production services and collaborate with product/UX to embed AI into user workflows
  • Participate in Agile ceremonies and contribute to a culture of high-quality engineering
What we offer
What we offer
  • Flexible paid time off
  • Affordable health, dental, and vision insurance options
  • Monthly fitness reimbursement
  • 401(k) matching
  • New-Parent Paid Leave
  • Casual work environments
  • Remote work
  • Fulltime
Read More
Arrow Right