CrawlJobs Logo

Member of Technical Staff - Edge Inference Engineer

United States, San Francisco · Job Posted February 21, 2026
Apply Position
Job Link Share

Job Description

Our Edge Inference team compiles Liquid Foundation Models into optimized machine code that runs on resource-constrained devices: phones, laptops, Raspberry Pis, and watches. We are core contributors to llama.cpp and build the infrastructure that makes efficient on-device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures and hardware constraints. This is high-ownership work where your code ships to production and directly impacts model performance on real devices.

Job Responsibility

  • Implement and optimize inference kernels for CPU, NPU, and GPU architectures across diverse edge hardware
  • Develop quantization strategies (INT4, INT8, FP8) that maximize compression while preserving model quality under strict memory budgets
  • Contribute to llama.cpp and other open-source inference frameworks, including new model architectures (audio, vision)
  • Profile and optimize end-to-end inference pipelines to achieve sub-100ms time-to-first-token on target devices
  • Collaborate with ML researchers to understand model architectures and identify optimization opportunities specific to Liquid Foundation Models

Requirements

  • 5+ years of experience in systems programming with strong C++ proficiency
  • Embedded software engineering experience or work on resource-constrained systems
  • Understanding of ML fundamentals at the linear algebra level (how matrix operations, attention, and quantization work)
  • Experience with hardware architecture concepts: cache hierarchies, memory bandwidth, SIMD/vectorization

Nice to have

  • Contributions to llama.cpp, ExecuTorch, or similar inference frameworks
  • Experience with Rust for systems programming
  • Background in custom accelerator development (TPU, NPU) or work at companies like SambaNova, Cerebras, Groq, or Google/Amazon accelerator teams
  • Quantitative degree (mathematics, physics, or similar) combined with engineering experience

What we offer

  • Competitive base salary with equity in a unicorn-stage company
  • 100% of medical, dental, and vision premiums for employees and dependents
  • 401(k) matching up to 4% of base pay
  • Unlimited PTO plus company-wide Refill Days throughout the year

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Member of Technical Staff - Edge Inference Engineer

8 matching positions

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Platform

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Palo Alto
Salary
Salary:
90000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Synthetic Data

As a Machine Learning Engineer specializing in synthetic data, you will play a p...
Location
Location
Salary
Salary:
Not provided
cohere.com Logo
Cohere
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering skills, with proficiency in Python and experience building data pipelines
  • Familiarity with data processing frameworks such as Apache Spark, Apache Beam, Pandas, or similar tools
  • Experience working with LLMs through work projects, open-source contributions or personal experimentation
  • Familiarity with LLM inference frameworks such as vLLM and TensorRT
  • Experience working with large-scale datasets, including web data, code data, and multilingual corpora
  • A passion for bridging research and engineering to solve complex data-related challenges in AI model training
Job Responsibility
Job Responsibility
  • Design and build scalable inference pipelines that run on large GPU clusters
  • Conduct data ablations to assess data quality and experiment with data mixtures to enhance model performance
  • Research and implement innovative synthetic data curation methods, leveraging Cohere’s infrastructure to drive advancements in natural language processing
  • Collaborate with cross-functional teams, including researchers and engineers, to ensure data pipelines meet the demands of cutting-edge language models
What we offer
What we offer
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)
  • Fulltime
Read More
Arrow Right
New

Paid Search Manager

Location
Location
United States , Newton; Newark
Salary
Salary:
Not provided
80twenty.com Logo
80Twenty
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years managing paid search at scale in acquisition-focused environments, with direct ownership of strategy and execution across Google Ads and Microsoft Advertising
  • Demonstrated experience managing $1M+ annual paid search budgets with consistent efficiency discipline under growth targets
  • Strong lead generation background with a focus on driving volume at efficient acquisition costs
  • Experience optimizing toward LTV or value-based metrics, with a solid understanding of marginal economics and incremental return evaluation
  • Advanced performance analysis skills, including forecasting, planning, and attribution across the lead generation funnel
  • Proactive approach to performance management, with a track record of identifying issues independently and acting without waiting to be directed
  • Strong understanding of automation-first campaign structures, including Smart Bidding and Performance Max
  • Experience with Performance Max, YouTube, or display advertising is a plus
  • Background in lead generation or marketing in the education or financial services vertical is a plus
  • Exposure to landing page testing, CRO, or SEO strategies is a plus
Job Responsibility
Job Responsibility
  • Own paid search strategy and execution across Google Ads and Microsoft Advertising, driving profitable lead volume across products and lifecycle stages
  • Lead execution across Search, Performance Max, YouTube, and remarketing, leveraging value-based bidding and structured signal inputs to maximize long-term return
  • Manage budget allocation and automated bidding strategies across a $1M+ annual paid search investment, with a focus on marginal efficiency and ROAS performance
  • Build and maintain scalable account structures aligned to acquisition, engagement, and revenue goals
  • Develop keyword, query, and audience expansion frameworks to drive incremental growth
  • Design and execute structured test-and-learn roadmaps across campaign types, ad formats, creative, targeting, bidding, and landing pages
  • Own paid search forecasting and investment planning aligned to business targets, tracking current run rate through 12 to 18 months out
  • Ensure accurate tracking, value signals, audience inputs, and attribution frameworks are properly configured to drive platform performance
  • Analyze performance data, identify trends, and implement ongoing optimizations to improve efficiency, scale, and conversion outcomes
  • Monitor campaign performance proactively, identify variances, and surface recommendations before issues compound
  • Fulltime
Read More
Arrow Right
New

Overhaul Engineer

As an Overhaul Engineer, you will play a critical role in supporting the overhau...
Location
Location
United Kingdom , Manchester
Salary
Salary:
Not provided
morson.com Logo
Morson Talent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • HNC/HND (Level 4/5) in Engineering or above
  • Level 3 Engineering qualification and significant relevant experience
  • Proven experience within maintenance and/or production engineering
  • Rail industry experience highly desirable
  • Experience working with heavy industrial equipment such as: Lathes, Presses, Cranes, Blasting equipment
  • Strong planning, organisational, and technical documentation skills
  • Familiarity with structured problem-solving tools such as: 5 Whys, 5W2H
  • Knowledge of Lean Manufacturing, Six Sigma, TPM, or Autonomous Maintenance would be advantageous
  • Strong IT skills with the ability to quickly learn new systems and software platforms
Job Responsibility
Job Responsibility
  • Deliver production and maintenance targets through effective engineering support
  • Ensure compliance with OEM recommendations, procedures, and industry regulations
  • Support equipment installation, commissioning, and maintainability reviews
  • Conduct trials and demonstrations to identify design and documentation improvements
  • Drive continuous improvement across maintenance and production activities
  • Support facility readiness, system integration, and overhaul process development
  • Forecast materials and external service requirements
  • Produce and maintain technical documentation, procedures, scopes of work, and BOMs
  • Carry out root cause analysis and implement corrective actions
  • Monitor material usage and contribute to budget reporting
  • Fulltime
Read More
Arrow Right
New

AI Engineer

Location
Location
India , Bangalore
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong proficiency in Python for ML pipelines, model serving, automation, and framework development
  • Hands-on experience with Apache Spark / PySpark for large-scale data processing, feature engineering, and batch inference
  • Proven experience building and operating end-to-end ML pipelines (training, validation, packaging, deployment, monitoring)
  • Experience with config-driven pipelines and reusable ML frameworks
  • Experience deploying real-time models exposed as REST APIs
  • Hands-on knowledge of HTTP-based model serving (FastAPI, Flask, custom model servers, etc.)
  • Strong understanding of low-latency requirements, performance tuning, and scaling real-time inference services
  • Knowledge of streaming technologies such as Kafka, Pub/Sub, Spark Streaming, Logstash, or similar
Read More
Arrow Right
New

Growth Marketing Manager

Our client is looking for Growth Marketing Manager to help build and scale our a...
Location
Location
United States
Salary
Salary:
150000.00 USD / Year
80twenty.com Logo
80Twenty
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 4-7+ years of experience in performance and growth marketing
  • Have deep expertise in paid digital (SEM, Social), SEO and conversion rate optimization
  • Strong analytical and quantitative skills, with the ability to work comfortably in spreadsheets and dashboards, analyze performance data, and translate insights into clear actions
  • Demonstrated experience owning funnel metrics and unit economics (ROAS, CAC, CPA, conversion rates), and using data to inform budget allocation, optimization, and prioritization
  • Understanding of attribution and measurement frameworks and how to QA attribution systems to improve data fidelity and funnel reporting capabilities
  • Experience breaking down ambiguous problems, structuring analyses, and developing pragmatic solutions using a first-principles mindset
  • Ability to lead cross functional projects (e.g. with Sales, Finance, Operations, and external agencies), create alignment and clarity, and move projects forward to drive measurable outcomes
  • Strong communication skills, with the ability to clearly articulate data-driven insights, tradeoffs, and recommendations to both technical and non-technical stakeholders
  • High ownership mentality – you're comfortable operating in environments with limited structure, taking initiative, and driving work forward
Job Responsibility
Job Responsibility
  • Drive scalable growth by building and optimizing acquisition and lifecycle programs and strategies tied directly to funnel performance, unit economics, revenue growth, and profitability
  • Take the lead on launching new growth channels or finding ways to scale existing ones
  • Own funnel performance and unit economic reporting (ROAS, CAC, conversion rates) – identify & help resolve data infrastructure gaps, and build reporting & analytics that enable faster, more informed decision-making across growth, sales, finance, and executive leadership
  • Continuously optimize our funnel end-to-end by identifying conversion bottlenecks, designing data-driven experiments, applying technology and process, and partnering with cross functional teams to drive higher conversion
  • Manage team and agencies to drive performance across PPC, paid social, and organic channels, occasionally digging into the numbers yourself to verify, uncover insights, and develop actionable recommendations for optimization
  • Develop and implement go-to-market plans for new market launches, including local search and paid lead generation expansion, and playbooks for post-brand acquisition
  • Establish, improve, and QA attribution tracking, UTMs, and measurement frameworks to increase both the fidelity of our reporting and decision making quality
  • Partner with lifecycle and provide direction to improve conversion and drive customer re-engagement
  • Work cross functionally with Sales, Finance, and Operations to design and implement strategies and solutions to improve funnel performance and drive growth and profitability
What we offer
What we offer
  • 150,000+ Salary
  • 10% Compensation
  • 401(k)
  • Generous Paid Time Off(PTO) and paid holidays
  • Medical, dental, and vision insurance, and more
  • Fulltime
Read More
Arrow Right