CrawlJobs Logo

Audio Inference Engineer, Model Efficiency

cohere.com Logo

Cohere

Location Icon

Location:
United States; Canada , New York

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Our team is a fast-growing group of committed researchers and engineers. The mission of the team is to build reliable machine learning systems and optimize audio inference serving efficiency using innovative techniques. As an engineer on this team, you will work on advancing core audio model serving metrics, including latency, throughput, and quality by diving deep into our systems, identifying bottlenecks, and delivering creative solutions for audio processing and streaming workloads. You’ll collaborate closely with both the training and serving infrastructure teams to ensure seamless integration between model development and deployment, with a special focus on real-time and streaming audio inference.

Job Responsibility:

  • Work on advancing core audio model serving metrics, including latency, throughput, and quality by diving deep into our systems, identifying bottlenecks, and delivering creative solutions for audio processing and streaming workloads
  • Collaborate closely with both the training and serving infrastructure teams to ensure seamless integration between model development and deployment, with a special focus on real-time and streaming audio inference

Requirements:

  • Significant experience developing high-performance audio or machine learning inference systems
  • Proficiency with programming languages such as C++ and Python
  • Hands-on experience with deep learning models for audio, speech, or language applications
  • A bias for action and a strong results-oriented mindset

Nice to have:

  • GPU programming, low-level system optimization, model parallelization techniques over multiple GPUs
  • Have experience with duplex real-time streaming architectures
  • Internals of machine learning frameworks for audio (such as PyTorch, TensorFlow, or specialized audio libraries)
  • Have experience with inference framework like vLLM, SGLang, Tensort-LLM, or custom distributed inference systems
  • Sequence modeling (e.g., transformers for audio/speech) and end-to-end audio pipeline optimization
What we offer:
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)

Additional Information:

Job Posted:
February 20, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Audio Inference Engineer, Model Efficiency

New

Senior Inference ML Runtime Engineer

The Inference ML Engineering team at Cerebras Systems is dedicated to enabling o...
Location
Location
United States; Canada , Sunnyvale; Toronto
Salary
Salary:
Not provided
cerebras.net Logo
Cerebras Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Mathematics, or a related field
  • 8+ years of experience in large-scale software engineering, with a focus on deep learning or related domains
  • Proficiency in Python for building and maintaining scalable systems
  • Advanced proficiency in C++, with an emphasis on multi-threaded programming, performance optimization, and system-level development
  • Demonstrated experience driving cross-functional projects
  • Experience building and scaling large-scale inference systems for LLMs or multimodal models
  • Familiarity with LLM serving frameworks, such as vLLM, SGLang, and TensorRT-LLM
  • Solid understanding of software architectural patterns for large-scale, high-performance applications
  • Hands-on experience with ML frameworks, such as PyTorch, and a strong understanding of their underlying architectures
  • Strong problem-solving skills, with the ability to balance technical depth with practical implementation constraints
Job Responsibility
Job Responsibility
  • Drive and provide technical guidance to a team of software engineers working on complex machine learning integration projects
  • Design and implement ML features (e.g., structured outputs, biased sampling, predicted outputs) that improve performance of generative AI models at inference time
  • Design and implement high-throughput, low-latency multimodal inference models that support delivery of image, audio, and video inputs and outputs
  • Maintain our scalable serving backend for handling many concurrent requests per minute
  • Scale our inference service by implementing detailed observability throughout the entire stack
  • Analyze and improve latency, throughput, memory usage, and compute efficiency on the service and the implementation of various features
  • Optimize software to accelerate generative LLM inference by achieving high throughput and low latency
  • Stay up-to-date with advancements in machine learning and deep learning, and apply state-of-the-art techniques to enhance our solutions
  • Evaluate trade-offs between different approaches, clearly articulate design choices, and develop detailed proposals for implementing new features
  • Uncover, scope, and prioritize significant areas of technical debt across the software stack to ensure continued high quality of the inference service
What we offer
What we offer
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs
Read More
Arrow Right

Research Scientist Intern, Real-Time Multimodal AI

Reality Labs is building the future of connection through world-class AR/VR hard...
Location
Location
United States , Burlingame
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in Computer Science, Machine Learning, Electrical Engineering, or a related field
  • 2+ years of research experience in one or more of the following areas: multimodal learning, vision-language models, large language models, or foundation model fine-tuning
  • Hands-on experience fine-tuning large foundation models (e.g., LLaVA, InternVL, Qwen-VL, LLaMA, or similar)
  • Strong programming skills in Python
  • Experience with deep learning frameworks such as PyTorch
  • Excellent communication skills and ability to work independently
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Research and develop novel approaches for fine-tuning large multimodal foundation models (vision-language, audio-visual) for real-time applications
  • Design and implement efficient inference pipelines for deploying fine-tuned models in real-time communication scenarios
  • Explore agentic architectures that leverage fine-tuned models as tools within larger AI systems
  • Collaborate with cross-functional teams to integrate models into prototype experiences
  • Document and present research progress with the goal of publishing findings at top-tier ML/CV conferences
  • Contribute to building working prototypes that demonstrate the capabilities of fine-tuned multimodal models
Read More
Arrow Right
New

Research Engineer, RealTime AI, MSL PAR

We are seeking research engineers to join the Product and Applied Research (PAR)...
Location
Location
United States , Bellevue, WA
Salary
Salary:
257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 2+ years of industry experience in LLM/NLP, audio, or related AI/ML models
  • Experience as a formal technical lead, leading major technical initiatives with cross functional partners to impact, and/or influencing strategy across multiple teams
  • Skilled in model training, data, or inference & efficiency for LLMs
  • Experience building products/systems based on machine learning, reinforcement learning and/or deep learning methods
  • Programming experience in Python and hands-on experience with frameworks like PyTorch
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s AI Characters products
  • Lead the development of new algorithms and systems for LLM post-training, evaluation and efficiency
  • Support creative data sourcing, high-quality post-training data curation, and scale and optimize data pipelines for large language models (LLMs)
  • Develop and integrate models,orchestrations and RAGs in production
  • Analyze and interpret experimental results, iterate on model architectures, and drive continuous improvement
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Senior Data Scientist

We are seeking a Senior Data Scientist with deep expertise in unstructured data ...
Location
Location
Taiwan
Salary
Salary:
Not provided
beyond.ai Logo
Beyond Limits
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on experience in AI, Machine Learning, and Data Science, with a strong focus on production-scale AI
  • Expertise in LLMs, including fine-tuning, distributed training, quantization, and pruning techniques
  • Experience working with OCR, ASR, and TTS applications in real-world deployments
  • Proven experience deploying AI models in production, with real-world examples of scaled AI applications
  • Strong understanding of cloud computing, containerization (Docker, Kubernetes), and ML Ops best practices
  • Proficiency in Python, PyTorch, and ML libraries
  • Hands-on experience with vector databases and retrieval-augmented generation (RAG) architectures
  • Strong awareness of AI system performance benchmarks (latency, speed, throughput) and ability to optimize models accordingly
  • Experience working with AI agents, designing real-world intelligent automation solutions beyond just open-source experimentation
  • Proficiency in transformer-based architectures (BERT, GPT, LLaMA, Whisper, etc.), including pre-training, fine-tuning, and task-specific adaptation
Job Responsibility
Job Responsibility
  • Develop and deploy AI models for unstructured data (text, speech, audio, images) with a focus on enterprise-scale performance
  • Fine-tune, optimize, and deploy LLMs and multimodal models, integrating distributed training, quantization, and pruning techniques for efficiency
  • Design and implement production-ready AI solutions, ensuring scalability, low-latency inference, and high throughput
  • Work with AI agents and automation frameworks to create intelligent, real-world AI applications for enterprise clients
  • Build and maintain end-to-end LLM Ops pipelines, ensuring efficient training, deployment, monitoring, and model updates
  • Implement vector search and retrieval-augmented generation (RAG) systems for large-scale data solutions
  • Monitor AI performance using key metrics such as speed, latency, and throughput, continuously refining models for real-world efficiency
  • Work with cloud-based AI infrastructure (AWS, GCP) and containerized environments (Docker, Kubernetes) to scale AI solutions
  • Collaborate with engineering, DevOps, and product teams to align AI solutions with business needs and client requirements
  • Implement data curation pipelines, including data collection, cleaning, deduplication, decontamination, etc. for training high-quality AI models
Read More
Arrow Right

Senior Data Scientist

We are seeking a Senior Data Scientist with deep expertise in unstructured data ...
Location
Location
Salary
Salary:
Not provided
beyond.ai Logo
Beyond Limits
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on experience in AI, Machine Learning, and Data Science, with a strong focus on production-scale AI
  • Expertise in LLMs, including fine-tuning, distributed training, quantization, and pruning techniques
  • Experience working with OCR, ASR, and TTS applications in real-world deployments
  • Proven experience deploying AI models in production, with real-world examples of scaled AI applications
  • Strong understanding of cloud computing, containerization (Docker, Kubernetes), and ML Ops best practices
  • Proficiency in Python, PyTorch, and ML libraries
  • Hands-on experience with vector databases and retrieval-augmented generation (RAG) architectures
  • Strong awareness of AI system performance benchmarks (latency, speed, throughput) and ability to optimize models accordingly
  • Experience working with AI agents, designing real-world intelligent automation solutions beyond just open-source experimentation
  • Proficiency in transformer-based architectures (BERT, GPT, LLaMA, Whisper, etc.), including pre-training, fine-tuning, and task-specific adaptation
Job Responsibility
Job Responsibility
  • Develop and deploy AI models for unstructured data (text, speech, audio, images) with a focus on enterprise-scale performance
  • Fine-tune, optimize, and deploy LLMs and multimodal models, integrating distributed training, quantization, and pruning techniques for efficiency
  • Design and implement production-ready AI solutions, ensuring scalability, low-latency inference, and high throughput
  • Work with AI agents and automation frameworks to create intelligent, real-world AI applications for enterprise clients
  • Build and maintain end-to-end LLM Ops pipelines, ensuring efficient training, deployment, monitoring, and model updates
  • Implement vector search and retrieval-augmented generation (RAG) systems for large-scale data solutions
  • Monitor AI performance using key metrics such as speed, latency, and throughput, continuously refining models for real-world efficiency
  • Work with cloud-based AI infrastructure (AWS, GCP) and containerized environments (Docker, Kubernetes) to scale AI solutions
  • Collaborate with engineering, DevOps, and product teams to align AI solutions with business needs and client requirements
  • Implement data curation pipelines, including data collection, cleaning, deduplication, decontamination, etc. for training high-quality AI models
Read More
Arrow Right
New

Pharmacy Technician

We’re building a world of health around every individual — shaping a more connec...
Location
Location
United States , Blaine
Salary
Salary:
17.00 - 27.00 USD / Hour
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
April 21, 2026
Flip Icon
Requirements
Requirements
  • Must comply with any state board of pharmacy requirements or laws governing the practice of pharmacy, which includes but is not limited to, age, education, and licensure/certification
  • If the state board of pharmacy does not address or mandate a minimum age requirement, must be at least 16 years of age
  • If the state board of pharmacy does not address or mandate a minimum educational requirement, must have a high school diploma or equivalent, or be actively enrolled in high school or high school equivalency program
  • Regular and predictable attendance, including nights and weekends
  • Ability to complete required training within designated timeframe
  • Attention and Focus: Ability to concentrate on a task over a period of time
  • Ability to pivot quickly from one task to another to meet patient and business needs
  • Ability to confirm prescription information and label accuracy, ensuring patient safety
  • Customer Service and Team Orientation: Actively look for ways to help people, and do so in a friendly manner
  • Notice and understand patients’ reactions, and respond appropriately
Job Responsibility
Job Responsibility
  • Living our purpose by following all company SOPs at each workstation to help our Pharmacists manage and improve patient health
  • Following pharmacy workflow procedures at each pharmacy workstation (i.e., production, pick-up, drive-thru, and drop-off) for safe and accurate prescription fulfillment
  • Contributing to positive patient experiences by showing empathy and genuine care: creating heartfelt and personalized moments while serving patients at pick-up, drive-thru, and over the phone
  • keeping patients healthy by offering immunizations and other services at the register and over the phone
  • and demonstrating compassionate care by solving or escalating patient problems
  • Completing basic inventory activities, as permitted by law, and as directed by the pharmacy leadership team, such as accurately putting away medication deliveries and completing cycle counts, returns-to-stocks, waiting bin inventories, etc.
  • Contributing to a high-performing team, embracing a growth mindset, and being receptive to feedback
  • actively seeking opportunities to expand clinical and technical knowledge needed to better assist patients
  • Remaining flexible for both scheduling and business needs, while contributing to a safe, inclusive, and engaging team dynamic
  • voluntarily traveling to stores in the market to work shifts as needed by the business
What we offer
What we offer
  • Affordable medical plan options, a 401(k) plan (including matching company contributions), and an employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
  • Benefit solutions that address the different needs and preferences of our colleagues including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility
  • Fulltime
Read More
Arrow Right
New

Distribution Training Manager

As the Distribution Training Manager, you will be a strategic partner to the Dis...
Location
Location
Hong Kong , Hong Kong
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
March 19, 2026
Flip Icon
Requirements
Requirements
  • University degree in any discipline
  • Minimum of 8 years in training with at least 3 years in a management capacity specifically within the life insurance industry (Agency Management or Sales Training)
  • Excellent command of both spoken and written English and Chinese
  • Proficient in MS Office (Word, Excel, PowerPoint)
  • At least three years in agency training is preferred
  • Candidates with financial services background would be an absolute advantage
  • Professional insurance qualifications (FLMI, IIQA, etc.) are highly preferred
Job Responsibility
Job Responsibility
  • Develop long-term training plans, strategies, and policies specifically for Agency Managers
  • Design and revamp comprehensive training curricula for agency managers across all levels (Junior, Middle, and Senior)
  • Lead the delivery of high-impact training programs and continuously update methodologies
  • Supervise and coach the training team, monitoring project progress and fostering professional development
  • Lead high-profile initiatives, including AI training projects and the implementation/optimization of the Learning Management System (LMS)
  • Oversee CPD accreditation, maintain training tracking reports, and manage audit or regulatory review projects
  • Lead the team in participating in industry award competitions to showcase the firm’s training excellence
Read More
Arrow Right
New

Senior Marine Claims Handler

This is an individual contributor opportunity for an experienced professional to...
Location
Location
Singapore , Singapore River
Salary
Salary:
8000.00 - 9000.00 SGD / Month
https://www.randstad.com Logo
Randstad
Expiration Date
March 22, 2026
Flip Icon
Requirements
Requirements
  • Extensive background in claims management in the Marine insurance (ideally Marine Cargo)
  • Advanced proficiency in navigating legal principles and interpreting complex manuscript policy wordings
  • Strong commercial acumen paired with highly developed negotiation and relationship-building capabilities
  • Decisive problem solving skills with a proven track record of managing a successful portfolio of results driven outcomes
  • Excellent command of digital tools and database systems for managing technical workflows and stakeholder communications
Job Responsibility
Job Responsibility
  • Manage the full life cycle of high value Marine insurance claims
  • Lead the end-to-end investigation and strategic settlement of a complex portfolio of Marine claims
  • Negotiate high level outcomes with a wide range of external partners including brokers, legal providers and co-insurers
  • Partner with internal divisions such as underwriting and risk engineering to provide technical insights and support account retention efforts
What we offer
What we offer
  • Bonus
Read More
Arrow Right