This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Cairns Health is building an AI-powered care companion that seniors interact with entirely through voice. We’re looking for a highly skilled engineer to own and evolve the real-time audio and speech processing pipeline that makes these conversations feel natural, reliable, and responsive on embedded hardware. This role is ideal for someone with a strong foundation in C++ on embedded Linux and deep hands-on experience with audio signal processing for speech, whether your background is firmware-heavy with ML exposure or ML-leaning with strong systems skills.
Job Responsibility:
Design and implement real-time streaming of speech audio to and from the OpenAI Realtime API
Build and tune audio buffering, latency management, and synchronization for conversational speech
Implement speech interruption detection (barge-in) to support natural, turn-based dialogue
Develop dynamic noise floor detection and related signal conditioning for in-home environments
Apply practical audio signal processing and ML techniques to improve speech quality and robustness
Evaluate and potentially re-architect our Linux audio stack (e.g., PulseAudio → PipeWire)
Optimize performance, memory usage, and reliability on constrained embedded devices
Collaborate closely with firmware, ML, and hardware teams to ship production-quality systems
Requirements:
Strong proficiency in C++ with experience building production, real-time systems
Hands-on experience with audio signal processing for speech, such as: Audio buffering and streaming, Noise estimation / suppression, Voice activity detection or interruption handling
Experience developing on embedded Linux (Yocto preferred)
Solid understanding of multi-threaded, low-latency systems
Comfortable working close to the OS and audio stack
Nice to have:
Experience integrating with speech or conversational AI systems
Familiarity with ML tools or models used in audio/speech processing
Experience with PipeWire, PulseAudio, ALSA, or similar Linux audio frameworks
Background in embedded firmware, device bring-up, or kernel-adjacent development