Voice-Capture Device

M*Modal, an independent pioneer in cloud speech recognition and transcription for healthcare, originally asked us to develop a purpose-built ambient-audio capture device to streamline EHR workflows. The result was a fully functional, field-ready system that anticipated the expanding role of AI in clinical documentation. At the project’s successful conclusion, M*Modal was acquired by 3M. Today the hardware lives on as the Solventum™ Ambient Device, the room-capture endpoint that powers Fluency Align—Solventum’s ambient clinical-note service.

Our interdisciplinary team led industrial design, mechanical and electrical engineering, firmware, and acoustic integration. A key engineering thrust was ultra-low-power USB operation—deep-sleep, wake-on-sound and automatic power-down detection—so the unit can sit idle yet wake instantly when a clinician speaks or taps the surface. Core interaction elements include one-touch start/stop, dynamic LED feedback, and a sealed five-element MEMS microphone array protected by hydrophobic filters.

After alpha prototypes were verified using off-the-shelf components, we built multiple custom electromechanical architectures to refine performance, acoustics, and manufacturability. In parallel we produced a PoE-enabled network-appliance variant that can either stream encrypted audio to Solventum’s cloud or, where policy requires, run recognition workloads locally.

The final design balanced technical performance with cleanability and near-invisibility in the exam room: a compact, disinfectant-safe enclosure that can sit on a desk or mount flush to the wall, powered by a single USB-C or PoE cable. A limited pilot run of fully functional units—assembled and documented in-house by Daedalus—validated the hardware and integrated seamlessly with the client’s software stack.

Under Solventum, the Ambient Device now anchors a fast-growing Fluency Align services business, strengthened by partnerships with Epic, Amazon Web Services, and other ecosystem vendors.