Despite the highlyaccurate predictions in clean environments, the accuracy of keyword spotting systems critically degrades in noisy environments, as artifacts corrupt the input signal when complex noises are present in the environment. We propose a domain-adaptation methodology to tackle accuracy losses caused by on-site environmental noises. Our approach refines keyword spotting models on prerecorded speech data augmented with noise samples collected on-site. Thanks to this, we measure accuracy improvements of 15% over noise-augmented pretrained models and 25% over noiseless keyword spotters. The adaptation runs completely on-device, leveraging backpropagation-based learning on the ten-core ultra-low power GAP9 platform. With a memory adaptation cost of 10 kB, our specialised models outperform larger, already-robust baseline networks.
Schedule
Timezone: PDT
On-Device Domain Adaptation for Noise-Robust Keyword Spotting
Cristian CIOFLAN, Doctoral Student
ETH Zürich
Cristian CIOFLAN, Doctoral Student
ETH Zürich
Cristian Cioflan received his B.Sc. degree in Electrical Engineering and Information Technology from University Politehnica of Bucharest in 2018 and his M.Sc. degree from the Swiss Federal Institute of Technology Zürich (ETHZ) in 2020. He is currently pursuing a Ph.D. in the Digital Circuits and Systems group of Prof. Luca Benini. His research interests include audio processing in low-power embedded systems, on-device continual learning, and neural architecture search for energy-efficient learning.
Schedule subject to change without notice.