Smoothed Frame-Level SINR and Its Estimation for Sensor Selection in Distributed Acoustic Sensor Networks
Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
In this work, we propose a speaker-dependent smoothed frame-level SINR estimation method for sensor selection in multi-speaker scenarios, specifically addressing source movement within DASN. Additionally, we devise an approach for similarity measurement to generate dynamic speaker embeddings resilient to variations in reference speech levels. Furthermore, we introduce a novel loss function that integrates classification and ordinal regression within a unified framework.
Recommended citation: S. Guan, M. Wang, Z. Bai, J. Wang, J. Chen and J. Benesty, "Smoothed Frame-Level SINR and Its Estimation for Sensor Selection in Distributed Acoustic Sensor Networks," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 4554-4568, 2024, 10.1109/TASLP.2024.3477277. https://ieeexplore.ieee.org/document/10711254