Real-time speech source separation in a dynamic environment poses significant challenges for wearable augmented reality (AR) devices due to moving sources, head rotations, and adverse acoustic conditions. This seminar presents a robust bilinear framework that integrates minimum power distortionless response (MPDR) beamforming with weighted prediction error (WPE) dereverberation.
By decoupling spatial and temporal filtering, we enable efficient recursive least squares (RLS) adaptation that tracks changes in the acoustic scene. To further improve robustness against steering vector errors caused by direction-of-arrival (DOA) mismatches, we introduce region-of-interest (ROI) beamforming.
Additionally, we present a linearly constrained minimum power (LCMP) extension to enable flexible spatial control. Our comprehensive analysis of the framework, combined with robust ROI and LCMP extensions validated on real-world SPEAR recordings, establishes a practical and efficient solution for real-time audio enhancement in wearable AR systems.
M.Sc. student under the supervision of Prof. Israel Cohen