NeFut Logo NeFut
Admin Login

[CS.AI] FAConformer: Frequency-Aware CNN-Transformer for Auditory Attention Decoding

Published at: 2026-06-16 22:00 Last updated: 2026-06-17 01:38
#AI #Machine Learning #Neural

Abstract

Auditory attention decoding (AAD) aims to infer the attended speaker from neural responses in multi-speaker acoustic environments and is a key problem for neuro-steered hearing systems. Although recent studies have achieved encouraging progress, existing AAD models still do not fully exploit frequency domain electroencephalography (EEG) information. Most approaches introduce multi-band information through handcrafted feature extraction or direct cross-band feature concatenation, primarily exploiting frequency information at a shallow level and potentially overlooking band-specific patterns and cross-band interactions.

To address these limitations, this paper proposes FAConformer, a frequency-aware CNN-Transformer framework for AAD that explicitly integrates band-specific encoding and adaptive cross-band interaction. Specifically, FAConformer first decomposes EEG signals into multiple frequency bands and assigns each band to an independent CNN-Transformer encoder for band-specific modeling. The resulting band-wise features are then adaptively fused by a carefully designed frequency-aware attention (FAA) module that models cross-band dependencies by treating band-wise features as tokens. Further, band-wise auxiliary supervision (BAS) is introduced to prevent weakly contributing branches from being under-optimized during joint training. In this way, FAConformer performs frequency-aware modeling that more effectively exploits frequency domain information.

Extensive experiments on two public AAD datasets with three decision-window lengths demonstrated that FAConformer consistently outperformed 12 competitive baselines, surpassing the current state-of-the-art model by 4.9%. Further analyses of band importance, ablation, and parameter sensitivity verify the effectiveness, robustness, and interpretability of the proposed framework. Code is available at GitHub.

Blogger's Review: FAConformer significantly enhances auditory attention decoding through its frequency-aware mechanism and adaptive interactions. This innovative approach not only improves the model's utilization of frequency domain information but also strengthens the modeling capabilities for different band features, providing new insights for future neuro-steered hearing systems.

Original Source: https://arxiv.org/abs/2606.14120

[h] Back to Home