This collection of preprints explores advancements in signal processing and communication, focusing on emerging technologies for biomedical applications, 6G networks, and radar systems. Energy efficiency is a key theme, with Samakovlis et al. (2024) Samakovlis et al. (2024) benchmarking state-of-the-art low-power microcontrollers (MCUs) for biomedical wearables. Their analysis, spanning idle, data acquisition, and processing phases, identifies key hardware features impacting energy consumption and offers insights for optimized wearable design. Mehta et al. (2024) Mehta et al. (2024) investigate resting-state EEG's potential to enhance motor imagery decoding models. Their findings reveal limited benefits in both within-user and across-user scenarios, highlighting the need for further research. Reidy et al. (2024) Reidy et al. (2024) present Respiro, a novel wearable UWB radar system for continuous respiratory rate monitoring during motion. A controlled experiment demonstrates high accuracy, showcasing the potential of consumer-grade UWB radar for unobtrusive health monitoring.
6G networks are also a major focus. Guo et al. (2024) Guo et al. (2024) propose a movable antenna enhanced networked full-duplex ISAC system, optimizing beamforming and resource allocation via majorization-minimization. Wang et al. (2024) Wang et al. (2024) analyze multiple STAR-RIS assisted MIMO-NOMA systems using operator-valued free probability, deriving asymptotic information rate expressions and a projected gradient ascent method for STAR-RIS optimization. Thoota & Larsson (2024) Thoota & Larsson (2024) introduce a flexible framework for grant-free random access in cell-free massive MIMO, accommodating variable pilot lengths and integration with grant-based systems. Guo & Liu (2024) Guo & Liu (2024) propose DeepCAPA, a deep learning framework for beamforming in continuous aperture arrays, achieving superior spectral efficiency.
Several contributions explore radar and signal processing. He et al. (2024) He et al. (2024) analyze super-resolution ISAR imaging for space targets, deriving performance bounds and exploring parameter trade-offs. Gavras & Alexandropoulos (2024) Gavras & Alexandropoulos (2024) present a circuit-compliant near-field localization framework with dynamic metasurface antennas, demonstrating improved performance. Menakath et al. (2024) Menakath et al. (2024) introduce an orthogonal linear array based product beamforming method for real-time underwater 3D acoustical imaging, reducing computational complexity. Yao et al. (2024) Yao et al. (2024) investigate fluid antenna systems for secure and covert communications.
Specific communication challenges are addressed by various papers. Singh et al. (2024) Singh et al. (2024) propose a hybrid machine learning receiver for 5G NR PRACH. Shi et al. (2024) Shi et al. (2024) introduce an efficient method for computing and storing generalized scattering matrices. Jin et al. (2024) Jin et al. (2024) present AW-MinMax for range-free node localization in anisotropic networks. Alghotmi (2024) Alghotmi (2024) analyzes crosstalk interference in multi-stacked chips. Wang & Zeng (2024) Wang & Zeng (2024) propose a deep learning-based channel knowledge map construction method. Zhao et al. (2024) Zhao et al. (2024) introduce a demand-aware scheme for load balancing in LEO satellite networks.
Finally, several works explore advanced techniques. Friot-Giroux et al. (2024) Friot-Giroux et al. (2024) compare tomographic reconstruction algorithms. Kim et al. (2024) Kim et al. (2024) propose a multi-kernel ensemble diffusion model for EEG-based speech decoding. Talebi et al. (2024) Talebi et al. (2024) present BlueME for underwater robot communication. Gao et al. (2024) Gao et al. (2024) investigate adversarial attacks against ASR systems. Ren et al. (2024) Ren et al. (2024) introduce MultiVital, a mmWave MIMO radar for vital sign monitoring. Liu et al. (2024) Liu et al. (2024) study joint rate splitting and beamforming. Guo et al. (2024) Guo et al. (2024) investigate secrecy energy efficiency in VLC networks. Sangston (2024) Sangston (2024) proposes a geometric algebra framework for multidimensional analytic signals and Sangston (2024) introduces a radar clutter model. Schmalen et al. (2024) Schmalen et al. (2024) review machine learning in optical communications. Iacovelli et al. (2024) Iacovelli et al. (2024) discuss holographic MIMO. Aboulfotouh et al. (2024) Aboulfotouh et al. (2024) explore transformer architectures for 6G foundation models. Sanjari & Aflatouni (2024) Sanjari & Aflatouni (2024) demonstrate a reconfigurable metasurface. Tao et al. (2024) Tao et al. (2024) propose AFDM with index modulation. Aboulfotouh et al. (2024) Aboulfotouh et al. (2024) introduce self-supervised radio pre-training. Tung et al. (2024) Tung et al. (2024) develop a hybrid AI system for automated EEG analysis. Finally, Zhang & Liu (2024) Zhang & Liu (2024) analyze phase transitions with structured sparsity.
Building 6G Radio Foundation Models with Transformer Architectures by Ahmed Aboulfotouh, Ashkan Eshaghbeigi, Hatem Abou-Zeid https://arxiv.org/abs/2411.09996
Caption: This diagram illustrates the Masked Spectrogram Modeling (MSM) approach for pretraining a Vision Transformer (ViT). A masked spectrogram is fed into a ViT encoder-decoder structure to reconstruct the original spectrogram, and the pretrained encoder is then used for downstream tasks like human activity sensing and spectrogram segmentation with a task-specific head. This self-supervised learning approach allows the ViT to learn transferable representations that generalize well across diverse tasks and datasets.
This paper explores the innovative application of Vision Transformers (ViTs) as foundation models (FMs) for 6G radio signal processing. The authors introduce Masked Spectrogram Modeling (MSM), a self-supervised learning technique for pretraining the ViT. This involves masking a substantial portion (e.g., 80%) of the input spectrogram and training the model to reconstruct the masked sections based on the visible parts. This approach leverages the ViT's ability to handle variable-length sequences and capture long-range dependencies through its attention mechanisms. The authors posit that this self-supervised pretraining leads to more robust and adaptable models than traditional supervised methods, which often struggle to generalize to new, unseen data. The use of real-world datasets, including a real-time radio dataset (RRD), a CSI-based human activity sensing dataset (HSD), and a 5G/LTE spectrogram segmentation dataset (SD), reinforces the practical relevance of their approach.
The core of the methodology lies in the ViT architecture and MSM pretraining. The model employs an encoder-decoder structure. The encoder processes the visible spectrogram patches, generating feature tokens. The decoder then reconstructs the full spectrogram by attending to these tokens. The MSM loss function is defined as: L<sub>MSM</sub> = (1/NM) Σ<sub>n=1</sub><sup>N</sup> Σ<sub>ij</sub> ||vec(x<sub>ij</sub><sup>(n)</sup>) - vec(ŷ<sub>ij</sub><sup>(n)</sup>)||<sup>2</sup> I<sub>masked</sub><sup>(n,i,j)</sup>. Here, N represents the batch size, M the number of patches, x<sub>ij</sub><sup>(n)</sup> the input patch, ŷ<sub>ij</sub><sup>(n)</sup> the reconstructed patch, and I<sub>masked</sub> an indicator function for masked patches. After pretraining, the encoder serves as a feature extractor for downstream tasks, with only a task-specific head requiring finetuning.
The authors evaluate their approach on two downstream tasks: CSI-based human activity sensing and spectrogram segmentation. Results show the pretrained ViT outperforming a four-times larger model trained from scratch on spectrogram segmentation, while also requiring less training time. It achieves competitive performance on the human activity sensing task. For instance, on spectrogram segmentation, the best model (ViT-M pretrained with 70% masking) achieves 97.9% accuracy, outperforming the larger, scratch-trained ViT-L. On the HSD task, the pretrained ViT-M achieves 93.9% accuracy compared to the scratch-trained model's 98.9%, with the difference attributed to the inherent differences between CSI and spectrogram data.
The masking ratio in MSM plays a crucial role. Experiments reveal that a 70-80% masking ratio yields optimal performance, balancing the reconstruction challenge with training efficiency. The consistent results across datasets demonstrate the effectiveness of the ViT-based FM approach, highlighting its ability to learn transferable representations. The use of over-the-air data captured with software-defined radios (SDRs) further validates the practical applicability of this method. This work presents a compelling case for ViTs as FMs in 6G, offering a promising path toward scalable and adaptable models for dynamic wireless environments.
Respiro: Continuous Respiratory Rate Monitoring During Motion via Wearable Ultra-Wideband Radar by Sebastian Reidy, Manuel Meier, Christian Holz https://arxiv.org/abs/2411.08898
Caption: This figure illustrates the signal processing pipeline of Respiro, a wearable UWB radar system for respiratory rate monitoring. The pipeline processes the channel impulse response (CIR) magnitudes, calibrates and aligns them, and then computes an optimal weighting to isolate the respiratory signal. This allows for the extraction of the respiratory rate as the dominant frequency.
Monitoring respiratory rate (RR) is crucial, as deviations can be early signs of health issues. Existing methods like spirometry are accurate but cumbersome, while less invasive options like respiration belts are susceptible to motion artifacts. Respiro, introduced in this paper, offers a novel solution: a wearable device leveraging consumer-grade ultra-wideband (UWB) radar for continuous RR monitoring, even during motion. Unlike other UWB-based RR systems, Respiro is a single-point-of-contact chest strap, making it practical for continuous, real-world use.
Respiro uses two off-the-shelf DWM3000 UWB radar modules, controlled by a microcontroller and linked to a computer for data processing. The system captures in-body reflections of UWB pulses, represented by complex channel impulse responses (CIRs). Processing begins by calculating the element-wise magnitude of the CIR and storing it in a matrix H, where rows represent reflections at different times and columns represent reflections from different spatial distances. The CIRs are then calibrated and aligned using cross-correlation maximization. An optimal weighting vector W<sub>opt</sub> is computed to maximize energy within the respiration frequency band (0.1 Hz to 0.7 Hz). The RR is then extracted as the dominant frequency in H W<sub>opt</sub>. The optimal weighting is calculated as:
W<sub>opt</sub> = argmax<sub>w</sub> (F<sub>1</sub>Hw)<sup>H</sup>(F<sub>1</sub>Hw) s.t. const. = (FHw)<sup>H</sup>(FHw)
where F is the discrete Fourier transform (DFT) matrix and F<sub>1</sub> is the DFT matrix containing frequencies in the band of possible respiration frequencies.
A controlled experiment with 12 participants engaged in various activities validated Respiro's performance. Ground truth RR was recorded using a spirometer and a respiration belt. Respiro achieved a mean absolute error of 1.11 breaths per minute (bpm), with 71% of measurements within 1 bpm of the ground truth. This outperformed accelerometer-based systems in accuracy and robustness to motion. The study notes that the presence of a spirometer, while providing accurate ground truth, influences breathing patterns, leading to deeper breaths and higher signal-to-noise ratios (SNRs). A strong negative correlation (-0.29) was observed between the UWB signal's SNR and the RR estimation error, suggesting SNR can predict measurement reliability.
Respiro demonstrates the potential of consumer-grade UWB radar for accurate and unobtrusive RR monitoring in daily life. The study highlights the importance of careful ground truth acquisition and the use of SNR as a reliability predictor. While limited by sample size and age diversity, Respiro paves the way for more comfortable continuous RR monitoring, potentially enabling earlier detection of health problems.
A Hybrid Artificial Intelligence System for Automated EEG Background Analysis and Report Generation by Chin-Sung Tung, Sheng-Fu Liang, Shu-Feng Chang, Chung-Ping Young https://arxiv.org/abs/2411.09874
Caption: This diagram illustrates the hybrid AI system's workflow for automated EEG analysis and report generation. The process begins with raw EEG preprocessing, followed by artifact repair, feature generation for PDR detection and AI text generation, and concludes with LLM-driven report creation. This system integrates deep learning, unsupervised artifact removal, and expert-designed algorithms for comprehensive EEG background analysis.
This paper introduces a hybrid AI system for automated EEG background analysis and report generation, addressing the limitations of manual EEG interpretation, especially in resource-constrained environments. The system combines deep learning models for posterior dominant rhythm (PDR) prediction, an unsupervised artifact removal method, and expert-designed algorithms for detecting abnormalities. Trained on 1530 labeled EEGs, the PDR prediction model achieves a mean absolute error (MAE) of 0.237, with 91.8% accuracy within 0.6 Hz and 99% accuracy within 1.2 Hz. The system also incorporates large language models (LLMs) for report generation, demonstrating 100% accuracy as verified by three independent LLMs.
The methodology involves several key stages. EEG data undergoes preprocessing, including conversion to EDF format, reference electrode rebuilding using the REST method, and segmentation into 4-second epochs. An unsupervised outlier anomaly detection method, combined with a custom neighboring electrode comparison method, handles artifacts. Features extracted include band power spectra, anterior-posterior gradients, total power, slow band power ratios, and left-right ratios for alpha, theta, and delta bands. PDR prediction employs a supervised deep learning approach with three architectures (custom CNN, GoogleNet, ResNet) and an ensemble model. Abnormality detection utilizes expert-guided algorithms and statistical analysis, considering generalized background slowing (GBS), background asymmetry, and focal slow waves. Finally, LLMs generate reports, with prompt engineering ensuring clinical relevance and accuracy.
Validation on an internal dataset and the Temple University Abnormal EEG Corpus (TUAB) demonstrates consistent performance, with F1 scores of 0.884 and 0.835, respectively. The AI system significantly outperforms neurologists in GBS detection (p=0.02; F1: AI 0.93, neurologists 0.82) and shows promising, though not statistically significant, improvement in focal abnormality detection (p=0.79; F1: AI 0.71, neurologists 0.55). The LLM-generated reports achieve 100% accuracy, verified by three independent LLMs.
The discussion emphasizes the superior performance of the deep learning architectures for PDR prediction compared to traditional methods, highlighting their ability to capture subtle features. The unsupervised artifact removal method proves effective, and the consistent performance across datasets and architectures demonstrates robustness and generalizability. The study acknowledges limitations, including reliance on a single-center dataset and exclusion of seizure detection. Future work will focus on expanding the dataset, incorporating multicenter data, and extending functionality to encompass other EEG features. This hybrid AI system offers a promising solution for automated EEG analysis, particularly in resource-limited settings, with potential to improve diagnostic accuracy and reduce misdiagnosis rates.
This newsletter highlights significant progress in signal processing and communication technologies. The trend towards intelligent systems is evident, from foundation models for 6G radio signal processing to hybrid AI systems for automated EEG analysis. The exploration of ViTs as foundation models by Aboulfotouh et al. offers a scalable approach to learning robust and adaptable representations for complex radio signals, promising better generalization in dynamic wireless environments. The development of Respiro by Reidy et al. demonstrates the potential of consumer-grade UWB radar for accurate and unobtrusive health monitoring, opening new possibilities for continuous respiratory rate tracking in everyday life. Finally, the hybrid AI system for EEG analysis by Tung et al. showcases the power of combining deep learning with expert-designed algorithms and LLMs, offering a practical solution for automated EEG interpretation and report generation, particularly beneficial for resource-constrained settings. These advancements collectively point towards a future where intelligent signal processing and communication systems play an increasingly critical role in various applications, from healthcare to next-generation networks.