Subject: Cutting-Edge Advancements in Wireless Communication, Sensing, and Signal Processing
Hi Elman,
This newsletter covers recent breakthroughs in wireless communication, sensing, and signal processing, focusing on emerging technologies for 6G and beyond.
This collection of papers explores cutting-edge advancements in wireless communication, sensing, and signal processing, with a particular emphasis on emerging technologies for 6G and beyond. Several papers focus on novel antenna technologies and their applications. Sheemar et al. (2025) Sheemar et al. (2025) propose a majorization-maximization (MM) based algorithm for secrecy rate maximization in reconfigurable holographic surface (RHS) assisted systems, jointly optimizing digital beamforming, artificial noise, and holographic beamforming. Similarly, Hu et al. (2025) Hu et al. (2025) investigate uplink transmission design for fluid antenna-enabled multiuser MIMO systems, employing a genetic algorithm for antenna position optimization under imperfect CSI. The potential of fluid antenna systems (FAS) for RSSI-based positioning is explored by Liu et al. (2025) Liu et al. (2025), leveraging inter-port correlation to enhance location accuracy. Hua et al. (2025) Hua et al. (2025) introduce a hierarchically tunable 6DMA architecture, optimizing both antenna positions and rotations for communication and sensing scenarios. These works collectively demonstrate the potential of advanced antenna technologies to significantly improve performance in various communication and sensing tasks.
Another prominent theme is the development of novel signal processing techniques for challenging communication scenarios. Jiang et al. (2025) Jiang et al. (2025) propose C2S-AE, an auto-encoder-based framework for extracting sensing information from CSI, demonstrating improved accuracy in delay and signal strength estimation. Liu et al. (2025) Liu et al. (2025) introduce a confidence-based asynchronous integrated communication and localization (ICL) network using pulsed UWB signals, achieving high localization accuracy. Xue et al. (2025) Xue et al. (2025) leverage large AI models for delay-Doppler domain channel prediction in 6G OTFS-based vehicular networks. Wang et al. (2025) Wang et al. (2025) propose a direct tracking approach for multi-object tracking, bypassing traditional preprocessing stages and achieving improved performance. These contributions highlight the growing role of advanced signal processing and machine learning in enhancing communication system performance.
Several papers address specific applications and challenges in different communication domains. Ma et al. (2025) Ma et al. (2025) propose a data-driven method for metering error estimation of fast-charging stations, while Hong et al. (2025) Hong et al. (2025) evaluate the performance of V2V visible light communication in motion scenarios. Zhang et al. (2025) Zhang et al. (2025) introduce a novel framework for channel semantic characterization in integrated sensing and communication (ISAC) scenarios. Mizmizi et al. (2025) Mizmizi et al. (2025) explore hybrid MIMO architectures in the upper mid-band, analyzing their energy efficiency and spectral efficiency trade-offs. Qiu et al. (2025) Qiu et al. (2025) investigate rate splitting multiple access (RSMA) for simultaneous lightwave information and power transfer (SLIPT). These diverse applications demonstrate the breadth of research in the field and the ongoing efforts to address specific challenges in various communication contexts.
Further contributions include the work of Gomes et al. (2025) Gomes et al. (2025), who review lossy neural compression for geospatial analytics, and Brodmann & Eifler (2025) Brodmann & Eifler (2025), who propose a Hamiltonian description of surface topography for tribological assessments. Sutrakar & P K (2025) Sutrakar & P K (2025) analyze the impact of fasteners on the radar cross-section performance of radar absorbing air intake ducts. Trinh et al. (2025) Trinh et al. (2025) discuss the potential of optical RISs in quantum-empowered SAGINs. These papers demonstrate the interdisciplinary nature of the field and the application of advanced techniques to diverse problems.
Finally, several papers focus on optimization and modeling techniques. Zhang et al. (2025) Zhang et al. (2025) propose a net-zero ISAC model for backscatter systems, while Zhang et al. (2025) Zhang et al. (2025) present a proof-of-concept for zero-power backscatter sensing and communication. Pani et al. (2025) Pani et al. (2025) use machine learning for predicting cascading failures in power systems. Binucci et al. (2025) Binucci et al. (2025) introduce conformal Lyapunov optimization for resource allocation under reliability constraints. These papers highlight the importance of developing efficient optimization and modeling techniques for complex systems. Overall, this collection of papers showcases significant advancements in various areas of wireless communication and signal processing, paving the way for future innovations in 6G and beyond.
Foundation-Model-Boosted Multimodal Learning for fMRI-based Neuropathic Pain Drug Response Prediction by Wenrui Fan, L. M. Riza Rizky, Jiayang Zhang, Chen Chen, Haiping Lu, Kevin Teh, Dinesh Selvarajah, Shuo Zhou https://arxiv.org/abs/2503.00210
Caption: This figure illustrates the FMMTC framework, which leverages a foundation model (BrainLM) and multimodal learning to predict neuropathic pain drug response from rs-fMRI data. It processes both time series (TS) and functional connectivity (FC) data, fusing the extracted features for enhanced prediction accuracy. The framework incorporates external knowledge from a large pain-agnostic dataset and adapts its reliance on TS and FC based on the specific dataset.
Neuropathic pain, a widespread and challenging condition, affects a substantial portion of adults and poses significant difficulties in treatment due to limited drug efficacy and potential adverse effects. Resting-state functional MRI (rs-fMRI) emerges as a promising non-invasive tool for identifying brain biomarkers that can predict drug response, potentially paving the way for personalized treatment approaches. However, the inherent complexity and high dimensionality of fMRI data, coupled with the scarcity of data in neuropathic pain research, limit the application of powerful machine learning models. This paper introduces FMMTC (Foundation-Model-boosted Multimodal learning for Time series and functional Connectivity), a groundbreaking framework designed to overcome these limitations and enhance the accuracy of neuropathic pain drug response prediction using rs-fMRI.
FMMTC leverages both internal and external sources of knowledge to boost its predictive power. Addressing the critical issue of data scarcity, FMMTC adopts a multimodal learning strategy, integrating information from two distinct rs-fMRI modalities: Time Series (TS) and Functional Connectivity (FC). Unlike conventional approaches that treat TS and FC as features within a single modality, FMMTC considers them as separate modalities, encoding them individually using a frozen fMRI foundation model (BrainLM) for TS and a trainable ResNet-18 for FC. The resulting features, denoted as R<sub>T</sub> and R<sub>C</sub>, are then combined to create a multimodal feature representation R<sub>TC</sub> = concat(R<sub>T</sub>, R<sub>C</sub>), which is subsequently fed into a single-layer perceptron for prediction. This multimodal approach enables FMMTC to capture both the fine-grained temporal dynamics from TS and the global spatial relationships from FC, maximizing the information extracted from limited pain-specific data.
Furthermore, FMMTC harnesses external knowledge by transferring information from BrainLM, a foundation model pre-trained on a vast pain-agnostic fMRI dataset (UK Biobank). This knowledge transfer enriches the limited pain-specific data with insights from a broader population, leading to improved generalizability and robustness of the model.
The effectiveness of FMMTC was rigorously evaluated on both an in-house dataset and a publicly available dataset from OpenNeuro. In drug-agnostic response prediction, FMMTC consistently outperformed unimodal baselines, achieving improvements in Matthews Correlation Coefficient (MCC) of at least 2.80% across all experiments. In drug-specific response prediction, FMMTC demonstrated superior representation ability within the same drug domain and strong generalizability to other drug domains, outperforming BrainLM. For instance, in out-of-domain drug-specific response prediction, FMMTC achieved an MCC of 86.02%, significantly surpassing BrainLM. Ablation studies confirmed the crucial role of both multimodal learning and foundation-model-powered external knowledge transfer in enhancing prediction performance.
The remarkable adaptability of FMMTC across datasets was further explored through integrated gradients (IG) analysis. This analysis revealed that FMMTC dynamically adjusts its reliance on TS and FC modalities based on the characteristics of the dataset. In scenarios where BrainLM (TS-focused) performed better, FMMTC prioritized TS features, while in scenarios favoring ResNet (FC-focused), FMMTC shifted its emphasis to FC features. This dynamic behavior enables FMMTC to effectively adapt to different data distributions and conditions, contributing to its robust performance across various scenarios.
FlowDec: A flow-based full-band general audio codec with high perceptual quality by Simon Welker, Matthew Le, Ricky T. Q. Chen, Wei-Ning Hsu, Timo Gerkmann, Alexander Richard, Yi-Chiao Wu https://arxiv.org/abs/2503.01485
Caption: The image contrasts the architectures of a traditional deep audio codec (DAC), which uses a discriminator during training, with the novel FlowDec. FlowDec replaces the discriminator with a stochastic postfilter trained using conditional flow matching, improving perceptual quality and reducing computational cost. Both architectures share a common inference path consisting of an encoder, quantizer, and decoder.
FlowDec, a novel neural audio codec, challenges the dominance of Generative Adversarial Networks (GANs) in the field. This full-band codec, designed for 48 kHz audio, utilizes conditional flow matching (CFM) instead of adversarial training, achieving high perceptual quality while addressing limitations of previous methods like ScoreDec. FlowDec significantly reduces computational requirements, needing only 6 Deep Neural Network (DNN) evaluations compared to ScoreDec's 60, while maintaining or exceeding quality.
FlowDec operates in two stages. First, a deterministic neural codec, derived from a modified non-adversarial version of the state-of-the-art DAC codec, creates an initial audio estimate. This initial decoder is trained with a combination of spectral and waveform losses, including a novel multiscale constant-Q transform (CQT) loss and an L¹ waveform loss. Second, a stochastic postfilter, based on a novel adaptation of CFM, refines this initial estimate, generating multiple enhanced versions. This postfilter learns to map samples from a simpler, tractable distribution to the complex distribution of clean audio, conditioned on the initial estimate. It utilizes a simplified formulation with a single tunable hyperparameter, σ<sub>y</sub>. The probability path used is a linear interpolation between the initial estimate y and the target clean signal x<sub>1</sub>, with linearly decreasing noise:
p<sub>t</sub>(x<sub>t</sub>|x<sub>1</sub>,y) = N(x<sub>t</sub>; μ<sub>t</sub>, σ<sub>t</sub>) := N(x<sub>t</sub>; y + t(x<sub>1</sub> - y), (1 – t)²y)
Objective evaluations, using metrics like Frechét Audio Distance (FAD), demonstrate FlowDec's superior performance. Compared to EnCodec, DAC, and 2xDAC, FlowDec-75m achieves the best FAD scores across various bitrates. For example, at 7.5 kbit/s, FlowDec-75m achieves an FAD score of 0.0209, compared to 0.0408 for EnCodec at 12 kbit/s. While traditional spectral metrics might favor DAC, the perceptually weighted fwSSNR shows a smaller gap, highlighting FlowDec's focus on perceptual quality. Subjective listening tests (MUSHRA) corroborate these results, with FlowDec rated on par with DAC across different bitrates and feature rates.
FlowDec's efficiency at very low bitrates and feature rates is another key advantage. FlowDec-25s, operating at 25 Hz and 4.0 kbit/s, performs comparably to higher-bitrate versions, suggesting its potential for efficient generative audio modeling. Furthermore, FlowDec's real-time factor (RTF) of approximately 0.23 significantly surpasses ScoreDec's 1.7, making it more suitable for real-time applications.
Ablation studies confirm the effectiveness of FlowDec's design choices. The frequency-dependent σ<sub>y</sub> prevents oversmoothing, especially at low NFE. The modified NCSN++ architecture suppresses high-frequency artifacts in music. The added CQT and waveform losses in the underlying codec improve low-frequency preservation and overall quality.
Nanosatellite Constellation and Ground Station Co-design for Low-Latency Critical Event Detection by Zhuo Cheng, Brandon Lucia https://arxiv.org/abs/2503.01756
Caption: This figure compares the capture latency of Planet's single-plane Dove constellation with the proposed GYRFALCON multi-plane design across varying swath widths. GYRFALCON demonstrates a 6.9x improvement in capture latency due to its optimized orbital design, highlighting the significance of orbital planes in minimizing the time to observe critical events.
While existing nanosatellite constellations provide extensive Earth observation coverage, they often experience significant delays in detecting time-sensitive events like forest fires or floods. This latency, sometimes reaching several hours, can hinder timely responses and exacerbate the impact of such events. This research identifies a surprising bottleneck: not data transmission, as previously believed, but the capture latency – the time it takes for a satellite to be positioned over the event area. This paper introduces GYRFALCON, a novel approach to constellation and ground station co-design that prioritizes minimizing capture latency.
The study utilizes a simulation-based approach, employing the Cote simulator and real-world datasets of events like fires, earthquakes, and city locations. It investigates the influence of various orbital parameters, including inclination, altitude, number of planes, plane phase, and plane distribution, on capture latency. For optimizing ground station placement, the research proposes an integer linear programming (ILP) algorithm, aiming to maximize coverage while minimizing transmission latency.
The results highlight the significant impact of the number of orbital planes on capture latency. Distributing satellites across 10 planes, as opposed to Planet's single-plane configuration, dramatically reduces capture latency by a factor of 7.9-10.5×. Contrary to intuitions derived from communication satellite designs, lower inclination orbits, while beneficial for communication, offer minimal latency reduction for Earth observation. The proposed ground station selection algorithm significantly outperforms randomly distributed ground stations, reducing transmission latency by 4.9× and 7.7× compared to Planet and a sampled L2D2 deployment, respectively. Including a polar ground station further reduces transmission latency by 1.8-1.94×.
The study also analyzes the effect of altitude and swath width on capture latency. Lower altitudes, while reducing latency due to shorter orbital periods, are limited by atmospheric drag and mission lifetime considerations. Larger swath widths, constrained by sensor technology, also contribute to lower latency. The relationship between swath, altitude (H), and camera field of view (θ) is defined as: swath = H × tan(θ/2).
Finally, the research demonstrates GYRFALCON's adaptability to incremental satellite deployments and its ability to be optimized for specific event locations by strategically positioning ground stations near historical event sites. This targeted approach can further minimize transmission latency, achieving near-zero values in some cases.
This newsletter highlights a range of advancements across wireless communication, sensing, and signal processing. From novel antenna technologies to advanced signal processing and AI-driven solutions, the research covered demonstrates a clear push toward 6G and beyond. The highlighted papers showcase impactful contributions, including FMMTC's innovative approach to predicting neuropathic pain drug response using multimodal learning and foundation models, FlowDec's challenge to GAN-based audio coding through conditional flow matching, and GYRFALCON's paradigm shift in nanosatellite constellation design for low-latency critical event detection. These advancements collectively pave the way for more efficient, reliable, and intelligent communication systems in the future.