This collection of preprints explores advancements in signal processing, communication systems, and machine learning applications across diverse domains, with a focus on enhancing existing technologies and applying deep learning to complex problems.
Several papers demonstrate improvements to established methods. Reshadati and Shirzaei (2024) (Reshadati & Shirzaei, 2024) introduce a model-based approach for transforming InSAR-derived vertical land motion data from local to global reference frames, eliminating the need for direct GNSS measurements. Lyu et al. (2024) (Lyu et al., 2024) present the first experimental study demonstrating the superiority of Rate-Splitting Multiple Access (RSMA) over Space Division Multiple Access (SDMA) for Integrated Sensing and Communications (ISAC) in MIMO systems, achieving higher throughput with comparable radar SNR. Yarahmadian et al. (2024) (Yarahmadian et al., 2024) propose a novel wavelet-based algorithm for reconstructing time-domain impulse responses from band-limited scattering parameters, with applications to ship hull analysis. Zhang et al. (2024) (Zhang et al., 2024) introduce an improved importance sampling method based on stochastic particle flow and diffusion optimization for enhanced accuracy and reduced computational complexity in nonlinear source localization.
Deep learning emerges as a prominent theme. Cheng et al. (2024) (Cheng et al., 2024) utilize a feedforward neural network to quantify battery degradation modes from charging data, accurately estimating lithium inventory and active material loss. Pan et al. (2024) (Pan et al., 2024) review deep learning for spectrum prediction in cognitive radio networks and propose ViTransLSTM, integrating visual self-attention and LSTM to capture spatiotemporal dependencies in spectrum usage. Yu et al. (2024) (Yu et al., 2024) explore model-driven deep learning and physical layer foundation models for THz ultra-massive MIMO systems. Luo et al. (2024) (Luo et al., 2024) introduce NormWear, a foundation model for multivariate wearable sensing of physiological signals, demonstrating generalization across health applications and zero-shot inference.
Specific applications in wireless communication and sensing are also addressed. Chen et al. (2024) (Chen et al., 2024) propose an online adaptive real-time beamforming design for dynamic cell-free systems using a high-generalization network (HGNet). Chai and Wang (2024) (Chai & Wang, 2024) introduce CSL-L2M for controllable song-level lyric-to-melody generation. Meng et al. (2024) (Meng et al., 2024) address user identity protection in EEG-based brain-computer interfaces. Sridhara et al. (2024) (Sridhara et al., 2024) investigate joint graph and sampling set selection, introducing Vertex Importance Sampling (VIS) and VISR algorithms for graph signal reconstruction.
Further applications span diverse fields. Oré et al. (2024a) (Oré et al., 2024a) develop a novel approach for soil moisture estimation using SAR. Bui et al. (2024) (Bui et al., 2024) present a scalable data transmission framework for Earth observation satellites. Da Costa et al. (2024) (da Costa et al., 2024) evaluate vectocardiographic and ECG parameters for cardiology care allocation. Xia et al. (2024) (Xia et al., 2024) propose acceleration methods for the ISRS EGN model. Hadj Djilani et al. (2024) (Hadj Djilani et al., 2024) introduce a method for interrogating mobile passive sensors.
Finally, several papers explore emerging theoretical frameworks. Lyu et al. (2024) (Lyu et al., 2024) analyze the capacity of OAM-based wireless communications from an electromagnetic information theory perspective. Xia et al. (2024) (Xia et al., 2024) introduce Probabilistic GOSPA, a metric for multi-object filter performance evaluation. Xiu et al. (2024) (Xiu et al., 2024) investigate latency minimization in mobile edge computing systems. Eichen et al. (2024) (Eichen et al., 2024) discuss spectrum sharing between IMT and EESS.
Wireless Environmental Information Theory: A New Paradigm towards 6G Online and Proactive Environment Intelligence Communication by Jianhua Zhang, Li Yu, Shaoyi Liu, Yichen Cai, Yuxiang Zhang, Hongbo Xing, Tao jiang https://arxiv.org/abs/2412.11479
Caption: This image illustrates the proposed Environment Intelligence Communication (EIC) framework for 6G wireless networks. It highlights the five key steps: multimodal sensing and environment reconstruction, knowledge mapping, AI-based channel fading prediction, proactive decision making, and optimal transmission strategy selection. The framework leverages Wireless Environmental Information (WEI) to enhance communication performance across various layers of the 6G network architecture.
This paper proposes a paradigm shift in 6G wireless communication, moving from traditional statistical channel modeling to an Environment Intelligence Communication (EIC) framework based on Wireless Environmental Information Theory (WEIT). The authors argue that the statistical paradigm, while effective for previous generations (1G-5G), relies on offline measurements and passive adaptation, hindering optimal performance, particularly in the complex and dynamic environments envisioned for 6G. WEIT introduces Wireless Environmental Information (WEI), encompassing the physical properties of environmental scatterers that influence channel characteristics. The authors define WEI, categorize it into static, dynamic, and random types, and propose quantifying it using an entropy-like concept.
The EIC architecture involves three steps: (1) multimodal environment sensing to acquire WEI; (2) AI-based channel fading prediction using WEI to reduce uncertainty; and (3) autonomous selection of the optimal air-interface transmission strategy based on real-time predictions. The paper addresses fundamental questions about WEI: its definition, quantification, and relationship to statistical communication information. The EIC framework, aided by WEI (EIC-WEI), is validated through simulations across several air-interface tasks, demonstrating significant performance gains over the statistical paradigm.
In cell coverage prediction, EIC-WEI provides results closer to true values than models relying solely on basic environmental features or traditional statistical models. In channel prediction, EIC-WEI achieves a remarkable 59.8% reduction in normalized mean square error (NMSE) compared to methods without WEI. In beam selection, the accuracy of the top 5 beams improves by 23%, and the top 3 by 29%. Resource allocation also benefits, with EIC-WEI achieving a more balanced distribution and reducing throughput variance by over 50%. These results showcase the potential of incorporating WEI into the communication process.
The paper also acknowledges open challenges, such as improving WEI acquisition accuracy and efficiency, reducing real-time interaction complexity, and enhancing the framework's generalization capabilities. The development of Wireless Environment Knowledge (WEK), mapping WEI to channel characteristics, and the use of large language models are suggested as potential solutions.
Wearable Accelerometer Foundation Models for Health via Knowledge Distillation by Salar Abbaspourazad, Anshuman Mishra, Joseph Futoma, Andrew C. Miller, Ian Shapiro https://arxiv.org/abs/2412.11276
This preprint describes Apple's research on building generalizable foundation models for health using accelerometry data from the Apple Heart and Movement Study (AHMS). The key innovation is transferring the rich physiological information from resource-intensive biosignals like photoplethysmography (PPG) to readily available accelerometry data. This is achieved through a novel fully unsupervised representational knowledge distillation framework. A PPG encoder is first pre-trained using self-supervised learning (masked autoencoding or contrastive learning). Then, this knowledge is distilled to an accelerometry encoder using a multi-modal contrastive loss, aligning the representations learned from both modalities. This enables the accelerometry encoder to capture subtle physiological information typically associated with PPG.
The methodology involves a two-stage process. First, a PPG encoder is pre-trained on 20 million minutes of unlabeled data from ~172,000 AHMS participants. The second stage distills this knowledge to an accelerometry encoder using paired PPG and accelerometry data and a multi-modal contrastive loss. This loss function, a weighted sum of InfoNCE loss from teacher to student and student to teacher (L = λL<sup>(ts)</sup><sub>contrastive</sub> + (1-λ)L<sup>(st)</sup><sub>contrastive</sub>), maximizes mutual information between PPG and accelerometry embeddings while minimizing mutual information between the accelerometry embedding and other PPG embeddings.
The results demonstrate impressive cross-modal alignment, with near-perfect retrieval accuracy (99.17% top-1) between PPG and accelerometry embeddings. The distilled accelerometry encoders outperform self-supervised and supervised accelerometry encoders in predicting downstream health targets, achieving at least 23%-49% improvement in predicting heart rate and heart rate variability. These encoders also show strong predictive power for various health targets, demonstrating their potential as generalist foundation models. Even compressed models maintain strong performance, highlighting the efficiency of the knowledge distillation approach. Ablation studies confirm the importance of the pre-training methods, data augmentations, and the two-stage distillation process.
AI and Deep Learning for THz Ultra-Massive MIMO: From Model-Driven Approaches to Foundation Models by Wentao Yu, Hengtao He, Shenghui Song, Jun Zhang, Linglong Dai, Lizhong Zheng, Khaled B. Letaief https://arxiv.org/abs/2412.09839
This paper explores the potential of AI, specifically deep learning (DL), to tackle the challenges of THz Ultra-Massive MIMO (UM-MIMO) systems. The authors identify three key hurdles: 'hard to compute' (high dimensionality and complexity), 'hard to model' (complex channel characteristics), and 'hard to measure' (incomplete and noisy channel state information).
Two research roadmaps are proposed. Model-driven deep learning leverages existing signal processing frameworks and uses AI to enhance bottleneck modules. This involves selecting algorithmic frameworks (e.g., fixed-point networks, neural calibration), choosing basis algorithms and identifying bottlenecks, designing loss functions, and designing neural architectures (e.g., graph neural networks, hypernetworks). Case studies on beamforming and data detection illustrate this approach's effectiveness. For example, a neural-calibrated zero-forcing beamformer achieves near-optimal performance compared to the computationally expensive WMMSE algorithm, with a runtime improvement of nearly 118 times.
The second roadmap, physical layer foundation models, offers a more unified approach. A single, compact foundation model is trained to estimate the score function of wireless channels, $s_\theta(h) \approx \nabla_h \log p(h)$, acting as a prior for various transceiver modules. This involves defining general frameworks, conditioning, site-specific adaptation, and joint design with model-driven DL. A channel estimation case study demonstrates the potential, achieving near-oracle MMSE performance.
The paper concludes by highlighting the synergy between these roadmaps. Model-driven DL provides algorithmic frameworks and domain expertise, while foundation models offer a unified channel representation. Combining these approaches promises more efficient and robust AI-enabled solutions for THz UM-MIMO systems, emphasizing the importance of considering computational complexity and generalization capabilities.
This newsletter highlights a convergence of advanced techniques in signal processing, communication systems, and machine learning. The trend towards leveraging environmental information and deep learning is evident across various applications. The proposed EIC-WEI framework for 6G communication demonstrates a paradigm shift towards proactive interaction with the environment, promising significant performance gains. Simultaneously, the development of foundation models, particularly for wearable health monitoring and complex THz UM-MIMO systems, showcases the power of knowledge distillation and unified channel representations. These advancements underscore the ongoing effort to push the boundaries of these technologies through innovative theoretical frameworks, algorithmic improvements, and practical implementations. The integration of AI and domain expertise emerges as a key driver for future breakthroughs.