Subject: Cutting-Edge Research in Causal Inference, Statistical Modeling, and Machine Learning
Hi Elman,
This collection of preprints showcases a diverse array of methodological advancements and applications across various domains, with a particular focus on causal inference, statistical modeling, and machine learning. Several papers tackle challenging problems in causal inference. For example, Valtanen et al. (2025) (Valtanen et al., 2025) propose a method for identifying mediational effects in longitudinal intervention studies with an ordinal treatment-dependent confounder, leveraging monotonicity assumptions and functional representations of the mediator. Russo et al. (2025) (Russo et al., 2025) introduce a Bayesian decision-theoretic framework for incorporating auxiliary outcomes in randomized clinical trials, aiming for more efficient decision-making while controlling frequentist operating characteristics. Other notable contributions to causal inference include a large-sample framework for coarsened confounding by Ghosh & Wang (2025) (Ghosh & Wang, 2025), and a study on monotone spillover effects using randomization tests by Huang et al. (2025) (Huang et al., 2025).
Beyond causal inference, the collection features novel statistical modeling techniques. Seri et al. (2025) (Seri et al., 2025) present Spherical Double K-Means, a co-clustering approach for text data analysis, demonstrating its application to US presidential inaugural addresses. Longjohn et al. (2025) (Longjohn et al., 2025) address uncertainty quantification for aggregate performance metrics in machine learning benchmarks, employing bootstrapping and Bayesian hierarchical modeling. In time series analysis, Abdelrazeq et al. (2025) (Abdelrazeq et al., 2025) propose a method for verifying Lévy-Driven Ornstein-Uhlenbeck processes, while Hu et al. (2025) (Hu et al., 2025) introduce Context-Alignment, a paradigm for leveraging LLMs in time series tasks by aligning time series data with linguistic components.
Applications of these statistical and machine learning methods span diverse fields. In healthcare, Zhu et al. (2025) (Zhu et al., 2025) model cell type developmental trajectories using multinomial unbalanced optimal transport, and Wang et al. (2025) (Wang et al., 2025) apply transfer learning for individualized treatment rules in sepsis patients. Brewer et al. (2025) (Brewer et al., 2025) revisit cosmological time dilation in quasars, incorporating source properties and evolution. Nath et al. (2025) (Nath et al., 2025) introduce MERCURY, a multi-resolution emulator for compound climate hazards. Kunz et al. (2025) (Kunz et al., 2025) utilize statistical distributions for transient transport analysis in heterogeneous catalytic systems.
Several preprints explore the intersection of data analysis and societal issues. Ross et al. (2025) (Ross et al., 2025) reconstruct ecological community dynamics from limited observations using Bayesian inference, while Miaci & Seri (2025) (Miaci & Seri, 2025) analyze maternal narratives of Albanian women in Italy using text data analysis. Rong et al. (2025) (Rong et al., 2025) investigate the historical evolution of interdisciplinary research. Omatoi et al. (2025) (Omatoi et al., 2025) apply causal inference to study the impact of adjuvant therapy and skin/nipple involvement on breast cancer survival. Kumar et al. (2025) (Kumar et al., 2025) introduce a shape-based functional index for objective assessment of pediatric motor function. The remaining preprints delve into specific applications within distinct domains, ranging from customer lifetime value in NFT markets (Das, 2025; Das, 2025) to gravitational-wave event rates (Wang et al., 2025; Wang et al., 2025). This diverse collection underscores the growing influence of advanced statistical and machine learning techniques across a wide spectrum of scientific and societal challenges.
An Algorithmic Approach for Causal Health Equity: A Look at Race Differentials in Intensive Care Unit (ICU) Outcomes by Drago Plecko, Paul Secombe, Andrea Clarke, Amelia Fiske, Samarra Toby, Donisha Duff, David Pilcher, Leo Anthony Celi, Rinaldo Bellomo, Elias Bareinboim https://arxiv.org/abs/2501.05197
Caption: Decomposition of Causal Effects of Race/Ethnicity on ICU Mortality
This study uses causal inference to examine racial disparities in intensive care unit (ICU) outcomes, moving beyond traditional statistical approaches for a more nuanced understanding of health equity. Analyzing over a million ICU admissions in Australia and the US, the research reveals consistent patterns across populations, despite differences in raw mortality rates. The key innovation is the Causal Fairness Analysis framework, based on structural causal models, which decomposes observed disparities in mortality rates into three causal pathways: confounded, indirect, and direct effects. This decomposition is expressed by the formula:
E[Y | X = x₁] - E[Y | X = x₀] = E[Yₓ₁,Wₓ₀ - Yₓ₀ | X = x₀] (direct) - E[Yₓ₁,Wₓ₀ - Yₓ₁ | X = x₀] (indirect) - E[Yₓ₁ | X = x₀] - E[Yₓ₁ | X = x₁] (confounded)
where X represents race/ethnicity, W represents mediators (e.g., chronic health, illness severity), and Y represents mortality.
Confounded effects, stemming from factors like age and socioeconomic status, were protective for minority groups, largely due to younger age at admission. Indirect effects, transmitted through mediators, were detrimental, reflecting disparities in pre-ICU health and access to care. Surprisingly, direct effects, isolating the impact of race/ethnicity while controlling for other variables, revealed a protective effect for minority patients, especially for medical admissions. This counterintuitive finding suggests a "tipping over" effect: poorer access to primary care for minority groups leads to more ICU admissions for less severe, preventable conditions, thus lowering overall ICU mortality risk. This hypothesis was supported by the higher ICU admission rates for medical reasons among Indigenous Australians compared to non-Indigenous Australians. This disparity, combined with the protective direct effect, points to underutilization of primary care as a key driver. The researchers developed the Indigenous Intensive Care Equity (IICE) Radar to monitor ICU overutilization by Indigenous Australians geographically, aiming to identify areas with the greatest disparities in primary care access and inform targeted interventions.
Equipoise calibration of clinical trial design by Fabio Rigat https://arxiv.org/abs/2501.03009
This paper introduces equipoise calibration as a novel methodology to address the disconnect between statistical significance and clinical meaningfulness in clinical trial design. The central idea is to calibrate trials to establish strong clinical equipoise imbalance, ensuring a positive outcome is not only statistically unlikely under the null hypothesis but also improbable under perfect pre-study equipoise (genuine uncertainty about the preferred treatment).
The paper examines three probabilistic models of clinical equipoise, advocating for the least informed population distribution of pre-trial odds (the BP(1,1) model) as the most practical calibrator due to its minimal pre-study assumptions. Under this model, common phase 3 superiority designs (90% power at 5% false positive rate) demonstrate nearly 95% equipoise imbalance for positive outcomes. Designs with 95% power at 5% false positive rate show even stronger imbalance, offering an operational definition of a "robustly powered study." This suggests increasing power beyond the typical 80-90% could yield more conclusive results. Specifically, the paper proposes a more conservative definition of an adequately powered study: power > 95% at a 5% false positive rate to achieve post-study odds of at least 19:1, corresponding to the 95th percentile of the BP(1,1) pre-study odds distribution. The formula for post-study odds in favor of H₁ is: ro(+) = (P(H₁)/P(H₀)) * (p(+|H₁)/p(+|H₀)).
The methodology extends to clinical development plans (CDPs) with phase 2 and 3 trials. In oncology, typical designs provide high overall equipoise imbalance when both trials are positive. However, establishing imbalance favoring the null hypothesis when a positive phase 2 isn't confirmed in phase 3 requires impractically large sample sizes for clinically meaningful effect sizes. A key finding is that a CDP with 80% power at 10% false positive rate in phase 2 and 95% power at 5% family-wise error rate (FWER) in phase 3 is the smallest option ensuring strong equipoise imbalance for both double-positive and double-negative outcomes. The paper highlights the potential of equipoise calibration and identifies areas for further research, particularly regarding different definitions of equipoise and scenarios where perfect pre-study equipoise is unattainable.
A data-driven merit order: Learning a fundamental electricity price model by Paul Ghelasi, Florian Ziel https://arxiv.org/abs/2501.02963
Caption: This graph visualizes a data-driven merit order model for electricity price forecasting. It displays the variable cost against the quantity of electricity generated, showcasing the non-linear shape of the estimated merit order curve and the resulting market price (dashed line) compared to the load (dotted line).
Accurate electricity price forecasting is crucial. This paper introduces a hybrid approach—a data-driven fundamental model—combining the strengths of fundamental models (simulating market mechanics) and data-driven models (learning from historical patterns). The core is a merit order (supply stack) model. Instead of fixed expert estimates for power plant parameters (e.g., efficiency, marginal costs), this model estimates them directly from historical data, minimizing the mean absolute error (MAE) between predicted and actual prices, similar to data-driven models. The optimization problem is:
Θ = argmin<sub>Θ</sub> (1/N) Σ<sup>N</sup><sub>t=1</sub> |MO<sub>t</sub>(q<sub>t</sub>; Θ) - Price<sub>t</sub>|,
subject to parameter bounds. MO<sub>t</sub>(q<sub>t</sub>; Θ) is the merit order model's price prediction, q<sub>t</sub> is the load, Price<sub>t</sub> is the observed price, and Θ are the optimizable parameters (efficiencies (η), bidding prices (b), capacity correction factors (cf)). The classical merit order model is embedded within this framework, with expert guesses as initial values, allowing the model to generalize beyond these initial assumptions.
Applied to German day-ahead prices, the extended hybrid model (incorporating a gas stack split, hydro power forecasts, and a must-run stack) achieved the lowest MAE (15.23 EUR/MWh) on the test set, a 22% improvement over the econometric expert model and 61% over the naive model. Even the basic classical merit order model outperformed the expert model on average. The hybrid model's parameter estimation provides valuable insights. The estimated merit order curve is distinctly non-linear, unlike the more linear classical merit order, capturing price dynamics more effectively. Estimated parameters revealed plausible adjustments to expert guesses (e.g., higher efficiencies, more negative RES bidding prices). The model also identifies fuel switches, crucial events driven by fluctuating fuel and CO2 prices, not directly observable in econometric models.
Reconstructing ecological community dynamics from limited observations by Chandler Ross, Ville Laitinen, Moein Khalighi, Jarkko Salojärvi, Willem de Vos, Guilhem Sommeria-Klein, Leo Lahti https://arxiv.org/abs/2501.03820
Understanding complex ecological dynamics from limited data is a major challenge. This research introduces a flexible Bayesian model that leverages information across multiple short time series, providing a robust approach to understanding stability and resilience. The method combines information from short time series using Gaussian process priors for both the drift (f(x(t))) and diffusion (g(x(t))) functions of a stochastic differential equation:
dx(t) = f(x(t))dt + g(x(t))dW(t).
This non-parametric characterization avoids strong assumptions about the stability landscape and fluctuation intensity. Crucially, it recognizes that short, independent time series can effectively cover the state space, unlike methods requiring long, continuous observations.
Validated using simulated data from the cusp catastrophe model and a custom stochastic differential equation, the model accurately identified stable and tipping regions even with sparse data, achieving comparable performance to existing techniques with significantly fewer data points. Applied to human gut microbiota data, it identified bistability in five genus-level groups, challenging previous findings based on cross-sectional data. The model also provides a probabilistic estimate of exit time (a measure of resilience). This research highlights the importance of distinguishing modality (number of peaks in a distribution) from stability, demonstrating how non-constant diffusion can lead to misleading interpretations of stability based solely on observed frequencies.
Knowledge Distillation with Adapted Weight by Sirong Wu, Xi Luo, Junjie Liu, Yuhui Deng https://arxiv.org/abs/2501.02705
Large language models (LLMs) are powerful but resource-intensive. Knowledge distillation addresses this by transferring knowledge from a large teacher to a smaller student model. This paper introduces Knowledge Distillation with Adaptive Influence Weight (KD-AIF), which uses influence functions to weight training data based on their impact on the student's generalization ability. KD-AIF prioritizes data that positively contributes to performance on unseen data. Influence functions estimate the impact of each training point on the model's parameters. By incorporating these scores, the teacher adapts dynamically, providing personalized guidance. The weight function is:
w<sub>i</sub> = 2 / (1 + φ<sub>i</sub>(θ<sub>s</sub>)),
where φ<sub>i</sub>(θ<sub>s</sub>) is the normalized influence score of the i-th data point. This downweights data points that negatively impact generalization.
Evaluated on image classification (CIFAR-100, CIFAR-10-4k, SVHN-1k) and NLP (GLUE) tasks, KD-AIF consistently outperformed other methods. On CIFAR-100, it achieved accuracy improvements of 1-3%. On GLUE, it achieved state-of-the-art results. KD-AIF also demonstrated robustness to noisy labels, maintaining superior performance compared to traditional methods. Extended to semi-supervised learning, it showed significant improvements on CIFAR-10-4k and SVHN-1k.
Caption: This diagram illustrates the Knowledge Distillation with Adaptive Influence Weight (KD-AIF) framework. A teacher model, trained on labeled data, generates pseudo-labels for unlabeled data. The student model is then trained on both labeled and pseudo-labeled data, using influence weights (ωl and ωu) derived from the student model's sensitivity to each data point to improve generalization performance.
This newsletter highlights significant advances in statistical modeling and machine learning. From addressing racial disparities in healthcare using causal inference to redefining "adequately powered" clinical trials with equipoise calibration, the research pushes methodological boundaries. The development of a data-driven merit order model revolutionizes electricity price forecasting, while a novel Bayesian approach reconstructs ecological dynamics from sparse data. Finally, the introduction of adaptive influence weights in knowledge distillation enhances the efficiency and explainability of training smaller, more deployable machine learning models. These diverse contributions demonstrate the transformative potential of these techniques across various domains.