Your paper timeline
Scroll AI takes the way you would scroll a great paper aggregator: quick signal first, deeper critique when something earns your attention, and challenges when a claim feels off.
181 papers in cs.LG
Trending mixes fresh papers with community signal.
0
cs.ROcs.LG Ruiqi Xian, Jing Liang, He Yin et al. · Mar 23, 2026

2-4 sentences for scrolling feed.

Sections:
1. Verdict: Overall assessment - solid incremental contribution, hybrid approach is interesting, results are good but limited scope.
2. What holds up: Gaussian anchoring mechanism, two-stage design, ablation studies showing component effectiveness.
3. Main concerns: Single-frame limitation, dataset limitation (only SemanticKITTI), missing comparison with GaussianFormer, efficiency trade-offs not fully characterized, limited discussion of failure modes.
4. Evidence and comparison: Fair comparison with ETFormer/VoxFormer using same backbone, but missing key Gaussian baselines; ablations validate design choices; qualitative results show improvements.
5. Reproducibility: Good implementation details provided, standard dataset, but no code release mentioned; hyperparameters mostly specified.

Let me write the content now, ensuring I follow the formatting rules:
- Use LaTeX for math
- Keep JSON strings on single lines (use \n for line breaks)
- Include exact quotes with locators
- No markdown fences around JSON

We present \emph{GaussianSSC}, a two-stage, grid-native and triplane-guided approach to semantic scene completion (SSC) that injects the benefits of Gaussians without replacing the voxel grid or maintaining a separate Gaussian set. We introduce \emph{Gaussian Anchoring}, a sub-pixel, Gaussian-weighted image aggregation over fused FPN features that tightens voxel--image alignment and improves monocular occupancy estimation. We further convert point-like voxel features into a learned per-voxel Gaussian field and refine triplane features via a triplane-aligned \emph{Gaussian--Triplane Refinement} module that combines \emph{local gathering} (target-centric) and \emph{global aggregation} (source-centric). This directional, anisotropic support captures surface tangency, scale, and occlusion-aware asymmetry while preserving the efficiency of triplane representations. On SemanticKITTI~\cite{behley2019semantickitti}, GaussianSSC improves Stage~1 occupancy by +1.0\% Recall, +2.0\% Precision, and +1.8\% IoU over state-of-the-art baselines, and improves Stage~2 semantic prediction by +1.8\% IoU and +0.8\% mIoU.
0
cs.SDcs.LG Kazuki Matsumoto, Ren Uchida, Kohei Yatabe · Mar 23, 2026

Existing Lipschitz-constrained DNNs don't directly apply to audio amplitude modifiers (AMs) because the complex-valued reconstruction breaks continuity. This paper proves that AMs are generally not Lipschitz continuous, derives sufficient conditions for Lipschitz continuity (Assumption 3), and proposes LipsAM architectures that enforce these bounds via element-wise minimum and ReLU operations. The work matters because it enables certified robust amplitude modification and stabilizes Plug-and-Play algorithms where conventional AMs diverge.

The robustness of deep neural networks (DNNs) can be certified through their Lipschitz continuity, which has made the construction of Lipschitz-continuous DNNs an active research field. However, DNNs for audio processing have not been a major focus due to their poor compatibility with existing results. In this paper, we consider the amplitude modifier (AM), a popular architecture for handling audio signals, and propose its Lipschitz-continuous variants, which we refer to as LipsAM. We prove a sufficient condition for an AM to be Lipschitz continuous and propose two architectures as examples of LipsAM. The proposed architectures were applied to a Plug-and-Play algorithm for speech dereverberation, and their improved stability is demonstrated through numerical experiments.
0
cs.LGstat.ML Paolo Toccaceli · Mar 23, 2026

This paper addresses conditional distribution estimation for regression by proposing a non-parametric binning approach. Observations sorted by a one-dimensional covariate are partitioned into contiguous bins via dynamic programming, minimizing a closed-form leave-one-out CRPS cost function. The method produces conformal prediction sets with finite-sample marginal coverage guarantees and connects to Venn predictors, offering substantially narrower intervals than standard split-conformal methods on heteroscedastic and bimodal benchmarks.

We propose a method for non-parametric conditional distribution estimation based on partitioning covariate-sorted observations into contiguous bins and using the within-bin empirical CDF as the predictive distribution. Bin boundaries are chosen to minimise the total leave-one-out Continuous Ranked Probability Score (LOO-CRPS), which admits a closed-form cost function with $O(n^2 \log n)$ precomputation and $O(n^2)$ storage; the globally optimal $K$-partition is recovered by a dynamic programme in $O(n^2 K)$ time. Minimisation of Within-sample LOO-CRPS turns out to be inappropriate for selecting $K$ as it results in in-sample optimism. So we instead select $K$ by evaluating test CRPS on an alternating held-out split, which yields a U-shaped criterion with a well-defined minimum. Having selected $K^*$ and fitted the full-data partition, we form two complementary predictive objects: the Venn prediction band and a conformal prediction set based on CRPS as the nonconformity score, which carries a finite-sample marginal coverage guarantee at any prescribed level $\varepsilon$. On real benchmarks against split-conformal competitors (Gaussian split conformal, CQR, and CQR-QRF), the method produces substantially narrower prediction intervals while maintaining near-nominal coverage.
0
cs.LG Weilin Wan, Jingtao Han, Weizhong Zhang et al. · Mar 23, 2026

This paper tackles the combinatorial explosion in Mixture-of-Experts (MoE) architecture design, where traditional scaling laws either add too many variables to fit reliably or isolate MoE components while ignoring global interactions. The authors propose a holistic framework that uses algebraic constraints and a rank-preserving property of the hidden dimension $d$ to collapse the search space from $\mathcal{O}(n^{16})$ to manageable two-phase searches of $\mathcal{O}(n^3)+\mathcal{O}(n^2)$. They derive closed-form scaling laws mapping compute budgets to optimal configurations across $10^{18}$ to $3 \times 10^{20}$ FLOPs, revealing that near-optimal architectural bands widen at larger scales—providing actionable guidance for resource-efficient MoE deployment.

Scaling laws for Large Language Models govern macroscopic resource allocation, yet translating them into precise Mixture-of-Experts (MoE) architectural configurations remains an open problem due to the combinatorially vast design space. Existing MoE scaling studies are constrained by experimental budgets to either augment scaling formulas with extra MoE variables, risking unreliable fits, or fix all non-MoE factors, ignoring global interactions. We propose a reusable framework for holistic MoE architectural optimization that bridges this gap. We first show that FLOPs per token alone is an inadequate fairness metric for MoE models because differing computational densities across layer types can inflate parameters without proportional compute cost, and establish a joint constraint triad of FLOPs per token, active parameters, and total parameters. We then reduce the 16-dimensional architectural search space to two sequential low-dimensional phases through algebraic constraints and a rank-preserving property of the hidden dimension. Validated across hundreds of MoE models spanning six orders of magnitude in compute, our framework yields robust scaling laws that map any compute budget to a complete, optimal MoE architecture. A key finding is that the near-optimal configuration band widens with scale, giving practitioners quantitative flexibility to balance scaling law recommendations against infrastructure constraints.
0
stat.MLcs.LG MD Ruiz-Medina, AE Madrid, A Torres-Signes et al. · Mar 22, 2026

The paper addresses functional Gaussian Process regression on compact Riemannian manifolds, proposing a time-adaptive Empirical Bayes framework that exploits invariance of covariance kernels under isometries and spectral decomposition via Laplace–Beltrami eigenfunctions. The core idea is to work in the time-varying angular spectral domain, truncating the infinite-dimensional expansion based on functional sample size (typically logarithmic) to balance computational cost with approximation accuracy. This matters because it extends GP regression to infinite-dimensional functional settings on non-Euclidean domains while attempting to maintain computational tractability through spectral truncation schemes.

This paper proposes a new formulation of functional Gaussian Process regression in manifolds, based on an Empirical Bayes approach, in the spatiotemporal random field context. We apply the machinery of tight Gaussian measures in separable Hilbert spaces, exploiting the invariance property of covariance kernels under the group of isometries of the manifold. The identification of these measures with infinite-product Gaussian measures is then obtained via the eigenfunctions of the Laplace-Beltrami operator on the manifold. The involved time-varying angular spectra constitute the key tool for dimension reduction in the implementation of this regression approach, adopting a suitable truncation scheme depending on the functional sample size. The simulation study and synthetic data application undertaken illustrate the finite sample and asymptotic properties of the proposed functional regression predictor.
0
cond-mat.mtrl-scicond-mat.mes-hallcs.LG Claudia Islas-Vargas, L. Ricardo Montoya, Carlos A. Vital-Jos\'e et al. · Mar 23, 2026

Sodium-ion batteries need high-capacity anodes with fast ion transport, but hard carbon suffers from structural disorder and slow diffusion. This computational study uses the SpookyNet machine-learning force field with DFT to characterize aminobenzene-functionalized Janus graphene at room temperature. The work identifies a three-stage sodium storage mechanism and predicts a high capacity of ~400 mAh g$^{-1}$ with diffusion coefficients two to three orders of magnitude above hard carbon.

Sodium-ion batteries require anodes that combine high capacity, low operating voltage, fast Na-ion transport, and mechanical stability, which conventional anodes struggle to deliver. Here, we use the SpookyNet machine-learning force field (MLFF) together with all-electron density-functional theory calculations to characterize Na storage in aminobenzene-functionalized Janus graphene (Na$_x$AB) at room-temperature. Simulations across state of charge reveal a three-stage storage mechanism-site-specific adsorption at aminobenzene groups and Na$_n$@AB$_m$ structure formation, followed by interlayer gallery filling-contrasting the multi-stage pore-, graphite-interlayer-, and defect-controlled behavior in hard carbon. This leads to an OCV profile with an extended low-voltage plateau of 0.15 V vs. Na/Na$^{+}$, an estimated gravimetric capacity of $\sim$400 mAh g$^{-1}$, negligible volume change, and Na diffusivities of $\sim10^{-6}$ cm$^{2}$ s$^{-1}$, two to three orders of magnitude higher than in hard carbon. Our results establish Janus aminobenzene-graphene as a promising, structurally defined high-capacity Na-ion anode and illustrate the power of MLFF-based simulations for characterizing electrode materials.
0
stat.MLcs.LG Shailesh Garg, Souvik Chakraborty · Mar 23, 2026

Time-dependent reliability analysis for nonlinear dynamical systems under stochastic loading is computationally prohibitive with Monte Carlo simulation. CoNBONet proposes a surrogate combining DeepONet operator learning with Variable Spiking Neurons (VSNs) for sparse computation, Bayesian variational inference for uncertainty, and split conformal prediction for calibration. The goal is fast, energy-efficient inference with theoretical guarantees on reliability estimates.

Time-dependent reliability analysis of nonlinear dynamical systems under stochastic excitations is a critical yet computationally demanding task. Conventional approaches, such as Monte Carlo simulation, necessitate repeated evaluations of computationally expensive numerical solvers, leading to significant computational bottlenecks. To address this challenge, we propose \textit{CoNBONet}, a neuroscience-inspired surrogate model that enables fast, energy-efficient, and uncertainty-aware reliability analysis, providing a scalable alternative to techniques such as Monte Carlo simulations. CoNBONet, short for \textbf{Co}nformalized \textbf{N}euroscience-inspired \textbf{B}ayesian \textbf{O}perator \textbf{Net}work, leverages the expressive power of deep operator networks while integrating neuroscience-inspired neuron models to achieve fast, low-power inference. Unlike traditional surrogates such as Gaussian processes, polynomial chaos expansions, or support vector regression, that may face scalability challenges for high-dimensional, time-dependent reliability problems, CoNBONet offers \textit{fast and energy-efficient inference} enabled by a neuroscience-inspired network architecture, \textit{calibrated uncertainty quantification with theoretical guarantees} via split conformal prediction, and \textit{strong generalization capability} through an operator-learning paradigm that maps input functions to system response trajectories. Validation of the proposed CoNBONet for various nonlinear dynamical systems demonstrates that CoNBONet preserves predictive fidelity, and achieves reliable coverage of failure probabilities, making it a powerful tool for robust and scalable reliability analysis in engineering design.
0
cs.SCcs.LG Andrzej Odrzywo{\l}ek · Mar 23, 2026

The paper establishes that a single binary operator $\operatorname{eml}(x,y)=\exp(x)-\ln(y)$, together with the constant $1$, suffices to generate all elementary functions—trigonometric, exponential, logarithmic, and arithmetic operations. This provides a continuous analog to the Sheffer stroke in Boolean logic, enabling uniform binary-tree representations of mathematical expressions and opening avenues for gradient-based symbolic regression using identical computational nodes.

A single two-input gate suffices for all of Boolean logic in digital hardware. No comparable primitive has been known for continuous mathematics: computing elementary functions such as sin, cos, sqrt, and log has always required multiple distinct operations. Here I show that a single binary operator, eml(x,y)=exp(x)-ln(y), together with the constant 1, generates the standard repertoire of a scientific calculator. This includes constants such as $e$, $\pi$, and $i$; arithmetic operations including $+$, $-$, $\times$, $/$, and exponentiation as well as the usual transcendental and algebraic functions. For example, $e^x=\operatorname{eml}(x,1)$, $\ln x=\operatorname{eml}(1,\operatorname{eml}(\operatorname{eml}(1,x),1))$, and likewise for all other operations. That such an operator exists was not anticipated; I found it by systematic exhaustive search and established constructively that it suffices for the concrete scientific-calculator basis. In EML (Exp-Minus-Log) form, every such expression becomes a binary tree of identical nodes, yielding a grammar as simple as $S \to 1 \mid \operatorname{eml}(S,S)$. This uniform structure also enables gradient-based symbolic regression: using EML trees as trainable circuits with standard optimizers (Adam), I demonstrate the feasibility of exact recovery of closed-form elementary functions from numerical data at shallow tree depths up to 4. The same architecture can fit arbitrary data, but when the generating law is elementary, it may recover the exact formula.
0
cs.LGstat.ML Shreeram Murali, Cristian R. Rojas, Dominik Baumann · Mar 23, 2026

The paper proposes a non-parametric classifier based on the Nadaraya-Watson (NW) estimator that achieves linear $O(n)$ computational complexity while providing frequentist uncertainty bounds on predictions. By reformulating kernel regression for multi-class classification and deriving error bounds under Lipschitz continuity or separability assumptions, the authors bridge the gap between efficient "black box" methods and computationally expensive approaches like Gaussian Processes that offer formal guarantees. The method achieves $>96\%$ accuracy on MIT-BIH ECG data with uncertainty intervals that flag low-confidence predictions, making it suitable for safety-critical applications.

While both classical and neural network classifiers can achieve high accuracy, they fall short on offering uncertainty bounds on their predictions, making them unfit for safety-critical applications. Existing kernel-based classifiers that provide such bounds scale with $\mathcal O (n^{\sim3})$ in time, making them computationally intractable for large datasets. To address this, we propose a novel, computationally efficient classification algorithm based on the Nadaraya-Watson estimator, for whose estimates we derive frequentist uncertainty intervals. We evaluate our classifier on synthetically generated data and on electrocardiographic heartbeat signals from the MIT-BIH Arrhythmia database. We show that the method achieves competitive accuracy $>$\SI{96}{\percent} at $\mathcal O(n)$ and $\mathcal O(\log n)$ operations, while providing actionable uncertainty bounds. These bounds can, e.g., aid in flagging low-confidence predictions, making them suitable for real-time settings with resource constraints, such as diagnostic monitoring or implantable devices.
0
cs.LG Koichi Tanaka, Kazuki Kawamura, Takanori Muroi et al. · Mar 23, 2026

The paper tackles Off-Policy Evaluation (OPE) for ranking policies when the logging policy is deterministic—a common industrial scenario where existing estimators fail due to lack of common support. The key insight is to replace action-propensity weighting with click-propensity weighting, yielding the Click-based IPS (CIPS) estimator that leverages intrinsic user stochasticity even when the logging policy has none. This shifts the support requirement from ranking-wise or position-wise action overlap to click-wise overlap, enabling low-bias estimation in previously intractable deterministic settings.

Off-Policy Evaluation (OPE) is an important practical problem in algorithmic ranking systems, where the goal is to estimate the expected performance of a new ranking policy using only offline logged data collected under a different, logging policy. Existing estimators, such as the ranking-wise and position-wise inverse propensity score (IPS) estimators, require the data collection policy to be sufficiently stochastic and suffer from severe bias when the logging policy is fully deterministic. In this paper, we propose novel estimators, Click-based Inverse Propensity Score (CIPS), exploiting the intrinsic stochasticity of user click behavior to address this challenge. Unlike existing methods that rely on the stochasticity of the logging policy, our approach uses click probability as a new form of importance weighting, enabling low-bias OPE even under deterministic logging policies where existing methods incur substantial bias. We provide theoretical analyses of the bias and variance properties of the proposed estimators and show, through synthetic and real-world experiments, that our estimators achieve significantly lower bias compared to strong baselines, for a range of experimental settings with completely deterministic logging policies.
0
cs.CRcs.AIcs.CL Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera · Mar 23, 2026

SecureBreak introduces a response-level safety dataset designed to detect harmful LLM outputs that bypass alignment mechanisms. Unlike existing benchmarks that classify prompts, this work focuses on binary classification of generated responses (safe vs. unsafe) across 3,059 samples from multiple model families including Llama, Qwen, Gemma, and Mistral. The core value proposition is providing a 'last-line defense' layer for post-generation filtering and supervisory signals to guide security re-alignment, addressing the growing threat of jailbreak attacks.

Large language models are becoming pervasive core components in many real-world applications. As a consequence, security alignment represents a critical requirement for their safe deployment. Although previous related works focused primarily on model architectures and alignment methodologies, these approaches alone cannot ensure the complete elimination of harmful generations. This concern is reinforced by the growing body of scientific literature showing that attacks, such as jailbreaking and prompt injection, can bypass existing security alignment mechanisms. As a consequence, additional security strategies are needed both to provide qualitative feedback on the robustness of the obtained security alignment at the training stage, and to create an ``ultimate'' defense layer to block unsafe outputs possibly produced by deployed models. To provide a contribution in this scenario, this paper introduces SecureBreak, a safety-oriented dataset designed to support the development of AI-driven solutions for detecting harmful LLM outputs caused by residual weaknesses in security alignment. The dataset is highly reliable due to careful manual annotation, where labels are assigned conservatively to ensure safety. It performs well in detecting unsafe content across multiple risk categories. Tests with pre-trained LLMs show improved results after fine-tuning on SecureBreak. Overall, the dataset is useful both for post-generation safety filtering and for guiding further model alignment and security improvements.
0
cs.LGcs.SYeess.SY Ehimare Okoyomon, Christoph Goebel · Mar 23, 2026

This paper introduces BOOST-RPF, which tackles power flow analysis in distribution grids by reformulating voltage prediction from a global graph regression task into a sequential path-based learning problem. The key insight is leveraging the radial (tree) topology of distribution networks to decompose them into root-to-leaf paths, then using XGBoost to predict local voltage drops between parent-child bus pairs. This approach aims to combine the speed of machine learning with the size-agnostic, recursive inductive bias of classical solvers like DistFlow.

Accurate power flow analysis is critical for modern distribution systems, yet classical solvers face scalability issues, and current machine learning models often struggle with generalization. We introduce BOOST-RPF, a novel method that reformulates voltage prediction from a global graph regression task into a sequential path-based learning problem. By decomposing radial networks into root-to-leaf paths, we leverage gradient-boosted decision trees (XGBoost) to model local voltage-drop regularities. We evaluate three architectural variants: Absolute Voltage, Parent Residual, and Physics-Informed Residual. This approach aligns the model architecture with the recursive physics of power flow, ensuring size-agnostic application and superior out-of-distribution robustness. Benchmarked against the Kerber Dorfnetz grid and the ENGAGE suite, BOOST-RPF achieves state-of-the-art results with its Parent Residual variant which consistently outperforms both analytical and neural baselines in standard accuracy and generalization tasks. While global Multi-Layer Perceptrons (MLPs) and Graph Neural Networks (GNNs) often suffer from performance degradation under topological shifts, BOOST-RPF maintains high precision across unseen feeders. Furthermore, the framework displays linear $O(N)$ computational scaling and significantly increased sample efficiency through per-edge supervision, offering a scalable and generalizable alternative for real-time distribution system operator (DSO) applications.
0
cs.LGstat.ML Mohammed Abdullah, George Iosifidis, Salah Eddine Elayoubi et al. · Mar 22, 2026

The paper tackles Constrained Online Convex Optimization with Memory (COCO-M), where both losses and constraints depend on a window of past decisions, capturing realistic scenarios like smart-grid budgets and battery health limits. The authors propose the first algorithms achieving sublinear regret and cumulative constraint violation (CCV) under adversarial, time-varying constraints, both with and without unreliable predictions of future gradients. This work bridges the gap between classical constrained OCO and practical memory-dependent control problems.

We study Constrained Online Convex Optimization with Memory (COCO-M), where both the loss and the constraints depend on a finite window of past decisions made by the learner. This setting extends the previously studied unconstrained online optimization with memory framework and captures practical problems such as the control of constrained dynamical systems and scheduling with reconfiguration budgets. For this problem, we propose the first algorithms that achieve sublinear regret and sublinear cumulative constraint violation under time-varying constraints, both with and without predictions of future loss and constraint functions. Without predictions, we introduce an adaptive penalty approach that guarantees sublinear regret and constraint violation. When short-horizon and potentially unreliable predictions are available, we reinterpret the problem as online learning with delayed feedback and design an optimistic algorithm whose performance improves as prediction accuracy improves, while remaining robust when predictions are inaccurate. Our results bridge the gap between classical constrained online convex optimization and memory-dependent settings, and provide a versatile learning toolbox with diverse applications.
0
cs.IRcs.LG Ounnaci Iddir, Ahmed-ouamer Rachid, Tai Dinh · Mar 22, 2026

This paper addresses personalized information retrieval for XML documents by representing users, queries, and documents as weighted concept vectors derived from a domain ontology. The core idea is a hierarchical weighting scheme that favors specific (deeper) ontology concepts combined with a dynamic profile update mechanism that reinforces concepts based on user interactions. The work targets the limitation of traditional keyword-based systems that return identical results regardless of user knowledge or preferences.

This paper addresses the challenge of improving information retrieval from semi-structured eXtensible Markup Language (XML) documents. Traditional information retrieval systems (IRS) often overlook user-specific needs and return identical results for the same query, despite differences in users' knowledge, preferences, and objectives. We integrate external semantic resources, namely a domain ontology and user profiles, into the retrieval process. Documents, queries, and user profiles are represented as vectors of weighted concepts. The ontology applies a concept-weighting mechanism that emphasizes highly specific concepts, as lower-level nodes in the hierarchy provide more precise and targeted information. Relevance is assessed using semantic similarity measures that capture conceptual relationships beyond keyword matching, enabling personalized and fine-grained matching among user profiles, queries, and documents. Experimental results show that combining ontologies with user profiles improves retrieval effectiveness, achieving higher precision and recall than keyword-based approaches. Overall, the proposed framework enhances the relevance and adaptability of XML search results, supporting more user-centered retrieval.
0
cs.LG Ghifari Adam Faza, Jolan Wauters, Fabio Cuzzolin et al. · Mar 22, 2026

Interval uncertainty propagation typically requires solving expensive optimization problems for each input, making it infeasible for high-fidelity physics simulations. This paper proposes Direct Interval Propagation (DIP), reframing the task as interval-valued regression using neural surrogates to bypass optimization entirely. The authors extend DeepONet architectures to handle interval inputs and benchmark three distinct approaches---naive regression, bound propagation (IBP/CROWN), and interval neural networks---demonstrating orders-of-magnitude speedups on benchmark problems.

In engineering, uncertainty propagation aims to characterise system outputs under uncertain inputs. For interval uncertainty, the goal is to determine output bounds given interval-valued inputs, which is critical for robust design optimisation and reliability analysis. However, standard interval propagation relies on solving optimisation problems that become computationally expensive for complex systems. Surrogate models alleviate this cost but typically replace only the evaluator within the optimisation loop, still requiring many inference calls. To overcome this limitation, we reformulate interval propagation as an interval-valued regression problem that directly predicts output bounds. We present a comprehensive study of neural network-based surrogate models, including multilayer perceptrons (MLPs) and deep operator networks (DeepONet), for this task. Three approaches are investigated: (i) naive interval propagation through standard architectures, (ii) bound propagation methods such as Interval Bound Propagation (IBP) and CROWN, and (iii) interval neural networks (INNs) with interval weights. Results show that these methods significantly improve computational efficiency over traditional optimisation-based approaches while maintaining accurate interval estimates. We further discuss practical limitations and open challenges in applying interval-based propagation methods.
0
cs.LG Dip Roy, Rajiv Misra, Sanjay Kumar Singh et al. · Mar 22, 2026

This paper investigates whether mechanistic interpretability findings from image-domain VAEs transfer to tabular data using 75 independent training runs across five architectures and four tabular benchmarks. It introduces posterior-calibrated Causal Effect Strength (CES) and Feature-Group Disentanglement (FGD) to compare circuit structures across modalities, finding that tabular VAEs exhibit ~50% lower modularity and that β-VAEs suffer catastrophic capacity collapse on heterogeneous tabular data (260× CES reduction) compared to images.

Although mechanism-based interpretability has generated an abundance of insight for discriminative network analysis, generative models are less understood -- particularly outside of image-related applications. We investigate how much of the causal circuitry found within image-related variational autoencoders (VAEs) will generalize to tabular data, as VAEs are increasingly used for imputation, anomaly detection, and synthetic data generation. In addition to extending a four-level causal intervention framework to four tabular and one image benchmark across five different VAE architectures (with 75 individual training runs per architecture and three random seed values for each run), this paper introduces three new techniques: posterior-calibration of Causal Effect Strength (CES), path-specific activation patching, and Feature-Group Disentanglement (FGD). The results from our experiments demonstrate that: (i) Tabular VAEs have circuits with modularity that is approximately 50% lower than their image counterparts. (ii) $\beta$-VAE experiences nearly complete collapse in CES scores when applied to heterogeneous tabular features (0.043 CES score for tabular data compared to 0.133 CES score for images), which can be directly attributed to reconstruction quality degradation (r = -0.886 correlation coefficient between CES and MSE). (iii) CES successfully captures nine of eleven statistically significant architecture differences using Holm--\v{S}id\'{a}k corrections. (iv) Interventions with high specificity predict the highest downstream AUC values (r = 0.460, p < .001). This study challenges the common assumption that architectural guidance from image-related studies can be transferred to tabular datasets.
0
physics.comp-phcs.LG Shailesh Garg, Luis Mandl, Somdatta Goswami et al. · Mar 23, 2026

Physics-informed neural operators enable rapid surrogate modeling of PDEs but incur substantial energy costs during repeated inference, limiting deployment on edge devices. This paper proposes SPINONet, which embeds Variable Spiking Neurons (VSNs) into the branch network of a separable DeepONet architecture to enable sparse, event-driven computation while preserving continuous coordinate pathways for derivative calculation. The core insight is that structural decoupling—spiking for input encoding and dense differentiability for coordinate encoding—allows physics-informed training without redundant multiply-accumulate operations.

Energy efficiency remains a critical challenge in deploying physics-informed operator learning models for computational mechanics and scientific computing, particularly in power-constrained settings such as edge and embedded devices, where repeated operator evaluations in dense networks incur substantial computational and energy costs. To address this challenge, we introduce the Separable Physics-informed Neuroscience-inspired Operator Network (SPINONet), a neuroscience-inspired framework that reduces redundant computation across repeated evaluations while remaining compatible with physics-informed training. SPINONet incorporates regression-friendly neuroscience-inspired spiking neurons through an architecture-aware design that enables sparse, event-driven computation, improving energy efficiency while preserving the continuous, coordinate-differentiable pathways required for computing spatio-temporal derivatives. We evaluate SPINONet on a range of partial differential equations representative of computational mechanics problems, with spatial, temporal, and parametric dependencies in both time-dependent and steady-state settings, and demonstrate predictive performance comparable to conventional physics-informed operator learning approaches despite the induced sparse communication. In addition, limited data supervision in a hybrid setup is shown to improve performance in challenging regimes where purely physics-informed training may converge to spurious solutions. Finally, we provide an analytical discussion linking architectural components and design choices of SPINONet to reductions in computational load and energy consumption.
0
cs.LGcs.CL Pawel Batorski, Paul Swoboda · Mar 22, 2026

The paper addresses the brittleness of in-context learning (ICL) to example ordering, an intractable $n!$ search problem. It proposes PLR, which reframes discrete permutation search as learning a Plackett-Luce distribution that concentrates probability mass on high-performing orderings. Using Gumbel perturb-and-sort for efficient sampling, PLR optimizes task-level metrics directly without requiring finite label spaces, extending naturally to open-ended reasoning tasks like mathematical problem solving.

In-context learning (ICL) adapts large language models by conditioning on a small set of ICL examples, avoiding costly parameter updates. Among other factors, performance is often highly sensitive to the ordering of the examples. However, exhaustive search over the $n!$ possible orderings is infeasible. Therefore more efficient ordering methods use model confidence measures (e.g., label-probability entropy) over label sets or take a direct approach to finding the best ordering. We propose PLR, a probabilistic approach to in-context example ordering that replaces discrete ordering search with learning a probability distribution over orderings with the Plackett-Luce model. PLR models orderings using a Plackett-Luce distribution and iteratively updates its parameters to concentrate probability mass on high-performing orderings under a task-level metric. Candidate orderings are sampled efficiently via a Gumbel perturb-and-sort procedure. Experiments on multiple classification benchmarks show that PLR consistently improves few-shot accuracy for $k \in \{4, 8, 16, 32\}$ examples, and we further demonstrate gains on mathematical reasoning tasks where label-based ordering methods are not applicable. Our code is available at https://github.com/Batorskq/PLR.
0
cs.AIcs.LG Syed Usama Imtiaz, Mitra Nasr Azadani, Nasrin Alamdari · Mar 23, 2026

Foundation models for Earth observation risk learning spurious correlations when pretraining with random masking. This paper proposes SpecTM (Spectral Targeted Masking), which deterministically masks pigment-sensitive spectral bands (phycocyanin, chlorophyll-a, red-edge) to enforce physics-based cross-spectral learning. Validated on microcystin concentration prediction using NASA PACE hyperspectral imagery over Lake Erie, the method achieves $R^2=0.695$ (current week) and $R^2=0.620$ (8-day-ahead), showing strong label efficiency but limited geographic validation.

Foundation models are now increasingly being developed for Earth observation (EO), yet they often rely on stochastic masking that do not explicitly enforce physics constraints; a critical trustworthiness limitation, in particular for predictive models that guide public health decisions. In this work, we propose SpecTM (Spectral Targeted Masking), a physics-informed masking design that encourages the reconstruction of targeted bands from cross-spectral context during pretraining. To achieve this, we developed an adaptable multi-task (band reconstruction, bio-optical index inference, and 8-day-ahead temporal prediction) self-supervised learning (SSL) framework that encodes spectrally intrinsic representations via joint optimization, and evaluated it on a downstream microcystin concentration regression model using NASA PACE hyperspectral imagery over Lake Erie. SpecTM achieves R^2 = 0.695 (current week) and R^2 = 0.620 (8-day-ahead) predictions surpassing all baseline models by (+34% (0.51 Ridge) and +99% (SVR 0.31)) respectively. Our ablation experiments show targeted masking improves predictions by +0.037 R^2 over random masking. Furthermore, it outperforms strong baselines with 2.2x superior label efficiency under extreme scarcity. SpecTM enables physics-informed representation learning across EO domains and improves the interpretability of foundation models.
0
cs.CLcs.LGeess.AS Kai-Wei Chang, Yi-Cheng Lin, Huang-Cheng Chou et al. · Mar 23, 2026

This paper introduces TaigiSpeech, the first intent recognition dataset for Taiwanese Hokkien—a low-resource language spoken by 65% of Taiwanese elders. With 3,000+ utterances from 21 elderly speakers across emergency and smart-home scenarios, it addresses a critical gap in speech technology for aging populations. The authors also propose keyword-based and audio-visual mining strategies to bootstrap training data from unlabeled video sources.

Speech technologies have advanced rapidly and serve diverse populations worldwide. However, many languages remain underrepresented due to limited resources. In this paper, we introduce \textbf{TaigiSpeech}, a real-world speech intent dataset in Taiwanese Taigi (aka Taiwanese Hokkien/Southern Min), which is a low-resource and primarily spoken language. The dataset is collected from older adults, comprising 21 speakers with a total of 3k utterances. It is designed for practical intent detection scenarios, including healthcare and home assistant applications. To address the scarcity of labeled data, we explore two data mining strategies with two levels of supervision: keyword match data mining with LLM pseudo labeling via an intermediate language and an audio-visual framework that leverages multimodal cues with minimal textual supervision. This design enables scalable dataset construction for low-resource and unwritten spoken languages. TaigiSpeech will be released under the CC BY 4.0 license to facilitate broad adoption and research on low-resource and unwritten languages. The project website and the dataset can be found on https://kwchang.org/taigispeech.