Nothing here yet
This paper investigates whether large language models exhibit metacognitive control—specifically, whether they use internal confidence signals to guide abstention decisions (knowing when to answer versus withhold responses). The authors develop a rigorous four-phase paradigm combining behavioral analysis, activation steering, and computational modeling to demonstrate that abstention arises from a two-stage confidence-decision pathway involving confidence representation formation followed by threshold-based policy implementation. Their findings suggest that LLMs deploy native confidence signals in a structured manner paralleling biological metacognition, with substantial implications for safe AI deployment.
This paper investigates which static analysis alert removals actually reduce bug rates—a critical question since developers constantly face noisy linting warnings. The author employs three complementary methods: a randomized controlled trial with 521 manual interventions, labeling functions to identify intervention-like events in 8,245 natural commits, and supervised learning to predict beneficial removals. The core finding is that removing complexity alerts (too-many-branches, too-many-nested-blocks) via method extraction reduces bug tendency by 4.1–5.5 percentage points, offering evidence-based guidance for prioritizing refactoring efforts.
Soft robot simulators suffer from a sim-to-real gap that widens when optimizing morphology, because calibration parameters identified on one geometry often fail to transfer to unseen shapes. This paper proposes Residual Acceleration Field Learning (RAFL), which learns local corrective accelerations defined on quadrature elements rather than global nodal forces. By operating on deformation and velocity gradients in material space, the model becomes independent of mesh topology and discretization, enabling zero-shot generalization across geometries.
This paper introduces the Distributed Human Data Engine (DHDE), a socio-technical framework tackling 'under-vibrancy'—a condition of low visitor density suppressing economic activity—in declining regions like Fukui, Japan. Contrasting with overtourism literature, it integrates Google Business Profile search intent, Japan Meteorological Agency micro-climate data, edge-AI cameras, and 97,719 survey responses to forecast tourism flows and quantify economic leakage. The work promises algorithmic governance via 'dual-nudge' interventions to redirect visitors and coordinate merchant behavior, backed by claims of $R^2=0.810$ explanatory power.
Physics-informed neural networks typically enforce boundary conditions via penalty terms, leading to approximate satisfaction and training pathologies. This paper proposes a systematic method to enforce Dirichlet, Neumann, and Robin conditions exactly on curved quadrilateral domains using Theory of Functional Connections (TFC) combined with transfinite interpolation. The key innovation is handling compatibility constraints at vertices where mixed boundary conditions meet, particularly when two Neumann/Robin boundaries intersect, by decomposing the problem into a four-step procedure.
Generative policies represent actions as multi-step denoising trajectories, rendering standard PPO's single-step action-space ratios mismatched to the policy structure. This paper proposes GSB-PPO, a path-space formulation inspired by Generalized Schrödinger Bridge that lifts proximal updates from terminal actions to full generation paths. The central finding is that a penalty-based objective substantially outperforms the direct clipping extension, establishing trajectory-level regularization as the preferred inductive bias for on-policy generative RL.
Training machine learning interatomic potentials (MLIPs) requires costly quantum mechanical calculations to label atomic configurations. This paper proposes using determinantal point processes (DPPs) to select diverse, informative subsets of configurations, mitigating the computational bottleneck while maintaining model accuracy. Experiments on hafnium oxide systems demonstrate that DPP-based subselection achieves competitive or superior performance compared to existing methods like k-means clustering and MaxVol, offering a probabilistic framework that naturally handles variable training set sizes.
This paper addresses mesa-optimization by defining agency as a balance between curiosity (KL divergence) and empowerment (mutual information), proposing an optimization-friendly agency function and an STEC-based metric to detect mesa-optimizers. The work claims that agency functions are convex, smooth, and exhibit logarithmic convergence—suggesting high probability of spontaneous emergence in modern models.
This paper tackles the challenge of deploying traffic forecasting models in resource-constrained Wi-Fi controllers that manage thousands of access points (APs). The core idea is to use feature-based clustering (k-means on PCA-reduced features) to group APs by traffic behavior, then deploy cluster-specific LSTM models only to high-activity clusters while using a lightweight global model for low-activity clusters. The approach reduces memory footprint by approximately 40% compared to deploying complex models for all clusters, while preserving prediction accuracy through selective specialization.
This paper investigates why linear steering methods for transformers sometimes fail silently by leaking probability mass to unintended tokens. The authors show that softmax induces a Bregman geometry governed by the Hessian $H(\lambda) = \operatorname{Cov}[\gamma \mid \lambda]$, and when this Hessian is degenerate at intermediate layers, Euclidean steering becomes unreliable. Using a carefully controlled $2 \times 2$ factorial design crossing stream separation (CASCADE architecture) with per-layer supervision, they find that maintaining a frozen token stream improves Hessian conditioning by up to $22\times$ compared to standard single-stream transformers. The work provides both a diagnostic tool (cosine similarity between primal and dual directions with threshold $\sim$0.3) and an architectural fix for safer linear interventions.
Bayesian neural networks (BNNs) suffer from fragmented, high-dimensional posteriors due to weight-space symmetries, raising doubts about the practicality of sampling-based inference. This paper demonstrates that overparametrization—using more hidden units than necessary—actually transforms the posterior geometry in beneficial ways. The authors identify three key phenomena induced by redundancy: balancedness (norm equalization across layers), weight reallocation on equal-probability manifolds (following Dirichlet distributions), and prior conformity (marginals aligning with zero-mean Gaussian priors). Through theory for ReLU networks and extensive experiments with up to 10 million posterior samples, the work explains why recent sampling methods succeed and provides a principled foundation for understanding weight priors in overparametrized regimes.
This paper solves stability and bifurcation analysis for nonlinear PDEs using Physics-Informed Random Projection Neural Networks (PI-RPNNs). The core innovation is a matrix-free shift-invert Krylov-Arnoldi method operating directly in weight space to circumvent the exponential singular value decay of the random collocation matrix $\Psi$. This enables reliable computation of leading eigenpairs for detecting saddle-node, Hopf, and pitchfork bifurcations without requiring additional PDE solves beyond the initial training.
The paper addresses adaptive broadcast of data-intensive sensory streams (e.g., camera/LiDAR) to heterogeneous edge devices with diverse channel conditions and computational budgets. It proposes Nonlinear Transform Rateless Source-Channel Coding (NTRSCC), integrating learned nonlinear transforms with physical-layer Luby Transform (LT) codes to enable receivers to adaptively adjust the number of received symbols and belief propagation iterations. This achieves an explicit, controllable tradeoff between distortion, transmission rate, and decoding complexity—addressing key limitations of fixed-rate DeepJSCC schemes that either underserve capable devices or require costly retransmissions.
This paper investigates amortized Bayesian inference (ABI) for estimating coupling parameters in Kuramoto oscillator networks—a nonlinear dynamical system widely used to study synchronization. The authors apply neural posterior estimation via BayesFlow to learn an amortized approximation of the posterior distribution from simulated phase dynamics. While the method succeeds for simple single-parameter networks, the paper's central finding is that it fails for complex multi-node networks due to structural non-identifiability and data inefficiency—making the title's focus on 'limitations' well-earned.
This paper addresses paper-code consistency detection in bioinformatics, tackling the reproducibility crisis where algorithmic descriptions in publications often diverge from software implementations. The authors introduce BioCon, a benchmark of 48 bioinformatics projects with expert-annotated sentence-code pairs, and propose a cross-modal framework using UniXcoder with weighted focal loss. While the task is important for computational biology reproducibility, claims of novelty require qualification given concurrent efforts in the broader scientific community.
ThinkJEPA addresses the limitation of JEPA-style latent world models that rely on short, densely sampled windows, which bias predictions toward local dynamics while missing long-horizon semantics. The paper proposes a dual-temporal architecture combining a dense-frame V-JEPA branch for fine-grained motion with a sparsely sampled VLM "thinker" branch that provides semantic guidance via multi-layer feature pyramids. This matters because it attempts to marry the physical consistency of latent world models with the general knowledge of vision-language models for robust trajectory forecasting.
This paper establishes information-theoretic limits on LLM steganography, proving that any semantic-preserving embedding of a payload $P$ into a covertext $M_1$ to produce stegotext $M_2$ must increase Kolmogorov complexity by at least $K(P) - O(\log n)$. Since Kolmogorov complexity is uncomputable, the authors propose perplexity ratios (specifically the Binoculars score) as a practical proxy and validate the approach on a color-based encoding scheme with 300 samples.
SparseDVFS tackles energy-efficient DNN inference on edge devices by bridging the gap between coarse model-level and prohibitive operator-level DVFS. The core insight is using operator sparsity to distinguish compute-bound and memory-bound phases, applying specialized frequency triplets via a block-level strategy. A white-box offline modeler, greedy graph partitioner with amortization constraints, and unified co-governor with look-ahead pipelining collectively achieve substantial energy savings while managing switching overheads.
This paper extends In-Context Operator Networks (ICONs)—which learn PDE solution operators via in-context learning without retraining—to higher-order and higher-dimensional PDEs. The authors test on 19 problem types including the heat equation and 3D linear PDEs, finding that while point-wise accuracy degrades for complex OOD problems, the model retains qualitative solution behavior.
This paper studies the coupling between three design axes in audio representation learning: input frontend (raw waveform vs. spectrogram), backbone architecture (Mamba vs. attention), and sequence length. The authors introduce HELIX, a minimal hybrid architecture with five bidirectional Mamba layers and one attention bottleneck at matched 8.3M parameter capacity. The key finding is that these choices are not independent: raw waveforms help with Mamba but not attention, attention hurts on short environmental sounds but becomes critical at 30,000 tokens (5 minutes), where pure attention fails with OOM errors and HELIX closes an 11.5-point gap over pure Mamba on speaker identification.