Feed - arxlens

0

DSPA: Dynamic SAE Steering for Data-Efficient Preference Alignment

cs.LG cs.AI cs.CL James Wedgwood, Aashiq Muhamed, Mona T. Diab et al. · Mar 23, 2026

Preference alignment typically requires expensive weight-updating training like RLHF or DPO, which lacks mechanistic interpretability. This paper proposes DSPA, an inference-time method that dynamically steers sparse autoencoder (SAE) features based on prompt content without modifying base-model weights. By computing a sparse conditional-difference map $\mathbf{A}$ from preference triples that links prompt features to generation-control features, DSPA edits only token-active latents during decoding. The method achieves competitive open-ended generation quality with up to $4.47\times$ fewer alignment-stage FLOPs than training-based alternatives, while offering direct auditability of which features are modified and revealing that preference directions are dominated by discourse and stylistic signals.

Preference alignment is usually achieved by weight-updating training on preference data, which adds substantial alignment-stage compute and provides limited mechanistic visibility. We propose Dynamic SAE Steering for Preference Alignment (DSPA), an inference-time method that makes sparse autoencoder (SAE) steering prompt-conditional. From preference triples, DSPA computes a conditional-difference map linking prompt features to generation-control features; during decoding, it modifies only token-active latents, without base-model weight updates. Across Gemma-2-2B/9B and Qwen3-8B, DSPA improves MT-Bench and is competitive on AlpacaEval while preserving multiple-choice accuracy. Under restricted preference data, DSPA remains robust and can rival the two-stage RAHF-SCIT pipeline while requiring up to $4.47\times$ fewer alignment-stage FLOPs. Finally, we audit the SAE features DSPA modifies, finding that preference directions are dominated by discourse and stylistic signals, and provide theory clarifying the conditional-difference map estimate and when top-$k$ ablation is principled.

Read abstractHide abstract

0

Counterfactual Credit Policy Optimization for Multi-Agent Collaboration

cs.AI Zhongyi Li, Wan Tian, Yikun Ban et al. · Mar 23, 2026

Collaborative multi-agent LLM systems struggle with credit assignment during RL training: shared terminal rewards obscure individual contributions, encouraging free-riding and high gradient variance. This paper introduces CCPO (Counterfactual Credit Policy Optimization), which estimates each agent's marginal contribution by contrasting actual team performance with counterfactual outcomes—simulating performance without that specific agent. The method targets efficient discrete generation for LLMs and demonstrates improved reasoning accuracy across mathematical and logical benchmarks.

Collaborative multi-agent large language models (LLMs) can solve complex reasoning tasks by decomposing roles and aggregating diverse hypotheses. Yet, reinforcement learning (RL) for such systems is often undermined by credit assignment: a shared global reward obscures individual contributions, inflating update variance and encouraging free-riding. We introduce Counterfactual Credit Policy Optimization (CCPO), a framework that assigns agent-specific learning signals by estimating each agent's marginal contribution through counterfactual trajectories. CCPO builds dynamic counterfactual baselines that simulate outcomes with an agent's contribution removed, yielding role-sensitive advantages for policy optimization. To further improve stability under heterogeneous tasks and data distributions, we propose a global-history-aware normalization scheme that calibrates advantages using global rollout statistics. We evaluate CCPO on two collaboration topologies: a sequential Think--Reason dyad and multi-agent voting. Across mathematical and logical reasoning benchmarks, CCPO mitigates free-riding and outperforms strong multi-agent RL baselines, yielding finer-grained and more effective credit assignment for collaborative LLM training. Our code is available at https://github.com/bhai114/ccpo.

Read abstractHide abstract

0

Optimizing Feature Extraction for On-device Model Inference with User Behavior Sequences

cs.LG cs.AI cs.HC Chen Gong, Zhenzhe Zheng, Yiliu Chen et al. · Mar 23, 2026

Machine learning models on mobile devices spend 61-86% of execution time extracting features from user behavior logs rather than running inference. This paper introduces AutoFeature, a graph-based engine that eliminates redundant operations across features and consecutive executions using directed acyclic graph optimization and intelligent caching. Tested across five industrial services including TikTok and e-commerce platforms, it achieves 1.33×-4.53× end-to-end latency reduction without accuracy loss.

Machine learning models are widely integrated into modern mobile apps to analyze user behaviors and deliver personalized services. Ensuring low-latency on-device model execution is critical for maintaining high-quality user experiences. While prior research has primarily focused on accelerating model inference with given input features, we identify an overlooked bottleneck in real-world on-device model execution pipelines: extracting input features from raw application logs. In this work, we explore a new direction of feature extraction optimization by analyzing and eliminating redundant extraction operations across different model features and consecutive model inferences. We then introduce AutoFeature, an automated feature extraction engine designed to accelerate on-device feature extraction process without compromising model inference accuracy. AutoFeature comprises three core designs: (1) graph abstraction to formulate the extraction workflows of different input features as one directed acyclic graph, (2) graph optimization to identify and fuse redundant operation nodes across different features within the graph; (3) efficient caching to minimize operations on overlapping raw data between consecutive model inferences. We implement a system prototype of AutoFeature and integrate it into five industrial mobile services spanning search, video and e-commerce domains. Online evaluations show that AutoFeature reduces end-to-end on-device model execution latency by 1.33x-3.93x during daytime and 1.43x-4.53x at night.

Read abstractHide abstract

0

When Documents Disagree: Measuring Institutional Variation in Transplant Guidance with Retrieval-Augmented Language Models

cs.IR cs.AI Yubo Li, Ramayya Krishnan, Rema Padman · Mar 23, 2026

This paper introduces a scalable framework to measure institutional variation in solid-organ transplant patient education materials using retrieval-augmented generation (RAG). The authors ground 1,115 patient questions across 102 handbooks from 23 U.S. centers, then classify answer pairs into a five-label taxonomy (Absent, Consistent, Complementary, Divergent, Contradictory). The work exposes critical information gaps: 96.2% of question-handbook pairs miss relevant content, and 20.8% of non-absent pairs show clinically meaningful divergence, with reproductive health nearly absent (95.1%) across all materials.

Patient education materials for solid-organ transplantation vary substantially across U.S. centers, yet no systematic method exists to quantify this heterogeneity at scale. We introduce a framework that grounds the same patient questions in different centers' handbooks using retrieval-augmented language models and compares the resulting answers using a five-label consistency taxonomy. Applied to 102 handbooks from 23 centers and 1,115 benchmark questions, the framework quantifies heterogeneity across four dimensions: question, topic, organ, and center. We find that 20.8% of non-absent pairwise comparisons exhibit clinically meaningful divergence, concentrated in condition monitoring and lifestyle topics. Coverage gaps are even more prominent: 96.2% of question-handbook pairs miss relevant content, with reproductive health at 95.1% absence. Center-level divergence profiles are stable and interpretable, where heterogeneity reflects systematic institutional differences, likely due to patient diversity. These findings expose an information gap in transplant patient education materials, with document-grounded medical question answering highlighting opportunities for content improvement.

Read abstractHide abstract

0

Stabilizing Iterative Self-Training with Verified Reasoning via Symbolic Recursive Self-Alignment

cs.AI Xinyu Zhang · Mar 23, 2026

Recursive self-improvement promises sustained capability growth but faces recursive drift—the compounding of errors when models train on self-generated outputs. This paper proposes Neuro-Symbolic Recursive Self-Alignment (NSRSA), which stabilizes iterative self-training by filtering training data through symbolic verification at the reasoning step level. The core claim is that eliminating lucky guesses (correct answers with flawed reasoning) prevents recursive collapse and enables sustained improvement over multiple iterations.

Recursive self-improvement--where a model iteratively trains on its own outputs--promises sustained capability growth but faces a fundamental obstacle: recursive drift. As models train on self-generated data across multiple iterations, errors in intermediate reasoning compound, leading to mode collapse and performance degradation. We propose Neuro-Symbolic Recursive Self-Alignment (NSRSA), which stabilizes iterative self-training by embedding a symbolic verification subsystem that gates training data quality at the reasoning step level. Unlike outcome-only filtering (which admits "lucky guesses" with flawed reasoning), NSRSA verifies each arithmetic operation via sympy, checks logical flow consistency across reasoning steps, and enforces domain constraints. We evaluate NSRSA on GSM8K using Qwen3-4B-Thinking across 5 self-training iterations under five conditions: no verification, outcome verification, majority voting, full NSRSA symbolic verification, and NSRSA with DPO. Our filtering analysis shows that NSRSA rejects approximately 34% of correct-answer solutions that pass outcome verification, eliminating "lucky guesses" with flawed reasoning from the training set. We further demonstrate that constructing DPO preference pairs from NSRSA verification teaches the model to distinguish sound from flawed reasoning (reward accuracy 46% to 63%). NSRSA provides an extensible framework that demonstrates how external symbolic verification can make recursive self-improvement measurable and reliable within domains where automated verification is available.

Read abstractHide abstract

0

Generalized Discrete Diffusion from Snapshots

stat.ML cs.AI cs.CL Oussama Zekri, Th\'eo Uscidda, Nicolas Boull\'e et al. · Mar 22, 2026

Discrete diffusion models have been limited to simplistic noising schemes like uniform corruption or masking, restricting their ability to leverage semantic structure in large vocabularies. This paper introduces GDDS (Generalized Discrete Diffusion from Snapshots), a framework supporting arbitrary continuous-time Markov chain noising processes via exact uniformization-based sampling and a tractable snapshot-level ELBO. The work achieves state-of-the-art results on large-scale language modeling tasks, claiming to surpass autoregressive baselines for the first time at this scale.

We introduce Generalized Discrete Diffusion from Snapshots (GDDS), a unified framework for discrete diffusion modeling that supports arbitrary noising processes over large discrete state spaces. Our formulation encompasses all existing discrete diffusion approaches, while allowing significantly greater flexibility in the choice of corruption dynamics. The forward noising process relies on uniformization and enables fast arbitrary corruption. For the reverse process, we derive a simple evidence lower bound (ELBO) based on snapshot latents, instead of the entire noising path, that allows efficient training of standard generative modeling architectures with clear probabilistic interpretation. Our experiments on large-vocabulary discrete generation tasks suggest that the proposed framework outperforms existing discrete diffusion methods in terms of training efficiency and generation quality, and beats autoregressive models for the first time at this scale. We provide the code along with a blog post on the project page : \href{https://oussamazekri.fr/gdds}{https://oussamazekri.fr/gdds}.

Read abstractHide abstract

0

Safety as Computation: Certified Answer Reuse via Capability Closure in Task-Oriented Dialogue

cs.AI Cosimo Spera · Mar 22, 2026

The paper addresses inefficiency in task-oriented dialogue systems that recompute answers via retrieval or generation each turn, even when answers are already derivable from prior state. It proposes framing safety certification as a computational primitive where the fixed-point closure $cl(A_t)$ contains all derivable capabilities, enabling a Certified Answer Store with Pre-Answer Blocks that eliminates redundant RAG calls through formal containment checks. This matters because it reduces mean RAG calls from 13.7 to 1.31 and latency from 18.8s to 340ms while eliminating unsafe cache hits that plague embedding-based approaches.

We introduce a new paradigm for task-oriented dialogue systems: safety certification as a computational primitive for answer reuse. Current systems treat each turn independently, recomputing answers via retrieval or generation even when they are already derivable from prior state. We show that in capability-based systems, the safety certification step computes a fixed-point closure cl(At) that already contains every answer reachable from the current configuration. We operationalize this insight with a Certified Answer Store (CAS) augmented by Pre-Answer Blocks (PAB): at each certified turn, the system materializes all derivable follow-up answers together with minimal provenance witnesses. Subsequent queries are answered in sub-millisecond time via formal containment checks, eliminating redundant retrieval and generation.

Read abstractHide abstract

0

RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models

cs.AI Dongyoung Kim, Sumin Park, Woomin Song et al. · Mar 22, 2026

RoboAlign addresses the modality gap between high-level language reasoning and low-level robot control in Vision-Language-Action (VLA) models. The framework first uses supervised fine-tuning to teach a multimodal LLM to generate FAST action tokens through zero-shot chain-of-thought reasoning, then applies Group Relative Policy Optimization (GRPO) to refine reasoning based on token-level action accuracy. This matters because prior work showed that improving embodied reasoning via language supervision often fails to translate into better robot performance or even degrades it.

Improving embodied reasoning in multimodal-large-language models (MLLMs) is essential for building vision-language-action models (VLAs) on top of them to readily translate multimodal understanding into low-level actions. Accordingly, recent work has explored enhancing embodied reasoning in MLLMs through supervision of vision-question-answering type. However, these approaches have been reported to result in unstable VLA performance, often yielding only marginal or even negative gains. In this paper, we propose a more systematic MLLM training framework RoboAlign that reliably improves VLA performance. Our key idea is to sample action tokens via zero-shot natural language reasoning and refines this reasoning using reinforcement learning (RL) to improve action accuracy. As a result, RoboAlign bridges the modality gap between language and low-level actions in MLLMs, and facilitate knowledge transfer from MLLM to VLA. To validate the effectiveness of RoboAlign, we train VLAs by adding a diffusion-based action head on top of an MLLM backbone and evaluate them on major robotics benchmarks. Remarkably, by performing RL-based alignment after SFT using less than 1\% of the data, RoboAlign achieves performance improvements of 17.5\%, 18.9\%, and 106.6\% over SFT baselines on LIBERO, CALVIN, and real-world environments, respectively.

Read abstractHide abstract

0

Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks

cs.LG cs.AI Hang-Cheng Dong, Pengcheng Cheng · Mar 23, 2026

This paper develops a differential-geometric framework for shallow neural networks that treats predictor classes rather than raw parameters as the fundamental objects. By quotienting out permutation and scaling symmetries on a regular set $\Theta_{\mathrm{reg}}$, the authors define a function-induced metric $g_\theta$ and an effective Hessian that removes spurious curvature degeneracies along symmetry orbits. The work connects implicit bias to quotient-level geometry, with concrete analysis for quadratic-activation models where parameters map explicitly to symmetric matrices $Q(\theta)=\sum_{i=1}^m a_i w_i w_i^\top$.

Overparameterized shallow neural networks admit substantial parameter redundancy: distinct parameter vectors may represent the same predictor due to hidden-unit permutations, rescalings, and related symmetries. As a result, geometric quantities computed directly in the ambient Euclidean parameter space can reflect artifacts of representation rather than intrinsic properties of the predictor. In this paper, we develop a differential-geometric framework for analyzing simple shallow networks through the quotient space obtained by modding out parameter symmetries on a regular set. We first characterize the symmetry and quotient structure of regular shallow-network parameters and show that the finite-sample realization map induces a natural metric on the quotient manifold. This leads to an effective notion of curvature that removes degeneracy along symmetry orbits and yields a symmetry-reduced Hessian capturing intrinsic local geometry. We then study gradient flows on the quotient and show that only the horizontal component of parameter motion contributes to first-order predictor evolution, while the vertical component corresponds purely to gauge variation. Finally, we formulate an implicit-bias viewpoint at the quotient level, arguing that meaningful complexity should be assigned to predictor classes rather than to individual parameter representatives. Our experiments confirm that ambient flatness is representation-dependent, that local dynamics are better organized by quotient-level curvature summaries, and that in underdetermined regimes, implicit bias is most naturally described in quotient coordinates.

Read abstractHide abstract

0

KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning

cs.CL cs.AI Shuai Wang, Yinan Yu · Mar 22, 2026

KG-Hopper addresses Knowledge Base Question Answering (KBQA) by training compact 7B LLMs to perform multi-hop reasoning over Knowledge Graphs in a single inference round. Unlike sequential multi-step approaches that suffer from error cascades, it embeds the entire KG traversal process into a unified "thinking" stage using reinforcement learning. The core innovation is using GRPO (Group Relative Policy Optimization) with composite rewards to teach models to autonomously invoke retrieval tools via special tokens and reason across multiple hops without predefined pipelines.

Large Language Models (LLMs) demonstrate impressive natural language capabilities but often struggle with knowledge-intensive reasoning tasks. Knowledge Base Question Answering (KBQA), which leverages structured Knowledge Graphs (KGs) exemplifies this challenge due to the need for accurate multi-hop reasoning. Existing approaches typically perform sequential reasoning steps guided by predefined pipelines, restricting flexibility and causing error cascades due to isolated reasoning at each step. To address these limitations, we propose KG-Hopper, a novel Reinforcement Learning (RL) framework that empowers compact open LLMs with the ability to perform integrated multi-hop KG reasoning within a single inference round. Rather than reasoning step-by-step, we train a Reasoning LLM that embeds the entire KG traversal and decision process into a unified ``thinking'' stage, enabling global reasoning over cross-step dependencies and dynamic path exploration with backtracking. Experimental results on eight KG reasoning benchmarks show that KG-Hopper, based on a 7B-parameter LLM, consistently outperforms larger multi-step systems (up to 70B) and achieves competitive performance with proprietary models such as GPT-3.5-Turbo and GPT-4o-mini, while remaining compact, open, and data-efficient. The code is publicly available at: https://github.com/Wangshuaiia/KG-Hopper.

Read abstractHide abstract

0

What Do World Models Learn in RL? Probing Latent Representations in Learned Environment Simulators

cs.LG cs.AI Xinyu Zhang · Mar 23, 2026

World models for reinforcement learning learn to simulate environment dynamics, yet what they represent internally remains unclear. This paper probes two architecturally distinct models—IRIS (a discrete token transformer) and DIAMOND (a continuous diffusion UNet)—on Atari Breakout and Pong using linear and MLP probes, causal interventions, and attention analysis to test whether they develop structured, interpretable representations of game state. The core finding is that world models develop approximately linear representations of salient state variables (ball position, score) that are not merely correlated but functionally used during prediction.

World models learn to simulate environment dynamics from experience, enabling sample-efficient reinforcement learning. But what do these models actually represent internally? We apply interpretability techniques--including linear and nonlinear probing, causal interventions, and attention analysis--to two architecturally distinct world models: IRIS (discrete token transformer) and DIAMOND (continuous diffusion UNet), trained on Atari Breakout and Pong. Using linear probes, we find that both models develop linearly decodable representations of game state variables (object positions, scores), with MLP probes yielding only marginally higher R^2, confirming that these representations are approximately linear. Causal interventions--shifting hidden states along probe-derived directions--produce correlated changes in model predictions, providing evidence that representations are functionally used rather than merely correlated. Analysis of IRIS attention heads reveals spatial specialization: specific heads attend preferentially to tokens overlapping with game objects. Multi-baseline token ablation experiments consistently identify object-containing tokens as disproportionately important. Our findings provide interpretability evidence that learned world models develop structured, approximately linear internal representations of environment state across two games and two architectures.

Read abstractHide abstract

0

LLM-Powered Workflow Optimization for Multidisciplinary Software Development: An Automotive Industry Case Study

cs.SE cs.AI Shuai Wang, Yinan Yu, Earl Barr et al. · Mar 22, 2026

This paper tackles the persistent bottleneck of Multidisciplinary Software Development (MSD), where domain experts and software developers must manually coordinate across heterogeneous artifacts and incompatible formalisms. The authors model MSD workflows as a directed dependency graph $\mathcal{G}=(\mathcal{V},\mathcal{R})$ and propose an iterative optimization framework that replaces manual translation nodes with LLM-powered services. This matters because their approach reduces per-API development time from approximately 5 hours to under 7 minutes while maintaining production-quality code, demonstrating that workflow-level automation—not just coding assistance—can unlock substantial efficiency gains in industrial settings.

Multidisciplinary Software Development (MSD) requires domain experts and developers to collaborate across incompatible formalisms and separate artifact sets. Today, even with AI coding assistants like GitHub Copilot, this process remains inefficient; individual coding tasks are semi-automated, but the workflow connecting domain knowledge to implementation is not. Developers and experts still lack a shared view, resulting in repeated coordination, clarification rounds, and error-prone handoffs. We address this gap through a graph-based workflow optimization approach that progressively replaces manual coordination with LLM-powered services, enabling incremental adoption without disrupting established practices. We evaluate our approach on \texttt{spapi}, a production in-vehicle API system at Volvo Group involving 192 endpoints, 420 properties, and 776 CAN signals across six functional domains. The automated workflow achieves 93.7\% F1 score while reducing per-API development time from approximately 5 hours to under 7 minutes, saving an estimated 979 engineering hours. In production, the system received high satisfaction from both domain experts and developers, with all participants reporting full satisfaction with communication efficiency.

Read abstractHide abstract

0

The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability

cs.AI Anthony T. Nixon · Mar 22, 2026

The paper addresses state-space explosion in partially observable environments by formalizing a bounded-interaction analogue of the Myhill-Nerode theorem for finite POMDPs. Its core insight is that two observation histories are equivalent if no bounded finite-state controller can distinguish them via closed-loop interaction, inducing a canonical quotient that is minimal and unique for that observer capacity. This yields a principled separation between exact decision sufficiency for observation-measurable objectives and approximate bounds for latent-state rewards, with the canonical object strictly requiring clock-aware probe families.

Any capacity-limited observer induces a canonical quotient on its environment: two situations that no bounded agent can distinguish are, for that agent, the same. We formalise this for finite POMDPs. A fixed probe family of finite-state controllers induces a closed-loop Wasserstein pseudometric on observation histories and a probe-exact quotient merging histories that no controller in the family can distinguish. The quotient is canonical, minimal, and unique-a bounded-interaction analogue of the Myhill-Nerode theorem. For clock-aware probes, it is exactly decision-sufficient for objectives that depend only on the agent's observations and actions; for latent-state rewards, we use an observation-Lipschitz approximation bound. The main theorem object is the clock-aware quotient; scalable deterministic-stationary experiments study a tractable coarsening with gap measured on small exact cases and explored empirically at larger scale. We validate theorem-level claims on Tiger and GridWorld. We also report operational case studies on Tiger, GridWorld, and RockSample as exploratory diagnostics of approximation behavior and runtime, not as theorem-facing evidence when no exact cross-family certificate is available; heavier stress tests are archived in the appendix and artifact package.

Read abstractHide abstract

0

ARYA: A Physics-Constrained Composable & Deterministic World Model Architecture

cs.AI cs.DC Seth Dobrin, Lukasz Chmiel · Mar 22, 2026

ARYA presents a world model architecture using "nano models"—small specialized components orchestrated by an autonomous agent (AARA)—rather than monolithic neural networks. The system claims physics-constrained determinism, sub-20-second training cycles, and an "unfireable" safety kernel that cannot be bypassed. The authors position this as production-deployed across seven industry domains from aerospace to pharma, achieving state-of-the-art results on six of nine benchmarks with "zero neural network parameters."

This paper presents ARYA, a composable, physics-constrained, deterministic world model architecture built on five foundational principles: nano models, composability, causal reasoning, determinism, and architectural AI safety. We demonstrate that ARYA satisfies all canonical world model requirements, including state representation, dynamic prediction, causal and physical awareness, temporal consistency, generalization, learnability, and planning and control. Unlike monolithic foundation models, the ARYA foundation model implements these capabilities through a hierarchical system-of-system-of-systems of specialized nano models, orchestrated by AARA (ARYA Autonomous Research Agent), an always-on cognitive daemon that executes a continuous sense-decide-act-learn loop. The nano model architecture provides linear scaling, sparse activation, selective untraining, and sub-20-second training cycles, resolving the traditional tension between capability and computational efficiency. A central contribution is the Unfireable Safety Kernel: an architecturally immutable safety boundary that cannot be disabled or circumvented by any system component, including its own self-improvement engine. This is not a social or ethical alignment statement; it is a technical framework ensuring human control persists as autonomy increases. Safety is an architectural constraint governing every operation, not a policy layer applied after the fact. We present formal alignment between ARYA's architecture and canonical world model requirements, and report summarizing its state-of-the-art performance across 6 of 9 competitive benchmarks head-to-head with GPT-5.2, Opus 4.6, and V-JEPA-2. All with zero neural network parameters, across seven active industry domain nodes spanning aerospace, pharma manufacturing, oil and gas, smart cities, biotech, defense, and medical devices.

Read abstractHide abstract

0

Evolutionary Biparty Multiobjective UAV Path Planning: Problems and Empirical Comparisons

cs.NE cs.AI Kesheng Chen, Wenjian Luo, Xin Lin et al. · Mar 23, 2026

Unmanned aerial vehicle (UAV) path planning traditionally treats efficiency and safety objectives as a single multiobjective optimization problem. This paper proposes a biparty multiobjective formulation with separate decision-makers for efficiency and safety, adapting immune algorithms (NNIA, HEIA, AIMA) into BPNNIA, BPHEIA, and BPAIMA to find common Pareto optimal solutions. The work addresses the practical scenario where regulatory and operational departments have independent criteria.

Unmanned aerial vehicles (UAVs) have been widely used in urban missions, and proper planning of UAV paths can improve mission efficiency while reducing the risk of potential third-party impact. Existing work has considered all efficiency and safety objectives for a single decision-maker (DM) and regarded this as a multiobjective optimization problem (MOP). However, there is usually not a single DM but two DMs, i.e., an efficiency DM and a safety DM, and the DMs are only concerned with their respective objectives. The final decision is made based on the solutions of both DMs. In this paper, for the first time, biparty multiobjective UAV path planning (BPMO-UAVPP) problems involving both efficiency and safety departments are modeled. The existing multiobjective immune algorithm with nondominated neighbor-based selection (NNIA), the hybrid evolutionary framework for the multiobjective immune algorithm (HEIA), and the adaptive immune-inspired multiobjective algorithm (AIMA) are modified for solving the BPMO-UAVPP problem, and then biparty multiobjective optimization algorithms, including the BPNNIA, BPHEIA, and BPAIMA, are proposed and comprehensively compared with traditional multiobjective evolutionary algorithms and typical multiparty multiobjective evolutionary algorithms (i.e., OptMPNDS and OptMPNDS2). The experimental results show that BPAIMA performs better than ordinary multiobjective evolutionary algorithms such as NSGA-II and multiparty multiobjective evolutionary algorithms such as OptMPNDS, OptMPNDS2, BPNNIA and BPHEIA.

Read abstractHide abstract

0

A Framework for Closed-Loop Robotic Assembly, Alignment and Self-Recovery of Precision Optical Systems

cs.RO cs.AI physics.optics Seou Choi, Sachin Vaidya, Caio Silva et al. · Mar 23, 2026

Precision free-space optics demands sub-millimeter and sub-degree tolerances where traditional robotic pick-and-place fails. This work introduces a closed-loop robotics framework integrating hierarchical computer vision, Newton-based spatial optimization, and Bayesian angular optimization to autonomously construct, align, and maintain optical systems. The authors demonstrate this by building a tabletop laser cavity from randomly distributed components—achieving beam alignment, mode selection, and self-recovery without human intervention. The system bridges the gap between coarse robotic manipulation and the extreme precision required for functional optical experiments.

Robotic automation has transformed scientific workflows in domains such as chemistry and materials science, yet free-space optics, which is a high precision domain, remains largely manual. Optical systems impose strict spatial and angular tolerances, and their performance is governed by tightly coupled physical parameters, making generalizable automation particularly challenging. In this work, we present a robotics framework for the autonomous construction, alignment, and maintenance of precision optical systems. Our approach integrates hierarchical computer vision systems, optimization routines, and custom-built tools to achieve this functionality. As a representative demonstration, we perform the fully autonomous construction of a tabletop laser cavity from randomly distributed components. The system performs several tasks such as laser beam centering, spatial alignment of multiple beams, resonator alignment, laser mode selection, and self-recovery from induced misalignment and disturbances. By achieving closed-loop autonomy for highly sensitive optical systems, this work establishes a foundation for autonomous optical experiments for applications across technical domains.

Read abstractHide abstract

0

Persona Vectors in Games: Measuring and Steering Strategies via Activation Vectors

cs.AI cs.GT Johnathan Sun, Andrew Zhang · Mar 22, 2026

The paper tackles the challenge of controlling high-level behavioral traits in LLM agents deployed in strategic settings. Rather than treating models as black boxes via prompting, the authors construct 'persona vectors'—linear directions in activation space—for traits like altruism and forgiveness using contrastive activation addition. Applied to six canonical games, these vectors allow both measurement of behavioral tendencies and causal steering of decisions, offering a mechanistic handle on strategic behavior.

Large language models (LLMs) are increasingly deployed as autonomous decision-makers in strategic settings, yet we have limited tools for understanding their high-level behavioral traits. We use activation steering methods in game-theoretic settings, constructing persona vectors for altruism, forgiveness, and expectations of others by contrastive activation addition. Evaluating on canonical games, we find that activation steering systematically shifts both quantitative strategic choices and natural-language justifications. However, we also observe that rhetoric and strategy can diverge under steering. In addition, vectors for self-behavior and expectations of others are partially distinct. Our results suggest that persona vectors offer a promising mechanistic handle on high-level traits in strategic environments.

Read abstractHide abstract

0

Behavioural feasible set: Value alignment constraints on AI decision support

cs.AI econ.GN q-fin.EC Taejin Park · Mar 22, 2026

Organizations deploying commercial AI systems inherit vendor-imposed value constraints that limit which recommendations the system can produce. This paper formalizes these boundaries as a "behavioural feasible set" and demonstrates through controlled experiments that alignment training compresses this set, making AI systems structurally unable to endorse certain legitimate organizational actions even under strong contextual pressure. The work reframes AI governance from a capability question to a constraint diagnosis problem, showing that vendor selection partially determines which trade-offs remain negotiable for adopting firms.

When organisations adopt commercial AI systems for decision support, they inherit value judgements embedded by vendors that are neither transparent nor renegotiable. The governance puzzle is not whether AI can support decisions but which recommendations the system can actually produce given how its vendor has configured it. I formalise this as a behavioural feasible set, the range of recommendations reachable under vendor-imposed alignment constraints, and characterise diagnostic thresholds for when organisational requirements exceed the system's flexibility. In scenario-based experiments using binary decision scenarios and multi-stakeholder ranking tasks, I show that alignment materially compresses this set. Comparing pre- and post-alignment variants of an open-weight model isolates the mechanism: alignment makes the system substantially less able to shift its recommendation even under legitimate contextual pressure. Leading commercial models exhibit comparable or greater rigidity. In multi-stakeholder tasks, alignment shifts implied stakeholder priorities rather than neutralising them, meaning organisations adopt embedded value orientations set upstream by the vendor. Organisations thus face a governance problem that better prompting cannot resolve: selecting a vendor partially determines which trade-offs remain negotiable and which stakeholder priorities are structurally embedded.

Read abstractHide abstract

0

RuntimeSlicer: Towards Generalizable Unified Runtime State Representation for Failure Management

cs.SE cs.AI Lingzhe Zhang, Tong Jia, Weijie Hong et al. · Mar 23, 2026

Modern failure management pipelines tightly couple task-specific models with modality-specific encoders, blocking reuse across systems. RuntimeSlicer proposes a unified runtime state representation that encodes metrics, traces, and logs into a single embedding via Unified Runtime Contrastive Learning, then adapts to downstream tasks through State-Aware Task-Oriented Tuning. The core value is decoupling representation learning from failure management tasks—if it generalizes, teams could freeze the embedding backbone and ship lightweight task heads.

Modern software systems operate at unprecedented scale and complexity, where effective failure management is critical yet increasingly challenging. Metrics, traces, and logs provide complementary views of system runtime behavior, but existing failure management approaches typically rely on task-oriented pipelines that tightly couple modality-specific preprocessing, representation learning, and downstream models, resulting in limited generalization across tasks and systems. To fill this gap, we propose RuntimeSlicer, a unified runtime state representation model towards generalizable failure management. RuntimeSlicer pre-trains a task-agnostic representation model that directly encodes metrics, traces, and logs into a single, aligned system-state embedding capturing the holistic runtime condition of the system. To train RuntimeSlicer, we introduce Unified Runtime Contrastive Learning, which integrates heterogeneous training data sources and optimizes complementary objectives for cross-modality alignment and temporal consistency. Building upon the learned system-state embeddings, we further propose State-Aware Task-Oriented Tuning, which performs unsupervised partitioning of runtime states and enables state-conditioned adaptation for downstream tasks. This design allows lightweight task-oriented models to be trained on top of the unified embedding without redesigning modality-specific encoders or preprocessing pipelines. Preliminary experiments on the AIOps 2022 dataset demonstrate the feasibility and effectiveness of RuntimeSlicer for system state modeling and failure management tasks.

Read abstractHide abstract

0

PivotRL: High Accuracy Agentic Post-Training at Low Compute Cost

cs.AI Junkeun Yi, Damon Mosk-Aoyama, Baihe Huang et al. · Mar 22, 2026

PivotRL addresses the compute-generalization trade-off in agentic post-training by extracting "pivot" states—intermediate turns with high outcome variance—from existing SFT trajectories and applying functional-equivalence rewards rather than strict string matching. The method achieves comparable accuracy to end-to-end RL on SWE-Bench with roughly one-quarter the rollout cost, while avoiding the catastrophic forgetting typical of supervised fine-tuning on long-horizon tool-use tasks.

Post-training for long-horizon agentic tasks has a tension between compute efficiency and generalization. While supervised fine-tuning (SFT) is compute efficient, it often suffers from out-of-domain (OOD) degradation. Conversely, end-to-end reinforcement learning (E2E RL) preserves OOD capabilities, but incurs high compute costs due to many turns of on-policy rollout. We introduce PivotRL, a novel framework that operates on existing SFT trajectories to combine the compute efficiency of SFT with the OOD accuracy of E2E RL. PivotRL relies on two key mechanisms: first, it executes local, on-policy rollouts and filters for pivots: informative intermediate turns where sampled actions exhibit high variance in outcomes; second, it utilizes rewards for functional-equivalent actions rather than demanding strict string matching with the SFT data demonstration. We theoretically show that these mechanisms incentivize strong learning signals with high natural gradient norm, while maximally preserving policy probability ordering on actions unrelated to training tasks. In comparison to standard SFT on identical data, we demonstrate that PivotRL achieves +4.17% higher in-domain accuracy on average across four agentic domains, and +10.04% higher OOD accuracy in non-agentic tasks. Notably, on agentic coding tasks, PivotRL achieves competitive accuracy with E2E RL with 4x fewer rollout turns. PivotRL is adopted by NVIDIA's Nemotron-3-Super-120B-A12B, acting as the workhorse in production-scale agentic post-training.

Read abstractHide abstract

Nothing here yet