Feed - arxlens

0

cs.LG cs.AI cs.GT Yurong Chen, Zhiyi Huang, Michael I. Jordan et al. · Mar 23, 2026

The paper studies calibeating—post-processing external forecasts online to minimize cumulative losses while matching an informativeness-based benchmark. Unlike prior work that used loss-specific arguments, the authors reduce calibeating to standard online learning primitives, showing it is minimax-equivalent to regret minimization. This yields optimal rates for general proper losses and improves bounds for simultaneous calibration and calibeating.

We study calibeating, the problem of post-processing external forecasts online to minimize cumulative losses and match an informativeness-based benchmark. Unlike prior work, which analyzed calibeating for specific losses with specific arguments, we reduce calibeating to existing online learning techniques and obtain results for general proper losses. More concretely, we first show that calibeating is minimax-equivalent to regret minimization. This recovers the $O(\log T)$ calibeating rate of Foster and Hart [FH23] for the Brier and log losses and its optimality, and yields new optimal calibeating rates for mixable losses and general bounded losses. Second, we prove that multi-calibeating is minimax-equivalent to the combination of calibeating and the classical expert problem. This yields new optimal multi-calibeating rates for mixable losses, including Brier and log losses, and general bounded losses. Finally, we obtain new bounds for achieving calibeating and calibration simultaneously for the Brier loss. For binary predictions, our result gives the first calibrated algorithm that at the same time also achieves the optimal $O(\log T)$ calibeating rate.

Read abstractHide abstract

0

One Model, Two Markets: Bid-Aware Generative Recommendation

cs.IR cs.AI cs.GT Yanchen Jiang, Zhe Feng, Christopher P. Mah et al. · Mar 23, 2026

Generative recommender systems like TIGER excel at semantic retrieval but ignore the economic realities of monetization via sponsored content. This paper proposes GEM-Rec, a unified framework that augments semantic IDs with control tokens (<ORG>, <AD>) to factorize slot allocation from item generation, and introduces Bid-Aware Decoding to inject real-time auction bids into inference. The work bridges the gap between generative recommendation and computational advertising, offering theoretical guarantees like allocative monotonicity while allowing dynamic trade-offs between user relevance and platform revenue.

Generative Recommender Systems using semantic ids, such as TIGER (Rajput et al., 2023), have emerged as a widely adopted competitive paradigm in sequential recommendation. However, existing architectures are designed solely for semantic retrieval and do not address concerns such as monetization via ad revenue and incorporation of bids for commercial retrieval. We propose GEM-Rec, a unified framework that integrates commercial relevance and monetization objectives directly into the generative sequence. We introduce control tokens to decouple the decision of whether to show an ad from which item to show. This allows the model to learn valid placement patterns directly from interaction logs, which inherently reflect past successful ad placements. Complementing this, we devise a Bid-Aware Decoding mechanism that handles real-time pricing, injecting bids directly into the inference process to steer the generation toward high-value items. We prove that this approach guarantees allocation monotonicity, ensuring that higher bids weakly increase an ad's likelihood of being shown without requiring model retraining. Experiments demonstrate that GEM-Rec allows platforms to dynamically optimize for semantic relevance and platform revenue.

Read abstractHide abstract

0

Persona Vectors in Games: Measuring and Steering Strategies via Activation Vectors

cs.AI cs.GT Johnathan Sun, Andrew Zhang · Mar 22, 2026

The paper tackles the challenge of controlling high-level behavioral traits in LLM agents deployed in strategic settings. Rather than treating models as black boxes via prompting, the authors construct 'persona vectors'—linear directions in activation space—for traits like altruism and forgiveness using contrastive activation addition. Applied to six canonical games, these vectors allow both measurement of behavioral tendencies and causal steering of decisions, offering a mechanistic handle on strategic behavior.

Large language models (LLMs) are increasingly deployed as autonomous decision-makers in strategic settings, yet we have limited tools for understanding their high-level behavioral traits. We use activation steering methods in game-theoretic settings, constructing persona vectors for altruism, forgiveness, and expectations of others by contrastive activation addition. Evaluating on canonical games, we find that activation steering systematically shifts both quantitative strategic choices and natural-language justifications. However, we also observe that rhetoric and strategy can diverge under steering. In addition, vectors for self-behavior and expectations of others are partially distinct. Our results suggest that persona vectors offer a promising mechanistic handle on high-level traits in strategic environments.

Read abstractHide abstract

0

The Intelligent Disobedience Game: Formulating Disobedience in Stackelberg Games and Markov Decision Processes

cs.AI cs.GT cs.LG Benedikt Hornig, Reuth Mirsky · Mar 22, 2026

This paper addresses the challenge of "intelligent disobedience" in shared autonomy — when assistive AI must override human commands to prevent harm but remain helpful. The authors formalize this as the Intelligent Disobedience Game (IDG), a sequential Stackelberg game where a human leader proposes actions and an assistive follower with superior environmental awareness decides whether to obey or intervene. The framework aims to provide the mathematical foundations for training safety-critical assistive systems.

In shared autonomy, a critical tension arises when an automated assistant must choose between obeying a human's instruction and deliberately overriding it to prevent harm. This safety-critical behavior is known as intelligent disobedience. To formalize this dynamic, this paper introduces the Intelligent Disobedience Game (IDG), a sequential game-theoretic framework based on Stackelberg games that models the interaction between a human leader and an assistive follower operating under asymmetric information. It characterizes optimal strategies for both agents across multi-step scenarios, identifying strategic phenomena such as ``safety traps,'' where the system indefinitely avoids harm but fails to achieve the human's goal. The IDG provides a needed mathematical foundation that enables both the algorithmic development of agents that can learn safe non-compliance and the empirical study of how humans perceive and trust disobedient AI. The paper further translates the IDG into a shared control Multi-Agent Markov Decision Process representation, forming a compact computational testbed for training reinforcement learning agents.

Read abstractHide abstract

Nothing here yet