Nothing here yet
This paper addresses interactive text-to-image retrieval (I-TIR) where diffusion models generate visual proxies from dialogue, but static additive fusion of text and generated images introduces harmful noise. The core idea is ADaFuSE, a lightweight plug-in module combining adaptive gating (to dynamically weight modalities per instance) with a semantic-aware mixture-of-experts branch (to capture fine-grained cross-modal cues). The work matters because it challenges the assumption that diffusion-augmented retrieval always benefits from generated images, showing that up to 55.62% of queries suffer degradation under static fusion.
This paper identifies "semantic shift"—the intrinsic evolution of meaning within a text—as the root cause of embedding pathologies like anisotropy and length-induced collapse. The authors argue that pooling-based aggregation forces "semantic smoothing," where diverse sentences compromise into a diluted representation. They formalize semantic shift as the product of local evolution and global dispersion ($\mathrm{Shift}(k) = \mathrm{Local}(k) \cdot \mathrm{Disp}(k)$), showing through controlled concatenation experiments that it predicts embedding concentration and retrieval degradation better than text length alone. The work reframes geometric pathologies not as inherent model defects but as consequences of content structure interacting with pooling mechanics.
Code retrieval currently relies on dense embeddings, but this paper proposes SPLADE-Code, the first large-scale learned sparse retrieval (LSR) family for code search (600M–8B parameters). The authors address unique challenges including subword fragmentation, semantic gaps between natural language and code, and latency issues from long code documents. Their lightweight single-stage training achieves 75.4 nDCG@10 on MTEB Code under 1B parameters (state-of-the-art for that size) and 79.0 with 8B parameters, while enabling sub-millisecond retrieval via inverted indices.
This paper addresses cross-lingual knowledge graph fusion, where heterogeneous KGs in different languages must be unified without expensive manually-curated seed alignments. The core idea is to use Large Language Models as a universal semantic bridge by linearizing graph triplets into natural language sequences and sequentially agglomerating multiple graphs. This matters because it promises zero-shot alignment capability for low-resource languages where traditional embedding-based methods fail due to lack of training data.
This paper addresses personalized information retrieval for XML documents by representing users, queries, and documents as weighted concept vectors derived from a domain ontology. The core idea is a hierarchical weighting scheme that favors specific (deeper) ontology concepts combined with a dynamic profile update mechanism that reinforces concepts based on user interactions. The work targets the limitation of traditional keyword-based systems that return identical results regardless of user knowledge or preferences.
Generative recommender systems like TIGER excel at semantic retrieval but ignore the economic realities of monetization via sponsored content. This paper proposes GEM-Rec, a unified framework that augments semantic IDs with control tokens (<ORG>, <AD>) to factorize slot allocation from item generation, and introduces Bid-Aware Decoding to inject real-time auction bids into inference. The work bridges the gap between generative recommendation and computational advertising, offering theoretical guarantees like allocative monotonicity while allowing dynamic trade-offs between user relevance and platform revenue.
AgenticRec attacks a key gap in LLM-based recommenders: existing agents rely on frozen reasoning chains and cannot learn from ranking feedback to refine tool use. The paper proposes a two-stage training framework that combines ReAct-style tool invocation with list-wise Group Relative Policy Optimization (GRPO) and Progressive Preference Refinement (PPR) for hard-negative mining. The work matters because it demonstrates that end-to-end reinforcement learning can align multi-step tool use with ranking objectives, moving beyond prompt-engineered agent workflows.
This paper tackles the lack of shared formalism for comparing hierarchical memory systems in language agents. It proposes a unifying theory based on three operators: extraction (α) that maps raw data to atomic units, coarsening (C = (π, ρ)) that partitions and summarizes units, and traversal (τ) that selects content under a token budget. The core insight is the self-sufficiency spectrum of representatives ρ, which constrains viable retrieval strategies—an observation the authors call the coarsening-traversal (C–T) coupling.
This paper introduces a scalable framework to measure institutional variation in solid-organ transplant patient education materials using retrieval-augmented generation (RAG). The authors ground 1,115 patient questions across 102 handbooks from 23 U.S. centers, then classify answer pairs into a five-label taxonomy (Absent, Consistent, Complementary, Divergent, Contradictory). The work exposes critical information gaps: 96.2% of question-handbook pairs miss relevant content, and 20.8% of non-absent pairs show clinically meaningful divergence, with reproductive health nearly absent (95.1%) across all materials.
Understanding collective human intent from noisy, conflicting public discourse represents a frontier AI challenge that extends beyond individual instruction-following. This paper introduces COIN-Bench, a live-updating benchmark comprising 200k+ real consumer discussions across 1,400+ products, which operationalizes an Active Probing Paradigm requiring LLMs to act as meta-analysts and reconstruct chaotic feedback into structured questionnaires. The work matters because it shifts evaluation from transactional action prediction to hierarchical consensus synthesis, testing whether models can resolve contradictions and infer latent trends from swarm-like intelligence.
This paper introduces ECI (Effective Contrastive Information), a training-free metric for evaluating hard-negative mining strategies in dense retrieval. The core idea is to leverage the logarithmic InfoNCE bound on mutual information combined with a harmonic mean of signal (hardness) and safety (margin) to predict downstream retrieval quality without expensive fine-tuning. The proposed metric addresses a real pain point in retrieval research: practitioners currently must run end-to-end ablation studies to evaluate negative sampling strategies, which is computationally wasteful.