Agentic Personas for Adaptive Scientific Explanations with Knowledge Graphs
This paper tackles the limitation that XAI systems assume static user models, ignoring diverse epistemic stances among domain experts. The authors propose agentic personas—structured representations of expert reasoning strategies derived from clustered feedback and instantiated via LLMs—to condition reinforcement learning-based explanation generation on knowledge graphs. This enables adaptive explanations that align with specific interpretive preferences (mechanistic rigor vs. focused clarity) without requiring extensive individual-level human feedback, demonstrated in drug discovery with 22 expert participants.
The paper offers a conceptually elegant framework for adaptive explainability that successfully combines RLHF principles with KG-based reasoning. However, empirical validation is severely compromised by sample size limitations—particularly the Leo persona derived from merely two expert responses—and troubling inter-annotator agreement metrics that suggest poor reliability. While the approach demonstrates feasibility, the evidence does not fully support claims of scalable, faithful expert proxying.
The conceptual framework rigorously formalizes adaptive explanations as $\mathcal{X}(h|\theta)=\{p_1,\dots,p_m\}$ where paths maximize $f(\alpha(p),\beta(p),\gamma(p,\theta))$, incorporating epistemic alignment $\gamma$ alongside fidelity $\alpha$ and relevance $\beta$. The Elena persona (based on 13 experts) shows credible alignment with human ratings ($r=0.74$–$0.80$) and the RL formulation preserves predictive performance (Hits@1 0.358 vs. REx 0.338) while yielding expert-preferred explanations (63–76% preference). The curriculum learning strategy for gradually increasing the relevance threshold $\tau_{\text{relevance}}(t)$ is technically sound.
The Leo persona rests on statistically insufficient data (2 experts) yet generalizes to 12 evaluation participants, raising serious validity concerns. Inter-annotator agreement (ICC) for Leo is catastrophic: negative values for relevance ($-1.40$) and validity ($-0.17$), indicating the persona fails to elicit consistent evaluations across participants. The authors acknowledge this reflects that 'different participants applied their own interpretive frameworks,' undermining the persona's utility as a stable proxy. The claim of 'two orders of magnitude' feedback reduction conflates speed with quality—substituting LLM judgments for human feedback is a methodological substitution, not merely efficiency. The claim that 5,000 human interactions would be needed remains hypothetical with no empirical validation that such training would be equivalent.
Comparisons to the non-adaptive REx baseline are methodologically fair (identical architecture), but the study lacks a true human-feedback-trained baseline, making the 'scalability' claims unverified extrapolations. Task-dependent degradation in persona-expert correlations (Elena's validity drops from $r=0.74$ to $r=0.56$ on DTI tasks) suggests limited generalizability across biomedical contexts. The preference study uses a within-subjects design that may bias against the baseline. Correlation with expert ratings is presented as validation, but correlation does not imply that the persona captures the underlying reasoning strategy—only that rankings align on specific instances.
Critical implementation details are relegated to supplementary materials with no accessible link in the main text, including prompt templates for persona generation and evaluation, precise RL hyperparameters, and curriculum learning schedules. The reliance on proprietary APIs (OpenAI o3-pro for persona synthesis, GPT-4o-mini for evaluation) creates barriers to reproduction. While Hetionet is public, the specific train/validation splits and the 10 hypotheses used in the user study are not specified, making exact reproduction impossible. The claimed 187× speedup (250 hours vs. 1.34 hours) depends on unspecified assumptions about expert interaction time.
AI explanation methods often assume a static user model, producing non-adaptive explanations regardless of expert goals, reasoning strategies, or decision contexts. Knowledge graph-based explanations, despite their capacity for grounded, path-based reasoning, inherit this limitation. In complex domains such as scientific discovery, this assumption fails to capture the diversity of cognitive strategies and epistemic stances among experts, preventing explanations that foster deeper understanding and informed decision-making. However, the scarcity of human experts limits the use of direct human feedback to produce adaptive explanations. We present a reinforcement learning approach for scientific explanation generation that incorporates agentic personas, structured representations of expert reasoning strategies, that guide the explanation agent towards specific epistemic preferences. In an evaluation of knowledge graph-based explanations for drug discovery, we tested two personas that capture distinct epistemic stances derived from expert feedback. Results show that persona-driven explanations match state-of-the-art predictive performance while persona preferences closely align with those of their corresponding experts. Adaptive explanations were consistently preferred over non-adaptive baselines (n = 22), and persona-based training reduces feedback requirements by two orders of magnitude. These findings demonstrate how agentic personas enable scalable adaptive explainability for AI systems in complex and high-stakes domains.
Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.
No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.