Politics of Questions in News: A Mixed-Methods Study of Interrogative Stances as Markers of Voice and Power

cs.CL cs.CY Bros Victor, Barbini Matilde, Gerard Patrick, Gatica-Perez Daniel · Mar 23, 2026
Local to this browser
What it does
This paper investigates how interrogative stances function as markers of voice and power in French-language digital news. Analyzing over 1.
Why it matters
2 million articles from 24 outlets (2023–2024) through a mixed-methods pipeline combining LLM pseudo-labeling and qualitative annotation, the authors operationalize pragmatic concepts like answerhood and dialogicity at scale. The study...
Main concern
The paper makes a valuable contribution by bridging computational linguistics and media sociology to study questioning practices in news. The mixed-methods design—using Qwen3-30B to generate pseudo-labels for fine-tuning CamemBERT-large...
Community signal
0
0 up · 0 down
Sign in to vote with arrows
AI Review AI reviewed
Plain-language introduction

This paper investigates how interrogative stances function as markers of voice and power in French-language digital news. Analyzing over 1.2 million articles from 24 outlets (2023–2024) through a mixed-methods pipeline combining LLM pseudo-labeling and qualitative annotation, the authors operationalize pragmatic concepts like answerhood and dialogicity at scale. The study reveals that questions are sparse but structurally significant, predominantly serving framing functions rather than information-seeking, and centering elite actors over diffuse publics.

Critical review
Verdict
Bottom line

The paper makes a valuable contribution by bridging computational linguistics and media sociology to study questioning practices in news. The mixed-methods design—using Qwen3-30B to generate pseudo-labels for fine-tuning CamemBERT-large classifiers—provides a pragmatic balance between scale and interpretability, particularly given the finding that the binary detector achieves F1 0.78 while the six-way stance classifier reaches only macro-F1 0.51 against human gold standards. However, the reliance on embedding-based answer detection risks conflating semantic similarity with pragmatic resolution, and the teacher-student pipeline may propagate systematic biases from the LLM into downstream analyses.

“the binary interrogative detector achieves 0.97 accuracy, with precision 0.76, recall 0.80, and F1-score 0.78 for the positive (interrogative) class ... the six-way stance classifier attains a macro-averaged F1 of 0.51”
Bros et al., Sec. 4.4 · Section 4.4
“ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frames detection ... zero-shot accuracy of ChatGPT exceeds that of crowd-workers for four out of five tasks”
Gilardi et al. 2023 · Gilardi et al., Sec. 1
What holds up

The conceptual framework successfully integrates pragmatic theories of questions with computational text analysis, demonstrating that interrogative stances can be operationalized meaningfully at corpus scale. The finding that framing-procedural questions dominate (accounting for "just over half of all interrogatives") while explicitly leading and tag questions remain rare ("about 2%") offers robust evidence that journalistic questions primarily structure exposition. The systematic variation across editorial scales—showing higher interrogative density in thematic outlets ($\bar{I} \approx 0.036$) compared to transnational ones ($\bar{I} \approx 0.020$)—holds up across sensitivity checks, with the authors noting that varying the confidence threshold keeps mean article-level density in a narrow band ($ID_a = 0.0236$–$0.0261$).

“framing-procedural ... 52.1 ... leading ... 2.1 ... tag ... 2.1”
Bros et al., Sec. 4.1 · Table 1
“varying the binary/stance confidence cutoff from 0.6 to 0.8 changes the absolute number of detected interrogatives but keeps mean article-level interrogative density in a narrow band ($ID_a=0.0236$-$0.0261$)”
Bros et al., Sec. 4.1 · Appendix Table 5
Main concerns

Several limitations constrain the interpretive validity of the quantitative measures. First, the answer identification heuristic relies on cosine similarity between CamemBERT embeddings ($\geq 0.40$ threshold) to detect "answer-like" spans, which the authors acknowledge captures "semantic relatedness rather than manually verified resolution"; the 95.6% answerability rate should be read as an "upper bound on local textual uptake rather than a literal estimate of fully resolved answerhood." Second, the six-way stance classifier exhibits substantial confusion between pragmatically adjacent categories, with "information-seeking and rhetorical cases often absorbed into framing-procedural," and the teacher-student pipeline risks propagating LLM biases into the final models. Third, the NER-based "addressivity" indices conflate mention with address—detecting that an entity appears near a question does not establish accountability relations, a limitation the authors note but do not fully resolve methodologically.

“the 95.6% figure as a heuristic upper bound on local textual uptake rather than a literal estimate of fully resolved answerhood”
Bros et al., Sec. 4.2 · Section 4.2
“information-seeking and rhetorical cases are often absorbed into framing-procedural, while leading questions are frequently mapped to rhetorical or framing-procedural”
Bros et al., Sec. 4.4 · Section 4.4
Evidence and comparison

The evidence supports the broad claim that interrogatives cluster around elite actors rather than diffuse publics, with "58.5% of questions mention at least one person" while "only 2.7% of questions mention a public or audience." The comparison to prior work on broadcast interviews (Clayman and Heritage) appropriately notes the shift from spoken interaction to written news, though the paper could more explicitly address how the absence of sequential turn-taking in written news affects the applicability of conversational accountability concepts. The authors effectively position their work against large-scale news studies that "treat all sentences alike, without distinguishing interrogatives from declaratives," but under-cite recent computational pragmatics work on question-answering in dialogue that might offer alternative methodological approaches.

“58.5% of questions mention at least one person ... only 2.7% of questions mention a public or audience”
Bros et al., Sec. 4.3 · Section 4.3
“Large-scale NLP studies of news ... focus mainly on topics, sentiment, or frames and usually treat all sentences alike, without distinguishing interrogatives from declaratives or separating different interrogative functions”
Bros et al., Sec. 2 · Section 2
Reproducibility

The study provides substantial methodological transparency: code and derived annotations are available at https://gitlab.idiap.ch/socialcomputing/politics-of-questions, with detailed hyperparameters for CamemBERT training (learning rate $2 \times 10^{-5}$, batch size 16, early stopping patience 3) and answer identification (cosine threshold 0.40, window lengths $L \in \{1,2,3,4,5\}$). However, full reproducibility is blocked by copyright restrictions on the raw news corpus (1.2M articles from CCNews and Swiss outlets), and the paper explicitly states it lacks specification of "total compute resources and hardware" due to time constraints. The use of a local Qwen3-30B-A3B-Instruct instance for pseudo-labeling prevents external verification of the training data generation step, which is critical given that student models inherit "systematic biases from the teacher model."

“learning rate $2\times 10^{-5}$ ... batch sizes 16 (train) and 32 (validation) ... early stopping (patience 3) ... cosine similarity $\geq 0.40$ ... window length $L\in\{1,2,3,4,5\}$”
Bros et al., Appendix B · Appendix B
“the teacher-student setup may propagate some of the pseudo-labeler's preferences into the final classifiers”
Bros et al., Sec. 5 · Section 5
“Did you include the total amount of compute and the type of resources used ... No, because the current draft does not yet specify hardware details or total compute. Time constraints for the publication prevented a reliable estimation of the compute resources”
Bros et al., Checklist · Checklist 4d
Abstract

Interrogatives in news discourse have been examined in linguistics and conversation analysis, but mostly in broadcast interviews and relatively small, often English-language corpora, while large-scale computational studies of news rarely distinguish interrogatives from declaratives or differentiate their functions. This paper brings these strands together through a mixed-methods study of the "Politics of Questions" in contemporary French-language digital news. Using over one million articles published between January 2023 and June 2024, we automatically detect interrogative stances, approximate their functional types, and locate textual answers when present, linking these quantitative measures to a qualitatively annotated subcorpus grounded in semantic and pragmatic theories of questions. Interrogatives are sparse but systematically patterned: they mainly introduce or organize issues, with most remaining cases being information-seeking or echo-like, while explicitly leading or tag questions are rare. Although their density and mix vary across outlets and topics, our heuristic suggests that questions are overwhelmingly taken up within the same article and usually linked to a subsequent answer-like span, most often in the journalist's narrative voice and less often through quoted speech. Interrogative contexts are densely populated with named individuals, organizations, and places, whereas publics and broad social groups are mentioned much less frequently, suggesting that interrogative discourse tends to foreground already prominent actors and places and thus exhibits strong personalization. We show how interrogative stance, textual uptake, and voice can be operationalized at corpus scale, and argue that combining computational methods with pragmatic and sociological perspectives can help account for how questioning practices structure contemporary news discourse.

Challenge the Review

Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.

No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.