Chronological Contrastive Learning: Few-Shot Progression Assessment in Irreversible Diseases

cs.CV cs.AI Clemens Watzenb\"ock, Daniel Aletaha, Micha\"el Deman, Thomas Deimel, Jana Eder, Ivana Janickova, Robert Janiczek, Peter Mandl, Philipp Seeb\"ock, Gabriela Supp, Paul Weiser, Georg Langs · Mar 23, 2026

What it does

Why it matters

By assuming monotonic progression in irreversible diseases, the method learns progression-aware representations from routinely archived clinical metadata. The core finding is that under few-shot scenarios—using labels from only 5...

Main concern

Community signal

0 up · 0 down

AI Review AI reviewed

Plain-language introduction

This paper introduces ChronoCon, a self-supervised method that repurposes Rank-N-Contrast learning to use temporal ordering of longitudinal medical scans instead of expert severity labels. By assuming monotonic progression in irreversible diseases, the method learns progression-aware representations from routinely archived clinical metadata. The core finding is that under few-shot scenarios—using labels from only 5 patients—the model achieves an ICC of 86% for disease severity prediction on rheumatoid arthritis radiographs, potentially reducing reliance on costly expert annotations.

Critical review

Verdict

Bottom line

The paper presents a technically sound and clinically motivated approach to leveraging longitudinal imaging archives. The two-stage pipeline—first learning from chronological order, then fine-tuning on scarce labels—is well-designed and the empirical gains in low-data regimes are substantial. However, the work is limited by single-center validation and reliance on strong monotonicity assumptions that may not hold across all irreversible diseases or treatment scenarios. The claim of parity with a model trained on 367 patients lacks a clear citation in the provided text, which weakens an otherwise compelling comparison.

“fine-tuning ChronoCon on expert scores from only five patients yields an intraclass correlation coefficient of 86% for severity score prediction”

paper · Abstract

“our experiments are based on a single-center dataset, and broader multi-center validation will be necessary”

paper · Section 4 (Discussion)

What holds up

The methodical extension of Rank-N-Contrast (RnC) from label space to temporal ordering is innovative and well-justified. The authors correctly identify that absolute time intervals are not meaningful distances in nonlinear disease progression, and their formulation of chronological vs. anti-chronological negatives ($\mathcal{S}^{<}_{ap}$ and $\mathcal{S}^{>}_{ap}$) elegantly handles this. The ablation studies distinguishing ChronoCon from RnC:t (which uses temporal distance) and SimCLR demonstrate that the specific ordering-based strategy drives performance. The observation that longitudinal performance peaks at intermediate label fractions (6–15%) and degrades with full supervision is intriguing and suggests the pretrained features capture progression信号 more faithfully than noisy full labels.

“longitudinal performance peaks at 6–15% of labels rather than at full supervision, suggesting that strong (and noisy) labels may override the features learned during pretraining”

paper · Section 3.1

“We explicitly avoid imposing a metric on $t$”

paper · Section 2 (Methods)

Main concerns

The primary limitation is the single-center dataset of 778 patients, which raises concerns about generalizability to different imaging protocols and populations. While the authors acknowledge this, the method's reliance on monotonic progression—a 'clinically plausible assumption' for RA—is not empirically validated against cases where treatment might arrest or reverse apparent progression. The comparison claim that the method 'performs on par with a recently published model (RMSE = 23.6) trained on 367 patients' appears in the text with a mangled footnote marker (444), and the cited work is not clearly identified in the bibliography provided, making the claim unverifiable. Additionally, the paper does not address whether the learned representations transfer to other joint diseases or imaging modalities beyond radiographs.

“The usefulness of ChronoCon depends on the presence of a valid ordering variable $t$ within subgroups of shared id”

paper · Section 4 (Discussion)

“performs on par with a recently published model (RMSE = 23.6) trained on 367 patients 444and substantially outperforms [Moradmand2025]”

paper · Section 3.1

“our experiments are based on a single-center dataset”

paper · Section 4 (Discussion)

Evidence and comparison

The internal comparisons are rigorous: ChronoCon outperforms SimCLR, DAE pretraining alone, and a supervised ImageNet baseline in low-label settings, with statistically significant differences ($p<10^{-4}$) in MSE. The distinction between ChronoCon and RnC:t (which uses $|t_a - t_n| \geq |t_a - t_p|$ for negative selection) is particularly important, as it demonstrates that treating time as an ordinal rather than metric quantity matters for nonlinear disease trajectories. However, external comparisons to prior RA scoring work are uneven: while the Moradmand2025 reference provides a clear benchmark (RMSE = 44.28), the claim of parity with a 367-patient model lacks proper citation. The comparison to 'a recently published model' with RMSE = 23.6 cannot be verified from the text provided.

“When all labels are used during pretraining (via $L^{\mathrm{OrdinalCon:Y}}$ or $L^{\mathrm{RnC}}$), these methods perform best”

paper · Section 3.3

“applying $L^{\mathrm{OrdinalCon:Y}}$ to the ordinal JSN/ERO scores yields the strongest overall results, improving cross-sectional MSE over RnC ($p=0.016$)”

paper · Section 3.3

Reproducibility

The code is publicly available at https://github.com/cirmuw/ChronoCon, and the training procedure—including the two-stage protocol, data augmentation pipeline, and hyperparameter ranges ($\tau \in [0.1,5]$, encoder LR $4\cdot 10^{-4}$)—is documented in detail in Appendix C. However, the dataset is proprietary: radiographs from the Medical University of Vienna are not publicly available due to ethical and legal constraints, which prevents independent reproduction of the exact results. The method relies on automatic joint localization using a Spatial Configuration Network (SCN), and while the landmark detection code is available separately, the lack of public data means the few-shot learning experiments cannot be independently validated.

“Due to ethical, legal, and data protection constraints, the data are not publicly available. Access may be granted upon reasonable request”

paper · Data Availability (end of paper)

“Code is available at https://github.com/cirmuw/ChronoCon”

paper · Abstract

Abstract

Quantitative disease severity scoring in medical imaging is costly, time-consuming, and subject to inter-reader variability. At the same time, clinical archives contain far more longitudinal imaging data than expert-annotated severity scores. Existing self-supervised methods typically ignore this chronological structure. We introduce ChronoCon, a contrastive learning approach that replaces label-based ranking losses with rankings derived solely from the visitation order of a patient's longitudinal scans. Under the clinically plausible assumption of monotonic progression in irreversible diseases, the method learns disease-relevant representations without using any expert labels. This generalizes the idea of Rank-N-Contrast from label distances to temporal ordering. Evaluated on rheumatoid arthritis radiographs for severity assessment, the learned representations substantially improve label efficiency. In low-label settings, ChronoCon significantly outperforms a fully supervised baseline initialized from ImageNet weights. In a few-shot learning experiment, fine-tuning ChronoCon on expert scores from only five patients yields an intraclass correlation coefficient of 86% for severity score prediction. These results demonstrate the potential of chronological contrastive learning to exploit routinely available imaging metadata to reduce annotation requirements in the irreversible disease domain. Code is available at https://github.com/cirmuw/ChronoCon.

Challenge the Review

Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.

Challenges are public to read, but only signed-in members can post them. Your challenge text is stored with your account for moderation, but usernames are not shown in the public thread.

No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.