Suiren-1.0 Technical Report: A Family of Molecular Foundation Models
Suiren-1.0 introduces a family of molecular foundation models designed to bridge the gap between microscopic 3D quantum-mechanical conformations and macroscopic 2D molecular property prediction. The framework comprises Suiren-Base (a 1.8B-parameter SE(3)-equivariant GNN pre-trained on 70M DFT samples), Suiren-Dimer (continued pre-training on intermolecular interactions), and Suiren-ConfAvg (a lightweight 2D model distilled via a novel Conformation Compression Distillation diffusion framework). This work matters because it attempts to unify quantum-accurate representations with practical cheminformatics workflows where only SMILES or graph inputs are available.
The paper presents a technically sound and ambitious approach to molecular foundation modeling, achieving strong empirical results across 50+ tasks. The three-stage pipeline (microscopic pre-training → CCD distillation → macroscopic fine-tuning) is physically motivated and effectively addresses the modality gap between 3D conformations and 2D graphs. However, the magnitude of claimed improvements must be weighed against the significantly larger model scale (1.8B parameters) and training data (70M samples) compared to many baselines evaluated on subset data. The work represents a valuable contribution to open molecular AI, though some architectural claims regarding the novelty of specific components warrant deeper scrutiny against concurrent work.
The Conformation Compression Distillation (CCD) framework is a compelling innovation that uses diffusion-based generation to distill 3D structural knowledge into 2D representations, enabling Suiren-ConfAvg to infer conformational ensembles from simple SMILES strings. The empirical evaluation is extensive, covering 43 property prediction tasks on the newly introduced MoleHB benchmark and 18 ADMET tasks from TDC, with the model achieving best or second-best performance on the vast majority. The commitment to open science is commendable: all model weights, training code, and benchmarks are publicly released. The incorporation of physical priors through EMPP (masked atom reconstruction) and EST (Equivariant Spherical Transformer) within an MoE architecture demonstrates thoughtful integration of geometric deep learning principles.
The comparison against baselines like EquiformerV2 and eSCN is confounded by scale: Suiren-Base uses 1.8B parameters trained on 70M samples, while competing archmictures were trained on only 20M samples due to compute constraints (Section 3.3). This makes attribution of gains to architectural innovations versus scale difficult. Several performance claims obscure the full picture—for instance, while 41/43 SOTA results on MoleHB are reported, many improvements are marginal (e.g., 1-5%), while others show substantial degradation (e.g., -187.9% on Henry's law constant for gases, Section 5.2.1). The TDC ADMET results show SOTA on only 9/18 metrics, with the paper acknowledging 'negligible' gaps on others, which contradicts the broad 'consistent SOTA' narrative. Ablation studies isolating the contribution of specific components (MoE routing, basis-rotation, contrastive learning in Stage 2) are notably absent, making it difficult to assess which innovations drive performance versus standard scaling effects.
The evidence broadly supports the claim that Suiren-1.0 advances molecular property prediction, particularly on energetically sensitive quantities (enthalpy, Gibbs energy) where the quantum-mechanical pre-training objective provides clear inductive bias. The scaffold-split evaluation (Appendix C) provides rigorous evidence of generalization, with Suiren-ConfAvg achieving best MAE on 31/38 properties under distribution shift. However, comparisons to Uni-Mol and MoleBERT on MoleHB are complicated by differences in model capacity and pre-training data volume. The TDC comparisons are fairer (fixed hyperparameters per task type), though the model underperforms on specific clinically critical endpoints like Caco2 and AqSol compared to Uni-QSAR. The paper honestly reports these limitations but frames them as trade-offs for reproducibility rather than methodological failures.
Reproducibility is strong in terms of artifact availability: all weights are hosted on HuggingFace, training code is on GitHub, and the MoleHB benchmark is publicly released. Table 2 provides detailed hyperparameters (AdamW, lr=4e-4, cosine schedule, 200 epochs) for downstream fine-tuning. However, critical details for full reproduction are sparse: exact preprocessing pipelines for the 70M pre-training dataset (beyond citing Qo2mol), the specific chemical scope of the 13.5M dimer dataset, and computational cost estimates (total GPU-hours for the three-stage pipeline) are omitted. The MoE routing implementation details—specifically whether loss-balancing auxiliary losses were used—are not specified. While the CCD diffusion framework is described mathematically, the exact architecture of the lightweight dynamics network $\varphi_{\theta}$ (parameter count, layer dimensions) lacks specification, impeding exact reconstruction of the distillation stage.
We introduce Suiren-1.0, a family of molecular foundation models for the accurate modeling of diverse organic systems. Suiren-1.0 comprising three specialized variants (Suiren-Base, Suiren-Dimer, and Suiren-ConfAvg) is integrated within an algorithmic framework that bridges the gap between 3D conformational geometry and 2D statistical ensemble spaces. We first pre-train Suiren-Base (1.8B parameters) on a 70M-sample Density Functional Theory dataset using spatial self-supervision and SE(3)-equivariant architectures, achieving robust performance in quantum property prediction. Suiren-Dimer extends this capability through continued pre-training on 13.5M intermolecular interaction samples. To enable efficient downstream application, we propose Conformation Compression Distillation (CCD), a diffusion-based framework that distills complex 3D structural representations into 2D conformation-averaged representations. This yields the lightweight Suiren-ConfAvg, which generates high-fidelity representations from SMILES or molecular graphs. Our extensive evaluations demonstrate that Suiren-1.0 establishes state-of-the-art results across a range of tasks. All models and benchmarks are open-sourced.
Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.
No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.