GTSR: Subsurface Scattering Awared 3D Gaussians for Translucent Surface Reconstruction

cs.CV Youwen Yuan, Xi Zhao · Mar 23, 2026

What it does

Why it matters

The method achieves state-of-the-art surface reconstruction on the NeuralTO Syn dataset while training in approximately 2. 5 hours, significantly faster than prior neural implicit approaches.

Main concern

Community signal

0 up · 0 down

AI Review AI reviewed

Plain-language introduction

Reconstructing translucent objects from multi-view images is challenging because subsurface scattering causes standard surface reconstruction methods to fail. This paper proposes GTSR, a 3D Gaussian Splatting (3DGS) pipeline that separates surface geometry from scattering effects by using two Gaussian sets—surface Gaussians for geometry and interior Gaussians for scattering—blended via a Fresnel term. A physically-based rendering (PBR) module with deferred shading further constrains the geometry. The method achieves state-of-the-art surface reconstruction on the NeuralTO Syn dataset while training in approximately 2.5 hours, significantly faster than prior neural implicit approaches.

Critical review

Verdict

Bottom line

GTSR presents a pragmatic solution to the inherent conflict between rendering quality and geometric accuracy for translucent objects. The decomposition into surface and interior Gaussians, combined with Fresnel-weighted blending, effectively prevents the 'hollow shell' problem where standard 3DGS places high-opacity Gaussians inside objects to model scattering. The quantitative results on Chamfer distance are compelling, with the method outperforming NeuralTO on most scenes (e.g., reducing error from 13.2 to 3.3 on 'Nail'). However, the reliance on co-located lighting (flash photography setup) and the omission of refraction effects significantly limit applicability to general scenes with environmental illumination or transparent materials like glass.

“First, when reconstructing translucent objects, the optimization objective for achieving higher-quality rendering may conflict with the requirements of surface reconstruction.”

paper · Section 1

“Our model also simplifies light transport in translucent objects, omitting refraction and back-side illumination, which limits performance on transparent objects like glass.”

paper · Section 6

What holds up

The dual-Gaussian representation is well-motivated and addresses a genuine failure mode of prior 3DGS methods. The ablation study rigorously validates each component: removing the Fresnel term degrades Chamfer distance from 0.0020 to 0.0075, while removing interior Gaussians degrades it to 0.0086. The PBR module provides necessary geometric constraints where multi-view photometric consistency fails for translucent materials. The deferred rendering approach is computationally efficient and stable. The evaluation on NeuralTO Syn is comprehensive, covering six objects with varying material properties.

“w/o Fresnel: 0.0075 ... w/o G_{in}: 0.0086 ... full: 0.0020”

paper · Table 3

“In contrast, 3DGS assigns low opacity (often below 0.2) to most surface Gaussians while placing high-opacity Gaussians inside the object.”

paper · Section 4.1

Main concerns

The method assumes co-located lighting (flash at camera position), which is strictly enforced in the dataset but rarely available in casual capture scenarios. This is a significant limitation compared to general surface reconstruction methods. The scattering model is simplified—Equation (14) uses a single hyperparameter $\gamma = 0.3$ to blend diffuse and scattering terms, and the paper admits subsurface diffusion is simplified. Real-world validation is minimal (only 3 scenes, no quantitative metrics), raising concerns about generalization to complex real-world scattering effects. The claim of 'real-time rendering' is stated but not quantified with FPS metrics in the main text.

“we follow NeuralTO(Cai et al., 2024) to adopt a co-located lighting assumption, which assumes that there is only a point light located at the camera position and no ambient lighting.”

paper · Section 4.3

“Subsurface diffusion is too complex and difficult to implement based on 3DGS rendering pipeline, so we only use a simplified approach.”

paper · Section 4.3

Evidence and comparison

The comparison to NeuralTO is fair as both methods operate under the same co-located lighting assumption and use the same synthetic dataset. GTSR improves upon NeuralTO's average Chamfer distance significantly (Table 1). However, comparisons to general 3DGS methods like PGSR are somewhat misleading because PGSR is not designed for translucent materials and fails catastrophically on this dataset (e.g., Chamfer distance 23.4 vs 3.3 on 'Nail'), which is expected rather than a meaningful failure. The ablation study is thorough, isolating the contributions of the Fresnel term, interior Gaussians, and PBR module. Novel view synthesis results (Table 2) show PSNR of 45.28, outperforming baselines, though the paper notes this uses $C_{\mathrm{SH}}$ rendering which differs from the PBR path used for geometry optimization.

“Our method performs best on most scenes and has the lowest average Chamfer distance.”

paper · Table 1

“We use $C_{\mathrm{SH}}$ as our rendering results in the evaluation.”

paper · Section 5.2

Reproducibility

The paper provides detailed hyperparameters ($\lambda_1 = 0.2$, $\lambda_3, \lambda_4, \lambda_5 = 0.01$, etc.) and describes the two-stage training strategy. The training time (~2.5 hours) and memory requirements (<8GB) are specified for an Nvidia RTX A6000, providing a baseline for resource planning. However, there is no mention of code availability or dataset release, which significantly hinders reproducibility. The reliance on random point cloud initialization (due to COLMAP failing on textureless translucent objects) is a critical detail that must be exactly replicated to achieve similar results, yet the specific initialization protocol is not fully detailed.

“COLMAP fails to reconstruct the initial points for 3DGS because of the lack of surface textures, so we adopted random point clouds to initialize the scene.”

paper · Section 5

“The training takes approximately 2.5 hours and consumes less than 8 GB of video memory.”

paper · Section 5

Abstract

Reconstructing translucent objects from multi-view images is a difficult problem. Previously, researchers have used differentiable path tracing and the neural implicit field, which require relatively large computational costs. Recently, many works have achieved good reconstruction results for opaque objects based on a 3DGS pipeline with much higher efficiency. However, such methods have difficulty dealing with translucent objects, because they do not consider the optical properties of translucent objects. In this paper, we propose a novel 3DGS-based pipeline (GTSR) to reconstruct the surface geometry of translucent objects. GTSR combines two sets of Gaussians, surface and interior Gaussians, which are used to model the surface and scattering color when lights pass translucent objects. To render the appearance of translucent objects, we introduce a method that uses the Fresnel term to blend two sets of Gaussians. Furthermore, to improve the reconstructed details of non-contour areas, we introduce the Disney BSDF model with deferred rendering to enhance constraints of the normal and depth. Experimental results demonstrate that our method outperforms baseline reconstruction methods on the NeuralTO Syn dataset while showing great real-time rendering performance. We also extend the dataset with new translucent objects of varying material properties and demonstrate our method can adapt to different translucent materials.

Challenge the Review

Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.

Challenges are public to read, but only signed-in members can post them. Your challenge text is stored with your account for moderation, but usernames are not shown in the public thread.

No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.