GTSR: Subsurface Scattering Awared 3D Gaussians for Translucent Surface Reconstruction
Reconstructing translucent objects from multi-view images is challenging because subsurface scattering causes standard surface reconstruction methods to fail. This paper proposes GTSR, a 3D Gaussian Splatting (3DGS) pipeline that separates surface geometry from scattering effects by using two Gaussian sets—surface Gaussians for geometry and interior Gaussians for scattering—blended via a Fresnel term. A physically-based rendering (PBR) module with deferred shading further constrains the geometry. The method achieves state-of-the-art surface reconstruction on the NeuralTO Syn dataset while training in approximately 2.5 hours, significantly faster than prior neural implicit approaches.
GTSR presents a pragmatic solution to the inherent conflict between rendering quality and geometric accuracy for translucent objects. The decomposition into surface and interior Gaussians, combined with Fresnel-weighted blending, effectively prevents the 'hollow shell' problem where standard 3DGS places high-opacity Gaussians inside objects to model scattering. The quantitative results on Chamfer distance are compelling, with the method outperforming NeuralTO on most scenes (e.g., reducing error from 13.2 to 3.3 on 'Nail'). However, the reliance on co-located lighting (flash photography setup) and the omission of refraction effects significantly limit applicability to general scenes with environmental illumination or transparent materials like glass.
The dual-Gaussian representation is well-motivated and addresses a genuine failure mode of prior 3DGS methods. The ablation study rigorously validates each component: removing the Fresnel term degrades Chamfer distance from 0.0020 to 0.0075, while removing interior Gaussians degrades it to 0.0086. The PBR module provides necessary geometric constraints where multi-view photometric consistency fails for translucent materials. The deferred rendering approach is computationally efficient and stable. The evaluation on NeuralTO Syn is comprehensive, covering six objects with varying material properties.
The method assumes co-located lighting (flash at camera position), which is strictly enforced in the dataset but rarely available in casual capture scenarios. This is a significant limitation compared to general surface reconstruction methods. The scattering model is simplified—Equation (14) uses a single hyperparameter $\gamma = 0.3$ to blend diffuse and scattering terms, and the paper admits subsurface diffusion is simplified. Real-world validation is minimal (only 3 scenes, no quantitative metrics), raising concerns about generalization to complex real-world scattering effects. The claim of 'real-time rendering' is stated but not quantified with FPS metrics in the main text.
The comparison to NeuralTO is fair as both methods operate under the same co-located lighting assumption and use the same synthetic dataset. GTSR improves upon NeuralTO's average Chamfer distance significantly (Table 1). However, comparisons to general 3DGS methods like PGSR are somewhat misleading because PGSR is not designed for translucent materials and fails catastrophically on this dataset (e.g., Chamfer distance 23.4 vs 3.3 on 'Nail'), which is expected rather than a meaningful failure. The ablation study is thorough, isolating the contributions of the Fresnel term, interior Gaussians, and PBR module. Novel view synthesis results (Table 2) show PSNR of 45.28, outperforming baselines, though the paper notes this uses $C_{\mathrm{SH}}$ rendering which differs from the PBR path used for geometry optimization.
The paper provides detailed hyperparameters ($\lambda_1 = 0.2$, $\lambda_3, \lambda_4, \lambda_5 = 0.01$, etc.) and describes the two-stage training strategy. The training time (~2.5 hours) and memory requirements (<8GB) are specified for an Nvidia RTX A6000, providing a baseline for resource planning. However, there is no mention of code availability or dataset release, which significantly hinders reproducibility. The reliance on random point cloud initialization (due to COLMAP failing on textureless translucent objects) is a critical detail that must be exactly replicated to achieve similar results, yet the specific initialization protocol is not fully detailed.
Reconstructing translucent objects from multi-view images is a difficult problem. Previously, researchers have used differentiable path tracing and the neural implicit field, which require relatively large computational costs. Recently, many works have achieved good reconstruction results for opaque objects based on a 3DGS pipeline with much higher efficiency. However, such methods have difficulty dealing with translucent objects, because they do not consider the optical properties of translucent objects. In this paper, we propose a novel 3DGS-based pipeline (GTSR) to reconstruct the surface geometry of translucent objects. GTSR combines two sets of Gaussians, surface and interior Gaussians, which are used to model the surface and scattering color when lights pass translucent objects. To render the appearance of translucent objects, we introduce a method that uses the Fresnel term to blend two sets of Gaussians. Furthermore, to improve the reconstructed details of non-contour areas, we introduce the Disney BSDF model with deferred rendering to enhance constraints of the normal and depth. Experimental results demonstrate that our method outperforms baseline reconstruction methods on the NeuralTO Syn dataset while showing great real-time rendering performance. We also extend the dataset with new translucent objects of varying material properties and demonstrate our method can adapt to different translucent materials.
Pick a starting point or write your own. Challenges run in the background, so you can keep reading while the AI investigates.
No challenges yet. Disagree with the review? Ask the AI to revisit a specific claim.