SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs


CVPR 2025

1Peking University, 2Pengcheng Laboratory, 3University of Nottingham

TL;DR: We propose SPC-GS, an early and novel attempt at jointly reconstructing and understanding open-world indoor scenes from sparse-input images.

Fast3R Overview

Abstract

3D Gaussian Splatting-based indoor open-world free-view synthesis approaches have shown significant performance with dense input images. However, they exhibit poor performance when confronted with sparse inputs, primarily due to the sparse distribution of Gaussian points and insufficient view supervision. To relieve these challenges, we propose SPC-GS, leveraging Scene-layout-based Gaussian Initialization (SGI) and Semantic-Prompt Consistency (SPC) Regularization for open-world free view synthesis with sparse inputs. Specifically, SGI provides a dense, scene-layout-based Gaussian distribution by utilizing view-changed images generated from the video generation model and view-constraint Gaussian points densification. Additionally, SPC mitigates limited view supervision by employing semantic-prompt-based consistency constraints developed by SAM2. This approach leverages available semantics from training views, serving as instructive prompts, to optimize visually overlapping regions in novel views with 2D and 3D consistency constraints. Extensive experiments demonstrate the superior performance of SPC-GS across Replica and ScanNet benchmarks. Notably, our SPC-GS achieves a 3.06 dB gain in PSNR for reconstruction quality and a 7.3% improvement in mIoU for open-world semantic segmentation.

Framework Overview

Architecture

Framework of SPC-GS. (a) We first generate adjacent images of each training image using the video generation model. These generated images, combined with the original training images, produce denser initialized SfM points. These points are then optimized to create a scene-layout Gaussian distribution via Gaussian densification and outlier removal. (b) Building on the scene-layout Gaussian initialization, SPC leverages semantic information from training views as instructive semantic prompts to optimize adjacent rendered pseudo views, establishing semantic consistency constraints that enhance overall sparse-input semantic understanding of 3D scenes.

Visual Comparisons

Ours
Feature 3DGS
Ours
Gaussian Grouping
Ours
FSGS
Ours
Feature 3DGS
Ours
Gaussian Grouping
Ours
FSGS

Visual reconstruction comparison with Gaussian Grouping.

Visual segmentation comparison with Gaussian Grouping.

BibTeX

@inproceedings{liao2025spcgs,
      title={SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs},
      author={Guibiao Liao, Qing Li, Zhenyu Bao, Guoping Qiu, and Kanglin Liu},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
      year={2025},
}