HUBEI AGRICULTURAL SCIENCES ›› 2026, Vol. 65 ›› Issue (2): 195-201.doi: 10.14088/j.cnki.issn0439-8114.2026.02.029

• Information Engineering • Previous Articles     Next Articles

3D indoor semantic scene completion method based on multi-channel difference fusion

WANG Chang-shuan1, LU Yun-he2, JIANG Jian-wu2   

  1. 1. Guangxi Institute of Surveying Mapping and Geoinformation, Liuzhou 545006, Guangxi, China;
    2. College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, Guangxi, China
  • Received:2025-09-15 Online:2026-03-04 Published:2026-03-04

Abstract: To address the issues of missing 3D perceptual information and insufficient semantic understanding caused by object occlusion and compact spatial structures in complex indoor environments, a multi-channel difference fusion network for semantic scene completion (MCDFNet) based on RGB-D input was proposed. The model designed a multi-channel difference fusion (MCDF) module, which, based on unified RGB-D representation, extracted differential features among RGB, Depth, and their fused channels to effectively enhance the modeling capability for the geometric structure and semantic consistency of occluded regions. Experiments on the NYUCAD dataset showed that the MCDFNet model achieved an accuracy of 72.8%, a precision of 77.1%, and a mean Intersection over Union (mIoU) of 43.4% while maintaining a single-scene completion inference time of 1.9 s, outperforming mainstream models such as AICNet, DDRNet, and GRFNet. Ablation studies demonstrated that introducing the MCDF module could improve the mIoU by 1.5 percentage points, proving its critical role in enhancing completion accuracy. The model could operate stably in highly occluded indoor environments, improving the completeness and practical value of 3D maps, and was suitable for various typical indoor application scenarios.

Key words: multi-channel difference fusion, 3D, indoor, semantic scene completion, RGB, Depth

CLC Number: