湖北农业科学 ›› 2026, Vol. 65 ›› Issue (2): 195-201.doi: 10.14088/j.cnki.issn0439-8114.2026.02.029

• 信息工程 • 上一篇    下一篇

基于多通道差融合的三维室内语义场景补全方法

王长栓1, 陆运鹤2, 姜建武2   

  1. 1.广西壮族自治区地理信息测绘院,广西 柳州 545006;
    2.桂林理工大学测绘地理信息学院,广西 桂林 541006
  • 收稿日期:2025-09-15 出版日期:2026-03-04 发布日期:2026-03-04
  • 作者简介:王长栓(1980-),男,湖北十堰人,正高级工程师,主要从事测绘地理信息应用研究,(电话)18077225183(电子信箱)9798727@qq.com。
  • 基金资助:
    广西自然科学基金项目(2025GXNSFBA069341; 2023GXNSFBA026325); 中央引导地方科技发展资金专项(2022SRZ0101)

3D indoor semantic scene completion method based on multi-channel difference fusion

WANG Chang-shuan1, LU Yun-he2, JIANG Jian-wu2   

  1. 1. Guangxi Institute of Surveying Mapping and Geoinformation, Liuzhou 545006, Guangxi, China;
    2. College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, Guangxi, China
  • Received:2025-09-15 Published:2026-03-04 Online:2026-03-04

摘要: 针对复杂室内环境中因物体遮挡与空间结构紧凑导致的三维感知信息缺失与语义理解不足问题,提出基于RGB-D输入的多通道差融合语义场景补全模型(MCDFNet)。该模型设计多通道差融合(MCDF)模块,在统一RGB-D表征基础上,提取RGB、Depth及其融合通道之间的差异特征,有效增强对遮挡区域的几何结构与语义一致性建模能力。在NYUCAD数据集上的试验表明,MCDFNet模型在保持单场景补全推理时间为1.9 s的前提下,准确率(Accuracy)达72.8%,精准率(Precision)达77.1%,平均交并比(mIoU)达43.4%,优于AICNet、DDRNet、GRFNet等主流模型。消融试验表明,引入MCDF模块可使mIoU提升1.5个百分点,证明其在提升补全精度方面具有关键作用。该模型能够在高遮挡室内环境中稳定运行,提升三维地图的完整性与实用价值,适用于多类典型室内应用场景。

关键词: 多通道差融合, 三维, 室内, 语义场景补全, RGB, Depth

Abstract: To address the issues of missing 3D perceptual information and insufficient semantic understanding caused by object occlusion and compact spatial structures in complex indoor environments, a multi-channel difference fusion network for semantic scene completion (MCDFNet) based on RGB-D input was proposed. The model designed a multi-channel difference fusion (MCDF) module, which, based on unified RGB-D representation, extracted differential features among RGB, Depth, and their fused channels to effectively enhance the modeling capability for the geometric structure and semantic consistency of occluded regions. Experiments on the NYUCAD dataset showed that the MCDFNet model achieved an accuracy of 72.8%, a precision of 77.1%, and a mean Intersection over Union (mIoU) of 43.4% while maintaining a single-scene completion inference time of 1.9 s, outperforming mainstream models such as AICNet, DDRNet, and GRFNet. Ablation studies demonstrated that introducing the MCDF module could improve the mIoU by 1.5 percentage points, proving its critical role in enhancing completion accuracy. The model could operate stably in highly occluded indoor environments, improving the completeness and practical value of 3D maps, and was suitable for various typical indoor application scenarios.

Key words: multi-channel difference fusion, 3D, indoor, semantic scene completion, RGB, Depth

中图分类号: