基于多通道差融合的三维室内语义场景补全方法

doi:10.14088/j.cnki.issn0439-8114.2026.02.029

湖北农业科学 ›› 2026, Vol. 65 ›› Issue (2): 195-201.doi: 10.14088/j.cnki.issn0439-8114.2026.02.029

基于多通道差融合的三维室内语义场景补全方法

王长栓¹, 陆运鹤², 姜建武²

1.广西壮族自治区地理信息测绘院,广西柳州 545006;
2.桂林理工大学测绘地理信息学院,广西桂林 541006

收稿日期:2025-09-15 出版日期:2026-03-04 发布日期:2026-03-04
作者简介:王长栓（1980-）,男,湖北十堰人,正高级工程师,主要从事测绘地理信息应用研究,（电话）18077225183（电子信箱）9798727@qq.com。
基金资助:
广西自然科学基金项目（2025GXNSFBA069341; 2023GXNSFBA026325）; 中央引导地方科技发展资金专项（2022SRZ0101）

3D indoor semantic scene completion method based on multi-channel difference fusion

WANG Chang-shuan¹, LU Yun-he², JIANG Jian-wu²

1. Guangxi Institute of Surveying Mapping and Geoinformation, Liuzhou 545006, Guangxi, China;
2. College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, Guangxi, China

Received:2025-09-15 Published:2026-03-04 Online:2026-03-04

摘要/Abstract

摘要： 针对复杂室内环境中因物体遮挡与空间结构紧凑导致的三维感知信息缺失与语义理解不足问题,提出基于RGB-D输入的多通道差融合语义场景补全模型（MCDFNet）。该模型设计多通道差融合（MCDF）模块,在统一RGB-D表征基础上,提取RGB、Depth及其融合通道之间的差异特征,有效增强对遮挡区域的几何结构与语义一致性建模能力。在NYUCAD数据集上的试验表明,MCDFNet模型在保持单场景补全推理时间为1.9 s的前提下,准确率（Accuracy）达72.8%,精准率（Precision）达77.1%,平均交并比（mIoU）达43.4%,优于AICNet、DDRNet、GRFNet等主流模型。消融试验表明,引入MCDF模块可使mIoU提升1.5个百分点,证明其在提升补全精度方面具有关键作用。该模型能够在高遮挡室内环境中稳定运行,提升三维地图的完整性与实用价值,适用于多类典型室内应用场景。

关键词: 多通道差融合, 三维, 室内, 语义场景补全, RGB, Depth

Abstract: To address the issues of missing 3D perceptual information and insufficient semantic understanding caused by object occlusion and compact spatial structures in complex indoor environments, a multi-channel difference fusion network for semantic scene completion (MCDFNet) based on RGB-D input was proposed. The model designed a multi-channel difference fusion (MCDF) module, which, based on unified RGB-D representation, extracted differential features among RGB, Depth, and their fused channels to effectively enhance the modeling capability for the geometric structure and semantic consistency of occluded regions. Experiments on the NYUCAD dataset showed that the MCDFNet model achieved an accuracy of 72.8%, a precision of 77.1%, and a mean Intersection over Union (mIoU) of 43.4% while maintaining a single-scene completion inference time of 1.9 s, outperforming mainstream models such as AICNet, DDRNet, and GRFNet. Ablation studies demonstrated that introducing the MCDF module could improve the mIoU by 1.5 percentage points, proving its critical role in enhancing completion accuracy. The model could operate stably in highly occluded indoor environments, improving the completeness and practical value of 3D maps, and was suitable for various typical indoor application scenarios.

Key words: multi-channel difference fusion, 3D, indoor, semantic scene completion, RGB, Depth

中图分类号:

TP391

王长栓, 陆运鹤, 姜建武. 基于多通道差融合的三维室内语义场景补全方法[J]. 湖北农业科学, 2026, 65(2): 195-201.

WANG Chang-shuan, LU Yun-he, JIANG Jian-wu. 3D indoor semantic scene completion method based on multi-channel difference fusion[J]. HUBEI AGRICULTURAL SCIENCES, 2026, 65(2): 195-201.

参考文献

[1] ROLDÃO L, DE CHARETTE R, VERROUST-BLONDET A. 3D semantic scene completion: A survey[J]. International journal of computer vision, 2022, 130(8): 1978-2005.
[2] GARG S, SÜNDERHAUF N, DAYOUB F, et al. Semantics for robotic mapping, perception and interaction: A survey[J]. Foundations and trends in robotics, 2020, 8(1-2): 1-224.
[3] SONG S R, YU F, ZENG A, et al.Semantic scene completion from a single depth image[A].2017 IEEE conference on computer vision and pattern recognition (CVPR)[C]. Honolulu, HI, USA:IEEE, 2017.190-198.
[4] GUO Y X, TONG X. View-volume network for semantic scene completion from a single depth image[EB/OL]. (2018-06-14). https://arxiv.org/abs/1806.05361.
[5] LI J, LIU Y, YUAN X, et al.Depth based semantic scene completion with position importance aware loss[J]. IEEE robotics and automation letters, 2020, 5(1): 219-226.
[6] LI J, LIU Y, GONG D, et al.RGBD based dimensional decomposition residual network for 3D semantic scene completion[A].2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)[C]. Long Beach, CA, USA:IEEE, 2020.7685-7694.
[7] WANG X, LIN D, WAN L.Ffnet: Frequency fusion network for semantic scene completion[A]. Proceedings of the association for the advancement of artificial intelligence conference on artificial intelligence[C]. Vancouver, Canada: AAAI, 2022.2550-2557.
[8] DOURADO A, DE CAMPOS T E, KIM H, et al. EdgeNet: Semantic scene completion from a single RGB-D image[A].2020 25th international conference on pattern recognition (ICPR)[C]. Milan, Italy:IEEE, 2021.503-510.
[9] LI Y M, YU Z D, CHOY C, et al.VoxFormer: Sparse voxel transformer for camera-based 3D semantic scene completion[A].2023 IEEE/CVF conference on computer vision and pattern recognition(CVPR)[C].Vancouver, BC, Canada:IEEE, 2023.9087-9098.
[10] CAO A Q, DE CHARETTE R.MonoScene: Monocular 3D semantic scene completion[A].2022 IEEE/CVF conference on computer vision and pattern recognition(CVPR)[C].New Orleans, LA, USA:IEEE, 2022.3981-3991.
[11] LI J, WANG P, HAN K, et al.Anisotropic convolutional neural networks for RGB-D based semantic scene completion[J]. IEEE transactions on pattern analysis and machine intelligence, 2022, 44(11): 8125-8138.
[12] CAI Y J, CHEN X S, ZHANG C, et al.Semantic scene completion via integrating instances and scene in-the-loop[A].2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR)[C].Nashville, TN, USA:IEEE, 2021. 324-333.
[13] CHENG R, AGIA C, REN Y, et al.S3CNet: A sparse semantic scene completion network for Lidar point clouds[A]. Proceedings of the 2021 conference on robot learning[C]. London, UK: PMLR, 2021.2148-2161.
[14] ZHANG J H, ZHAO H, YAO A B, et al.Efficient semantic scene completion network with spatial group convolution[M]. Cham: Springer international publishing, 2018.749-765.
[15] LIANG Y Q,CHEN B Y, SONG S R.SSCNav: Confidence-aware semantic scene completion for visual semantic navigation[A].2021 IEEE international conference on robotics and automation(ICRA)[C]. Xi’an, China:IEEE, 2021. 13194-13200.
[16] ZHANG S L, LI S, HAO A M, et al.Point cloud semantic scene completion from RGB-D images[J]. Proceedings of the AAAI conference on artificial intelligence, 2021, 35(4): 3385-3393.
[17] YANG Z P, PAN J Z, LUO L J, et al.Extreme relative pose estimation for RGB-D scans via scene completion[A].2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)[C]. Long Beach, CA, USA:IEEE, 2020.4526-4535.
[18] WANG Y D, TAN D J, NAVAB N, et al.Adversarial semantic scene completion from a single depth image[A].2018 international conference on 3d vision(3DV)[C]. Verona, Italy:IEEE, 2018.426-434.
[19] SILBERMAN N, HOIEM D, KOHLI P, et al.Indoor segmentation and support inference from RGBD images[M].Berlin, Heidelberg: Springer, 2012.746-760.
[20] LIU Y, LI J, YAN Q S, et al. 3D gated recurrent fusion for semantic scene completion[EB/OL]. (2020-02-17). https://arxiv.org/abs/2002.07269.
[21] LI J, HAN K, WANG P, et al.Anisotropic convolutional networks for 3D semantic scene completion[A].2020 IEEE/CVF conference on computer vision and pattern recognition(CVPR)[C]. Seattle, WA, USA:IEEE, 2020. 3348-3356.

基于多通道差融合的三维室内语义场景补全方法

3D indoor semantic scene completion method based on multi-channel difference fusion

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	蔡晓锦, 白涛, 李想, 乔瑞强. 基于双模态MobileViTv2的饲料剩余量非接触式估算方法[J]. 湖北农业科学, 2026, 65(2): 202-208.
[2]	宋利沙, 蒋妮, 张占江, 韦树根, 丘卓秋, 黄琦, 詹鑫婕, 潘丽梅. 山豆根菌核病病原菌鉴定及室内药剂筛选[J]. 湖北农业科学, 2024, 63(9): 89-94.
[3]	尹书林, 董峦, 尤永鹏, 李佳航. 基于ORB-SLAM3的温室环境下番茄植株三维重建方法[J]. 湖北农业科学, 2024, 63(8): 96-103.
[4]	刘明, 李永可, 靳晟, 李悦, 余恩. 基于无人机摄影技术的农田场景可视化系统设计与实现[J]. 湖北农业科学, 2024, 63(8): 158-163.
[5]	雷琼, 林鑫, 王文俊, 屈佳楠, 张治有. 陕西省核桃黑斑病新病原的鉴定及室内药剂筛选[J]. 湖北农业科学, 2024, 63(7): 67-71.
[6]	许志萍, 赵晓军, 邢鲲, 赵飞. 19种杀虫剂对藜麦甜菜筒喙象的室内防治试验[J]. 湖北农业科学, 2024, 63(4): 67-72.
[7]	刘晓翠, 吾木提·艾山江, 尼加提·卡斯木. 基于三维光谱指数的春小麦SPAD高光谱估算[J]. 湖北农业科学, 2023, 62(9): 151-157.
[8]	李佳俊, 黄祥志, 赵亚萌, 雷国业, 赵小明. 基于CesiumJS和Electron框架的三维可视化信息平台构建[J]. 湖北农业科学, 2022, 61(7): 130-134.
[9]	赵朋飞, 王华俊, 张雄. 富马酸废水降解过程生物毒性物质的影响分析[J]. 湖北农业科学, 2022, 61(6): 139-142.
[10]	杨子萌, 杨芳绒, 赵赟, 郭明昊. 基于PSPL调研法和空间句法的传统村落景观微更新研究——以河南省洛阳市孟津县卫坡村为例[J]. 湖北农业科学, 2021, 60(6): 49-53.
[11]	张露, 夏正仪. 长江经济带自然资源可持续利用评价研究——基于改进三维生态足迹模型[J]. 湖北农业科学, 2021, 60(3): 179-189.
[12]	李文红, 张长华, 覃微为, 谢红炼, 董详立, 李治模, 孟建玉, 郭晓关. 14种杀虫剂对蠋蝽的安全性评价[J]. 湖北农业科学, 2021, 60(16): 89-92.
[13]	崔梦, 王勇, 张亚平. 响应曲面法优化三维电化学氧化体系处理印染废水[J]. 湖北农业科学, 2021, 60(12): 41-44.
[14]	常悦, 朱祥, 孟振国, 董倩, 王锦鹏, 李俊凯. 七种杀菌剂对荆州市小麦赤霉病的室内毒力及田间防效[J]. 湖北农业科学, 2021, 60(12): 73-75.
[15]	黄彪, 范聪颖, 兰天翔. 响应曲面法优化富马酸废水预处理工艺条件[J]. 湖北农业科学, 2020, 59(8): 66-70.