基于双模态MobileViTv2的饲料剩余量非接触式估算方法

doi:10.14088/j.cnki.issn0439-8114.2026.02.030

湖北农业科学 ›› 2026, Vol. 65 ›› Issue (2): 202-208.doi: 10.14088/j.cnki.issn0439-8114.2026.02.030

基于双模态MobileViTv2的饲料剩余量非接触式估算方法

蔡晓锦¹, 白涛^1,2,3, 李想¹, 乔瑞强¹

1.新疆农业大学计算机与信息工程学院,乌鲁木齐 830052;
2.智能农业教育部工程研究中心,乌鲁木齐 830052;
3.新疆农业信息化工程技术研究中心,乌鲁木齐 830052

收稿日期:2025-09-22 出版日期:2026-03-04 发布日期:2026-03-04
通讯作者: 白涛（1979-）,男,新疆乌鲁木齐人,教授,主要从事农业大数据、数据挖掘研究,（电子信箱）bt@xjau.edu.cn。
作者简介:蔡晓锦（2000-）,女,河南漯河人,硕士,主要从事计算机视觉研究,（电子信箱）cxj1558403@163.com。
基金资助:
新疆维吾尔自治区重大科技专项(2022A02011-4); 科技部科技创新2030重大项目(2022ZD0115800); 新疆维吾尔自治区高校基本科研业务费科研项目(XJEDU2022J009)

A non-contact estimation method for feed residue based on dual-modal MobileViTv2

CAI Xiao-jin¹, BAI Tao^1,2,3, LI Xiang¹, QIAO Rui-qiang¹

1. College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China;
2. Engineering Research Center of Intelligent Agriculture, MOE, Urumqi 830052, China;
3. Xinjiang Engineering Research Center for Agricultural Informatization, Urumqi 830052, China

Received:2025-09-22 Published:2026-03-04 Online:2026-03-04

摘要/Abstract

摘要： 针对传统饲料剩余量检测方法依赖接触式传感器、成本高且需改造采食槽的问题,提出基于双模态MobileViTv2的轻量化卷积融合回归模型（双模态MobileViTv2+CMFIM+SE）,用于实现非接触、高精度的饲料剩余量自动估算。该模型以RGB图像与深度图像为输入,通过双模态MobileViTv2分别提取多尺度特征,并在4个层级引入跨模态多尺度特征交互模块（CMFIM）,实现RGB与深度特征的空间-通道双重交互;采用SE模块自适应校准通道权重,增强高层语义表征能力;通过多层感知机回归头输出预测结果。在自建数据集上,双模态MobileViTv2+CMFIM+SE模型的平均绝对误差（MAE）和均方根误差（RMSE）分别为98.24 g和140.21 g,较未引入CMFIM与SE模块的双模态MobileViTv2模型分别降低21.65%和16.73%,且模型参数量仅为9.9×10⁶。该模型兼具高精度、强鲁棒性与轻量化优势,为智慧养殖中精准饲喂提供了可行的技术路径。

关键词: 双模态MobileViTv2, 饲料剩余量, 非接触式估算, RGB图像, 深度图像

Abstract: Aiming at the problems of traditional feed residue detection methods relying on contact sensors, high cost, and the need to modify feeding troughs, a lightweight convolutional fusion regression model （dual-modal MobileViTv2 + CMFIM + SE） based on dual-modal MobileViTv2 was proposed to achieve non-contact and high-precision automatic estimation of feed residue. Taking RGB images and depth images as input, the model extracted multi-scale features respectively through the dual-modal MobileViTv2 and introduced a cross-modal multi-scale feature interaction module (CMFIM) at four levels to achieve spatial-channel dual interaction between RGB and depth features. An SE module was employed to adaptively calibrate channel weights and enhance high-level semantic representation capability. The prediction results were output through a multilayer perceptron regression head. On the self-built dataset, the mean absolute error (MAE) and root mean square error (RMSE) of the dual-modal MobileViTv2 + CMFIM + SE model were 98.24 g and 140.21 g, respectively, which represented reductions of 21.65% and 16.73% compared to the dual-modal MobileViTv2 model without the CMFIM and SE modules, and the parameter size of the model was only 9.9×10⁶. The model combined the advantages of high accuracy, strong robustness, and lightweight design, providing a feasible technical pathway for precision feeding in intelligent livestock farming.

Key words: dual-modal MobileViTv2, feed residue, non-contact estimation, RGB images, depth images

中图分类号:

TP391

蔡晓锦, 白涛, 李想, 乔瑞强. 基于双模态MobileViTv2的饲料剩余量非接触式估算方法[J]. 湖北农业科学, 2026, 65(2): 202-208.

CAI Xiao-jin, BAI Tao, LI Xiang, QIAO Rui-qiang. A non-contact estimation method for feed residue based on dual-modal MobileViTv2[J]. HUBEI AGRICULTURAL SCIENCES, 2026, 65(2): 202-208.

参考文献

[1] 努尔古再丽·阿力木,刘威.新疆肉羊产业发展现状分析与对策建议[J].中国集体经济,2022(23):18-20.
[2] 罗鹏辉,刘琦.新疆肉牛肉羊产业发展情况分析及建议[J].新疆畜牧业,2019,34(6):17-19,11.
[3] BACH A, IGLESIAS C, BUSTO I.Technical note: A computerized system for monitoring feeding behavior and individual feed intake of dairy cattle[J]. Journal of dairy science, 2004, 87(12): 4207-4209.
[4] MERENDA V R,FIGUEIREDO C C,GONZÁLEZ T D, et al. Technical note: Validation of a system for monitoring individual behavior of Holstein cows[J]. Journal of dairy science, 2020, 103(8): 7425-7430.
[5] BLOCH V, LEVIT H, HALACHMI I.Design a system for measuring individual cow feed intake in commercial dairies[J]. Animal, 2021, 15(7): 100277.
[6] 石建华. 物联网平台下颗粒型饲料生产线远程监控技术探究[J].饲料工业,2016,37(22):69-73.
[7] 崔锦辉,刘雨航,张海燕.一种饲料混合机下料分档运动自动控制技术研究[J].粮食与饲料工业,2024(6):44-48,52.
[8] 农钧麟,明鑫.牧畜饲料加工过程中的自动化控制与管理研究[J].农业技术与装备,2023(9):97-98,101.
[9] BEZEN R, EDAN Y, HALACHMI I.Computer vision system for measuring individual cow feed intake using RGB-D camera and deep learning algorithms[J]. Computers and electronics in agriculture, 2020, 172: 105345.
[10] SAAR M,EDAN Y,GODO A,et al.A machine vision system to predict individual cow feed intake of different feeds in a cowshed[J]. Animal, 2022, 16(1): 100432.
[11] 王鑫杰. 基于机器视觉的奶牛个体进食信息自动监测方法研究[D].哈尔滨:东北农业大学,2023.
[12] 任海林. 奶牛饲喂辅助机器人视觉识别系统开发与试验[D].广州:华南理工大学,2023.
[13] CAMPOS C,ELVIRA R,RODRÍGUEZ J J G,et al.ORB-SLAM3: An accurate open-source library for visual, visual-inertial, and multimap SLAM[J]. IEEE transactions on robotics,2021,37(6): 1874-1890.
[14] SHELLEY A N.Monitoring dairy cow feed intake using machine vision[D]. Lexington, Kentucky:University of Kentucky,2013.
[15] SHELLEY A N, LAU D L, STONE A E, et al.Short communication: Measuring feed volume and weight by machine vision[J]. Journal of dairy science, 2016, 99(1): 386-391.
[16] JAN LASSEN, THOMASEN J R, HANSEN R H, et al.Individual measure of feed intake on in-house commercial dairy cattle using 3D camera technology[A].Proceedings of the world congress on genetics applied to livestock production[C]. World congress on genetics applied to livestock production, 2018.
[17] MEHTA S, APPLE M. Separable self-attention for mobile vision transformers[EB/OL].(2022-06-06). https://arxiv.org/abs/2206.02680.
[18] ANSEL J, YANG E, HE H, et al.PyTorch 2: Faster machine learning through dynamic Python bytecode transformation and graph compilation[A].Proceedings of the 29th ACM international conference on architectural support for programming languages and operating systems, Volume 2[C]. La Jolla CA USA:ACM, 2024.929-947.
[19] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16x16 words: Transformers for image recognition atscale[J]. (2020-10-22).https:1110.485501arXiv.2010.11929, 2020.
[20] 陆声链,李沂杨,李帼,等.基于RGB与深度图像融合的生菜表型特征估算方法[J].农业机械学报,2025,56(1):84-91,101.
[21] WOO S, PARK J, LEE J Y, et al.CBAM: Convolutional block attention module[M].Cham: Springer international publishing, 2018.
[22] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[A].2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)[C]. Las Vegas, NV, USA: IEEE, 2016.770-778.
[23] TAN M, LE Q.Efficientnet: Rethinking model scaling for convolutional neural networks[A].International conference on machine learning[C]. PMLR, 2019. 6105-6114.
[24] DING X H, ZHANG X Y, MA N N, et al.RepVGG: Making VGG-style convNets great again[A].2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR)[C]. Nashville, TN, USA:IEEE, 2021.13728-13737.
[25] LIU Z, MAO H Z, WU C Y, et al.A ConvNet for the 2020s[A].2022 IEEE/CVF conference on computer vision and pattern recognition(CVPR)[C]. New Orleans,LA,USA:IEEE, 2022.119 66-11976.
[26] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[A]. 2018 IEEE/CVF conference on computer vision and pattern recognition[C]. Salt Lake City, UT, USA: IEEE, 2018.7132-7141.

基于双模态MobileViTv2的饲料剩余量非接触式估算方法

A non-contact estimation method for feed residue based on dual-modal MobileViTv2

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	王长栓, 陆运鹤, 姜建武. 基于多通道差融合的三维室内语义场景补全方法[J]. 湖北农业科学, 2026, 65(2): 195-201.
[2]	张千子, 朱云聪, 杜啟霞, 赵文军, 李丽华, 李学明, 邓邵文, 王剑松, 高云才, 曹静. YOLOv11-CoordAttention轻量化烟叶目标检测模型[J]. 湖北农业科学, 2026, 65(1): 152-158.
[3]	张鑫, 王丽霞. 基于低成本物联网与大数据的冬暖式温室智能灌溉系统设计与应用[J]. 湖北农业科学, 2026, 65(1): 166-170.
[4]	张千子, 邓邵文, 王文浩, 何倩, 郭燕, 高云才, 李湘伟, 唐晓燕, 常玉龙, 杨粟, 杜啟霞, 罗小枝. 基于因子分析和AdaBoost算法的烟叶颜色分类[J]. 湖北农业科学, 2025, 64(10): 195-200.
[5]	周洪, 孟小艳, 方伟舟, 丁晓晨. YOLOv8n-LF模型在机收小麦含杂率与破碎率检测中的应用[J]. 湖北农业科学, 2025, 64(10): 201-206.
[6]	李佳骏, 董辉, 余霖, 刘金涛, 李双, 杨毅. 基于CIW-YOLOv8n的棉花叶病害检测与识别方法[J]. 湖北农业科学, 2025, 64(10): 207-212.
[7]	方伟舟, 孟小艳, 周洪, 丁晓晨. 双目视觉的农田场景同步定位与稠密建图[J]. 湖北农业科学, 2025, 64(9): 185-194.
[8]	王春清, 尚书旗, 张茜雅, 刘伟, 岳丹松. 改进YOLOv8的步进式烤房烟叶烘烤阶段识别方法[J]. 湖北农业科学, 2025, 64(9): 202-212.
[9]	李杰, 董峦. 温宿县台兰河灌区作物的EAA-UNet遥感分类研究[J]. 湖北农业科学, 2025, 64(9): 213-219.
[10]	章磊, 冷欣, 陈佳凯, 李宗轩. 基于改进RT-DETR模型的油菜田间杂草识别研究[J]. 湖北农业科学, 2025, 64(8): 1-9.
[11]	李鹏飞, 曾靖. 基于改进YOLOv8的轻量化水稻病虫害识别模型研究[J]. 湖北农业科学, 2025, 64(8): 10-16.
[12]	邹玮, 岳延滨, 李莉婕, 陈维榕, 韩威, 朱存洲. 基于机器视觉技术的辣椒果实炭疽病病害分级方法研究[J]. 湖北农业科学, 2025, 64(8): 17-23.
[13]	霍静琦, 崔婷婷, 薛志璐. 基于改进YOLOv8模型的黄花菜花蕾识别研究[J]. 湖北农业科学, 2025, 64(7): 186-191.
[14]	杨爽, 周中林. 基于改进YOLOv8s-Seg模型的番茄成熟度检测[J]. 湖北农业科学, 2025, 64(6): 178-184.
[15]	张露, 吴雪莲. 基于改进YOLOv8模型的玉米叶斑病快速识别方法[J]. 湖北农业科学, 2025, 64(5): 160-166.