HUBEI AGRICULTURAL SCIENCES ›› 2026, Vol. 65 ›› Issue (2): 202-208.doi: 10.14088/j.cnki.issn0439-8114.2026.02.030

• Information Engineering • Previous Articles     Next Articles

A non-contact estimation method for feed residue based on dual-modal MobileViTv2

CAI Xiao-jin1, BAI Tao1,2,3, LI Xiang1, QIAO Rui-qiang1   

  1. 1. College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China;
    2. Engineering Research Center of Intelligent Agriculture, MOE, Urumqi 830052, China;
    3. Xinjiang Engineering Research Center for Agricultural Informatization, Urumqi 830052, China
  • Received:2025-09-22 Online:2026-03-04 Published:2026-03-04

Abstract: Aiming at the problems of traditional feed residue detection methods relying on contact sensors, high cost, and the need to modify feeding troughs, a lightweight convolutional fusion regression model (dual-modal MobileViTv2 + CMFIM + SE) based on dual-modal MobileViTv2 was proposed to achieve non-contact and high-precision automatic estimation of feed residue. Taking RGB images and depth images as input, the model extracted multi-scale features respectively through the dual-modal MobileViTv2 and introduced a cross-modal multi-scale feature interaction module (CMFIM) at four levels to achieve spatial-channel dual interaction between RGB and depth features. An SE module was employed to adaptively calibrate channel weights and enhance high-level semantic representation capability. The prediction results were output through a multilayer perceptron regression head. On the self-built dataset, the mean absolute error (MAE) and root mean square error (RMSE) of the dual-modal MobileViTv2 + CMFIM + SE model were 98.24 g and 140.21 g, respectively, which represented reductions of 21.65% and 16.73% compared to the dual-modal MobileViTv2 model without the CMFIM and SE modules, and the parameter size of the model was only 9.9×106. The model combined the advantages of high accuracy, strong robustness, and lightweight design, providing a feasible technical pathway for precision feeding in intelligent livestock farming.

Key words: dual-modal MobileViTv2, feed residue, non-contact estimation, RGB images, depth images

CLC Number: