湖北农业科学 ›› 2026, Vol. 65 ›› Issue (5): 196-204.doi: 10.14088/j.cnki.issn0439-8114.2026.05.030

• 农业工程 • 上一篇    下一篇

基于改进YOLOv5s的自然环境下火龙果全生育期目标检测研究

邹玮, 李莉婕, 岳延滨, 韩威, 王虎, 赵泽英   

  1. 贵州省农业科技信息研究所,贵阳 550006
  • 收稿日期:2026-02-06 出版日期:2026-05-25 发布日期:2026-05-26
  • 通讯作者: 赵泽英(1975-),男,研究员,硕士,主要从事农业信息、人工智能等研究工作,(电子信箱)605538133@qq.com。
  • 作者简介:邹玮(1997-),女,实习研究员,硕士,主要从事图像处理、植物表型研究工作,(电子信箱)171465192@qq.com。
  • 基金资助:
    科研机构创新能力建设专项(黔科合服企[2021]15号); 贵州省科技计划项目(黔科合支撑[2024]一般149)

Research on the whole life cycle target detection of pitaya in natural environment based on improved YOLOv5s

ZOU Wei, LI Li-jie, YUE Yan-bin, HAN Wei, WANG Hu, ZHAO Ze-ying   

  1. Guizhou Agricultural Science and Technology Information Institute,Guiyang 550006,China
  • Received:2026-02-06 Published:2026-05-25 Online:2026-05-26

摘要: 为解决传统果园农事管理对人工依赖度高、难以针对大面积果园不同生长阶段的火龙果进行快速准确识别的问题,本研究根据火龙果生长特性,将其全生育期划分为7个阶段。首先通过生成对抗网络WGAN-GP对数据集进行扩充,利用条件判别机制增强稀有样本,提升数据集均衡性;其次,以YOLOv5s目标检测网络为基础,将其主干网络结构替换为轻量级网络MobileViT,在维持检测精度的同时,提升模型推理速度。结果表明,改进数据集和网络结构后的模型在测试集上的精确率为85.7%,召回率为77.6%,较原始模型分别提高了2.5和0.9个百分点,单帧平均检测时间为18.64 ms,减少了3.87 ms,模型参数量为5.87×106,mAP50为82.5%,mAP50-95为62.4%。该模型可实现自然环境下火龙果的实时检测,为后续火龙果生长状态监测及自动化作业提供了可行的视觉检测方案。

关键词: 火龙果, 全生育期, YOLOv5s, WGAN-GP, MobileViT, 目标检测

Abstract: To address the challenges of traditional orchard management that heavily relied on manual labor and the difficulties in rapidly identifying pitaya fruits at different growth stages across large-scale orchards, this study categorized pitaya growth stages into seven distinct phases based on its developmental characteristics. Firstly, the dataset was augmented using a Generative Adversarial Network (WGAN-GP) with conditional discrimination mechanisms to enhance rare sample representation and improve dataset balance. Secondly, building upon the YOLOv5s object detection framework, the core architecture was replaced with the lightweight MobileViT network to maintain detection accuracy while significantly accelerating model inference speed. The experimental results demonstrated that the enhanced model with the optimized dataset and network architecture achieved a precision of 85.7% and a recall rate of 77.6% on the test set, which increased by 2.5 and 0.9 percentage points compared to the original model, respectively. The average detection time of a single image was 18.64 ms, which was 3.87 ms shorter than the YOLOv5s model, and the improved detection network achieved a mAP50 value of 82.5%, a mAP50-95 value of 62.4%, while the model size was 5.87×106. This system could achieve real-time detection of pitaya in natural environments and provide a feasible visual inspection solution for subsequent monitoring of pitaya growth status and automated operations.

Key words: pitaya, whole life cycle, YOLOv5s, WGAN-GP, MobileViT, object detection

中图分类号: