湖北农业科学 ›› 2025, Vol. 64 ›› Issue (10): 195-200.doi: 10.14088/j.cnki.issn0439-8114.2025.10.030

• 信息工程 • 上一篇    下一篇

基于因子分析和AdaBoost算法的烟叶颜色分类

张千子, 邓邵文, 王文浩, 何倩, 郭燕, 高云才, 李湘伟, 唐晓燕, 常玉龙, 杨粟, 杜啟霞, 罗小枝   

  1. 红塔烟草(集团)有限责任公司,云南 玉溪 653100
  • 收稿日期:2025-03-17 出版日期:2025-10-25 发布日期:2025-11-14
  • 通讯作者: 邓邵文(1991-),云南玉溪人,农艺师,主要从事烟叶质量控制与管理研究,(电子信箱)422193434@qq.com。
  • 作者简介:张千子(1994-),女,云南玉溪人,农艺师,硕士,主要从事烟叶质量控制与管理研究,(电子信箱)526461905@qq.com。
  • 基金资助:
    红塔烟草(集团)有限责任公司科技项目(2023YL03)

Tobacco leaf color classification based on factor analysis and AdaBoost algorithm

ZHANG Qian-zi, DENG Shao-wen, WANG Wen-hao, HE Qian, GUO Yan, GAO Yun-cai, LI Xiang-wei, TANG Xiao-yan, CHANG Yu-long, YANG Su, DU Qi-xia, LUO Xiao-zhi   

  1. Hongta Tobacco (Group) Co., Ltd., Yuxi 653100, Yunnan, China
  • Received:2025-03-17 Published:2025-10-25 Online:2025-11-14

摘要: 通过样本采集获取具有代表性的烟叶样本(橘黄色烟叶、柠檬黄色烟叶、红棕色烟叶),并进行预处理。利用因子分析方法对烟叶颜色的相关特征进行提取,以降低数据维度并提取关键信息。采用AdaBoost算法构建分类模型,对提取的特征进行分类,并对比不同算法的预测结果。对FA-AdaBoost模型的性能进行评估,并验证其分类效果。结果表明,应用因子分析法选出380、460、740 nm 3个波段,作为烟叶颜色分类的关键光谱特征。与梯度提升、Bagging和随机森林算法相比,AdaBoost算法能够在较少的迭代次数内,达到最低的测试误差率。FA-AdaBoost模型在烟叶颜色分类中表现优异,精确率、召回率和F1分数均处于较高水平,FA-AdaBoost模型对红棕色烟叶的识别效果显著,3项指标均达100%。从支持度来看,各类别样本数量差异明显,红棕色烟叶样本量(3片)远少于其他类别,存在明显的类别不平衡现象,但是FA-AdaBoost模型整体准确率仍达86%,表明FA-AdaBoost模型在面临类别不平衡挑战时,依然能保持较强的整体分类能力。AdaBoost模型在烟叶颜色分类任务中展现出高效、准确的识别能力,在不同类别间的性能表现也较为均衡,展现出稳健的泛化能力。

关键词: 烟叶分级, 因子分析, AdaBoost算法, 烟叶颜色, 分类

Abstract: Representative tobacco leaf samples (orange-yellow, lemon-yellow, and red-brown) were collected and preprocessed. Factor analysis was employed to extract features related to tobacco leaf color, reducing data dimensionality and capturing key information. The AdaBoost algorithm was used to construct a classification model for the extracted features, and its predictive performance was compared with other algorithms. The performance of the FA-AdaBoost model was evaluated, and its classification effectiveness was validated. The results showed that three wavelengths (380, 460 nm, and 740 nm) were selected as key spectral features for tobacco leaf color classification using factor analysis. Compared with gradient boosting, Bagging, and random forest algorithms, the AdaBoost algorithm achieved the lowest test error rate with fewer iterations. The FA-AdaBoost model demonstrated excellent performance in tobacco leaf color classification, with high precision, recall, and F1-score. The FA-AdaBoost model achieved remarkable recognition results for red-brown leaves, with all three metrics reaching 100%. In terms of support, significant differences in sample sizes across categories were observed, with red-brown leaves (3 samples) being substantially fewer than other categories, indicating evident class imbalance. Nevertheless, the FA-AdaBoost model achieved an overall accuracy of 86%, demonstrating its strong classification capability despite class imbalance challenges. The AdaBoost algorithm exhibited efficient and accurate recognition in tobacco leaf color classification tasks, with balanced performance across different categories and robust generalization ability.

Key words: tobacco grading, factor analysis, AdaBoost algorithm, tobacco leaf color, classification

中图分类号: