湖北农业科学 ›› 2025, Vol. 64 ›› Issue (6): 197-206.doi: 10.14088/j.cnki.issn0439-8114.2025.06.033

• 生物工程 • 上一篇    下一篇

野桑蚕全长转录组测序及生物信息学分析

孟刚, 王瑞娴, 楚渠, 彭云武, 杨金宏, 陈安利, 张笙源, 凌君   

  1. 安康学院陕西省蚕桑重点实验室,陕西 安康 725000
  • 收稿日期:2024-12-16 出版日期:2025-06-25 发布日期:2025-07-18
  • 作者简介:孟 刚(1984-),男,湖北随州人,副研究员,博士,主要从事蚕种质资源与遗传育种研究,(电话)0915-3352237(电子信箱)nsymg@aku.edu.cn。
  • 基金资助:
    陕西省重点研发计划项目(2023-JC-YB-206); 陕西省教育厅重点科学研究计划项目(20JS002); 家蚕基因组生物学国家重点实验室开放课题(SKLSQB1819-4)

Full-length transcriptome sequencing and bioinformatics analysis of the Bombyx mandarina

MENG Gang, WANG Rui-xian, CHU Qu, PENG Yun-wu, YANG Jin-hong, CHEN An-li, ZHANG Sheng-yuan, LING Jun   

  1. Shaanxi Key Laboratory of Sericulture, Ankang University, Ankang 725000, Shaanxi, China
  • Received:2024-12-16 Published:2025-06-25 Online:2025-07-18

摘要: 采用二代测序(Illumina RNA-Seq)校正三代测序(PacBio ISO-Seq)的方法对野桑蚕(Bombyx mandarina)进行全长转录组测序,探究野桑蚕蛹不同滞育期的基因表达特征,深入探索基因组功能信息。通过测序和组装共获得93 616个全长转录本,其序列长度为327~33 273 bp,平均长度为2 631 bp,N50为3 204 bp。通过整合COG、GO、KEGG、KOG、Pfam、Swiss Prot、eggNOG和NR功能数据库的注释结果,共获得82 796个功能注释基因。利用全长转录本数据,共鉴定出17 189个lncRNA、87 921个SSR分子标记及49 432个开放阅读框(ORF);ORF编码蛋白的长度为0~1 522 aa,平均长度为305 aa。通过对野桑蚕蛹不同滞育期的基因表达情况进行分析,共获得差异表达基因5 780个,其中2 269个差异表达基因获得GO注释,1 590个差异表达基因获得KEGG注释。

关键词: 野桑蚕(Bombyx mandarina), 全长转录组, 测序, 生物信息学分析

Abstract: This study employed second-generation sequencing (Illumina RNA-Seq) to calibrate third-generation sequencing (PacBio ISO-Seq) for conducting full-length transcriptome sequencing of the Bombyx mandarina, aiming to investigate gene expression characteristics during different pupal diapause and deeply explore genomic functional information.A total of 93 616 full-length transcripts were obtained through sequencing and assembly, with a sequence length of 327~33 273 bp, an average length of 2 631 bp, and N50 of 3 204 bp. Integration of annotations from COG, GO, KEGG, KOG, Pfam, Swiss Prot, eggNOG, and NR databases identified 82 796 functionally annotated genes. Analysis of full-length transcripts revealed 17 189 lncRNAs, 87 921 SSR markers, and 49 432 open reading frames (ORFs), with ORF-encoded proteins ranging from 0 to 1 522 amino acids (average 305 aa).Gene expression analysis across diapause stages identified 5 780 differentially expressed genes, including 2 269 with GO annotations and 1 590 with KEGG annotations.

Key words: Bombyx mandarina, full-length transcriptome, sequencing, bioinformatics analysis

中图分类号: