Research on Structural Variation Detection Methods of Wheat Genome Based on Deep Learning

Haiping Shi1, Yanling Li1, Zijing Dong1, Yuhong Li2, Fernando Bacao3
1College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan, 450002, China
2School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore
3NOVA Information Management School (NOVA lMS), Campus de Campolide, Universidade Nova de Lisboa, Lisboa, 1070-312, Portugal

Abstract

Due to the complexity of genome structure and technical conditions, wheat genome structure variation has not yet been comprehensively and accurately detected and evaluated for genetic effects. The aim of this study is to construct a method based on deep learning algorithm to accurately detect genomic structure variation in wheat. The method converts genomic data into image form by genomic structure variation image generation algorithm. A gene structure variation prediction model is constructed based on deep learning, and efficient and accurate structure variation prediction is realized by automatically extracting and analyzing the variation features in the image. The experimental results show that this method has better detection performance than other structural variation detection methods based on third-generation sequencing data, especially in the structural variation detection of the “Sequencing and Assembly of Spring Wheat Genome in China” project, and the accuracy, precision, and recall rate of this method are all over 90%. This study provides a novel deep learning framework for efficiently detecting structural variants in the wheat genome, and provides powerful technical support for genetic improvement and breeding research of wheat.

Keywords: wheat genes, structural variation, deep learning, image generation algorithm, variation prediction