Gleditsia sinensis is a Chinese native deciduous tree with a high economic and medicinal value. However, there is limited knowledge on the molecular processes responsible for the medical properties of this species owing to lack of bioinformatic resources such as available whole-genome sequences. In the present study, RNA sequencing data were used to analyze the transcriptome of G. sinensis, and a series of bioinformatic tools was used to explore the main genes involved in important molecular processes. A total of 75.57 million paired-end reads, with a length of 101 bp, were acquired from G. sinensis. Using the assembly tool Trinity, 233,751 transcripts were discovered. Among these, 85,795 were identified as unique transcripts and 59,326 unique transcripts were found to contain coding regions. Gene ontology analysis identified 27,637 unique transcripts that were clustered into 56 functional groups. Genes involved in flavonoid and terpenoid backbone biosynthesis and those encoding transcription factors were further analyzed. Sequence analysis revealed four putative G. sinensis chalcone isomerase genes (GsCHI) encoding the enzymes for flavonoid biosynthesis. GsCHI1 was found to be phylogenetically related to the chalcone isomerase of the family Leguminosae, and its transcript levels in different tissues were higher than those of GsCHI2, GsCHI3, and GsCHI4. Furthermore, 15,014 simple sequence repeat (SSR) markers were discovered in the transcript library, and 5170 primers were generated for the SSR loci. The genetic and genomic information presented in this study will be helpful for future studies on gene discovery and molecular processes in G. sinensis. Gleditsia sinensis is a Chinese native deciduous tree with a high economic and medicinal value. However, there is limited knowledge on the molecular processes responsible for the medical properties of this species owing to lack of bioinformatic resources such as available whole-genome sequences. In the present study, RNA sequencing data were used to analyze the transcriptome of G. sinensis, and a series of bioinformatic tools was used to explore the main genes involved in important molecular processes. A total of 75.57 million paired-end reads, with a length of 101 bp, were acquired from G. sinensis. Using the assembly tool Trinity, 233,751 transcripts were discovered. Among these, 85,795 were identified as unique transcripts and 59,326 unique transcripts were found to contain coding regions. Gene ontology analysis identified 27,637 unique transcripts that were clustered into 56 functional groups. Genes involved in flavonoid and terpenoid backbone biosynthesis and those encoding transcription factors were further analyzed. Sequence analysis revealed four putative G. sinensis chalcone isomerase genes (GsCHI) encoding the enzymes for flavonoid biosynthesis. GsCHI1 was found to be phylogenetically related to the chalcone isomerase of the family Leguminosae, and its transcript levels in different tissues were higher than those of GsCHI2, GsCHI3, and GsCHI4. Furthermore, 15,014 simple sequence repeat (SSR) markers were discovered in the transcript library, and 5170 primers were generated for the SSR loci. The genetic and genomic information presented in this study will be helpful for future studies on gene discovery and molecular processes in G. sinensis.
Genetics and Molecular Research received 74024 citations as per google scholar report