To characterize the molecular functions of the DEGs, we first blasted the genome against available protein databases to get functional annotation of each gene, and then searched the DEGs against the genome to subtract the corresponding functions. Protein function of the annotated genome was estimated through Mercator4 V2.0 (Schwacke et al., 2019). Transcription factors were predicted from the online tool plantTFDB v5.0 (Jin et al., 2016). Amino acid sequences of novel transcripts produced by StringTie were extracted using IsoformSwitchAnalyzeR v1.12.0 (Vitting-Seerup and Sandelin, 2019), and searched against pfam-A protein database using Pfamscan to obtain the domain information (Mistry et al., 2007). The novel transcripts sequences were also annotated from eggNOG-mapper v2 to get a more complete information of the genes (Huerta-Cepas et al., 2017). Annotation information including GOslim and gene association files of Arapdopsis thaliana was downloaded from The Arabidopsis Information Resource3. The reference genome was first aligned to A. thaliana using BLASTp algorithms (e-value < 10–5), and then to map the genes to A. thaliana to obtain the gene ontology (GO) terms clustered based on biological process, cellular component or molecular function. GO term enrichment analysis was conducted with GOAtools v1.0.15 using Bonferroni correction with a cut-off threshold of P < 0.01 (Klopfenstein et al., 2018). KEGG analysis was performed through KAAS server and gene enrichment was done in R package “clusterProfiler” v 3.18.1 (Yu et al., 2012).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.