For raw data filtering, Fastp (version 0.19.4) (Chen et al., 2018) was performed to control quality and output the clean reads. Then, the paired-end clean reads were aligned against the reference genome sequence (G. hirsutum acc. TM-1, version 2.1) (Hu et al., 2019) with the Hisat2 (version 2.2.1) (Kim et al., 2015). The gene expression abundance of the samples was calculated using Stringtie (version 2.1.4) (Pertea et al., 2015) and the fragments per kilobase of exon per million fragments mapped (FPKM) value for each gene were obtained, which was used as the measure of the gene expression level. The mapped ratio of a replicate sample of Hai7124 at 25 DPA was only 55.66%, so it was eliminated in the subsequent analysis (Supplementary Table 1). In this study, the genes with FPKM >1 were considered as the expressed genes. The principal component analysis (PCA) and Pearson’s correlation coefficient (PCC) analysis were performed using Prcomp and Cor functions in R (version 3.6.4). We conducted the Gene Ontology (GO) analysis by the R package ClusterProfiler (version 3.14.3) (Yu et al., 2012). The GO terms exhibiting a p-value of < 0.05 were significantly enriched. The corresponding phenotype data were respectively retrieved from http://mascotton.njau.edu.cn/info/1058/1132.htm (Fang et al., 2017), and http://cotton.zju.edu.cn/download.html (Fang et al., 2021). The identification of the TF families in G. hirsutum gene members was performed by Hu et al. (2019).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.