By base calling, the original image data produced by sequencing were transferred into sequences (raw reads). To obtain clean reads for further analysis, the raw reads were filtered by removing adaptor sequences, low-quality reads, and reads with a percentage of unknown bases (N) of more than 10%. The clean reads were then mapped to the transcriptome of a developing flower reference database using SOAPaligner/SOAP2, and no more than 2 mismatches were allowed in the alignment. The expression level of each gene was determined by the numbers of reads that were uniquely mapped to the specific gene and the total number of uniquely mapped reads in the sample. The gene expression level was calculated using RPKM method (reads per kb per million reads) [67]. If there was more than one transcript for a gene, the longest transcript was used to calculate its expression level and coverage.
A false discovery rate (FDR) ≤ 0.001 and an absolute value of log2Ratio ≥ 1 were used as the threshold to determine the DEGs. The DGEs were then subjected to GO and KEGG Ontology (KO) enrichment analysis using hypergeometric testing. The enriched P-values were calculated as follows: , where N is the number of genes with GO or KO annotation; n is the number of DEGs in N; M is the number of genes in certain GO or KO terms; and m is the number of DEGs in M. A Bonferroni correction was imposed on the P-value, taking the corrected P-value ≤ 0.05 as a threshold.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.