In order to remove junctions and low-quality data from the transcriptome data, the raw data were filtered using fastp software (https://github.com/OpenGene/fastp) to remove reads containing junctions, reads with an N-ratio of 10% or more, and reads with a low-quality (quality value less than 20) base ratio of more than 50%. Finally, FastQC software was used to check the quality of the clean data, and subsequent analysis was carried out after passing the quality control. All downstream analyses were based on sequence data. We downloaded lemon reference genome and gene model annotation files directly from the Genome website (https://www.citrusgenomedb.org/jbrowse/index.html?data=data/Climon_Alt_v1.0f) (Bao et al., 2023). The index of the reference genome was built using HISAT2 v2.0.5, and paired-end clean reads were aligned to the reference genome using HISAT2 v2.0.5 (Kim, Langmead & Salzberg, 2015; Shirasawa et al., 2017). The mapped reads of each sample were assembled by StringTie (v1.3.3b) (Pertea et al., 2015) in a reference-based approach. With |log2Fold Change| ≥ 1 and padj ≤ 0.05 as the thresholds, differential expression analysis of two conditions/groups (two biological replicates per condition) was performed using the DESeq2 R package (1.20.0). Gene Ontology (GO) and KEGG pathway enrichment analysis of differentially expressed genes (DEGs) was implemented using the clusterProfiler R package.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.