RNA-sequencing analysis of differential gene expression

JT J. Vladimir Torres-Rodríguez
MS M. Nancy Salazar-Vidal
RM Ricardo A. Chávez Montes
JM Julio A. Massange-Sánchez
CG C. Stewart Gillmor
RS Ruairidh J. H. Sawers
request Request a Protocol
ask Ask a question
Favorite

RNA-sequencing analysis was carried out on roots and leaves for the 4 nutrient treatments (Full, LowN, LowP and lowNP) and two replicates, for a total of 2 tissues × 4 treatments × 2 replicates = 16 samples. Libraries were prepared by the Laboratorio de Servicios Genomicos, LANGEBIO, Mexico (www.langebio.cinvestav.mx/labsergen/). Libraries were prepared using the TruSeq RNA Sample Prep Kit v2 (https://support.illumina.com/sequencing/sequencing_kits/truseq_rna_sample_prep_kit_v2.html) and sequenced using the Illumina HiSeq4000 platform at the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 OD018174 Instrumentation Grant, and at Labsergen on the Illumina NextSeq 550 equipment. Transcriptome data are available in the NCBI Sequence Read Archive under study SRP287300 at https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP287300

RNA sequencing reads were aligned against the AGPV3.30 maize gene model set available at Ensembl Plants [106] using kallisto version 0.43.1 [107]). Transcript-level abundance data was pre-processed using R/tximport [108] and summarized at the gene-level before further analysis. Count data were analyzed using a linear model approach in edgeR [109, 110]. We fitted the complete model counts ~ intercept + tissue * N-level * P-level + error across the 16 samples. We selected genes-of-interest based on evidence of a non-zero coefficient for at least one model term containing N-level or P-level (the coef argument to R/edgeR::glmQLFTest included all model coefficients except for the intercept and tissue main effect; adjusted FDR < 0.01; absolute log fold change (LFC) > 1; log counts per million (CPM) > 1). An additional subset of 81 NxP interaction genes was selected based on the coefficients N-level x P-level and tissue x N-level x P-level (adjusted FDR < 0.05; |LFC| > 1; logCPM > 1). Genes-of-interest were further categorized based on pairwise LFC for each stress treatment with respect to the full nutrient control for either root or leaves. LFC for each tissue was extracted from the model counts ~ treatment + error, a threshold of + 1 and − 1 being used for up- and down- regulation, respectively. Gene functional annotations were assigned as the functional annotation of the blastp reciprocal best hits versus Araport11 [10.1111/tpj.13415] and uniprot proteins, and the description from the PANNZER2 [10.1093/nar/gky350] functional annotation webserve. Upset diagrams were generated using R/UpSetR and R/ComplexHeatmap [111, 112]. GO analysis was performed with BiNGO 3.0.3 [113] in the Cytoscape 3.7.2 environment [114] using a hypergeometric test, Benjamini & Hochberg FDR correction and a significance level of 0.05. The Gene ontology file (go.obo) was retrieved from the gene ontology web page (http://geneontology.org/docs/download-ontology/). For each GO category, the mean LFC of the associated genes-of-interest was calculated with respect to each tissue/treatment combination using the pairwise values described above.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A