Data analysis for RNAseq

CV Charles Viau
OH Orçun Haçariz
FK Farial Karimian
JX Jianguo Xia
request Request a Protocol
ask Ask a question
Favorite

Raw data for each sample was received in fastq file format from the McGill University and Génome Québec Innovation Centre. Read quality was checked with FASTQC (version 0.11.3) and adapter related sequences were removed using Trim Galore (version 0.4.5) (https://www.bioinformatics.babraham.ac.uk/projects/). The genome sequence of C. elegans and GTF file (Caenorhabditis_elegans.WBcel235.91.gtf) were downloaded from ENSEMBL (https://www.ensembl.org/). Reads were aligned to the C. elegans genome with HISAT2 (version 2.1.0) (Kim, Langmead & Salzberg, 2015) and sorted alignment files were generated by SAMtools (version 1.7) (Li et al., 2009). Raw read counts were extracted using HTSeq (version 0.9.1) with the intersection-strict mode (Anders, Pyl & Huber, 2015). Entrez IDs were extracted from a Bioconductor package (org.Ce.eg.db) (Carlson, 2018) and assigned to the wormbase gene sequences using R. Sample distribution by principal component analysis was visualised using NetworkAnalyst 3.0 (Zhou et al., 2019). Differential gene expression analysis between the nanoparticle treatments and control was carried out using edgeR where data were normalised by trimmed mean of M-values (TMM) and tag-wise dispersion parameters were estimated using the empirical Bayes method (Robinson, McCarthy & Smyth, 2010). For gene set enrichment analysis, genes were ranked by the expression ratio (combination of log2 fold change and FDR) and normalized enrichment score (NES) was determined using GSEAPreranked in Gene Set Enrichment Analysis (GSEA; version 3.0) (Subramanian et al., 2005). The value for the parameter of min size: exclude smaller sets was set to 0, the value for permutations was set to 1000 and the enrichment statistic was set to classic. For use with GSEAPreranked, GO derived MSigDB format gene sets for C. elegans was downloaded from GO2MSIG (Powell, 2014) and KEGG database of C. elegans was extracted from a current Bioconductor package (version 3.7) (Luo et al., 2009) and converted to *gmt file. Pathway interaction was investigated using ClueGO (Bindea et al., 2009). Furthermore, gene enrichment in GO and newly determined terms were carried out using GOATOOLS (Klopfenstein et al., 2018) and WormExp (Yang, Dierking & Schulenburg, 2016), respectively. Differentially expressed genes in the toxicity groups (compared to control) were further searched to ascertain whether they were reported in metal toxicity based on previous studies (Caito et al., 2012; Cui et al., 2007; Roh, Lee & Choi, 2006; Kumar et al., 2015; Anbalagan et al., 2012).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A