Data processing

SA Sandrine Auger
VM Virginie Mournetas
HC Hélène Chiapello
VL Valentin Loux
PL Philippe Langella
JC Jean-Marc Chatel
BW Brenda A Wilson
BW Brenda A Wilson
BW Brenda A Wilson
request Request a Protocol
ask Ask a question
Favorite

The workflow for this study is presented in S1 Fig. A read count table was obtained using the raw sequencing reads. After trimming reads with fastp v.0.20.0 (default parameters, [34]), fastq-formatted reads were aligned to the genome of F. prausnitzii strain A2-165 (Genome Assembly ASM273414v1) using BWA v.0.7.17 [35], allowing a single mismatch in the read. Then, sam-formatted alignments were sorted and converted to bam output files using SAMtools v.1.10 [36]. The number of reads per transcript from each sample was counted using HTSeqCount v.0.12.4 [37] and GFF-formatted gene annotations downloaded from NCBI. We checked the distribution of raw counts and performed principal component analysis in each dataset (S2 Fig). Gene expression values were normalized using the DEseq2 package v.1.34.0, in R [38]. The count table, containing 2,950 genes (S1 Table), was filtered to eliminate non-expressed genes. The resulting final dataset contained 2,902 genes and was further processed with WGCNA [39]. The quality of the expression matrix was evaluated by hierarchical clustering based on the distance between different samples, measured using Spearman’s correlation. No outliers were detected (S3 Fig).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A