IDseq z-score and aggregate score metrics

KK Katrina L Kalantar
TC Tiago Carvalho
Cd Charles F A de Bourcy
BD Boris Dimitrov
GD Greg Dingle
RE Rebecca Egger
JH Julie Han
OH Olivia B Holmes
YJ Yun-Fang Juan
RK Ryan King
AK Andrey Kislyuk
ML Michael F Lin
MM Maria Mariano
TM Todd Morse
LR Lucia V Reynoso
DC David Rissato Cruz
JS Jonathan Sheu
JT Jennifer Tang
JW James Wang
MZ Mark A Zhang
EZ Emily Zhong
VA Vida Ahyong
SL Sreyngim Lay
SC Sophana Chea
JB Jennifer A Bohl
JM Jessica E Manning
CT Cristina M Tato
JD Joseph L DeRisi
request Request a Protocol
ask Ask a question
Favorite

Given the sensitivity of mNGS, it is common to identify contaminating microbial sequences derived from laboratory contaminants, reagents, collection tubes, etc. There exist numerous approaches to assist in distinguishing background contaminants from true microbes [35, 60, 69]. IDseq implements a previously described z-score method for background correction [60]. Researchers can create a background model by selecting control samples sequenced via their standard laboratory protocols or select from a default set of publicly available water controls. From the selected set of samples, the distribution of reads for each taxon is computed. The z-score field of the IDseq sample report is calculated as the z-score for each taxonomic ID based on its prevalence in the selected background model. Specifically, the z-score for a taxon in sample A is computed as follows:

Thus, if a taxon is present at higher abundance in the sample than the controls, it will have a z-score >1. If a particular taxonomic ID is not found in the set of control samples, then the z-score will be set to 100. If the taxonomic ID is not found in the sample, the z-score will be set to −100. The z-score metric also feeds into the “aggregate score,” which combines information from NT rPM, NR rPM, NT z-score, and NR z-score to provide an estimate of “microbial importance” for a particular sample based on the relative abundance both within the sample as well as in the background. This experimental metric aims to rank rare organisms that may be implicated in an infection higher, even if they are present only at low abundance.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A