The high-quality reads from each sample were aligned against the gene catalog by SOAP2.21 with the criterion of identity > 90% [63]. In our sequence-based profiling analysis, the alignments that met one of the following criteria as previously described could be accepted [67]: (i) an entire of a paired-end read can be mapped onto a gene with the correct insert-size and (ii) only when the one end of paired-read was mapped outside the genic region; the other end of reads can be mapped onto the end of a gene. In both cases, the mapped read was counted as one copy. The formula used in this study for calculating gene relative abundance is similar to RPKM/FPKM (reads per kilobase of exon model per million mapped reads/fragments per kilobase of exon model per million mapped fragments) value. Accordingly, for any sample 푆, we calculated the abundance as follows:

Step 1: Calculation of the copy number of each gene:

Step 2: Calculation of the relative abundance of gene i:

ai: The relative abundance of gene i in sample S

Li: The length of gene i

xi: The times which gene i can be detected in sample S (the number of mapped reads)

bi: The copy number of gene i in the sequenced data from S.

j: The iHSMGC gene number.

The value of bi standardizes the effect of gene length in Step 1. The value of bijbi standardizes the effect of sequencing depth in Step 2.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.