Read counts were downloaded from GEO (accession numbers GSE129240, GSE86337, and GSE74246). Each sample was normalized by library size and multiplied by a size factor 106, or count per million (CPM), and then log2-transformed with a pseudocount of 1, i.e., log2(CPM+1). For the ENCODE samples, the downloaded data were already normalized (Fragments Per Kilobase of transcript per Million mapped reads, or FPKM) and log2-transformed.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.