K562 and GM12878 GRO-seq data from a previous report were acquired for re-analysis [45]. For activated mouse B cells and mES V6.5 cells, GRO-seq data were acquired from published reports [15, 69]. Adapter-trimmed reads were aligned using bowtie (-l 25 -v 1 -k 1 -m 1 -S -q --best) [82]. Reads mapped to rRNA were omitted. For K562 and GM12878 cells, the read density (reads per kilobase, RPK) normalized to 10 million sequencing reads in the gene bodies or promoter regions was calculated via analyzeRepeats.pl from Homer (analyzeRepeats.pl rna hg19 -count genes -condenseGenes -strand + -norm 1e7) [83]. For activated mouse B cells and mES V6.5 cells, the read density was obtained from Homer (analyzeRepeats.pl rna mm10 -count genes -condenseGenes -strand + -norm 1e6).
Active genes in K562 or GM12878 cells were defined as those with read density > 0 at the promoter region and RPK > 4 at the gene body from the GRO-seq data. Silent genes were defined as those with no read at the promoter region and RPK ≤ 1 at the gene body. Mouse active genes in activated B cells and mES V6.5 cells were defined as those with RPK > 1. Silent genes were genes with zero RPK from the GRO-seq data.
The non-transcribed regions used in heatmaps were defined as interval regions between two active genes within A compartments via bedtools subtract. Non-transcribed regions with a size of 20–100 kb that overlapped with ERIZs were used for the non-TR-related heatmaps. The active genes flanked by ERIZ-occupied non-TRs were defined as transcribed regions (TRs) for display in heatmaps (Fig. (Fig.44f–h).
The non-transcribed regions were further trimmed by 200 bp at the head and tail to exclude the TSS and TTS regions (Figs. (Figs.3b,3b, d, f and and4c,4c, e; Additional file 1: Figure S4f). The flanked transcribed regions were also trimmed as active gene body regions (TSS + 200 bp, TTS − 200 bp). The read density was calculated as follows: for each non-transcribed region, reads in the region were counted and then normalized by region length. Read counts in transcribed regions upstream or downstream of non-transcribed regions were summed and normalized by the summed length of the transcribed regions. In order to compare the signal enrichment of non-transcribed and transcribed regions based on NAIL-seq and ORC2 ChIP-seq, the log2 fold change of the ratio of the read density in the non-transcribed region to that of the transcribed region was calculated to allow quantification of differences. For MCM ChIP-seq, the fold enrichment of the read density relative to that of the input control was calculated and used to calculate the log2 fold change in the non-transcribed regions and transcribed regions.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.