Average CAGE, MNase-seq, H3K4me3 ChIP-seq, PRO-Cap, PRO-Seq and NET-seq signal calculations

MT Malte Thodberg
AT Axel Thieffry
JB Jette Bornholdt
MB Mette Boyd
CH Christian Holmberg
AA Ajuna Azad
CW Christopher T Workman
YC Yun Chen
KE Karl Ekwall
ON Olaf Nielsen
AS Albin Sandelin
request Request a Protocol
ask Ask a question
Favorite

MNase-seq and H3K4me3 ChIP-Seq data was obtain from (39,40) (accession numbers GSM1374060, GSE49575). Only wild-type strains were used in both cases. PRO-Cap and PRO-Seq signals were obtained from (41) (accession number GSE76142). NET-seq data were obtained from (42) (BAM-files were generously supplied by the authors). The GenomicAlignments package (34) was used to import and calculate the average signal.

In all cases, data were imported with rtracklayer and coverage calculated with the coverage-function from IRanges (34). CAGE, PRO-Cap, PRO-Seq and NET-Seq profiles were plotted after removing the 1% most highly expressed sites, to remove the influence of a few extreme outliers.

For Figure Figure3A3A and B, only CAGE-defined TSSs at annotated promoters were used. For Figure Figure3C3C and Supplementary Figure S3C and D, TSSs were selected based on the following criteria: (i) the TSSs must have an upstream CAGE TSS within 250 bp on the opposite strand and (ii) the TSSs must not be within 500 bp of an annotated TSS, on either strand. TSSs were grouped by the amount of NET-Seq sense signal in the nucleosome-depleted region (NDR), defined as −250:−50 bp relative to the CAGE-defined TSS peak.

Analysis of TSS distribution shape and bidirectional transcription in Schizosaccharomyces pombe. (A) Relation between CAGE TC width and CpG content. X-axes show CAGE TC width as defined in Supplementary Figure S1A. Left Y-axes and black histogram show the distribution of widths in five pooled Mus musculus lung CAGE libraries (left panel) and this study (S. pombe, right panel). Red trend-lines and right Y-axes show the average number of CpG dinucleotides per bp (200 bp centered on TSS peaks). The 95% confidence intervals are shown as dotted lines. Only CAGE-defined TSSs corresponding to annotated TSSs were analyzed. (B) Relation between TATA-box occurrence and TC width. X-axes show the −50 to −10 region relative to TSS peaks. Y-axes show the average TATA-box motif match score (0–100% of maximal score). TSSs are stratified by their TC width, indicated by line color. CAGE-defined TSSs from M. musculus and S. pombe are shown in left and right panel, respectively. Only CAGE-defined TSSs corresponding to annotated TSSs are analyzed. (C) Analysis of bidirectional transcription initiation. X-axes show distance relative to S. pombe CAGE-defined TSSs in bp. Each panel shows a different assay (MNase-Seq, NET-Seq or PRO-Cap). Y-axis shows average signal, in forward (blue), reverse (red) direction or unstranded (green) relative to the TSS. TSSs analyzed were selected to have bidirectional transcription based on CAGE, and subsequently filtered to only retain those with little sense NET-Seq transcription in the NDR-region (highlighted in gray). Supplementary Figure S3C and D shows TSSs with higher NET-Seq signal in the NDR region.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A