The ATAC-seq and H3K4me2 ChIPmentation data were processed using well-established bioinformatics pipelines, followed by quality control following the same approach as for the RNA-seq data (described above). The two data types were processed separately and subsequently integrated as described in the next section.
Raw reads were trimmed with trimmomatic (version 0.32) and aligned to the mouse reference genome (mm10) using bowtie2 (version 2.2.4). Primary alignments with mapping quality greater than 30 were retained. ATAC-seq peaks were called using MACS (version 2.7.6) on each individual sample. H3K4me2 ChIPmentation peaks were called using MACS against the input controls (which were obtained by pooling input control data from the three types of structural cells in equal amounts for each organ). Peaks were aggregated into a list of consensus peaks using the function reduce of the package GenomicRanges (version 1.22.4) in R. Consensus peak that overlapped with blacklisted genomic regions (downloaded from http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm10-mouse/) were discarded.
Quantitative measurements were obtained by counting reads within consensus peaks using the function summarizeOverlaps from the GenomicAlignments (version 1.6.3) package in R. Samples with detectable contamination by hematopoietic immune cells were identified in the same way as for the RNA-seq data, using epigenomic signals in promoter peaks to calculate immune cell signatures. Contaminated samples were automatically removed and replaced by new samples. Finally, we computationally corrected for residual contamination in the retained samples by regressed out epigenomic signatures of hematopoietic immune cells from the matrix of signal intensity values across peaks, using the function removeBatchEffect from the limma package in R.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.