Whole-genome bisulfite sequencing and analysis

IA Ihab Ansari
LS Llorenç Solé-Boldo
MR Meshi Ridnik
JG Julian Gutekunst
OG Oliver Gilliam
MK Maria Korshko
TL Timur Liwinski
BJ Birgit Jickeli
NW Noa Weinberg-Corem
MS Michal Shoshkes-Carmel
EP Eli Pikarsky
EE Eran Elinav
FL Frank Lyko
YB Yehudit Bergman
request Request a Protocol
ask Ask a question
Favorite

WGBS was performed on three biological replicates from Tet2/3fl/fl and Tet2/3fl/fl VillinCre mice. Library preparation for bisulfite sequencing was performed as described previously93. Reads were trimmed to a maximal length of 80 bp and stretches of bases having a quality score <30 at the ends of the reads were removed. Reads were mapped using BSMAP 2.5 (ref. 94). As a reference sequence for the bisulfite mapping we used the mm9 assembly of the mouse genome. Only reads mapping with both partners of the read pairs at the correct distance were used. The correct distance was defined by setting the minimum value to 50 bp and the maximum value to 800 bp. In case a read pair mapped to multiple sites on the reference sequence a random hit was chosen. This was done by setting the option -r of bsmap to 1. The maximum number of mismatches allowed was set to 4% of the number of bases of a read. The whole list of parameters used for mapping with BSMAP 2.5 are: -d mm9, -s 16, -v 0.04, -w 100, -r 1, -q 0, -z 33, -f 5, -A none, -B 1, -E 4,294,967,295, -L 144, -D none, -I 4, -S 0, -n 1, -M TC, -p 4, -m 50, -x 800. Duplicates were removed using the Picard tool [http://broadinstitute.github.io/picard]. Methylation ratios were determined using a Python script (methratio.py) distributed together with the BSMAP package. For both the forward and reverse strands, all cytosine bases in the GC context were called independently. LMRs were identified with MethylSeekR95.

For further analysis, we used chromHMM96, which performs Hidden Markov Modeling of input data. As training datasets we used available ChIP-seq tracks (http://genome.ucsc.edu) for H3K4me1, H3K4me3, H3K27ac, H3K27me3, and H3K36me3. This resulted in a segmentation of the genome into 15 different types of compartments, which were annotated based on the combination of related histone patterns. The motif analysis was performed using HOMER software97.

To compare the methylation levels of Tet2/3fl/fl and Tet2/3fl/fl VillinCre with developmental stages of the SI, we used publicly available datasets from ENCODE (ENCSR089FFK, ENCSR842QTB, ENCSR217TMK, ENCSR353IFP, ENCSR211VXF, mm10 genome assembly), removed CpGs which are covered by less than three reads and calculated the mean methylation level for each LMR. Finally, we performed a liftOver98 of the Tet2/3fl/fl and Tet2/3fl/fl VillinCre samples to the mm10 genome assembly, following a combined analysis of the developmental stages with our data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A