Raw sequencing data were trimmed using Trim Galore (v0.4.0) (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) to remove low-quality bases and adaptor sequences. Trimmed reads were mapped onto the reference genome (hg19 for samples of human origin and mm9 for samples of mouse origin) using bsmap (v2.74) (32), followed by removal of PCR duplicates. Methylation signals were extracted using methratio.py, a script in the bsmap package (v3.4.2) (33). An R script was generated to maintain chromosome coordinates consistent for methylation sites between each pair of BS and OXBS samples, allowing us to identify 5mC and 5hmC signals in these common sites with the mlml script in methpipe using the default settings. To calculate 5fC/5caC sites, we used the following formula: 5fC/5caC = NT/(NC + NT), where NT and NC are the number of Ts and Cs in the CG context, respectively. M.SssI methylase error rate and bisulfite conversion inefficiency was 1.64%, which was determined in lambda genome in our pilot experiment in mESC WT and mESC Tdg KO cell lines (table S8B). Error rate was used in binomial distribution and q value package to adjust for the 5fC/5caC signal. The number of 5fC/5caC sites was determined by a binomial test, as described previously (34). Only 5fC/5caC sites with cutoff of coverage ≥ 10, P < 0.01, and false discovery rate < 0.01 were kept for the downstream analyses. It is possible that our samples may carry mutations different from the reference genome, which could result in higher false positives. To address this issue, we used biscuit package (https://github.com/zwdzwd/biscuit) to identify these mutations from trimmed mapped reads and removed these potential mutations from 5mC, 5hmC, and 5fC/5caC sites. The remaining sites were used for the downstream analysis.

To call for DMRs, DHMRs, and DFCRs between the two groups of methylomes, the methpipe package (http://smithlabresearch.org/software/methpipe/) was used (33). We started by assembling a proportion table containing read proportions for all target methylomes with “merge-methcounts” program included in methpipe. After creating the proportion table and specifying the design matrix, we performed differential methylation analyses, which consists of (i) regression (“radmeth regression” default parameters), (ii) combining significance, and (iii) multiple testing adjustment steps (“radmeth adjust -bins 1:200:1”). All DMRs/DHMRs/DFCRs were filtered by P < 0.01 and contain at least five differentially methylated CG sites in each region. To call for DMRs, DHMRs, and DFCRs between a pair of methylomes, we generated an R script that enables us to use 200-bp step size across the entire genome with 2-kb bin and calculate the methylation difference along with P value (Student’s t test) in both samples. If adjacent bins had continual methylation differences between samples, which were identified with cutoff of fold change > 2 and P < 0.05, then these bins were iteratively merged together, and methylation difference was calculated for the merged region.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.