发布: 2021年06月05日第11卷第11期 DOI: 10.21769/BioProtoc.4033 浏览次数: 3965
评审: Prashanth N SuravajhalaOctavio Morante-PalaciosErnesto Aparicio
Abstract
DNA methylation in gene promoters plays a major role in gene expression regulation, and alterations in methylation patterns have been associated with several diseases. In this context, different software suites and statistical methods have been proposed to analyze differentially methylated positions and regions. Among them, the novel statistical method implemented in the mCSEA R package proposed a new framework to detect subtle, but consistent, methylation differences. Here, we provide an easy-to-use pipeline covering all the necessary steps to detect differentially methylated promoters with mCSEA from Illumina 450K and EPIC methylation BeadChips data. This protocol covers the download of data from public repositories, quality control, data filtering and normalization, estimation of cell type proportions, and statistical analysis. In addition, we show the procedure to compare disease vs. normal phenotypes, obtaining differentially methylated regions including promoters or CpG Islands. The entire protocol is based on R programming language, which can be used in any operating system and does not require advanced programming skills.
Keywords: Methylation (甲基化作用)Background
DNA methylation plays an important role in many cellular processes and is currently being widely studied to gain a better understanding of human development and disease (Robertson, 2005). Most epigenome-wide association studies (EWAS) search for associations between DNA methylation and disease (Flanagan, 2015). For this aim, Illumina’s BeadChip arrays are widely used to measure DNA methylation in humans. Methylation in promoters is associated with gene expression repression (Boyes and Bird, 1992). The mechanisms of expression repression include impeding the binding of transcription factors and recruiting transcription repressors (Cedar and Bergman, 2012). Aberrant DNA methylation in these regions has been linked to several diseases, including cancer (Ehrlich and Lacey, 2013) and autoimmune disorders (Dozmorov et al., 2014). There are several R packages designed to detect differentially methylated regions that apply de novo and predefined strategies, as previously reviewed (Martorell-Marugán et al., 2019). Most of the predefined methods can be applied directly to detect differentially methylated promoters. On the contrary, de novo methods search for differentially methylated regions along the entire genome that should be annotated in order to detect which regions are located at promoters.
In this protocol, we present the complete data and statistical analysis pipeline to detect differentially methylated promoters in disease phenotypes from Illumina BeadChip data based on the mCSEA R package (Martorell-Marugán et al., 2019), which applies a predefined regions strategy. We used previously published data from patients with two rare neurodevelopmental diseases: Williams syndrome (WS) and 7q11.23 duplication syndrome (Dup7), as well as typically developing (TD) patients. Methylation in blood cells was measured in all the samples by the authors of the original study (Strong et al., 2015). This pipeline can be easily adapted to study other genomic regions such as gene body specific methylation patterns or CpG islands (CGIs). The complete code for this protocol is available as Supplemental_script file.
Equipment
Personal computer with Windows, MacOS, or Unix-based operating system
Software
R software environment 4.0.2. (https://www.r-project.org/)
RStudio integrated development environment 1.3.1056 (not required, but strongly recommended, https://rstudio.com/)
GEOquery R package 2.56.0 (Davis and Meltzer, 2007), https://www.bioconductor.org/packages/release/bioc/html/GEOquery.html
Minfi R package 1.34.0 (Aryee et al., 2014), https://www.bioconductor.org/packages/release/bioc/html/minfi.html
M3C R package 1.10.0 (John et al., 2020), https://www.bioconductor.org/packages/release/bioc/html/M3C.html
ggfortify R package 0.4.11 (Yuan et al., 2016), https://cran.r-project.org/web/packages/ggfortify/index.html
DMRcate R package 2.2.3 (Peters et al., 2015), https://www.bioconductor.org/packages/release/bioc/html/DMRcate.html
wateRmelon R package 1.32.0 (Pidsley et al., 2013), https://www.bioconductor.org/packages/release/bioc/html/wateRmelon.html
FlowSorted.Blood.450k R package 1.26.0, http://bioconductor.org/packages/release/data/experiment/html/FlowSorted.Blood.450k.html
mCSEA R package 1.8.0 (Martorell-Marugán et al., 2019), http://bioconductor.org/packages/release/bioc/html/mCSEA.html
Methylation data generated previously (Strong et al., 2015). GEO identifier: GSE66552 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE66552)
Procedure
文章信息
版权信息
© 2021 The Authors; exclusive licensee Bio-protocol LLC.
如何引用
Martorell-Marugán, J. and Carmona-Sáez, P. (2021). Detecting Differentially Methylated Promoters in Genes Related to Disease Phenotypes Using R. Bio-protocol 11(11): e4033. DOI: 10.21769/BioProtoc.4033.
分类
系统生物学 > 表观基因组学 > DNA 甲基化
系统生物学 > 基因组学
您对这篇实验方法有问题吗?
在此处发布您的问题,我们将邀请本文作者来回答。同时,我们会将您的问题发布到Bio-protocol Exchange,以便寻求社区成员的帮助。
提问指南
+ 问题描述
写下详细的问题描述,包括所有有助于他人回答您问题的信息(例如实验过程、条件和相关图像等)。
Share
Bluesky
X
Copy link