Whole-genome bisulfite sequencing (WGBS) data generation and processing

HB Himisha Beltran
AR Alessandro Romanel
VC Vincenza Conteduca
NC Nicola Casiraghi
MS Michael Sigouros
GF Gian Marco Franceschini
FO Francesco Orlando
TF Tarcisio Fedrizzi
SK Sheng-Yu Ku
ED Emma Dann
AA Alicia Alonso
JM Juan Miguel Mosquera
AS Andrea Sboner
JX Jenny Xiang
OE Olivier Elemento
DN David M. Nanus
ST Scott T. Tagawa
MB Matteo Benelli
FD Francesca Demichelis
request Request a Protocol
ask Ask a question
Favorite

cfDNA (5 ng) and germline DNA (100 ng) were sonicated using a Covaris S220 to approximately 180–220 bp (Covaris) and bisulfite converted using the EZ DNA Methylation-Gold Kit (catalog D5005, Zymo Research Corporation). The single-stranded DNA obtained was processed for library construction using the Accel-NGS@Methyl-seq DNA Library kit (catalog 36024) as per manufacturer instructions (Swift BioSciences). Briefly, truncated adapter sequences were incorporated to the single-stranded DNA in a template-independent reaction through sequential steps using the Adaptase module (Swift BioSciences). DNA was then enriched using PCR with primers compatible with Illumina sequencing, 9 cycles for cfDNA, and 6 cycles for genomic DNA. The libraries were clustered at 12 pM on a pair-end read flow cell and sequenced for 125 cycles on Illumina HiSeq 2500 or 4000.

Primary processing of sequencing images was done using Illumina’s Real Time Analysis software (RTA). CASAVA 1.8.2 software was then used to demultiplex samples and generate raw reads and respective quality scores. The WGBS raw data was quality filtered and adapter trimmed using Flexbar with the following parameters: minimal overlap of adaptor and read sequence, 6; minimal read length after adaptor removal/trimming, 21; and allowed mismatches and gaps per 10bp overlap, 2. Reads were aligned to unmasked human genome build GRCh37/hg19 and methylation calls were generated with Bismark (59) as described in the data analysis section of the Weill Cornell Epigenomics Core in-house bisulfite sequencing analysis pipeline (60). The average conversion rate in WGBS samples is 99.6%, the average CpG coverage is ×14.3, and the average mapping efficiency is 76% (Supplemental Table 8). For plasma-related analysis, only CpG sites covered by at least 10 reads and read in at least 10% of the study samples were considered for downstream analysis. For each sample, the percentage of methylation per site (β value) was computed. ctDNA fraction was estimated by purity assessment from clonal methylation sites (PAMES), using 10 hypermethylated prostate-specific CpG islands (30). CpG wise differential methylation analysis (CRPC-NE versus CRPC-Adeno) was performed by AUC analysis. Hyper- and hypomethylated sites were identified as those sites demonstrating AUC = 1 and AUC = 0, respectively. Genomic annotation of methylation sites was performed by the tool annotatePeaks included in the HOMER package (Supplemental Table 11) (61).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A