参见作者原研究论文

本实验方案简略版
Aug 2021
Advertisement

本文章节


 

Whole-genome Methylation Analysis of APOBEC Enzyme-converted DNA (~5 kb) by Nanopore Sequencing
通过纳米孔测序对 APOBEC 酶转化 DNA (~5 kb) 进行全基因组甲基化分析   

引用 收藏 提问与回复 分享您的反馈 Cited by

Abstract

In recent years, DNA methylation research has been accelerated by the advent of nanopore sequencers. However, read length has been limited by the constraints of base conversion using the bisulfite method, making analysis of chromatin content difficult. The read length of the previous method combining bisulfite conversion and long-read sequencing was ~1.5 kb, even using targeted PCR. In this study, we have improved read length (~5 kb), by converting unmethylated cytosines to uracils with APOBEC enzymes, to reduce DNA fragmentation. The converted DNA was then sequenced using a PromethION nanopore sequencer. We have also developed a new analysis pipeline that accounts for base conversions, which are not present in conventional nanopore sequencing, as well as errors produced by nanopore sequencing.

Keywords: DNA methylation (DNA甲基化), Epigenetics (表观遗传学), Nanopore sequencing (纳米孔测序), Long-read sequencing (长读测序), Next generation sequencing (下一代测序)

Background

DNA methylation is an important mechanism for epigenetic regulation of gene expression (Greenberg and Bourc’his, 2019). It has a wide range of effects on genes via several biological processes. DNA methylation is usually detected and analyzed using bisulfite sequencing short reads. However, it is difficult to align these short reads (~150 bp) to some chromosomal regions, such as repetitive sequences and structural variants (Goerner-Potvin and Bourque, 2018). Similarly, short reads also constrain detection of chromosome-specific methylation patterns, such as imprinted regions, in polyploid organisms (Akbari et al., 2021). A comprehensive understanding of epigenetic regulation by DNA methylation will therefore require complementary methods.


The bisulfite method distinguishes between unmethylated and methylated cytosine (C vs. mC), by chemically converting unmethylated C to uracil (U) and to thymine (T), by subsequent amplification (Lister et al., 2009). However, since this reaction is carried out under chemically severe conditions, a large proportion of the DNA in the reaction is fragmented and degraded. The genomic regions that undergo this degradation show biased representation (Olova et al., 2018), further limiting the experimental conclusions this method can provide. The read length of the previous method combining bisulfite conversion and long-read sequencing was only ~1.5 kb, even using targeted PCR (Yan et al., 2015). Recently, enzymatic methyl sequencing (EMseq) was developed as an alternative to the bisulfite method of base conversion (Vaisvila et al., 2021). EMseq involves the oxidation of mC by ten-eleven translocation (TET) enzymes to protect them, followed by base conversion of unmethylated C to U, by APOBEC enzymes. Via the amplification process, U is converted to T, as in the bisulfite method. Because this reaction is performed under milder chemical conditions than with the bisulfite method, longer DNA fragments are obtainable. In fact, a previous study showed that DNA fragments over 5 kb long can be obtained using EMseq and target-specific PCR, and that these fragments can be successfully sequenced in a long-read sequencer (Sun et al., 2021).


Nanopore sequencers read nucleic acid sequences by measuring the change in electric current while the nucleic acids are passing through the nanopore. The maximum read length of nanopore sequencing is over 100 kb (Sakamoto et al., 2020). By recognizing specific electrical patterns for modified bases, base modifications can also be detected (Rand et al., 2017; Simpson et al., 2017). However, while the base-reading accuracy of nanopore sequencers is currently up to 90%, this is not quite high enough to accurately infer methylation patterns (Sakamoto et al., 2020). Furthermore, it requires about 500 ng–1 µg of DNA input, reducing its practical utility for rare samples, such as clinical specimens and biopsies. Although several methods combining base-conversion and long-read sequencing have been developed, thus far all have employed gene-specific amplification (Yang et al., 2015; Liu et al., 2020; Sun et al., 2021). A method for whole-genome methylation analysis by this method, and a bioinformatic pipeline to process the sequence data it generates, have not heretofore been developed.


Here, we report a method for whole-genome long-read methylation sequencing, using a relatively small amount of input DNA, for nanopore sequencing of base-converted DNA by APOBEC enzymes (Figure 1) (Sakamoto et al., 2021). Our method, which we designate nanoEM, allows for whole-genome long-read methylation analysis with 10–100 ng of DNA. In addition, we have developed a data analysis pipeline for nanoEM reads by adopting a three-letter alignment approach to long-read alignment. NanoEM is an useful approach for detecting methylation status of structural variants (SVs), repetitive regions, and imprinting regions, which are difficult to analyze using short read sequencing (Sakamoto et al., 2021).



Figure 1. Flow chart of the experimental procedure.

Materials and Reagents

  1. Filter pipette tips 10, 20, 200, and 1,000 µL [e.g., Pipette Tips RT UNV F (RAININ, catalog numbers: 30389172, 30389189, 30389186, and 30389165)]

  2. 1.5 mL tubes [e.g., DNA LoBind Tube 1.5 mL (Eppendorf, catalog number: 0030108051)]

  3. PCR tubes [e.g., Temp Assure 0.2 mL PCR 8-Tube Strips, Att. Optical Caps (USA Scientific, catalog number: 1402-4700)]

  4. Mag Attract HMW DNA Kit (QIAGEN, catalog number: 67563)

  5. g-TUBE (Covaris, catalog number: 520079)

  6. Ethanol (e.g., FUJIFILM WAKO Pure Chemical Corporation, catalog number: 057-00456)

  7. Nuclease-free water (Thermo Fisher Scientific, catalog number: AM9930)

  8. Formamide (FUJIFILM Wako Pure Chemical Corporation, catalog number: 064-00423)

  9. NEBNext Enzymatic Methyl-seq kit (New England Biolabs, catalog number: E7120S)

  10. KOD One PCR Master Mix (TOYOBO, catalog number: KMM-101)

  11. Ligation Sequencing kit (Oxford Nanopore Technologies, catalog number: SQK-LSK110)

  12. PromethION flowcell (Oxford Nanopore Technologies, catalog number: FLO-PRO002)

  13. NEBNext Ultra II End Repair/dA-Tailing Module (New England Biolabs, catalog number: E7546)

  14. NEBNext FFPE DNA Repair Mix (New England Biolabs, catalog number: M6630)

  15. NEBNext Quick Ligation Module (New England Biolabs, catalog number: E6056)

  16. Qubit ds DNA HS Assay kit (Thermo Fisher Scientific, catalog number: Q32854)

  17. Agilent DNA 12000 kit (Agilent Technologies, catalog number: 5067-1508)

  18. Agencourt AMPure XP (Beckman Coulter, catalog number: BC-A63880)

  19. DNA Clean & Concentrator-5 (Zymo Research, catalog number: D4013)

  20. ProNex Size-Selective DNA Purification System (Promega, catalog number: NG2001)

  21. TET2 Reaction Buffer with supplement (see Recipes)

  22. 70% and 80% (v/v) ethanol (see Recipes)

  23. Wash Buffer of ProNex Size-Selective DNA Purification System (NG2001) (see Recipes)

Equipment

  1. PromethION sequencing device (Oxford Nanopore Technologies, catalog number: PRM48BasicSP)

  2. 2100 Bioanalyzer Instrument (Agilent Technologies, catalog number: G2939BA)

  3. Thermal cycler [e.g., T100 thermal cycler (Bio-Rad, catalog number: 1861096)]

  4. Qubit 4 Fluorometer (Thermo Fisher Scientific, catalog number: Q33238)

  5. Vortex Mixer [e.g., Vortex-Genie 2 (Scientific Industries, catalog number: SI-0236)]

  6. High speed centrifuge (e.g., MDX-310 with rack for 2 mL × 24 tubes, TOMY SEIKO)

  7. Tabletop centrifuge for 1.5 and 0.2 mL tubes [e.g., MyFuge mini centrifuge (Benchmark Scientific, catalog number: C1008-B)]

  8. Magnetic stand for 1.5 and 2 mL tubes [e.g., DynaMag-2 (Thermo Fisher Scientific, catalog number: 12321D)]

  9. Magnetic stand for 0.2 mL tubes [e.g., 10× Magnetic Separator (10× Genomics, catalog number: 120250)]

  10. Pipettes for 10, 20, 200, and 1,000 μL tips

  11. Racks for 0.2 mL PCR tubes and 1.5 mL tubes

Software

  1. Python3 (version 3.8.6, https://www.python.org/downloads/)

  2. pysam (version 0.17.0, https://github.com/pysam-developers/pysam)

  3. minimap2 (version 2.22) (Li, 2018)

  4. sambamba (version 0.7.1) (Tarasov et al., 2015)

  5. samtools (version 1.9) (Li et al., 2009)

  6. Integrated Genome Viewer (IGV) (version 2.5.3) (Thorvaldsdóttir et al., 2013)

Procedure

  1. DNA Extraction

    The MagAttract HMW DNA Kit is used for DNA extraction from cultured cells (<2 × 109 cells) and/or clinical specimens (<25 mg tissues), in accordance with the manufacturer's instructions without modification (Note 1).


  2. DNA Fragmentation

    For fragmentation of genomic DNA, add 150 µL of DNA (<4 µg) diluted with nuclease-free water (NFW) to a g-TUBE, and centrifuge twice at 4,700 × g and room temperature (RT) for 1 min. Using the Bioanalyzer with the Agilent DNA 12000 kit, measure the concentration and the length distribution of the fragmented DNA following the manufacturer’s protocol (Note 2). Apply 10–50 ng of fragmented DNA to the next step, diluted with NFW to a 50 µL volume.


  3. End Repair and Adaptor Ligation

    1. Set up the programs of Steps 3, 5, and 12 in a thermal cycler.

    2. Combine 50 µL of the fragmented DNA, 7 µL of NEBNext Ultra II End-Prep Reaction Buffer, and 3 µL of NEBNext Ultra II End-Prep Enzyme Mix in a PCR tube. Mix by pipetting.

    3. Incubate at 20°C for 30 min, at 65°C for 30 min, then hold at 4°C in a thermal cycler with the heated lid set to 75°C.

    4. Add 2.5 µL of NEBNext EMseq Adaptor, 1 µL of NEBNext Ligation Enhancer, and 30 µL of NEBNext Ultra II Ligation Master Mix to the sample. Mix by pipetting.

    5. Incubate at 20°C for 15 min, then hold at 4°C in a thermal cycler with the heated lid off.

    6. Add 110 µL of NEBNext Sample Purification Beads to the sample. Mix by pipetting. Incubate at RT for 5 min.

    7. Place the tube on a magnetic stand for 0.2 mL tubes until it becomes clear (it takes ~2 min) (Figure 2). Remove and discard the supernatant.

    8. Add 200 µL of 80% ethanol to the tube. After 30 s, remove and discard the supernatant.

    9. Repeat Step 8 once.

    10. Remove the tube from the magnetic stand and spin down on a tabletop centrifuge (~2,000 × g at RT). Place the tube on the magnetic stand. Remove and discard the remaining supernatant completely.

    11. Air dry the pellet for 1 min.

    12. Remove the tube from the magnetic stand. Elute the DNA from the beads by adding 29 µL of Elution Buffer from the EMseq kit and incubating at 37°C in a thermal cycler for 10 min.

    13. Place the tube on the magnetic stand until it becomes clear (it takes ~2 min) (Figure 2). Transfer 28 µL of the supernatant to a new PCR tube.



      Figure 2. Bead separation by magnetic stand.

      (A) Suspended beads before magnetic separation. (B) Insufficient separation of beads. The solutions are slightly cloudy. (C) Sufficient separation of beads. The solutions are clear.


  4. Oxidation of 5 mC’s/5 hmC’s

    1. Set up the programs of Steps 5, 7, and 14 in a thermal cycler.

    2. Add 10 µL of TET2 Reaction Buffer with supplement, 1 µL of Oxidation Supplement, 1 µL of DTT, 1 µL of Oxidation Enhancer, and 4 µL of TET2 from the EMseq kit. Mix by pipetting.

    3. Dilute 1 µL of 500 mM Fe(II) Solution from the EMseq kit in 1,249 μL of NFW in a new 1.5 mL tube.

    4. Add 5 µL of the diluted Fe (II) Solution to the sample. Mix by pipetting.

    5. Incubate at 37°C in a thermal cycler with the heated lid set to 45°C for 1 h.

    6. Add 1 µL of Stop Reagent to the sample. Mix by pipetting.

    7. Incubate at 37°C in the thermal cycler with the heated lid set to 45°C for 30 min.

    8. Add 90 µL of NEBNext Sample Purification Beads from the EMseq kit to the sample. Mix by pipetting. Incubate at RT for 5 min.

    9. Place the tube on a magnetic stand for 0.2 mL tube until it becomes clear (it takes ~2 min) (Figure 2). Remove and discard the supernatant.

    10. Add 200 µL of 80% ethanol to the tube. After 30 s, remove and discard the supernatant.

    11. Repeat Step 10 once.

    12. Remove the tube from the magnetic stand and spin down on a tabletop centrifuge. Place the tube on the magnetic stand. Remove and discard the remaining supernatant completely.

    13. Air dry the pellet for 1 min.

    14. Remove the tube from the magnetic stand until it becomes clear (it takes ~2 min) (Figure 2). Elute the target DNA from the beads by adding 17 µL of Elution Buffer from the EMseq kit and incubating at 37°C in the thermal cycler for 10 min.

    15. Place the tube on the magnetic stand until it becomes clear (it takes ~2 min) (Figure 2). Transfer 16 µL of the supernatant to a new PCR tube.


  5. Denaturation of Cytosines

    1. Set up the program of Step 3 in a thermal cycler.

    2. Add 4 µL of formamide to the sample. Mix by pipetting.

    3. Incubate at 85°C in a thermal cycler with the heated lid set to 95°C for 10 min, then place on ice immediately.


  6. Deamination of Cytosines

    1. Set up the programs of Steps 3 and 10 in a thermal cycler.

    2. Add 68 µL of NFW, 10 µL of APOBEC Reaction Buffer, 1 µL of BSA, and 1 µL of APOBEC from the EMseq kit to the sample. Mix by pipetting.

    3. Incubate at 37°C for 3 h, then hold at 4°C, in a thermal cycler with the heated lid set to 45°C.

    4. Add 100 µL of NEBNext Sample Purification Beads from the EMseq kit to the sample. Mix by pipetting. Incubate for 5 min at RT.

    5. Place the tube on the magnetic stand for 0.2 mL tubes until it becomes clear (it takes ~2 min) (Figure 2). Remove and discard the supernatant.

    6. Add 200 µL of 80% ethanol to the tube. After 30 s, remove and discard the supernatant.

    7. Repeat Step 6 once.

    8. Remove the tube from the magnetic stand and spin down on a tabletop centrifuge. Place the tube on the magnetic stand. Remove and discard the remaining supernatant completely.

    9. Air dry the pellet for 1 min.

    10. Remove the tube from the magnetic stand. Elute the target DNA from the beads by adding 41 µL of NFW and incubating at 37°C in a thermal cycler with the heated lid set to 45°C for 10 min (Note 3).

    11. Place the tube on the magnetic stand until it becomes clear (it takes ~2 min) (Figure 2). Transfer 20 µL of the supernatant to two PCR tubes (each tube contains of the supernatant).


  7. PCR Amplification

    1. Set up the PCR program of Step 3 in a thermal cycler.

    2. Add 5 µL of the custom primer mix (10 µM each, described in Table 1), and 25 µL of KOD ONE PCR Master Mix to each tube. Mix by pipetting.

    3. Perform PCR amplification of both tubes using the following PCR program: 13–16 cycles of 94°C for 15 s, at 57°C for 5 s, 68°C for 15 min, then hold at 4°C. The number of PCR cycles depend on the amount of input DNA (16 cycles for 10 ng DNA input, 13 cycles for 50 ng DNA input) and the quality of DNA.


      Table 1. PCR program

      Temperature
      Time
      Cycles
      94°C
      15 s


      13–16
      57°C
      5 s
      68°C
      15 min
      4°C
      Hold
      1


    4. Combine the separately amplified samples into one tube. Purify the sample by using a purification column of DNA Clean & Concentrator-5, according to the manufacturer’s instructions. Elute the DNA from the column by adding 52 µL of NFW, pre-incubated at 70°C. Repeat the elution step by adding the 52 µL of flowthrough back to the column. The quality of the purified DNA is measured using the Agilent DNA 12000 kit (Figure 3A).


      Table 2. Custom primer sequences

      Primer
      Forward primer

      CAAGCAGAAGACGGCATACGAGATCGAGTAATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
      Reverse primer

      AATGATACGGCGACCACCGAGATCTACACTATAGCCTACACTCTTTCCCTACACGACGCTCTTCCGATCT


  8. Size Selection

    1. Set up the programs of Step 8 in a thermal cycler.

    2. Add 41–45 µL of ProNEX Chemistry (0.82–0.9×) to 50 µL of the DNA. Incubate at RT for 10 min.

    3. Place the tube on a magnetic stand for 0.2 mL tubes until it becomes clear (it takes ~2 min) (Figure 2). Remove and discard the supernatant.

    4. Add 200 µL of wash buffer to the tube. After 30 s, remove and discard the supernatant.

    5. Repeat Step 4 once.

    6. Remove the tube from the magnetic stand and spin down on a tabletop centrifuge. Place the tube on the magnetic stand. Remove and discard the remaining supernatant completely.

    7. Air dry the pellet for 1 min.

    8. Remove the tube from the magnetic stand. Elute the target DNA from the beads by adding 51 µL of NFW and incubating at 37°C in the thermal cycler with the heated lid set to 45°C for 10 min.

    9. Place the tube on a magnetic stand until it becomes clear (it takes ~2 min) (Figure 2). Transfer 50 µL of the supernatant to a new PCR tube. The quality and quantity of the purified DNA are assessed using the Agilent DNA 12000 kit (Note 2) and the Qubit ds DNA HS Assay kit (Figure 3B) (Note 4).



    Figure 3. Amplicon of base-converted DNA.

    (A and B) The amplicon distribution of base-converted DNA (A) before and (B) after size selection. DNA was quantified with the Agilent DNA 12000 kit. The reaction was performed with 50 ng of fragmented DNA from a breast cancer cell line BT-474 (Lasfargues et al., 1978). The amplification was performed with 13 cycles of polymerase chain reaction, using the KOD ONE PCR Master Mix and the primers described in Table 1. Size selection of the amplified DNA was performed using the 0.82× volume of ProNEX Chemistry.


  9. Library Preparation for Nanopore Sequencing

    1. Set up the programs of Steps 3 and 22 in a thermal cycler.

    2. Combine 48 µL of the sample, 3.5 µL of NEBNext FFPE DNA Repair Buffer, 3.5 µL of Ultra II End-Prep Reaction Buffer, 2 µL of NEBNext FFPE DNA Repair Mix, and 3 µL of Ultra II End-Prep Enzyme Mix in a new PCR tube. Mix by flicking the tube and spin down on a tabletop centrifuge.

    3. Incubate at 20°C for 5 min, then 65°C for 5 min, in a thermal cycler with the heated lid set to 75°C.

    4. Add 60 µL of AMPure XP beads to the sample. Mix by flicking the tube and spin down on a tabletop centrifuge.

    5. Incubate at RT for 5 min.

    6. Place the tube on a magnetic stand for 0.2 mL tubes until it becomes clear (it takes ~2 min) (Figure 2). Remove and discard the supernatant.

    7. Add 200 µL of 70% ethanol to the tube. After 30 s, remove and discard the supernatant.

    8. Repeat Step 7 once.

    9. Remove the tube from the magnetic stand and spin down on a tabletop centrifuge. Place the tube on the magnetic stand. Remove and discard the remaining supernatant completely.

    10. Air dry the pellet for 1 min.

    11. Remove the tube from the magnetic stand. Elute the DNA from the beads by adding 61 µL of NFW and incubating at 37°C in a thermal cycler with the heated lid set to 45°C for 10 min.

    12. Place the tube on the magnetic stand until it becomes clear (it takes ~2 min) (Figure 2). Transfer 61 µL of the supernatant to a new PCR tube. Use 1 µL of the sample for quantification by Qubit ds DNA HS Assay kit.

    13. Add 25 µL of ligation buffer, 10 µL of NEBNext Quick T4 DNA Ligase, and 5 µL of Adapter Mix F to the sample. Mix by flicking the tube and spin down on a tabletop centrifuge.

    14. Incubate at RT for 10 min.

    15. Add 40 µL of AMPure XP beads to the sample. Mix by flicking the tube and spin down on a tabletop centrifuge.

    16. Incubate at RT for 5 min.

    17. Place the tube on a magnetic stand for 0.2 mL tubes until it becomes clear (it takes ~2 min) (Figure 2). Remove and discard the supernatant.

    18. Remove the tube from the magnetic stand. Wash the beads by adding 250 µL of long fragment buffer to the tube. After flicking the beads to resuspend, return the sample to the magnetic stand. Once the solution is clear, remove and discard the supernatant.

    19. Repeat Step 18 once.

    20. Remove the tube from the magnetic stand and spin down on a tabletop centrifuge. Place the tube on the magnetic stand. Remove and discard the remaining supernatant completely.

    21. Air dry the pellet for 30 s.

    22. Remove the tube from the magnetic stand. Elute the DNA from the beads by adding 25 µL of elution buffer from the Ligation Sequencing kit and incubating at 37°C in a thermal cycler with the heated lid set to 45°C for 10 min.

    23. Place the tube on a magnetic stand until it becomes clear (it takes ~2 min) (Figure 2). Transfer 25 µL of the supernatant to a new PCR tube. Use 1 µL of the sample for quantification by Qubit ds DNA HS Assay kit. Estimate the molarity of the prepared library by correcting the mass concentration of the library with the mole and mass concentrations of the DNA before library preparation. Apply 5–50 fmol of the eluted library to the next step (Note 5). If more than 50 fmol of DNA is contained in 24 µL of the library, diluted 5–50 fmol of the library up to 24 µL by elution buffer from the Ligation Sequencing kit.


  10. Priming and Loading the PromethION Flowcell

    1. Add 30 µL of Flush Tether (FLT) to one tube of Flush Buffer (FB). Mix by vortexing.

    2. Set the flowcell to the PromethION sequencer. Remove air from the inlet port of the flowcell by pipetting, to avoid the introduction of air bubbles.

    3. Prime the flowcell with 500 µL of FB/FLT mix. After incubation for 5 min, re-prime with 500 µL of FB/FLT mix.

    4. Add 75 µL of Sequencing Buffer II and 51 µL of re-suspended Loading Beads II to the library.

    5. Mix by gently pipetting. Immediately load the 150 µL of library to the flowcell and run the program. A video for priming and loading the flow cell of PromethION is available on the community site of Oxford Nanopore Technologies (https://community.nanoporetech.com/protocols/genomic-dna-by-ligation-sqk-lsk110/v/gde_9108_v110_revl_10nov2020/priming-and-loading-the-flow-cell?devices=promethion).

Data analysis

Two fastq files containing 1d pass reads or 1d fail reads are generated via the real-time basecalling of Guppy, a basecaller integrated into MinKNOW software for the PromethION sequencer. A fastq file contains the base sequence and the quality of each of the bases for sequence reads. We recommend using the fastq file of 1d pass reads, which passed the filter of base quality, for the data analysis. DNA sequence after base-conversion by bisulfite or EMseq consists of A, G, T (original T and unmethylated C), and C (methylated C) in the original strand, or A (complementary of original T and unmethylated C), G (complementary of methylated C), T, and C in the complementary strand generated by PCR. Therefore, it is difficult to align the sequence to the normal reference sequence. To map nanoEM data to reference genome data, we adopted a three-letter alignment approach—which is also used for Bismark (Krueger and Andrews, 2011)—to long-read alignment. In the three-letter approach, to enable alignment of the base-converted reads, two types of reads are computationally prepared, where all the C are converted to T or all the G are converted to A, and two types of the reference genome sequence, where all the C are converted to T or all the G are converted to A. After alignment of the reads to the reference genomes, it is possible to determine whether each read is derived from the original or complementary strand, by choosing the best alignment combination with the best alignment score for each read, and to detect the methylation status of each C, by referring to the original sequence of reads and reference genome. A flow chart of the data analysis is shown in Figure 4. To perform the operations correctly, all information, including bioinformatics scripts and explanation, is available in a GitHub repository at this link: https://github.com/yos-sk/nanoEM. Software used in this protocol can be easily installed via the conda command of miniconda (https://docs.conda.io/en/latest/miniconda.html) or anaconda (https://www.anaconda.com/products/individual).



Figure 4. Flow chart of data analysis.

  1. Convert bases of reference genome

    From a fasta file (ref.fa) of the reference genome (such as human genome hg38), generate a fasta file (output.fa) for a modified reference genome, representing the reference genome with the Cs converted to Ts, and with the Gs converted to As on the reverse strand.


    $ python src/convert_ref.py ref.fa > output.fa


  2. Convert bases of nanoEM data

    From the compressed fastq file of nanoEM reads (1d_pass.fq.gz), generate two modified fastq files: one with the Cs converted to Ts (1d_pass_CT.fq.gz), the other with Gs converted to As (1d_pass_GA.fq.gz).


    $ python src/convert_reads.py 1d_pass.fq.gz


  3. Map the converted nanoEM reads (1d_pass_CT.fq.gz and 1d_pass_GA.fq.gz) to the converted reference genome (output.fa). Map the processed nanoEM reads to the processed reference genome, using minimap2 with the “map-ont” option. Then, two bam files (1.sorted.bam and 2.sorted.bam) and two respective index files (1.sorted.bam.bai and 2.sorted.bam.bai) are generated.


    $ minimap2 -t 8 –split-prefix temp_sam1 -ax map-ont output.fa 1d_pass_CT.fq.gz –eqx | samtools view -b | samtools sort -@ 8 -o 1.sorted.bam

    $ samtools index 1.sorted.bam

    $ minimap2 -t 8 –split-prefix temp_sam2 -ax map-ont output.fa 1d_pass_GA.fq.gz –eqx | samtools view -b | samtools sort -@ 8 -o 2.sorted.bam

    $ samtools index 2.sorted.bam


  4. Choose the best alignments

    From the alignment results (1.sorted.bam and 2.sorted.bam), select the most appropriate alignment combination by the alignment score. Then, two bam files (output_CT.sorted.bam and output_GA.sorted.bam) and two corresponding index files (output_CT.sorted.bam.bai and output_GA.sorted.bam.bai) are generated.


    $ python src/best_align.py --bam1 1.sorted.bam --bam2 2.sorted.bam --fastq nanoEM_read.fq.gz

    $ samtools view -b output_CT.sam | samtools sort -o output_CT.sorted.bam

    $ samtools view -b output_GA.sam | samtools sort -o output_GA.sorted.bam

    $ rm output_*.sam

    $ samtools index output_CT.sorted.bam

    $ samtools index output_GA.sorted.bam


  5. Call methylation

    Using the sambamba mpileup command, detect the methylation frequencies of the cytosines in the CpG sites of the reference genome (ref.fa). After processing by a python script (src/call_methylation.py), a tsv file of methylation frequency (frequency_methylation.tsv) is generated.


    $ sambamba mpileup output_CT.sorted.bam -L cpg_sites.bed -o pileup_CT.tsv -t 8 --samtools -f ref.fa

    $ sambamba mpileup output_GA.sorted.bam -L cpg_sites.bed -o pileup_GA.tsv -t 8 --samtools -f ref.fa

    $ python src/call_methylation.py pileup_CT.tsv pileup_GA.tsv > frequency_methylation.tsv


  6. Visualize in bisulfite mode of IGV

    To visualize in bisulfite mode of IGV, correct the sequence of G-to-A-converted reads (output_GA.sorted.bam) to that of the complementary strand and merge it (output_GA_vis.sorted.bam) with the bam file of the C-to-T-converted reads (output_CT.sorted.bam). After sorting, the merged bam file (output_merge.sorted.bam) can be visualized in the bisulfite mode of IGV. The bisulfite mode option can be activated from the right-click pop-up menu. Select “Color alignments by”, ”bisulfite mode”, then “CG”. Visualization of a typical nanoEM result is shown in Figure 5.


    $ python script/vis_GA_utilities.py -b output_GA.sorted.bam | samtools view -b | samtools sort -@ 4 -o output_GA_vis.sorted.bam

    $ samtools index output_GA_vis.sorted.bam

    $ samtools merge output_merge.bam output_CT.sorted.bam output_GA_vis.sorted.bam

    $ samtools sort -@ 4 -o output_merge.sorted.bam output_merge.bam

    $ samtools index output_merge.sorted.bam



Figure 5. Visualization of nanoEM reads.

Visualization of a representative nanoEM result in bisulfite mode of IGV, in the region surrounding the promoter of the gene PGR. Methylated and unmethylated CpGs are shown in red and blue, respectively. The annotations of CpG islands were obtained from the UCSC table browser (Karolchik et al., 2004).

Notes

  1. We have successfully used this protocol with mammalian cell lines and human clinical specimens of lung and breast.

  2. After filling the DNA chip with the pre-filtrated Gel-Dye mix by the Chip Priming Station, an accessory of the 2100 Bioanalyzer Instrument, add 9 µL of the Gel-Dye mix and 5 µL of the Marker to wells of the chip following the manufacturer’s instructions. Add 1 µL of the Ladder and 1 µL of the samples to the ladder well and the sample wells, respectively. After vortexing for 1 min, set the chip to 2100 Bioanalyzer Instrument and start measurement.

  3. In this step, DO NOT use Elution Buffer from the EMseq kit, as it is detrimental to the subsequent PCR reaction.

  4. We recommend using at least 200 ng of DNA for the subsequent library preparation of nanopore sequencing.

  5. When the amount of library loaded is too high or too low, the yield of the sequencing data will be reduced.

Recipes

  1. TET2 Reaction Buffer with supplement

    Add 100 μL of TET2 Reaction Buffer to a tube of TET2 Reaction Buffer Supplement and mix by vortexing. The TET2 Reaction Buffer with supplement can be stored at -20°C for 4 months.

  2. 70% and 80% (v/v) ethanol

    Mix ethanol and NFW. These reagents were freshly prepared at the time of use.

  3. Wash Buffer of ProNex Size-Selective DNA Purification System (NG2001)

    Add 75 mL of ethanol to a Bottle of Wash Buffer.

Acknowledgments

This protocol is based on our previous publication (Sakamoto et al., 2021). We are supported by JSPS KAKENHI [JP21K15074, JP19K16108]; MEXT KAKENHI [JP16H06279 (PAGS), JP17H06306, JP20H05906]; JSPS Fujita Memorial Fund for Medical Research; National Cancer Center Research and Development Fund (29-A-6).

Competing interests

There are no conflicts of interest or competing interests.

References

  1. Akbari, V., Garant, J. M., O'Neill, K., Pandoh, P., Moore, R., Marra, M. A., Hirst, M. and Jones, S. J. M. (2021). Megabase-scale methylation phasing using nanopore long reads and NanoMethPhase. Genome Biol 22(1): 68.
  2. Goerner-Potvin, P. and Bourque, G. (2018). Computational tools to unmask transposable elements. Nat Rev Genet 19(11): 688-704.
  3. (2019). The diverse roles of DNA methylation in mammalian development and disease. Nat Rev Mol Cell Biol 20(10): 590-607.
  4. Karolchik, D., Hinrichs, A. S., Furey, T. S., Roskin, K. M., Sugnet, C. W., Haussler, D. and Kent, W. J. (2004). The UCSC Table Browser data retrieval tool. Nucleic Acids Res 32(Database issue): D493-496.
  5. Krueger, F. and Andrews, S. R. (2011). Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27: 1571-1572.
  6. Lasfargues, E. Y., Coutinho, W. G. and Redfield, E. S. (1978). Isolation of two human tumor epithelial cell lines from solid breast carcinomas. J Natl Cancer Inst 61(4): 967-978.
  7. Li, H., (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34(18): 3094-3100.
  8. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R. and Genome Project Data Processing, S. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16): 2078-2079.
  9. Lister, R., Pelizzola, M., Dowen, R. H., Hawkins, R. D., Hon, G., Tonti-Filippini, J., Nery, J. R., Lee, L., Ye, Z. and Ngo, Q. M. (2009). Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462: 315-322.
  10. Liu, Y., Cheng, J., Siejka-Zielinska, P., Weldon, C., Roberts, H., Lopopolo, M., Magri, A., D'Arienzo, V., Harris, J. M. and McKeating, J. A. (2020). Accurate targeted long-read DNA methylation and hydroxymethylation sequencing with TAPS. Genome Biol 21(1): 54.
  11. Olova, N., Krueger, F., Andrews, S., Oxley, D., Berrens, R. V., Branco, M. R. and Reik, W. (2018). Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data. Genome Biol 19(1): 33.
  12. Rand, A. C., Jain, M., Eizenga, J. M., Musselman-Brown, A., Olsen, H. E., Akeson, M. and Paten, B. (2017). Mapping DNA methylation with high-throughput nanopore sequencing. Nat Methods 14: 411-413.
  13. Sakamoto, Y., Xu, L., Seki, M., Yokoyama, T. T., Kasahara, M., Kashima, Y., Ohashi, A., Shimada, Y., Motoi, N., Tsuchihara, K. and Kobayashi, S. S. (2020). Long-read sequencing for non-small-cell lung cancer genomes. Genome Res 30(9): 1243-1257.
  14. Sakamoto, Y., Zaha, S., Nagasawa, S., Miyake, S., Kojima, Y., Suzuki, A., Suzuki, Y. and Seki, M. (2021). Long-read whole-genome methylation patterning using enzymatic base conversion and nanopore sequencing. Nucleic Acids Res 49(14): e81.
  15. Simpson, J. T., Workman, R. E., Zuzarte, P. C., David, M., Dursi, L. J. and Timp, W. (2017). Detecting DNA cytosine methylation using nanopore sequencing. Nat Methods 14: 407-410.
  16. Sun, Z., Vaisvila, R., Hussong, L. M., Yan, B., Baum, C., Saleh, L., Samaranayake, M., Guan, S., Dai, N. and Correa, I. R. (2021). Nondestructive enzymatic deamination enables single-molecule long-read amplicon sequencing for the determination of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Genome Res 31: 291-300.
  17. Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. and Prins, P. (2015). Sambamba: fast processing of NGS alignment formats. Bioinformatics 31(12): 2032-2034.
  18. Thorvaldsdóttir, H., Robinson, J. T. and Mesirov, J. P. (2013). Integrative Genomics Viewer(IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14(2): 178-192.
  19. Vaisvila, R., Ponnaluri, V. K. C., Sun, Z., Langhorst, B. W., Saleh, L., Guan, S., Dai, N., Campbell, M. A., Sexton, B. S. and Marks, K. (2021). Enzymatic methyl sequencing detects DNA methylation at single-base resolution from picograms of DNA. Genome Res 31: 1280-1289.
  20. Yang, Y., Sebra, R., Pullman, B. S., Qiao, W., Peter, I., Desnick, R. J., Geyer, C. R., DeCoteau, J. F. and Scott, S. A. (2015). Quantitative and multiplexed DNA methylation analysis using long-read single-molecule real-time bisulfite sequencing(SMRT-BS). BMC Genomics 16: 350.

简介

[摘要]近年来,随着纳米孔测序仪的问世,DNA甲基化研究得到加速。然而,读取长度受到使用亚硫酸氢盐法碱基转换的限制,使得染色质含量的分析变得困难。先前结合亚硫酸氢盐转换和长读长测序的方法的读长约为1.5 kb,即使使用靶向 PCR 也是如此。在这项研究中,我们通过使用 APOBEC 酶将未甲基化的胞嘧啶转化为尿嘧啶,提高了读取长度 (~5 kb),以减少 DNA 片段化。然后使用 PromethION 纳米孔测序仪对转化的 DNA 进行测序。我们还开发了一种新的分析流程,可以解释传统纳米孔测序中不存在的碱基转换,以及纳米孔测序产生的错误。


[背景] DNA 甲基化是基因表达的表观遗传调控的重要机制(Greenberg 和Bourc'his ,2019)。它通过几个生物过程对基因产生广泛的影响。通常使用亚硫酸氢盐测序短读长检测和分析 DNA 甲基化。然而,很难将这些短读数(~150 bp)与一些染色体区域对齐,例如重复序列和结构变异(Goerner-Potvin 和 Bourque,2018)。同样,短读数也限制了对多倍体生物中染色体特异性甲基化模式(例如印迹区域)的检测(Akbari等人, 2021)。因此,全面了解 DNA 甲基化的表观遗传调控将需要补充方法。
亚硫酸氢盐方法通过随后的扩增将未甲基化的 C 化学转化为尿嘧啶 (U) 和胸腺嘧啶 (T),从而区分未甲基化和甲基化胞嘧啶 (C 与mC ) (Lister et al ., 2009)。然而,由于该反应是在化学苛刻的条件下进行的,因此反应中的大部分 DNA 被片段化和降解。经历这种退化的基因组区域显示出有偏差的表示( Olova et al ., 2018),进一步限制了该方法可以提供的实验结论。先前结合亚硫酸氢盐转换和长读长测序的方法的读长仅为~1.5 kb,即使使用靶向PCR(Yan et al ., 2015)。最近,酶促甲基测序 ( EMseq ) 被开发为碱基转换的亚硫酸氢盐方法的替代方法(Vaisvila等人, 2021 年)。 EMseq 涉及通过 10-11 易位 (TET) 酶氧化mC以保护它们,然后通过 APOBEC 酶将未甲基化的 C 碱基转化为 U。通过放大过程,U 被转化为 T,就像在亚硫酸氢盐法中一样。由于该反应是在比亚硫酸氢盐法更温和的化学条件下进行的,因此可以获得更长的 DNA 片段。事实上,之前的一项研究表明,使用 EMseq 和目标特异性 PCR 可以获得超过 5 kb 长的 DNA 片段,并且这些片段可以在长读长测序仪中成功测序(Sun et al ., 2021)。
纳米孔测序仪通过测量核酸通过纳米孔时的电流变化来读取核酸序列。纳米孔测序的最大读取长度超过 100 kb (Sakamoto et al ., 2020)。通过识别修饰碱基的特定电模式,还可以检测碱基修饰(Rand等人, 2017;辛普森等人, 2017)。然而,虽然纳米孔测序仪的碱基读取准确度目前高达 90%,但这还不足以准确推断甲基化模式(Sakamoto等人,2020)。此外,它需要大约 500 ng–1 µg 的 DNA 输入,降低了其在临床标本和活检等稀有样品中的实用性。尽管已经开发了几种结合碱基转换和长读长测序的方法,但迄今为止都采用了基因特异性扩增(Yang et al ., 2015; Liu et al ., 2020; Sun et al ., 2021)。迄今为止,尚未开发出一种通过这种方法进行全基因组甲基化分析的方法,以及用于处理其产生的序列数据的生物信息学管道。
在这里,我们报告了一种全基因组长读长甲基化测序方法,使用相对少量的输入DNA,通过 APOBEC 酶对碱基转换的 DNA 进行纳米孔测序(图 1)(Sakamoto等人, 2021)。我们命名为nanoEM的方法允许使用 10 – 100 ng DNA进行全基因组长读长甲基化分析。此外,我们通过采用三字母对齐方法进行长读取对齐,开发了用于 nanoEM 读取的数据分析管道。 NanoEM 是检测结构变体 (SV)、重复区域和印记区域甲基化状态的有用方法,这些区域很难使用短读长测序进行分析 (Sakamoto et al ., 2021)。


图 1. 实验程序流程图。

关键字:DNA甲基化, 表观遗传学, 纳米孔测序, 长读测序, 下一代测序



材料和试剂


1. 过滤移液器吸头 10、20、200 和 1,000 µL [例如,移液器吸头 RT UNV F(RAININ,目录号:30389172、30389189、30389186 和 30389165)]
2. 1.5 mL管[例如,DNA LoBind Tube 1.5 mL(Eppendorf,目录号:0030108051)]
3. PCR 管 [例如., Temp Assure 0.2 mL PCR 8-Tube Strips, Att.光学帽(USA Scientific,目录号:1402-4700 )]
4. Mag Attract HMW DNA 试剂盒(QIAGEN,目录号:67563)
5. g-TUBE( Covaris ,目录号:520079)
6. 乙醇(例如,FUJIFILM WAKO Pure Chemical Corporation,目录号:057-00456)
7. 无核酸酶水(Thermo Fisher Scientific,目录号:AM9930)
8. 甲酰胺(FUJIFILM Wako Pure Chemical Corporation,目录号:064-00423)
9. NEBNext Enzymatic Methyl-seq 试剂盒(New England Biolabs,目录号:E7120S)
10. KOD One PCR Master Mix(TOYOBO,目录号:KMM-101)
11. 连接测序试剂盒(Oxford Nanopore Technologies,目录号:SQK-LSK110)
12. PromethION 流通池(Oxford Nanopore Technologies,目录号:FLO-PRO002)
13. NEBNext Ultra II End Repair/ dA -Tailing Module(New England Biolabs,目录号:E7546)
14. NEBNext FFPE DNA Repair Mix(New England Biolabs,目录号:M6630)
15. NEBNext 快速连接模块(New England Biolabs,目录号:E6056)
16. Qubit ds DNA HS Assay 试剂盒(Thermo Fisher Scientific,目录号:Q32854)
17. Agilent DNA 12000 试剂盒(Agilent Technologies,目录号:5067-1508)
18. 阿让库尔 AMPure XP(Beckman Coulter,目录号:BC-A63880)
19. DNA Clean & Concentrator-5( Zymo Research,目录号:D4013)
20. ProNex大小选择性 DNA 纯化系统(Promega,目录号:NG2001)
21. TET2 反应缓冲液(见配方)
22. 70% 和 80% (v/v) 乙醇(见配方)
23. ProNex大小选择性 DNA 纯化系统 (NG2001) 的洗涤缓冲液(参见配方)


设备


1. PromethION测序装置(Oxford Nanopore Technologies,目录号:PRM48BasicSP)
2. 2100 Bioanalyzer Instrument(Agilent Technologies,目录号: G2939BA )
3. 热循环仪[例如,T100热循环仪(Bio-Rad,目录号:1861096)]
4. Qubit 4 荧光计(Thermo Fisher Scientific,目录号:Q33238)
5. 涡流混合器[例如,Vortex-Genie 2(Scientific Industries,目录号:SI-0236)]
6. 高速离心机(例如,MDX-310 with rack for 2 mL × 24 管,TOMY SEIKO)
7. 用于1.5 和 0.2 mL管的台式离心机 [例如, MyFuge迷你离心机(Benchmark Scientific,目录号:C1008-B)]
8. 1.5 和 2 mL管的磁性支架[例如,DynaMag-2(Thermo Fisher Scientific,目录号:12321D)]
9. 用于 0.2 mL 管的磁性支架 [例如,10 × Magnetic Separator(10 × Genomics,目录号:120250)]
10. 适用于 10、20、200 和 1,000 μL 吸头的移液器
11. 用于 0.2 mL PCR 管和 1.5 mL 管的架子


软件 


1. Python3(版本 3.8.6 , https: //www.python.org/downloads/ )
2. pysam (版本 0.17.0, https://github.com/pysam-developers/pysam )
3. minimap2(2.22 版)(李,2018)
4. 桑巴巴(0.7.1 版)(Tarasov等人, 2015 年)
5. samtools (1.9 版) (Li et al ., 2009)
6. 集成基因组查看器 (IGV)(版本 2.5.3)( Thorvaldsdóttir 等人, 2013)


程序


A. DNA提取
MagAttract HMW DNA 试剂盒用于根据制造商的说明从培养细胞(<2 × 10 9 个细胞)和/或临床标本(<25 mg 组织)中提取 DNA,无需修改(注 1)。


B. DNA片段化
对于基因组 DNA 的片段化,将用无核酸酶水 (NFW) 稀释的 150 µL DNA (<4 µg) 添加到g-TUBE 中,并在 4,700 × g和室温 (RT) 下离心两次1 分钟。使用带有安捷伦 DNA 12000 试剂盒的生物分析仪,按照制造商的方案(注 2)测量片段化 DNA 的浓度和长度分布。将 10 – 50 ng 片段化 DNA 用于下一步,用 NFW 稀释至 50 µL 体积。


C. 末端修复和适配器结扎
1. 在热循环仪中设置步骤 3、5 和 12 的程序。
2. 将 50 μL 的片段 DNA、7 μL 的NEBNext Ultra II End-Prep 反应缓冲液和 3 μL 的NEBNext Ultra II End-Prep 酶混合物混合在 PCR 管中。通过移液混合。
3. 在 20°C 孵育30 分钟,在 65°C 孵育 30 分钟,然后在热循环仪中保持 4°C,加热盖设置为 75°C。
4. 添加 2.5 μL 的NEBNext EMseq适配器、1 µL NEBNext Ligation Enhancer 和 30 µL NEBNext Ultra II Ligation Master Mix 加入样品。通过移液混合。
5. 在 20°C 下孵育 15 分钟,然后在热循环仪中保持 4°C,并关闭加热盖。
6. 将 110 μL 的NEBNext样品纯化珠添加到样品中。通过移液混合。在 RT 孵育 5 分钟。
7. 将管子放在0.2 mL 管子的磁性支架上,直到变得清晰(大约需要 2 分钟)(图 2)。取出并丢弃上清液。
8. 在管中加入 200 μL 的 80% 乙醇。 30 秒后,取出并丢弃上清液。
9. 重复步骤8 一次。
10. e上向下旋转(在 RT下约为2,000 × g )。将管子放在磁性支架上。完全去除并丢弃剩余的上清液。
11. 将颗粒风干 1 分钟。
12. 从磁性支架上取下管子。加入 EMseq 试剂盒中的 29 µL 洗脱缓冲液,在 37°C 的热循环仪中孵育 10 分钟,从珠子中洗脱 DNA。
13. 将管子放在磁性 st 上,直到它变得清晰(大约需要 2 分钟)(图 2)。将 28 μL 的上清液转移到新的 PCR 管中。




图 2.通过磁力架分离珠子。 
(A) 磁分离前的悬浮珠。 (B)珠子分离不足。溶液略微混浊。 (C) 珠子的充分分离。解决方案很明确。


D. 5 mC's /5 hmC's 的氧化
1. 在热循环仪中设置步骤 5、7 和 14 的程序。
2. EMseq试剂盒中添加 10 μL 的 TET2 反应缓冲液、1 μL 的氧化补充剂、1 μL 的 DTT、1 μL 的氧化增强剂和 4 μL 的 TET2 。通过移液混合。
3. 在新的 1.5 mL 管中,将 EMseq 套件中的1 μL 的 500 mM Fe ( II) 溶液稀释到 1,249 μL的 NFW 中。
4. 中加入 5 μL 的稀释Fe ( II) 溶液。通过移液混合。
5. 在热循环仪中 37°C 孵育,加热盖设置为 45°C 1 小时。
6. 在样品中加入 1 μL 的停止试剂。通过移液混合。
7. 将加热盖设置为 45°C ,在热循环仪中 37°C 孵育30 分钟。
8. 90 μL NEBNext样品纯化珠添加到样品中。通过移液混合。在 RT 孵育 5 分钟。
9. 将管子放在0.2 mL 管的磁性支架上,直到变得清晰(大约需要 2 分钟)(图 2)。取出并丢弃上清液。
10. 在管中加入 200 μL 的 80% 乙醇。 30 秒后,取出并丢弃上清液。
11. 重复步骤 10 一次。
12. 从磁性支架上取下试管并在台式离心机上旋转。将管子放在磁性支架上。完全去除并丢弃剩余的上清液。
13. 将颗粒风干 1 分钟。
14. 从磁性支架上取下管子,直到它变得清晰(大约需要 2 分钟)(图 2)。通过添加 EMseq 试剂盒中的 17 µL 洗脱缓冲液并在热循环仪中于 37°C 下孵育 10 分钟,从珠子中洗脱目标 DNA。
15. 将管子放在磁性支架上,直到它变得清晰(大约需要 2 分钟)(图 2)。将 16 μL 的上清液转移到新的 PCR 管中。


E. 胞嘧啶的变性
1. 在热循环仪中设置步骤 3 的程序。
2. 在样品中加入 4 μL 的甲酰胺。通过移液混合。
3. 在加热盖设置为 95°C 的热循环仪中在 85°C 下孵育10 分钟,然后立即置于冰上。


F. 胞嘧啶脱氨
1. 在热循环仪中设置步骤 3 和 10 的程序。
2. 从 EMseq 试剂盒中将68 μL 的 NFW、10 μL 的 APOBEC 反应缓冲液、1 μL 的 BSA 和 1 μL 的 APOBEC添加到样品中。通过移液混合。
3. 在 37°C 下孵育 3 小时,然后在 4°C 下保持在热循环仪中,加热盖设置为 45°C。
4. 将 EMseq 套件中的100 μL NEBNext样品纯化珠添加到样品中。通过移液混合。在 RT 孵育 5 分钟。
5. 将管子放在0.2 mL 管子的磁性支架上,直到变得清晰(大约需要 2 分钟)(图 2) 。取出并丢弃上清液。
6. 加入 200 μL 的 80% 乙醇。 30 秒后,取出并丢弃上清液。
7. 重复步骤 6 一次。
8. 从磁性支架上取下试管并在台式离心机上旋转。将管子放在磁性支架上。完全去除并丢弃剩余的上清液。
9. 将颗粒风干 1 分钟。
10. 从磁性支架上取下管子。加入 41 µL NFW 并在 37°C 的热循环仪中将目标 DNA 从珠子中洗脱下来,加热盖设置为 45°C 10 分钟(注 3)。
11. 将管子放在磁性支架上,直到它变得清晰(大约需要 2 分钟)(图 2) 。将 20 μL 的上清液转移到两个 PCR 管(每管包含上清液)。


G. PCR扩增
1. 在热循环仪中设置步骤 3 的 PCR 程序。
2. 在每个管中添加 5 μL 的自定义底漆混合物(每个 10 μM,如表 1 中所述)和 25 μL 的 KOD ONE PCR 主混合物。通过移液混合。
3. 使用以下 PCR 程序对两个试管进行 PCR 扩增:13 – 16 个循环,94 °C 15 秒,57°C 5 秒,68°C 15 分钟,然后保持在 4°C。 PCR 循环的数量取决于输入 DNA 的数量(10 ng DNA 输入 16 个循环,50 ng DNA 输入 13 个循环)和 DNA 质量。


表 1. PCR 程序
温度 时间 循环
94 ℃ 15 秒
1 3–16
57 ℃ 5 秒
68 ℃ 15 分钟
4 ℃ 旧_ 1


4. 将单独放大的样品合并到一个管中。根据制造商的说明,使用 DNA Clean & Concentrator-5 纯化柱纯化样品。加入 52 µL 在 70°C 下预孵育的 NFW,从柱中洗脱 DNA。通过将 52 μL 的流通液添加回色谱柱来重复洗脱步骤。使用 Agilent DNA 12000 试剂盒测量纯化 DNA 的质量(图 3A)。


表 2. 自定义引物序列
底漆
正向引物 CAAGCAGAAGACGGCATACGAGATCGAGTAATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
反向底漆 AATGATACGGCGACCACCGAGATCTACACTATAGCCTACACTCTTTCCCTACACGACGCTCTTCCGATCT


H. 尺寸选择
1. 在热循环仪中设置步骤 8 的程序。
2. 将 41 – 45 µL ProNEX Chemistry (0.82 – 0.9 × ) 添加到 50 µL DNA 中。在 RT 孵育 10 分钟。
3. 将管子放在0.2 mL 管子的磁性支架上,直到变得清晰(大约需要 2 分钟)(图 2) 。取出并丢弃上清液。
4. 加入 200 μL 的洗涤缓冲液。 30 秒后,取出并丢弃上清液。
5. 重复步骤4 一次。
6. 从磁性支架上取下试管并在台式离心机上旋转。将管子放在磁性支架上。完全去除并丢弃剩余的上清液。
7. 将颗粒风干 1 分钟。
8. 从磁性支架上取下管子。通过添加 51 μL从珠子中洗脱目标 DNA NFW 并在热循环仪中在 37°C 下孵育,加热盖设置为 45°C 10 分钟。
9. 将管子放在磁性支架上,直到它变得清晰(大约需要 2 分钟)(图 2) 。转移 50 µ L 将上清液转移到新的 PCR 管中。使用 Agilent DNA 12000 试剂盒(注 2)和 Qubit ds DNA HS 检测试剂盒(图 3B)(注 4)评估纯化 DNA 的质量和数量。




图 3. 碱基转换 DNA 的扩增子。
(A 和 B) 碱基转换 DNA (A) 在大小选择之前和 (B) 之后的扩增子分布。使用 Agilent DNA 12000 试剂盒对 DNA 进行定量。使用来自乳腺癌细胞系 BT-474 ( Lasfargues ) 的 50 ng 片段 DNA 进行反应。 等人, 1978 年)。使用 KOD ONE PCR Master Mix 和表 1 中描述的引物,通过 13 个聚合酶链反应循环进行扩增。使用 0.82 ×体积的ProNEX Chemistry 进行扩增 DNA 的大小选择。


I. 纳米孔测序文库制备
1. 在热循环仪中设置步骤 3 和 22 的程序。
2. 将 48 µL 样品、3.5 µL NEBNext FFPE DNA 修复缓冲液、3.5 µL Ultra II End-Prep Reaction Buffer、2 µL NEBNext FFPE DNA Repair Mix 和 3 µL Ultra II End-Prep Enzyme Mix 混合在一个新的PCR 管。通过轻弹试管混合并在台式离心机上旋转。
3. 在 20°C 孵育5 分钟,然后在 65°C 孵育 5 分钟。
4. 加入 60 μL 的AMPure XP 珠子。通过轻弹试管混合并在台式离心机上旋转。
5. 在 RT 孵育5 分钟。
6. 将试管放在 0.2 mL 试管的磁性支架上,直到变得清晰(大约需要 2 分钟)(图 2) 。取出并丢弃上清液。
7. 在管中加入 200 μL 的 70% 乙醇。 30 秒后,取出并丢弃上清液。
8. 重复步骤7 一次。
9. 从磁性支架上取下试管并在台式离心机上旋转。将管子放在磁性支架上。完全去除并丢弃剩余的上清液。
10. 将颗粒风干 1 分钟。
11. 从磁性支架上取下管子。通过添加 61 µL NFW 并在 37°C 的热循环仪中孵育珠子中的 DNA,将加热盖设置为 45°C 10 分钟。
12. 将管子放在磁性支架上,直到它变得清晰(大约需要 2 分钟)(图 2) 。将 61 μL 的上清液转移到新的 PCR 管中。使用 1 µL 的样品通过 Qubit ds DNA HS 检测试剂盒进行定量。
13. 加入 25 μL 的结扎缓冲液、10 μL 的NEBNext Quick T4 DNA 连接酶和 5 μL 的适配器混合 F。通过轻弹试管混合并在台式离心机上旋转。
14. 在 RT 孵育10 分钟。
15. 中加入 40 μL 的AMPure XP 珠子。通过轻弹试管混合并在台式离心机上旋转。
16. 在 RT 孵育5 分钟。
17. 将管子放在0.2 mL 管子的磁性支架上,直到变得清晰(大约需要 2 分钟)(图 2) 。取出并丢弃上清液。
18. 从磁性支架上取下管子。通过在管中加入 250 μL 的长片段缓冲液来清洗珠子。轻弹珠子以重新悬浮后,将样品放回磁性支架。溶液澄清后,取出并丢弃上清液。
19. 重复步骤 18 一次。
20. 取下试管并在台式离心机上离心。将管子放在磁性支架上。完全去除并丢弃剩余的上清液。
21. 风干颗粒 30 s。
22. 从磁性支架上取下管子。通过从连接测序试剂盒中加入 25 µL 洗脱缓冲液从珠子中洗脱 DNA,并在 37°C 的热循环仪中孵育,加热盖设置为 45°C 10 分钟。
23. 将管子放在磁性支架上,直到它变得清晰(大约需要 2 分钟)(图 2) 。将 25 μL 的上清液转移到新的 PCR 管中。使用 1 µL 的样品通过 Qubit ds DNA HS 检测试剂盒进行定量。通过在文库制备前用 DNA 的摩尔和质量浓度校正文库的质量浓度来估计制备文库的摩尔浓度。将 5–50 fmol的洗脱文库应用于下一步(注 5)。如果24 µL 文库中含有超过 50 fmol 的DNA , 用Ligation Sequencing 试剂盒中的洗脱缓冲液稀释 5–50 fmol文库至 24 µL


J. 启动和加载PromethION 流通池
1. 将 30 μL 的冲洗系绳 (FLT) 添加到一管冲洗缓冲液 (FB) 中。通过涡旋混合。
2. 将流通池设置为PromethION测序仪。通过移液去除流通池入口处的空气,以避免引入气泡。
3. 用 500 μL 的 FB/FLT 混合物为流通池注入底料。孵育 5 分钟后,用 500 μL 的 FB/FLT 混合物重新启动。
4. 将 75 μL 的测序缓冲液 II 和 51 μL 的重新悬浮加载珠 II 添加到库中。
5. 通过轻轻吹打混合。立即将 150 μL 的库加载到流通池并运行程序。 Oxford Nanopore Technologies 的社区网站上提供了用于启动和加载 PromethION 流通池的视频 ( https://community.nanoporetech.com/protocols/genomic-dna-by-ligation-sqk-lsk110/v/gde_9108_v110_revl_10nov2020/启动和加载流动池?devices=promethion )。


数据分析


通过 Guppy 的实时碱基调用生成两个包含 1d pass reads 或 1d fail reads 的 fastq 文件, Guppy是集成到PromethION测序仪的MinKNOW软件中的碱基调用器。 fastq文件包含序列读取的碱基序列和每个碱基的质量。我们推荐使用 1d pass reads 的fastq文件,它通过了碱基质量过滤器,用于数据分析。通过亚硫酸氢盐或EMseq碱基转换后的 DNA 序列由 原始链中的 A、G、T(原始 T 和未甲基化 C)和 C(甲基化 C),或PCR产生的互补链。因此,很难将序列与正常参考序列对齐。为了将 nanoEM 数据映射到参考基因组数据,我们采用了三字母比对方法——这也被用于 对于Bismark ( Krueger and Andrews, 2011 )——长读对齐。在三字母方法中,为了能够比对碱基转换读取,计算准备了两种类型的读取,其中所有 C 转换为 T 或所有 G 转换为 A,以及两种类型的参考基因组序列,其中所有 C 转换为 T 或所有 G 转换为 A。在将读数与参考基因组对齐后, 可以确定是否每次读取 源自原始链或互补链, 通过参考 reads 的原始序列和参考基因组,选择每个 read 的最佳比对组合和最佳比对组合,并检测每个C的甲基化状态。数据分析的流程图如图 4 所示。为了正确执行操作,所有信息,包括生物信息学脚本和解释,都可以在以下链接的 GitHub 存储库中获得: https : //github.com/yos-sk /纳米EM。 本协议中使用的软件可以通过miniconda ( https://docs.conda.io/en/latest/miniconda.html ) 或 anaconda ( https://www.anaconda.com/products/individual ) 的conda命令轻松安装)。




图 4. 数据分析流程图。


1. 转换参考基因组的碱基
从参考基因组(例如人类基因组 hg38)的 fasta 文件 (ref.fa) 生成修改后的参考基因组的fasta文件( output.fa ) ,表示参考基因组,其中 Cs 转换为 Ts,并且Gs 在反向链上转换为 As。


$ python src/convert_ref.py ref.fa > output.fa


2. 转换 nanoEM数据的基础
nanoEM读取的压缩fastq文件(1d_pass.fq.gz)中,生成两个修改后的fastq文件:一个将 Cs 转换为 Ts (1d_pass_CT.fq.gz),另一个将 Gs 转换为As (1d_pass_GA.fq.gz )。


$ python src/convert_reads.py 1d_pass.fq.gz


3. nanoEM读数(1d_pass_CT.fq.gz 和 1d_pass_GA.fq.gz)映射到转换后的参考基因组( output.fa )。使用带有“ mapont ”选项的 minimap2 将处理后的 nanoEM 读数映射到处理后的参考基因组。然后,生成两个 bam 文件(1.sorted.bam 和 2.sorted.bam)和两个各自的索引文件(1.sorted.bam.bai 和 2.sorted.bam.bai)。


$ minimap2 -t 8 –split-prefix temp_sam1 -ax map- ont output.fa 1d_pass_CT.fq.gz – eqx | samtools视图-b | samtools排序 -@ 8 -o 1.sorted.bam
$ samtools索引 1.sorted.bam
$ minimap2 -t 8 –split-prefix temp_sam2 -ax map- ont output.fa 1d_pass_GA.fq.gz – eqx | samtools视图-b | samtools排序 -@ 8 -o 2.sorted.bam
$ samtools索引 2.sorted.bam


4. C选择最佳对齐方式
从对齐结果(1.sorted.bam 和 2.sorted.bam)中,根据对齐分数选择最合适的对齐组合。然后,生成两个 bam 文件( output_CT.sorted.bam和output_GA.sorted.bam )和两个对应的索引文件( output_CT.sorted.bam.bai和output_GA.sorted.bam.bai )。




$ python src/best_align.py --bam1 1.sorted.bam --bam2 2.sorted.bam -- fastq nanoEM_read.fq.gz
$ samtools视图-b output_CT.sam | samtools排序 -o output_CT.sorted.bam
$ samtools视图-b output_GA.sam | samtools排序 -o output_GA.sorted.bam
$ rm输出_*。山姆
$ samtools索引output_CT.sorted.bam
$ samtools索引output_GA.sorted.bam


5. 调用甲基化
使用桑巴舞 mpileup命令,检测参考基因组( ref.fa ) CpG 位点胞嘧啶的甲基化频率。通过 python 脚本 ( src /call_methylation.py)处理后,生成甲基化频率 ( frequency_methylation.tsv )的tsv文件。


$桑巴巴 堆积 output_CT.sorted.bam -L cpg_sites.bed -o pileup_CT.tsv -t 8 -- samtools -f ref.fa
$桑巴巴 堆积 output_GA.sorted.bam -L cpg_sites.bed -o pileup_GA.tsv -t 8 -- samtools -f ref.fa
$ python src/call_methylation.py pileup_CT.tsv 堆积_GA.tsv >频率_甲基化.tsv


6. IGV的亚硫酸氢盐模式下可视化
为了在 IGV 的亚硫酸氢盐模式下可视化,将 G 到 A 转换的读取序列 ( output_GA.sorted.bam ) 更正为互补链的序列,并将其 ( output_GA_vis.sorted.bam ) 与 C 的 bam 文件合并-to-T-converted 读取 ( output_CT.sorted.bam )。排序后,合并后的 bam 文件( output_merge.sorted.bam )可以在 IGV 的亚硫酸氢盐模式下可视化。可以从右键单击弹出菜单中激活亚硫酸氢盐模式选项。选择“颜色对齐方式” ,“亚硫酸氢盐模式”,然后选择“CG”。 典型 nanoEM 结果的可视化如图 5 所示。


$ python 脚本/vis_GA_utilities.py -b output_GA.sorted.bam | samtools视图-b | samtools排序 -@ 4 -o output_GA_vis.sorted.bam
$ samtools index output_GA_vis.sorted.bam
$ samtools合并output_merge.bam output_CT.sorted.bam output_GA_vis.sorted.bam
$ samtools排序 -@ 4 -o output_merge.sorted.bam output_merge.bam
$ samtools索引output_merge.sorted.bam




图 5. nanoEM 读数的可视化。
在基因 PGR 的启动子周围的区域中,代表性 nanoEM 的可视化导致 IGV 的亚硫酸氢盐模式。甲基化和未甲基化的 CpG 分别以红色和蓝色显示。 CpG 岛的注释是从 UCSC 表格浏览器( Karolchik 等人, 2004)。


笔记 


1. 我们已经成功地将这个协议用于哺乳动物细胞系和人类肺和乳腺的临床标本。
2. 2100 Bioanalyzer Instrument的附件 Chip Priming Station 将预过滤的 Gel-Dye 混合物填充到 DNA 芯片后,在芯片孔中加入 9 µL Gel-Dye mix 和 5 µL Marker制造商的说明。分别将 1 μL 的梯子和1 μL 的样品添加到梯子井和样品井中。涡旋1分钟后,将芯片设置为2100 Bioanalyzer Instrument并开始测量。
3. 在此步骤中,请勿使用EMseq试剂盒中的洗脱缓冲液,因为它不利于后续的 PCR 反应。
4. 我们建议使用至少 200 ng DNA 用于随后的纳米孔测序文库制备。
5. 当库加载量过高或过低时,测序数据的产量都会降低。


食谱


1. TET2 反应缓冲液(含补充剂)
将 100 μL TET2 Reaction Buffer 添加到一管 TET2 Reaction Buffer Supplement 中,并通过涡旋混合。含添加剂的 TET2 反应缓冲液可在 -20°C 下保存 4 个月。
2. 70% 和 80% (v/v) 乙醇
混合乙醇和 NFW。这些试剂是在使用时新鲜制备的。
3. ProNex大小选择性 DNA 纯化系统 (NG2001)的洗涤缓冲液
在一瓶洗涤缓冲液中加入 75 mL 乙醇。


致谢


该协议基于我们之前的出版物( Sakamoto等人, 2021) 。我们得到了 JSPS KAKENHI [JP21K15074, JP19K16108] 的支持; MEXT KAKENHI [JP16H06279 (PAGS), JP17H06306, JP20H05906]; JSPS 藤田医学研究纪念基金;国家癌症中心研究与发展基金 (29-A-6)。


利益争夺


不存在利益冲突或竞争利益。


参考


1. Akbari, V., Garant , JM, O'Neill, K., Pandoh , P., Moore, R., Marra , MA, Hirst, M. 和 Jones, SJM (2021)。使用纳米孔长读数和 NanoMethPhase 进行兆碱基规模的甲基化定相。 基因组生物学22(1):68。
2. Goerner-Potvin, P. 和 Bourque, G. (2018)。用于揭示可转座元素的计算工具。 Nat Rev Genet 19(11):688-704。
3. Greenberg, MVC 和Bourc'his , D. (2019)。 DNA甲基化在哺乳动物发育和疾病中的不同作用。 Nat Rev Mol 细胞生物学20(10):590-607。
4. Karolchik , D., Hinrichs, AS, Furey , TS, Roskin , KM, Sugnet , CW, Haussler, D. 和 Kent, WJ (2004)。 UCSC 表浏览器数据检索工具。 Nucleic Acids Res 32(数据库问题):D493-496。
5. Krueger, F. 和 Andrews, SR (2011)。 Bismark:用于 Bisulfite-Seq 应用的灵活对齐器和甲基化调用器。 生物信息学27:1571-1572。
6. Lasfargues , EY, Coutinho, WG 和 Redfield, ES (1978)。从实体乳腺癌中分离出两种人类肿瘤上皮细胞系。 J Natl Cancer Inst 61(4):967-978。
7. 李海(2018)。 Minimap2:核苷酸序列的成对比对。 生物信息学34(18):3094-3100。
8. Li, H., Handsaker , B., Wysoker , A., Fennell, T., Ruan , J., Homer, N., Marth, G., Abecasis , G., Durbin, R. and Genome Project Data Processing, S. (2009)。序列比对/映射格式和 SAMtools。 生物信息学25(16):2078-2079。
9. Lister, R., Pelizzola, M., Dowen, RH, Hawkins, RD, Hon, G., Tonti-Filippini, J., Nery, JR, Lee, L., Ye, Z., Ngo, QM, et al . (2009 年)。碱基分辨率下的人类 DNA 甲基化组显示出广泛的表观基因组差异。 自然462:315-322。
10. Liu, Y., Cheng, J., Siejka-Zielinska , P., Weldon, C., Roberts, H., Lopopolo , M., Magri , A., D'Arienzo , V., Harris, JM, McKeating, JA等人。 (2020 年)。使用 TAPS 进行准确的靶向长读长 DNA 甲基化和羟甲基化测序。 基因组生物学21(1):54。
11. Olova , N., Krueger, F., Andrews, S., Oxley, D., Berrens , RV, Branco, MR 和Reik , W. (2018)。全基因组亚硫酸氢盐测序文库制备策略的比较确定了影响 DNA 甲基化数据的偏差来源。 基因组生物学19(1):33。
12. Rand, AC, Jain, M., Eizenga, JM, Musselman-Brown, A., Olsen, HE, Akeson, M. 和 Paten, B. (2017)。用高通量纳米孔测序绘制 DNA 甲基化图谱。 Nat 方法14:411-413。
13. Sakamoto, Y., Xu, L., Seki, M., Yokoyama, TT, Kasahara, M., Kashima, Y., Ohashi, A., Shimada, Y., Motoi, N., Tsuchihara, K., Kobayashi ,SS,等人。 (2020 年)。非小细胞肺癌基因组的长读长测序。 基因组研究30(9):1243-1257。
14. Sakamoto, Y.、Zaha, S.、Nagasawa, S.、Miyake, S.、Kojima, Y.、Suzuki, A.、Suzuki, Y. 和 Seki, M. (2021)。使用酶促碱基转换和纳米孔测序的长读长全基因组甲基化模式。 核酸研究49(14):e81。
15. Simpson, JT, Workman, RE, Zuzarte, PC, David, M., Dursi, LJ 和 Timp, W. (2017)。使用纳米孔测序检测 DNA 胞嘧啶甲基化。 Nat 方法14:407-410。
16. Sun, Z., Vaisvila, R., Hussong , LM, Yan, B., Baum, C., Saleh, L., Samaraayake, M., Guan, S., Dai, N., Correa, IR, Jr.等。 _ (2021 年)。非破坏性酶脱氨作用可实现单分子长读长扩增子测序,以单碱基分辨率测定 5-甲基胞嘧啶和 5-羟甲基胞嘧啶。 基因组资源 31:291-300。
17. Tarasov, A., Vilella , AJ, Cuppen , E., Nijman, IJ 和Prins , P. (2015)。 SambaMBA:快速处理 NGS 比对格式。 生物信息学31(12):2032-2034。
18. Thorvaldsdóttir , H., Robinson, JT 和Mesirov , JP (2013)。 Integrative Genomics Viewer (IGV):高性能基因组数据可视化和探索。 简要Bioinform 14(2):178-192。
19. Vaisvila, R., Ponnaluri , VKC, Sun, Z., Langhorst, BW, Saleh, L., Guan, S., Dai, N., Campbell, MA, Sexton, BS, Marks, K.等。 (2021 年)。酶促甲基化测序以单碱基分辨率从皮克 DNA 中检测 DNA 甲基化。 基因组资源 31:1280-1289。
20. Yang, Y., Sebra, R., Pullman, BS, Qiao, W., Peter, I., Desnick, RJ, Geyer, CR, DeCoteau, JF 和 Scott, SA (2015)使用长的定量和多重 DNA 甲基化分析-读取单分子实时亚硫酸氢盐测序(SMRT-BS) 。 BMC 基因组学16:350。


登录/注册账号可免费阅读全文
  • English
  • 中文翻译
免责声明 × 为了向广大用户提供经翻译的内容,www.bio-protocol.org 采用人工翻译与计算机翻译结合的技术翻译了本文章。基于计算机的翻译质量再高,也不及 100% 的人工翻译的质量。为此,我们始终建议用户参考原始英文版本。 Bio-protocol., LLC对翻译版本的准确性不承担任何责任。
Copyright: © 2022 The Authors; exclusive licensee Bio-protocol LLC.
引用:Zaha, S., Sakamoto, Y., Nagasawa, S., Sugano, S., Suzuki, A., Suzuki, Y. and Seki, M. (2022). Whole-genome Methylation Analysis of APOBEC Enzyme-converted DNA (~5 kb) by Nanopore Sequencing. Bio-protocol 12(5): e4345. DOI: 10.21769/BioProtoc.4345.
提问与回复

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。

如果您对本实验方案有任何疑问/意见, 强烈建议您发布在此处。我们将邀请本文作者以及部分用户回答您的问题/意见。为了作者与用户间沟通流畅(作者能准确理解您所遇到的问题并给与正确的建议),我们鼓励用户用图片的形式来说明遇到的问题。