Determining the Relative Fitness Score of Mutant Viruses in a Population Using Illumina Paired-end Sequencing and Regression Analysis
采用Illumina 配对末端测序和回归分析测定某种群中突变病毒的相对适应度分数   

引用 收藏 提问与回复 分享您的反馈 Cited by



PLOS Pathogens
Apr 2014



Recent advances in DNA sequencing capacity to accurately quantify the copy number of individual variants in a large and diverse population allows in parallel determination of the phenotypic effects caused by each genetic modification. This systematic profiling approach is a combination of forward and reverse genetics, which we refer to as quantitative high-resolution genetics (qHRG). This protocol describes how to determine the relative fitness score of each variant compared to wild type (WT) virus based on its frequency determined by Illumina sequencing. Random mutagenesis techniques will be used to introduce randomization at each codon position of the targeted region, thereby generating a comprehensive input mutant library with substitutions at each position of interest (Qi et al., 2014; Wu et al., 2014a; Wu et al., 2014b). After selection, each selected library will be sequenced by Illumina paired-end sequencing and the frequency of each mutation will be determined. Based on the change in frequency, the relative fitness score of each mutant can be calculated with regression analysis.

Materials and Reagents

  1. The Huh-7.5.1 cell line (kindly provided by Dr. Francis Chisari from the Scripps Research Institute, USA)
  2. Dulbecco's modified Eagle medium (DMEM) (Corning, Cellgro®, catalog number: 10-017-CV )
  3. Fetal bovine serum (FBS) (Omega Scientific, catalog number: FB-11 )
  4. 100x non-essential amino acids solution (Life Technologies, catalog number: 11140050 )
  5. 1 M HEPES (Life Technologies, catalog number: 15630080 )
  6. 100x Penicillin-Streptomycin-Glutamine (Life Technologies, catalog number: 10378016 )
  7. 10x trypsin supplemented with EDTA (Life Technologies, Gibco®, catalog number: 15400054 )
  8. Plasmid that carries the HCV viral genome (pFNX-HCV) was synthesized based on the chimeric sequence of J6/JFH1 virus
    Note: In this protocol, we are taking the HCV NS5A mutant library as an example to describe the procedures to relative fitness determination (Qi et al., 2014). A mutant virus library where each codon of interest was individually substituted with ‘NNK’, where N represents random incorporation of A/T/G/C; K represents random incorporation of T/G. The randomized codons therefore include 32 nucleotide combinations, which cover all possible amino acid.
  9. 100% ethanol (Decon Labs, catalog number: 2701 )
  10. QIAamp Viral RNA Mini Kit for viral RNA purification (QIAGEN, catalog number: 52906 )
  11. Sterile, RNase-free pipet tips (with aerosol barriers for preventing cross-contamination) (OLYMPUS, catalog numbers: 24-401 , 24-404 , 24-412 , 24-430 )
  12. SuperScriptTM III Reverse Transcriptase Kit (Life Technologies, InvitrogenTM, catalog number: 18080-044 )
  13. RNaseOUT Recombinant Ribonuclease Inhibitor (Life Technologies, InvitrogenTM, catalog number: 10777-019 )
  14. KOD Hot Start DNA Polymerase Kit (Novagen®, catalog number: 71086-4 )
  15. PureLink® Quick PCR Purification Kit (Life Technologies, InvitrogenTM, catalog number: K3100-02 )
  16. T4 Polynucleotide Kinase (PNK) (New England Biolabs, catalog number: M0201S )
  17. NEB buffer 2 (New England Biolabs, catalog number: B7002S )
  18. dATP (100 mM) (Life Technologies, InvitrogenTM, catalog number: 10216-018 )
  19. Klenow Fragment (3’ to 5’ exo-) enzyme (New England Biolabs, catalog number: M0212S )
  20. T4 DNA Ligase Kit (Life Technologies, InvitrogenTM, catalog number: 15224-017 )


  1. 15 cm cell culture dishes (Genesee Scientific, catalog number: 25-203 )
  2. T-150 cell culture flasks (Genesee Scientific, catalog number: 25-211 )
  3. 37 °C, 5% CO2 cell culture incubator
  4. 1.7 ml Microtubes (1.5 ml) (Genesee Scientific, catalog number: 22-282 )
  5. Falcon 50 ml tubes (Corning, catalog number: 14-432-22 )
  6. Falcon 15 ml tubes (Corning, catalog number: 05-527-90)
  7. Microcentrifuge (with rotor for 1.5 ml and 2 ml tubes) (Eppendorf, model: 5424 )
  8. Centrifuge (with rotor for 15 ml and 50 ml Falcon tubes) (Thermo Fisher Scientific, Legend RT)
  9. NanoDrop ND-1000 UV Spectrophotometer (Thermo Fisher Scientific)
  10. Thermal cycler (Eppendorf, catalog number: 950030050 )


  1. Passage the mutant virus library (pool 1) in Huh-7.5.1 cells for selection
    1. Seed Huh-7.5.1 cells in T-150 cell culture flasks at 50% confluence (approximately 4 million cells in 24 ml of complete growth medium).
    2. Aspire growth medium in the flask using a Pasteur pipette and infect the monolayer cells with mutant HCV library at M.O.I = 0.1 [the virus library should be titrated in advance as described earlier by Arumugaswami et al. (2008)].
    3. Incubate the cells at 37 °C incubator for 6 h. Aspirate old medium and put 24 ml of fresh complete growth medium (DMEM with 10% of FBS, 1x NEAA and 1x Penicillin/Streptomycin/Glutamine).
    4. Incubate the virus infected cells for 72 h at 37 °C before Huh7.5.1 cells reach 100% confluence (approximately 8 million cells).
    5. Collect the supernatant in a 50 ml Falcon tube.
    6. Wash the cells with 1x PBS once.
    7. Trypsinize the cells with 2 ml of 1x trypsin for 1 min at RT and tap flask to completely loosen cells.
    8. Stop trypsin by adding 24 ml of complete growth medium as mentioned in step A3.
    9. Distribute cells to 3 new flasks at 8 ml/flask.
    10. Distribute 8 ml of collected supernatant from step A5 into each flask from step A9, and add 8 ml of fresh complete growth medium into each flask to reach 24 ml/flask.
    11. Incubate the virus infected cells for 72 h at 37 °C before they reach 100% confluence.
    12. Collect the supernatant (144 h post infection) and store as library pool 2.
    13. Titrate the virus titer in pool 2.
    14. Repeat steps from A1 to A13 to passage the pool 2 and collect pool 3.
    15. Repeat steps from A1 to A13 to passage the pool 3 and collect pool 4.
    16. Repeat steps from A1 to A13 to passage the pool 4 and collect pool 5.

  2. Determine the frequency of each mutant virus at each passage
    1. Extract HCV genomic RNA from each pool (pool 1 through pool 5) with QIAamp Viral RNA Mini Kit for viral RNA purification from QIAGEN. All of the reagents used in this step are all from this kit, if not otherwise stated.
      1. The supernatant of each virus pool was spun at 1,500 x g for 10 min to get rid of possible contamination from cell associated RNA.
      2. Take 1.4 ml of supernatant from each sample in a 15 ml Falcon tube.
      3. Lyse the virus with 5.6 ml of lysis buffer (AVL) containing 1 μg/ml of carrier RNA (5.6 μg of total carrier RNA per sample to avoid overload of the columns) by pulse-vortexing for 15 sec and incubate at room temperature for 10 min.
      4. Add 5.6 ml of ethanol (100%) to the sample, and mix by pulse-vortexing for 15 sec.
      5. Transfer 630 μl of the solution from step 4 to the QIAamp Mini column (in a 2 ml collection tube). Close the cap and centrifuge at 6,000 x g for 1 min and discard the filtrate collected in the collection tube.
      6. Repeat step 5 until all of lysate step 4 is loaded onto the spin column.
      7. Add 500 μl of buffer AW1 onto the QIAamp Mini column, and centrifuge at 6,000 x g for 1 min.
      8. Place the QIAamp Mini column in a clean 2 ml collection tube and discard the filtrate.
      9. Add 500 μl of buffer AW2 and centrifuge at 20,000 x g for 1 min. Discard the filtrate collected in the collection tube.
      10. Centrifuge at full speed (20,000 x g) for 2 min to completely dry the column.
      11. Place the QIAamp Mini column in a clean 1.5 ml Eppendorf tube and add 60 μl of buffer AVE to the filter area of the column. Close the cap and incubate at room temperature for 1 min. Spin at full speed (20,000 x g) for 1 min to elute the RNA.
    2. Reverse transcription reaction and PCR amplification of the targeted region for sequencing. We use SuperScriptTM III Reverse Transcriptase kit from Life Technologies, and all of the reagents are from the kit if not otherwise stated.
      1. Set up 20 μl reverse transcription reaction with 10 μl of RNA isolated from each pool (pool 1-5) and the input RNA library (pool 0) which was used to reconstitute the mutant virus library as mentioned by Qi et al. (2014). Add the following components to a nuclease-free Eppendorf tube:
        RNA isolated from each pool
        10 μl
        Random primer (100 ng/ul)
        1 μl
        dNTP (10 mM)
        1 μl
        1 μl
        13 μl
      2. Incubate the mixture at 65 °C for 5 min and incubate on ice for 1 min.
      3. Spin down the tube for 5 sec and add the following components:
        RNA mixture from step 2
        13 μl
        5x First-Strand Buffer
        4 μl
        0.1 M DTT
        1 μl
        RNaseOUT RNase inhibitor
        1 μl
        SuperScript III RT
        1 μl
        20 μl
      4. Incubate at 25 °C for 5 min and 50 °C for 60 min.
      5. Inactivate the reaction by heating at 70 °C for 15 min.
      6. Determine the virus genome copy number in each pool with Q-PCR using a pair of HCV-specific primer as follows (Arumugaswami et al., 2008):
        Primer_forward: AGA GCC ATA GTG GTC TGC G
        Primer_reverse: CTT TCG CAA CCC AAC GCT AC
      7. Amplify the targeted region with PCR using KOD DNA polymerase for “just enough” cycle numbers (based on the Q-PCR reaction in step 2f) to reach saturation. For example, We would use 28 PCR amplification cycles at this step if 30 cycles would saturate the reaction according to the Q-PCR result.
      8. Purify the PCR amplicon from each PCR reaction with PCR purification kit from Life Technologies and measure the concentration of each sample with NanoDrop ND-1000 Spectrophotometer.
    3. Construct sequencing samples for Illumina sequencing.
      1. Take 1 μg of each PCR amplicon product from each sample and set up the following reaction with T4 Polynucleotide Kinase (PNK) to add 5’-phosphate to amplicons to allow subsequent ligation.
        PCR amplicons
        5-17 μl (1 μg total)
        T4 PNK Reaction Buffer
        2 μl
        T4 PNK
        1 μl
        0-12 μl
        20 μl
      2. Incubate at 37 °C for 1 h and purify the sample with PCR purification columns in 40 μl.
      3. dA-Tailing with Klenow Fragment (3'-->5' exo-):
        PCR amplicons
        37 μl
        NEB buffer 2 (10x)
        5 μl
        dATP (1 mM)
        5 μl
        Klenow Fragment (3’ to 5’ exo-)
        3 μl
        50 μl
      4. Incubate at 37 °C for 30 min and purify DNA samples with PCR purification columns in 35 μl volume.
      5. Ligate with Illumina sequencing adaptors with various barcodes designating to different pools:
        PCR amplicons
        30 μl
        T4 DNA ligase reaction buffer (10x)
        5 μl
        Adaptor with barcodes (10uM)
        5 μl
        T4 DNA ligase
        2 μl
        Sterile H2O
        8 μl
        50 μl
        Adapters were generated by annealing two oligos:
        5'-ACA CT CTT TCC CTA CAC GAC GCT CTT CCG ATC TNN NT-3' 5'-/5Phos/NNN AGA TCG GAA GAG CGG TTC AGC AGG AAT GCC GAG-3'. The location of multiplex ID for distinguishing different samples is underlined. NNN represents different sequences of multiplex ID.
      6. Incubate at 25 °C (room temperature) for 1 h and purify with PCR purification columns in 30 μl volume.
      7. The adapter-ligated products were enriched by a final PCR using primers:
      8. Purify the DNA with PCR purification columns in 30 μl volume and measure concentrations with NanoDrop ND-1000 Spectrophotometer.
      9. Mix 500 ng of final product from each pool and submit for Illumina sequencing (HiSeq).

  3. Determine the frequency of each mutant virus at each passage and calculate relative fitness score of each mutant virus with regression analysis.
    1. Each pair-end sequence read in the HiSeq data file was mapped to the reference sequence once it passes the quality control (cut off 35). Each miss match from the reference sequence will be identified as a mutation and the number of each mutation will be counted. The script ‘mapping.txt’ for mutation mapping is provided here.
    2. Calculate the frequency of a given variant, v, in the pool #N (fv,N) and the frequency of WT, wt, in the pool #N (fwt,N) as follows:

      (The frequency of the given variant in pool #N)

      (The frequency of the WT virus in pool #N)
      Where Readsv,N indicates the number of sequence reads for the variant (v) in pool #N, Readswt,N shows the number of sequence reads for the WT in pool #N, and ΣReadsN represents the total reads in the pool #N.
    3. Discard any frequency that is lower than 0.0005, since the mutation frequency of HCV is about 10-5 to 10-4 nucleotide substitutions per nucleotide per round of genome replication.
    4. Calculate the relative fitness score of each mutant virus. The relative fitness score of a given variant (Wv) was determined as the antilogarithm of the slope of the regression using the following formula implemented in Python:

      Where ln  is the logarithm of the relative frequency of a given variant (v) in the input RNA library, pool 0, which was used to reconstitute the mutant virus library. Script ‘fitness_reg.txt’ for fitness calculation is provided here.

Representative data

Figure 1. Procedure of mutant library construction and selection. A. Schematic picture showing the construction of the saturation mutant library in a sub-domain of NS5A of HCV. The area to be mutated was divided into 5 small regions, and each of them was composed of 17 or 18 amino acids. Each residue was replaced with one random codon (N1N2K: N1 and N2 codes for A/T/G/C and K codes for T/G) and incorporated into the WT background of HCV. B. The resultant viral library was then selected in vitro by passing through Huh5.7.1 cells for 4 rounds.

Figure 2. An example of expected data: The fitness landscape of amino acids 18-103 in NS5A in virus replication. This is a heat map showing the relative fitness scores represented as selection coefficient (s) for each variant during viral replication in vitro. Color indicates the fitness of each mutant calculated as ‘s’ relative to WT (Materials and method). Red represents positive ‘s’ (i.e. increased fitness) and blue represents negative ‘s’. s = 0 means the same fitness as the WT virus. The secondary structure of the mutated region is annotated below the figure (open circles: solvent exposed residues; filled circles: buried residues; half-filled circles: partially buried residue). This figure was generated by MATLAB software.


  1. During the process of passaging the mutant virus library in Huh-7.5.1 cells for in vitro selection, the library complexity should be estimated and always be maintained throughout the entire procedure. The complexity of library can be estimated depending on the way of the library is constructed. For example, in our recent study by Qi et al. (2014), we substituted each of the 86 position in the region of NS5A (from a.a. 18 to a.a. 103) with all possible 20 amino acids plus stop codon. In this case, the library complexity can be calculated as: 86 x 20 (19 variants plus stop codon) + 1 (WT) = 1721. According to our experience, we found that covering each variant for at least 100x on average gives optimal and reproducible results.
  2. The library should be selected for multiple rounds for regression analysis to give much higher confidence when calculating the relative fitness scores.


This work was supported by the following grants: National Natural Science Foundation of China (NSFC) 81172314, National Science Foundation EF-0928690 (JLS) and National Institute of Health AI078133 (RS), Margaret E. Early Medical Research Trust, P30CA016042 (Jonson Comprehensive Cancer Center) and P30AI028697 (UCLA AIDS Institute/CFAR). JLS is grateful for the support of the De Logi Chair in Biological Sciences and the RAPIDD program of the Science & Technology Directorate of the US Department of Homeland Security, and the Fogarty International Center, National Institutes of Health. C.A.O. was supported by the NCI Cancer Education Grant, R25 CA 098010.


  1. Qi, H., Olson, C. A., Wu, N. C., Ke, R., Loverdo, C., Chu, V., Truong, S., Remenyi, R., Chen, Z., Du, Y., Su, S. Y., Al-Mawsawi, L. Q., Wu, T. T., Chen, S. H., Lin, C. Y., Zhong, W., Lloyd-Smith, J. O. and Sun, R. (2014). A quantitative high-resolution genetic profile rapidly identifies sequence determinants of hepatitis C viral fitness and drug sensitivity. PLoS Pathog 10(4): e1004064.
  2. Wu, N. C., Young, A. P., Al-Mawsawi, L. Q., Olson, C. A., Feng, J., Qi, H., Luan, H. H., Li, X., Wu, T. T. and Sun, R. (2014). High-throughput identification of loss-of-function mutations for anti-interferon activity in the influenza A virus NS segment. J Virol 88(17): 10157-10164.
  3. Wu, N. C., Young, A. P., Al-Mawsawi, L. Q., Olson, C. A., Feng, J., Qi, H., Chen, S. H., Lu, I. H., Lin, C. Y., Chin, R. G., Luan, H. H., Nguyen, N., Nelson, S. F., Li, X., Wu, T. T. and Sun, R. (2014). High-throughput profiling of influenza A virus hemagglutinin gene at single-nucleotide resolution. Sci Rep 4: 4942.
  4. Arumugaswami, V., Remenyi, R., Kanagavel, V., Sue, E. Y., Ngoc Ho, T., Liu, C., Fontanes, V., Dasgupta, A. and Sun, R. (2008). High-resolution functional profiling of hepatitis C virus genome. PLoS Pathog 4(10): e1000182.


DNA测序能力的最近进展,准确量化大和多样群体中单个变体的拷贝数允许平行测定由每个遗传修饰引起的表型效应。这种系统分析方法是正向和反向遗传学的组合,我们称之为定量高分辨率遗传学(qHRG)。该方案描述了如何基于其通过Illumina测序确定的频率来确定每个变体与野生型(WT)病毒相比的相对适合度评分。随机诱变技术将用于在目标区域的每个密码子位置引入随机化,从而产生在每个感兴趣位置具有取代的综合输入突变文库(Qi等人,2014; Wu 等人,2014a; Wu ,2014b)。选择后,每个选定的文库将通过Illumina配对末端测序进行测序,并且确定每个突变的频率。基于频率的变化,可以用回归分析计算每个突变体的相对适合度分数。


  1. Huh-7.5.1细胞系(由美国Scripps研究所的Francis Chisari博士友情提供)
  2. Dulbecco改良的Eagle培养基(DMEM)(Corning,Cellgro ,目录号:10-017-CV)
  3. 胎牛血清(FBS)(Omega Scientific,目录号:FB-11)
  4. 100x非必需氨基酸溶液(Life Technologies,目录号:11140050)
  5. 1 M HEPES(Life Technologies,目录号:15630080)
  6. 100x青霉素 - 链霉素 - 谷氨酰胺(Life Technologies,目录号:10378016)
  7. 补充有EDTA的10×胰蛋白酶(Life Technologies,Gibco ,目录号:15400054)
  8. 基于J6/JFH1病毒的嵌合序列合成携带HCV病毒基因组的质粒(pFNX-HCV)
    注意:在本协议中,我们以HCV NS5A突变体文库为例,描述相对适合度测定的程序(Qi et al。,2014)。突变病毒文库,其中感兴趣的每个密码子分别用'NNK'取代,其中N表示A/T/G/C的随机掺入; K表示T/G的随机并入。因此,随机密码子包括32个核苷酸组合,涵盖所有可能的氨基酸。
  9. 100%乙醇(Decon Labs,目录号:2701)
  10. 用于病毒RNA纯化的QIAamp病毒RNA小试剂盒(QIAGEN,目录号:52906)
  11. 无菌,无RNA酶的移液管吸头(带有防止交叉污染的气溶胶屏障)(OLYMPUS,目录号:24-401,24-404,24-412,24-430)
  12. SuperScript TM反转录酶试剂盒(Life Technologies,Invitrogen TM ,目录号:18080-044)
  13. RNaseOUT重组核糖核酸酶抑制剂(Life Technologies,Invitrogen TM ,目录号:10777-019)
  14. KOD热启动DNA聚合酶试剂盒(Novagen ,目录号:71086-4)
  15. PureLink快速PCR纯化试剂盒(Life Technologies,Invitrogen TM ,目录号:K3100-02)
  16. T4多核苷酸激酶(PNK)(New England Biolabs,目录号:M0201S)
  17. NEB缓冲液2(New England Biolabs,目录号:B7002S)
  18. dATP(100mM)(Life Technologies,Invitrogen TM ,目录号:10216-018)
  19. Klenow片段(3'至5'外切酶)(New England Biolabs,目录号:M0212S)
  20. T4 DNA连接酶试剂盒(Life Technologies,Invitrogen TM ,目录号:15224-017)


  1. 15cm细胞培养皿(Genesee Scientific,目录号:25-203)
  2. T-150细胞培养瓶(Genesee Scientific,目录号:25-211)
  3. 37℃,5%CO 2细胞培养箱中培养
  4. 1.7ml微量管(1.5ml)(Genesee Scientific,目录号:22-282)
  5. Falcon 50ml管(Corning,目录号:14-432-22)
  6. Falcon 15ml管(Corning,目录号:05-527-90)
  7. 微量离心机(转子用于1.5ml和2ml管)(Eppendorf,型号:5424)
  8. 离心机(使用转子用于15ml和50ml Falcon管)(Thermo Fisher Scientific,Legend RT)
  9. NanoDrop ND-1000紫外分光光度计(Thermo Fisher Scientific)
  10. 热循环仪(Eppendorf,目录号:950030050)


  1. 通过突变病毒库(池1)在Huh-7.5.1细胞中进行选择
    1. 种子Huh-7.5.1细胞在50%汇合的T-150细胞培养瓶中 (在24ml完全生长培养基中约400万个细胞)
    2. 使用巴斯德吸管在烧瓶中生长培养基并感染 具有在M.O.I = 0.1的突变HCV文库的单层细胞[病毒 库应该如前所述提前滴定 Arumugaswami等人(2008)]。
    3. 在37℃下孵育细胞 孵育6小时。 吸出旧的介质,并把24毫升新鲜完整 生长培养基(含有10%FBS,1x NEAA和1x的DMEM) 青霉素/链霉素/谷氨酰胺)
    4. 孵化被感染的病毒 细胞在37℃下72小时,然后Huh7.5.1细胞达到100%汇合 (约8百万个单元格)。
    5. 收集上清液在50毫升Falcon管。
    6. 用1×PBS清洗细胞一次。
    7. 胰蛋白酶消化细胞用2毫升1×胰蛋白酶在室温下1分钟,并轻敲烧瓶以完全松开细胞。
    8. 通过加入24ml如步骤A3中所述的完全生长培养基停止胰蛋白酶
    9. 将细胞以8ml /瓶分配到3个新烧瓶中。
    10. 将来自步骤A5的8ml收集的上清液分配到每个中 ,并加入8ml新鲜的完全生长培养基 每个烧瓶达到24ml /瓶
    11. 在病毒感染的细胞达到100%汇合之前,在37℃孵育72小时
    12. 收集上清液(感染后144小时),并储存为文库泳池2.
    13. 滴定池2中的病毒滴度。
    14. 重复步骤A1到A13,通过池2并收集池3.
    15. 重复步骤A1到A13,通过池3并收集池4.
    16. 重复从A1到A13的步骤,通过池4并收集池5。

  2. 确定每个突变病毒在每个通道的频率
    1. 从每个池(池1至池5)提取HCV基因组RNA QIAamp病毒RNA小型试剂盒,用于从QIAGEN纯化病毒RNA。 所有的   在该步骤中使用的试剂全部来自该试剂盒,如果不是其他
      1. 将每个病毒池的上清液以1500×g离心10分钟以除去细胞相关的可能的污染物 RNA。
      2. 取每个样品中的1.4ml上清液在15ml Falcon管中
      3. 用5.6ml含有1μg/ml的裂解缓冲液(AVL)裂解病毒 的载体RNA(每个样品5.6μg的总载体RNA,以避免过载 的柱)通过脉冲涡旋15秒并在室温下孵育 温度10分钟
      4. 向样品中加入5.6ml乙醇(100%),并通过脉冲涡旋混合15秒
      5. 将630μl来自步骤4的溶液转移到QIAamp Mini 柱(在2ml收集管中)。 关闭盖子并离心机 6,000×g 1分钟,弃去收集的滤液   管
      6. 重复步骤5,直到所有裂解物步骤4装载到离心柱上
      7. 将500μl缓冲液AW1加入到QIAamp Mini柱上,并以6,000×g离心1分钟。
      8. 将QIAamp迷你柱放在干净的2ml收集管中,弃去滤液
      9. 加入500μl缓冲液AW2并以20,000×g离心1分钟。 弃去收集管中收集的滤液。
      10. 以全速(20,000×g)离心2分钟以完全干燥柱。
      11. 将QIAamp Mini柱放在干净的1.5 ml Eppendorf管中, 加入60μl缓冲液AVE到柱的过滤器区域。 关闭盖子 并在室温下孵育1分钟。 全速旋转(20000 x   g )1分钟以洗脱RNA。
    2. 逆转录反应 和用于测序的靶区域的PCR扩增。 我们用 来自Life Technologies的SuperScript TM III Reverse Transcriptase试剂盒 如果没有另外说明,所有试剂均来自试剂盒
      1. 组   用20μl逆转录反应与10μlRNA分离   每个池(池1-5)和输入RNA文库(池0) 以重构Qi等人(2014)所提及的突变病毒文库。 将以下组分添加到无核酸酶的Eppendorf管中:
        随机引物(100 ng/ul)
        dNTP(10mM) 1微升
        H sub 2 O
      2. 将混合物在65℃孵育5分钟,并在冰上孵育1分钟
      3. 向下旋转管5秒,并添加以下组件:
        5x First-Strand Buffer
        0.1 M DTT
        SuperScript III RT
      4. 在25℃孵育5分钟,在50℃孵育60分钟
      5. 通过在70℃下加热15分钟使反应失活
      6. 使用Q-PCR确定每个池中的病毒基因组拷贝数   一对HCV特异性引物如下(Arumugaswami等人,2008):
        引物前向:AGA GCC ATA GTG GTC TGC G
        Primer_reverse:CTT TCG CAA CCC AAC GCT AC
      7. 使用KOD DNA聚合酶进行PCR扩增靶区域 "刚好足够的"循环数(基于步骤2f中的Q-PCR反应)至 达到饱和。 例如,我们将使用28个PCR扩增循环 在该步骤如果30个循环将使反应饱和 Q-PCR结果。
      8. 用每个PCR反应纯化PCR扩增子   PCR纯化试剂盒,并测量 每个样品的浓度用NanoDrop ND-1000分光光度计测定。
    3. 构建Illumina测序的测序样品
      1. 取1μg来自每个样品的每个PCR扩增子产物并建立 与T4多核苷酸激酶(PNK)加入以下反应 5'-磷酸转移至扩增子以允许随后的连接
        T4 PNK反应缓冲液
        T4 PNK
        H sub 2 O
      2. 在37°C孵育1小时,并用PCR纯化柱纯化样品在40μl
      3. 具有Klenow片段(3'→5'外切)的dA尾部:
      4. 在37℃孵育30分钟,并用35μl体积的PCR纯化柱纯化DNA样品
      5. 用带有指定不同池的各种条形码的Illumina测序适配器连接:
        T4 DNA连接酶反应缓冲液(10x)
        T4 DNA连接酶 2微升
        无菌H 2 O 2/b 8微升
        产生适配子 5'-ACA CT CTT TCC CTA CAC GAC GCT CTT CCG ATC TNN NT-3'5' -/5Phos/NNN AGA TCG GAA GAG CGG TTC AGC AGG AAT GCC GAG-3'。 的位置 用于区分不同样品的多重ID被加下划线。 NNN 表示多路复用ID的不同序列
      6. 在25°C(室温)孵育1小时,并用30μl体积的PCR纯化柱纯化
      7. 使用引物通过最终PCR富集衔接子连接的产物:
      8. 用30μl体积的PCR纯化柱纯化DNA 用NanoDrop ND-1000分光光度计测量浓度
      9. 将500ng来自每个池的最终产物混合并进行Illumina测序(HiSeq)。

  3. 确定每个突变病毒在每次传代的频率,并用回归分析计算每个突变病毒的相对适合度评分。
    1. 在HiSeq数据文件中读取的每个对端序列被映射到 参考序列一旦它通过质量控制(切断35)。每  来自参考序列的错配匹配将被鉴定为突变  并计数每个突变的数目。剧本 "mapping.txt"用于提供变量映射,此处
    2. 计算 在池#N(fv,N)中的给定变量v的频率和频率  在池#N( wt,N )中的WT wt >

      在读取 v时,N 表示变体的序列读取数 ( )在池#N中读取的序列读数的数目, 在池#N中,并且读取 N 表示池#N中的总读数。
    3. 丢弃任何低于0.0005的频率,因为突变 HCV的频率为每个约10个核苷酸置换至约10个核苷酸置换 核苷酸/轮基因组复制
    4. 计算 每个突变病毒的相对适合度评分。相对健康得分 作为斜率的反对数确定给定变体(W sub)  的回归使用在Python中实现的以下公式:

      其中ln  是给定值的相对频率的对数 变体(v),用于输入RNA文库,池0 重建突变病毒库。脚本'fitness_reg.txt'为 将提供此处的健身计算。


图1.突变体文库构建和选择的程序。 A.显示HCV NS5A亚结构域中饱和突变文库构建的示意图。将突变的区域分成5个小区域,每个区域由17或18个氨基酸组成。每个残基用一个随机密码子(N 1 N N 2 K:N1和N2代码用于A/T/G/C和K代码用于T/G)代替,并入HCV的WT背景中。 B.然后通过使Huh5.7.1细胞通过4轮来体外选择所得的病毒文库。

图2.预期数据的示例:NS5A中氨基酸18-103在病毒复制中的适应度图。这是显示表示为选择系数的相对适合度分数的热图( )。 颜色表示相对于WT(材料和方法)计算为'em's '的每个突变体的适合度。红色表示正的 '(增加的适应度),而蓝色表示负的 。 s = 0表示与WT病毒相同的适合度。突变区的二级结构注释在图下方(空心圆:溶剂暴露的残基;实心圆:隐藏的残基;半填充的圆:部分掩埋的残基)。该图由MATLAB软件生成。


  1. 在Huh-7.5.1细胞中突变体病毒文库传代以进行体外选择的过程中,应估计文库复杂性,并且在整个过程中始终维持文库复杂性。库的复杂性可以根据构建库的方式来估计。例如,在我们最近的Qi等人(2014)的研究中,我们将NS5A区域(从aa 18到aa 103)中的每个86位置用所有可能的20个氨基酸加上终止密码子。在这种情况下,文库复杂性可以计算为:86×20(19个变体加终止密码子)+ 1(WT)= 1721。根据我们的经验,我们发现覆盖每个变体至少100x平均给出最佳和可重现的结果
  2. 应该选择图书馆进行多轮回归分析,以在计算相对适合度分数时给出更高的置信度


这项工作得到以下资助:中国国家自然科学基金(NSFC)81172314,国家科学基金会EF-0928690(JLS)和国家卫生研究所AI078133(RS),Margaret E.早期医学研究信托基金,P30CA016042(Jonson综合癌症中心)和P30AI028697(UCLA艾滋病研究所/CFAR)。 JLS非常感谢De Logi主席在生物科学领域的支持和RAPIDD计划的科学与技术。 美国国土安全部技术局和Fogarty国际中心,国立卫生研究院。 C.A.O.由NCI癌症教育奖学金,R25 CA 098010支持。


  1. 在一些实施方案中,本发明的化合物可以用于制备本发明的化合物,其中所述化合物具有下式:其中R 1,R 2,R 3,R 4, SY,Al-Mawsawi,LQ,Wu,TT,Chen,SH,Lin,CY,Zhong,W.,Lloyd-Smith,JO和Sun, 定量高分辨率遗传图谱可快速识别丙型肝炎病毒适应性和药物敏感性的序列决定因素。/a> PLoS Pathog 10(4):e1004064。
  2. Wu,N. C.,Young,A.P.,Al-Mawsawi,L.Q.,Olson,C.A.,Feng,J.,Qi,H.,Luan,H. H.,Li,X.,Wu,T.T.and Sun, 高通量鉴定A型流感病毒中抗干扰素活性的功能丧失突变NS segment。 J Virol 88(17):10157-10164。
  3. Wu,NC,Young,AP,Al-Mawsawi,LQ,Olson,CA,Feng,J.,Qi,H.,Chen,SH,Lu,IH,Lin,CY,Chin,RG,Luan,HH,Nguyen, N.,Nelson,SF,Li,X.,Wu,TT和Sun,R。(2014)。 以单核苷酸分辨率高流量分析甲型流感病毒血凝素基因。 em> Sci Rep 4:4942.
  4. Arumugaswami,V.,Remenyi,R.,Kanagavel,V.,Sue,E.Y.,Ngoc Ho,T.,Liu,C.,Fontanes,V.,Dasgupta,A.and Sun,R。(2008)。 丙型肝炎病毒基因组的高分辨率功能分析 /em> 4(10):e1000182。
  • English
  • 中文翻译
免责声明 × 为了向广大用户提供经翻译的内容, 采用人工翻译与计算机翻译结合的技术翻译了本文章。基于计算机的翻译质量再高,也不及 100% 的人工翻译的质量。为此,我们始终建议用户参考原始英文版本。 Bio-protocol., LLC对翻译版本的准确性不承担任何责任。
Copyright: © 2015 The Authors; exclusive licensee Bio-protocol LLC.
引用:Qi, H., Olson, C. A., Wu, N. C., Du, Y. and Sun, R. (2015). Determining the Relative Fitness Score of Mutant Viruses in a Population Using Illumina Paired-end Sequencing and Regression Analysis . Bio-protocol 5(10): e1475. DOI: 10.21769/BioProtoc.1475.