ERCC spike-in normalized mRNA-seq for cultured cells (human)

Mark W. Zimmerman; Brian J. Abraham; A. Thomas Look

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Preprint

ERCC spike-in normalized mRNA-seq for cultured cells (human)

MZ Mark W. Zimmerman

BA Brian J. Abraham

AL A. Thomas Look

Last updated date: Jun 1, 2022 Views: 726 Forks: 0

An abbreviated version of this protocol was published in Science Advances in Oct, 2021

Retinoic acid rewires the adrenergic core regulatory circuitry of childhood neuroblastoma

Download PDF

Ask a question

How to cite

Favorite

Mark W. Zimmerman, Dana-Farber Cancer Institute, Boston, MA

Brian J. Abraham, St. Jude Children’s Hospital, Memphis, TN

Required materials and kits for part 2:

Reagent Manufacturer Catalog number

ERCC spike-in mix 1 Thermo-Fisher 4456740

TRIzol Invitrogen 15596018

RNeasy mini kit QIAgen 74104

RNase-free DNase kit QIAgen 79254

Chloroform Thermo-Fisher J67241.AP

Isopropanol Thermo-Fisher T036181000CS

Ethanol Thermo-Fisher T038181000

Part 1 – Cell preparation

1. Seed cells in 6-well plates in enough wells for each replicate plus at least one additional sample for optimization (i.e., triplicate plus one extra, n=4)

*It is essential that the same number of cells are seeded in each well

2. Grow cells under the required experimental conditions until wells are 70-90% confluent

3. At the experimental endpoint(s), collect the cells from each well (by scraping or using trypsin if necessary) into separate 1.5 mL tubes and count the total number of cells in each sample

*Cell count should be approximately the same across replicates, so it is usually okay to count one sample per experimental group as long as sample collection is uniform and consistent – record this number

4. Pellet the cells, remove the supernatant, and snap freeze in liquid nitrogen or dry ice with ethanol

5. Store samples at -80°C until ready to proceed

Part 2 – RNA extraction and ERCC spike-in

Note: This step is based on a hybrid of the TRIzol (Invitrogen, Cat. No. 15596026) and RNeasy (QIAgen, Cat. No. 74106) extraction methods, which when combined, yields a large amount of high-quality RNA.

1. Beginning with only the extra samples collected for each experimental group (save the samples to be sequenced for step 20), resuspend the frozen cell pellets in 1 mL TRIzol reagent and incubate at room temperature for 5 min

2. Add 0.2 mL chloroform and mix for 15 seconds, incubate for 2 minutes at room temperature

3. Centrifuge the samples at 12,000 g for 10 minutes (at 4°C)

4. Transfer the aqueous (upper) phase to new tubes, and precipitate the RNA from the aqueous phase by adding 0.5 mL of isopropyl alcohol (per 1 ml of TRIZOL reagent) - briefly vortex

5. Incubate samples at room temperature for 10 minutes and centrifuge at 12,000 g for 10 minutes (at 4°C)

6. Remove the supernatant, wash the RNA pellet once with 1.0 mL 75% ethanol, and mix by pipetting up and down

7. Centrifuge at 12,000 g for 5 min at 4°C and remove as much of the supernatant as you can

*A small amount of residual ethanol is okay; no need to air dry the pellet

8. Dissolve the RNA pellet in 100µl RNase-free water, add 350µl buffer RLT, and mix by pipetting up and down

9. Add 250µl ethanol (100%) to the diluted RNA and mix by pipetting up and down

10. Apply the sample to an RNeasy extraction column placed in a 2 ml collection tube, and centrifuge for 1 min at 12000 g (at room temperature)

11. Prepare DNase (QIAgen, Cat. No. 79254) by adding 10 µl of DNase to 70 µl of RDD buffer (per sample) and keeping it on ice

*DNase is very sensitive to physical denaturation therefore mix gently by pipetting up and down

12. Add 350 µL of RW1 and spin 12,000 g for 15 sec (at room temperature) and discard flow through

13. Add 80 µL of DNase mix to column (making sure all solution reaches the membrane) and incubate at room temperature for 15 min

14. Add 350 µL RW1 to each column (with DNase solution still in it) and spin 12,000 g for 15 sec (at room temperature) and discard flow through

15. Add 500 µL buffer RPE onto the column and centrifuge for 15 sec at 12000 g (RT) to wash the column.

16. Add another 500 µL buffer RPE to the RNeasy column, and centrifuge for 2 min at 12000 g (at room temperature)

17. Transfer the column to a new 1.5 mL collection tube, add 30 µL RNase-free water, and centrifuge for 1 min at 12000 g (at room temperature) to elute total RNA

18. Quantify the total amount of RNA in each sample using nanodrop, qubit, or another instrument (i.e., 100 ng/µL x 30 µL = 3 µg total)

19. Based on the sample with the highest amount of RNA, calculate the volume of ERCC spike-in mix 1 (Invitrogen, Cat. No. 4456740) to add to each sample:

First, serial dilute the ERCC spike-in mix 1 (do this fresh every time):

Dilution	ERCC spike-in mix 1	Nuclease-free water
1:10	1 µL undiluted	9 µL
1:100	1 µL of 1:10	9 µL
1:1000	1 µL of 1:100	9 µL
1:5000	2 µL of 1:1000	8 µL

Next, determine volume of ERCC to add to the highest sample and adjust the amount added to every other sample based on cell count:

Total RNA	Volume of diluted ERCC spike-in mix 1
10 ng	1 µL (1:5000 dilution)
100 ng	2 µL (1:1000 dilution)
1000 ng	2 µL (1:100 dilution)
5000 ng	1 µL (1:10 dilution)

For example: DMSO control wells each contained 1.0x10⁶ cells at the time of collection and are expected to yield 1000 ng of total RNA (based on step 18), and the drug-treated wells each contained 0.8x10⁶ cells at the time of collection and are expected to yield 500 ng of total RNA (based on step 18). You would add 2 µL of 1:100 diluted ERCC to each DMSO control sample and 1.6 µL of 1:100 diluted ERCC to each drug-treated sample (to adjust for 20% less cells). If the cell count was the same across treatment groups (i.e. 1.0x10⁶ cells collected from both) you would add 2 µL of 1:100 diluted ERCC to every sample.

DO NOT adjust the volume of ERCC based on changes in RNA yield since this would just mask changes in the global RNA output.

20. Resuspend the experimental frozen cell pellets in 1 mL TRIzol reagent and incubate at room temperature for 5 min

21. Add the calculated volume of ERCC spike-in RNA to each sample by adding it directly to the cells resuspended in TRIzol solution

*It is critical to add the ERCC spike-in RNA early in the purification (i.e. directly to the TRIzol before precipitation) since RNA yield tends to become variable between samples following each subsequent step of the extraction procedure

22. Extract and quantify RNA from each sample by repeating steps 2 – 18 and store spike-in normalized RNA samples at -80°C until ready to proceed

*The initial sample used for determining total RNA yield and optimizing ERCC concentration can either be discarded or saved for a later analysis

Part 3 – Library preparation and sequencing

Note: For this step, samples can be processed using any standard RNA-sequencing method

Samples should be prepared using commercially available library preparation kits, typically starting with 500ng of purified total RNA according to the manufacturer’s protocol. The finished dsDNA libraries should be quantified by Qubit fluorometer (Thermo-Fisher), TapeStation 4200 (Agilent), and RT-qPCR using the Kapa Biosystems library quantification kit (Roche, Cat. No. KK4824).

Indexed libraries are then pooled in equimolar ratios and sequenced on an Illumina NextSeq 550, Novaseq, or similar (single-end reads with at least 75bp length).

Part 4 – Computational Analysis

1. Build a reference genome sequence that contains the sequences of the spike-in probes as additional “chromosomes.” Acquire the FASTA of ERCC spike-in sequences, e.g. https://www-s.nist.gov/srmors/certificates/documents/SRM2374_Sequence_v1.FASTA. Process the chromosomal and spike-in FASTA files to create a reference sequence for alignment using a tool of your choosing, including Burrows-Wheeler-based tools. Because our preferred alignment strategy uses hisat2, we used hisat2-build.

2. Align FASTQ reads from experiments to the custom reference genome. Our preferred aligner is hisat2 with default parameters. Convert aligned reads file to a sorted, indexed BAM file using samtools.

3. Build a positional gene reference GTF that includes the positions of the ERCC spike-in probes in the custom reference genomes. Begin by acquiring one of many basic reference GTFs of known positions of genes with positions in the reference genome build you initially chose; we used RefSeq. To this GTF, add GTF-formatted positions of ERCC probes. Ensure that the attributes field is formatted identically between the gene positions and probe positions.

4. Quantify read coverage of genes and ERCC probes. We used htseq-count to quantify coverage of all genes, using -i gene_id and -m intersection-strict, the sorted BAM file of aligned positions, and the GTF of gene and probe positions in the reference. This will generate a file of per-gene read counts.

5. (Optional) Normalize read counts using one of many standard strategies, including transcripts-per-million (TPM). First generate a file of per-gene total exon sizes by collapsing all exons of each isoform of each gene into a single set of regions using bedtools merge, then quantify the numbers of basepairs in these collapsed exons. Use the standard TPM-normalization strategy:
normterm = sum of (readcount * readlength / exonlength) across all genes.
TPM = readcount * readlength / exonlength. * 1e6 / normterm

6. Normalize expression using ERCC spike in values. For each gene for each sample, we floor expression at 0.01 and add a pseudocount of 0.1. Create a table where each row is a gene, each column is a sample and each cell is count or TPM-normalized count value. Using the affy R package, perform normalize.loess using the ERCC spike-in probe rows of the expression table as the subset.

7. (Optional) Confirm the distribution of ERCC probe expression values approximately span the distribution of gene expression values before and after normalization.

How to cite：

Readers should cite both the Bio-protocol preprint and the original research article where this protocol was used:

Zimmerman, M, Abraham, B and Look, A(2022). ERCC spike-in normalized mRNA-seq for cultured cells (human). Bio-protocol Preprint. bio-protocol.org/prep1692.
Zimmerman, M. W., Durbin, A. D., He, S., Oppel, F., Shi, H., Tao, T., Li, Z., Berezovskaya, A., Liu, Y., Zhang, J., Young, R. A., Abraham, B. J. and Look, A. T.(2021). Retinoic acid rewires the adrenergic core regulatory circuitry of childhood neuroblastoma. Science Advances 7(43). DOI: 10.1126/sciadv.abe0834