RRTS calculation

Jamie Wangen; Rachel Green

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Preprint

RRTS calculation

JW Jamie Wangen

RG Rachel Green

Last updated date: Apr 9, 2021 Views: 864 Forks: 0

An abbreviated version of this protocol was published in eLife in Jan, 2020

Stop codon context influences genome-wide stimulation of termination codon readthrough by aminoglycosides

Download PDF

Ask a question

How to cite

Favorite

RRTS calculation (Ribosome ReadThrough Score)

Data analysis is performed in python 2.7

All code is available here: https://github.com/jrw24/G418_readthrough

For RRTS specific details, see the python script here: https://github.com/jrw24/G418_readthrough/blob/master/riboseq/riboseq_buildDenseTables_rpkm_utr3adj.py

Step-by-step description of RRTS calculation:

Prior to performing this analysis, footprint assignment must be completed meaning that each ribosome protected fragment (RPF) has been mapped to a transcript specific position as a function of the fragment length and counts have been normalized by sequencing depth.
1. Each transcript is saved as a list with each position representing a nucleotide
  1. The variable name is: exonsplicedcounts
2. For example, a 9 nucleotide transcript (x) with 3 ribosomes could look like:
  x = [ 0, 0, 0, 0, 2, 0, 0, 1, 0]
3. A key step in the protocol is trimming regions of each transcript. To trim the last 3 nucleotides of this transcript, you could use:
  x = x[:-3]
  Now this transcript has a length of 6 and 2 total counts
  [ 0, 0, 0, 0, 2, 0 ]
4. I set a dictionary with insets (nucleotides to be trimmed from each region)
  defaultInsets = { 'utr5Inset3' : 6, 'cdsInset5' : 18, 'cdsInset3' : 15, 'utr3Inset5' : 6 }
Iterate through all transcripts in the transcriptome annotation and perform the following steps:
Remove transcripts with 5’UTR length of zero
Remove transcripts with 3’UTR length of zero
Define the CDS start and end:

cdsstart = utr5len

cdsend = len(exonsplicedcounts) - utr3len
Verify transcripts have a CDS length greater than zero
Modify 5’UTR, 3’UTR, CDS and mRNA region lengths to exclude start codon and stop codon peaks from calculation.
1. utr5len = utr5len – 6
2. cdslen = cdslen – 18 – 15
3. utr3len = utr3len – 6
4. mrnalen = utr5len + cdslen +utr3len
calculate adjusted 3’UTR length (utr3LenAdj) based on the position of the first inframe stop codon
1. inframe stop codon positions are identified using this script:
  https://github.com/jrw24/G418_readthrough/blob/master/utils/stopcodon_finder.py
2. If there are no in-frame stop codons in the 3’UTR, set utr3LenAdj to utr3len
Filter zero length 5’UTRs and 3’UTRs after trimming
Sum RPF counts in each region, excluding counts in trimmed positions

utr5Counts = sum(exonsplicedcounts[:cdsstart-insets['utr5Inset3']])

utr3Counts = sum(exonsplicedcounts[cdsend+insets['utr3Inset5']:])

cdsCounts = sum(exonsplicedcounts[cdsstart+insets['cdsInset5']:cdsend-insets['cdsInset3']])

mrnaCounts = utr5Counts+cdsCounts+utr3Counts
Sum counts in adjusted 3’UTRs excluding the first 6 nucleotides after the normal stop codon.

utr3AdjCounts = sum( exonsplicedcounts[cdsend+insets['utr3Inset5']:cdsend+utr3LenAdj] )
If the adjusted 3’UTR length (utr3LenAdj-6) is less than 12, exclude this transcript
Calculate densities for each region by dividing counts over length
Convert densities to rpkm by multiplying by 1000 (reads per kilobase per million of mapped reads)
Exclude transcript with zero CDS densities
Exclude transcript if there are less then 128 counts in trimmed CDS region
Exclude transcripts with CDS densities less than 0.001
Calculate RRTS by dividing 3’UTR adjusted density by CDS density
1. RRTS = utr3AdjDensity_rpkm/cdsDensity_rpkm
After completing calculation for all transcripts, write results to a csv file.

How to cite：

Readers should cite both the Bio-protocol preprint and the original research article where this protocol was used:

Wangen, J and Green, R(2021). RRTS calculation. Bio-protocol Preprint. bio-protocol.org/prep1008.
Wangen, J. R. and Green, R.(2020). Stop codon context influences genome-wide stimulation of termination codon readthrough by aminoglycosides. eLife. DOI: 10.7554/eLife.52611