Similarly, but independently of codon bias, the juxtaposition of codons in ORFs does not appear to be randomly distributed either [32], and these preferences are typically referred to as codon pair bias. An algorithm has been developed to quantify codon pair bias [33]. For each of the 3721 possible codon pairs, excluding stop codon pairs, the codon pair score (CPS) is defined as the natural log of the observed ratio over the expected number of occurrences of each codon pair’s overall coding regions. It can be calculated by using the following equation [33]:
where the codon pair encodes for amino acid pair , and denotes frequency (number of occurrences). The CPS value for a given pair determines whether the pair is over-represented (+) or under-represented (−). With the calculated CPSs, we can further calculate the codon pair bias (CPB) score as follows [33]:
where indicates how many kinds of codon pairs there are. The CPB score has already been used for virus attenuation through deoptimization [33,34,35]. Theoretically, a decreased CPB score is associated with the inefficiency of the viral gene translation in the host, which results in attenuation of viral replication [36]. The CPB scores were computed by the CPBias package (version 1.0) (https://github.com/alex-sbu/CPBias/) (accessed on 5 January 2021) of R.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.