All homologous sequences were aligned to their neORF. For all query neORFs which contained an intron, the intron was removed in the alignment between the neORF and its homologous sequence, in both sequences. Several diverging features were investigated between the aligned neORF and homologous sequences. We investigated the presence of the following six features (Supplemental Fig. S26): (i) start codon, (ii) stop codon, (iii) frameshift mutation, (iv) anticipated stop codon, (v) different sequence hit size, and (vi) transcription event. When the homologous sequences did not align to the entire sequence of the neORF, meaning that the match started later in the sequence and/or ended earlier, the homologous sequence was annotated as “Different sequence hit.” When the size of the alignment was the same, the start codon, stop codon, and frameshift mutations were searched in the sequence. The presence of an anticipated stop codon was searched in the sequence.
Finally, the genomics positions of the homologous sequences were mapped to their reference genome, their orientation in the genome was assessed, and the presence of a transcription event was annotated.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.