To study the population variation within uORFs, we merged the genomic intervals of human uORFs and excluded the regions overlapping with CDSs and repeats. We then extracted the SNPs overlapping with uORF regions from the phase 3 data of the 1000 Genomes Project. SNPs in the CDS-overlapping portion of oORFs were excluded. We annotated the effect of SNPs on human uORFs (nonsynonymous or synonymous) using custom scripts and excluded ambiguous SNPs that were annotated as both nonsynonymous and synonymous in different uORFs. For comparison, we also extracted the SNPs in CDS regions and determined their effect on CDSs using SnpEff154. The same analysis was performed for uORFs of D. melanogaster using the freeze 2 data of the Drosophila Genetic Reference Panel99.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.