To analyze the level of genome conservation [40, 73] for each AS core set and event type within Brassicaceae, we first generated chain genome alignments (liftOver files) between A. thaliana and four closely related species: Arabidopsis lyrata (v.1.0, estimated split 7.1 million years ago [MYA] [74]), Camelina sativa (Cs, 9.4 MYA), Arabis alpina (A_alpina_V4, 25.6 MYA), and Brassica rapa (Brapa_1.0, 25.6 MYA). For this purpose, we downloaded the genome sequences for these four species from Ensembl Plants v48 and generated the chain alignments following the UCSC pipeline and using blat with -minIdentity=80 and -minScore=50 parameters (the full pipeline is available on https://github.com/vastdb-pastdb/pastdb). Next, we lifted the coordinates for all vast-tools events from A. thaliana by AS type and assessed their presence in each of other four genomes (“genome conservation”). Following previous studies [25], for alternative exons, we further required that the lifted exonic sequence was surrounded by at least one canonical 5′ (GT/C) or 3′ (AG) splice site dinucleotide in the target species. For alternative 5′ and 3′ splice site choices, we required the canonical splice site dinucleotide to be conserved. Finally, for IR events, we required the lifted region to intersect with an annotated intron (from the Ensembl Plants v48 annotations) in the target species, using bedtools intersect. These results are shown in Additional file 2: Figure S10.

To assess the overlap among the AS core sets from A. thaliana and the three animal species, we used ExOrthist v0.0.1.beta (https://github.com/biocorecrg/ExOrthist). This software is designed to identify orthologous exons based on intron position-aware pairwise protein alignments (as previously described in [28, 32]). ExOrthist uses clusters of gene orthologs, which were obtained using Broccoli [75] with the sets of translated reference transcript per gene for A. thaliana, C. elegans, D. melanogaster, and H. sapiens. Clusters of orthologous exons were then obtained for all annotated exons plus all vast-tools exons (added through the --extraexons argument) for each species using the default settings for “long evolutionary distance” for all pairwise species comparisons (long_dist, minimum exonic sequence similarity [ex_sim] = 0.20, maximum ratio difference between exon lengths [ex_len] = 0.65 and overall protein sequence similarity [prot_sim] = 0.15; see https://github.com/biocorecrg/ExOrthist for further details). Finally, to generate the four-way Venn diagrams in Additional file 2: Figure S29, we assessed the overlap of each core set of AS events from each species with the others at two levels (code available in https://github.com/vastdb-pastdb/pastdb). First, we checked if any orthologous gene (defined as being in the same gene cluster) in the target species also hosted an AS event (of any type) belonging to the same core set (i.e., abiotic stress, biotic stress or tissues). Second, we assessed whether the exact splice site(s) affected by the AS event were also affected by an AS event (of any type) of the same core set in the target species. In the case of 5′ UTR AS events, these were conservatively considered to fall in the same splice site if orthologous genes in the two species had a core set AS event in the 5′ UTR. It should be kept in mind that these overlaps do not imply evolutionary conservation; rather, for such distantly related species, the overlaps are likely the result of convergent evolution.

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.