Phylogenetic analysis of sleep-relevant genes
Junko Kusumi, Hiroyuki J. Kanaya, Taichi Q. Itoh
Procedures
・Search of orthologous genes
Ortholog sequences were searched from OrthoDB release 10 (https://www.orthodb.org/) (Kriventseva et al., 2019). OrthoDB is a comprehensive catalog of orthologs, including putative orthologs among 448 metazoan, 117 plant, 549 fungal, 148 protist, 5609 bacterial, and 404 archaeal genomes. We searched the ortholog groups, including the sleep-relevant genes. Searches were conducted using the genes of Drosophila melanogaster as queries (annotation keywords and identifiers of FlyBase). Next, we picked up the protein sequences of Homo sapiens, Mus musculus, D. melanogaster, C. elegans, and six cnidarian species (H. vulgaris, Nematostella vectensis, Exaiptasia pallida, Orbicella faveolata, Stylophora pistillata, and Acropora digitifera) from each identified ortholog group.
・Alignment and constructing phylogenetic trees
Identified ortholog sequences were aligned using the muscle program implemented in MEGA X (Kumar et al., 2018) for each ortholog group. The phylogenetic trees for each ortholog group were constructed by the neighbor-joining method (Saitou & Nei, 1987) along with complete deletion of gaps after the sequences which were too short to align or including incomplete expected conserved domains were removed. We excluded the sequences which showed too long branches because they probably imply diverged paralogs or include sequencing errors. We finally considered the sequences passed the above criteria as orthologous genes.
・Estimation of the percent identities of a pair of protein sequences
The percent identities of a pair of protein sequences were calculated using BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi) for the sequences of H. sapiens, M. musculus, D. melanogaster, and H. vulgaris.
・GO enrichment analysis
GO enrichment analysis for differentially expressed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) version 6.8 (http://david.abcc.ncifcrf.gov/). Before the input, Hydra genes were converted to the corresponding orthologues of H. sapiens, M. musculus, or D. melanogaster, respectively. The converting threshold in protein BLAST was set as sequence identity ≥40% and e-value < 0.001. All orthologues of individual genes covered by the transcriptome analysis were used as a background in the DAVID system.
References
Kriventseva, E. V., Kuznetsov, D., Tegenfeldt, F., Manni, M., Dias, R., Simão, F. A., & Zdobnov, E. M. (2019). OrthoDB v10: Sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Research, 47(D1). https://doi.org/10.1093/nar/gky1053
Kumar, S., Stecher, G., Li, M., Knyaz, C., & Tamura, K. (2018). MEGA X: Molecular evolutionary genetics analysis across computing platforms. Molecular Biology and Evolution, 35(6). https://doi.org/10.1093/molbev/msy096
Saitou, N., & Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4(4). https://doi.org/10.1093/oxfordjournals.molbev.a040454