Phylogenetic tree construction

XZ Xiaojun Zhang
LS Lina Sun
JY Jianbo Yuan
YS Yamin Sun
YG Yi Gao
LZ Libin Zhang
SL Shihao Li
HD Hui Dai
JH Jean-François Hamel
CL Chengzhang Liu
YY Yang Yu
SL Shilin Liu
WL Wenchao Lin
KG Kaimin Guo
SJ Songjun Jin
PX Peng Xu
KS Kenneth B. Storey
PH Pin Huan
TZ Tao Zhang
YZ Yi Zhou
JZ Jiquan Zhang
CL Chenggang Lin
XL Xiaoni Li
LX Lili Xing
DH Da Huo
MS Mingzhe Sun
LW Lei Wang
AM Annie Mercier
FL Fuhua Li
HY Hongsheng Yang
JX Jianhai Xiang
request Request a Protocol
ask Ask a question
Favorite

Peptide sequences were clustered by the Markov clustering program orthoMCL [89]. These sequences were also searched against the nr database by an all-versus-all BLASTP with threshold E ≤ 1E-05 and then clustered by MCL with an inflation value of 1.5. A total of 43 single-copy orthologous genes were clustered among 17 genomes. Ortholog alignments were produced using MUSCLE (v3.6) and concatenated into a single multiple-sequence alignment by an in-house Perl script. A neighbor-joining phylogeny was reconstructed using MEGA (v5) [90].

To examine the phylogeny of sea cucumbers, we used a maximum likelihood (ML) method for genome-wide phylogenetic analysis based on single-copy genes from the 10 deuterostome and 7 nondeuterostome genomes. Based on the gene clustering results from orthoMCL, 49,351 gene families were collected from the 17 species. Among them, 43 single-copy genes were used for phylogenetic tree construction. To understand the relationship of A. japonicus to other echinoderms, we constructed another phylogenetic tree of 5 echinoderms based on orthologous genes. To extend the taxonomic sampling, gene families were surveyed in transcriptome datasets in F. serratissima, Patiria miniata, and A. filiformis (NCBI SRA database accession numbers SRR2454338, SRR573710, SRR573709, SRR573708, SRR573706, SRR573707, SRR573705, SRR573675, SRR1523743, SRR1533125, SRR794587, SRR794568, SRR789489, and SRR3097584). Transcriptome data for the 3 echinoderms were assembled into unigenes using Trinity and cap3 with default parameters [91,92]. Single-copy gene families were extracted from the full genes of S. kowalevskii (outgroup), A. japonicus, and S. purpuratus. For each gene family, protein sequences from all represented sequenced genomes were searched using BLASTP (E-value cutoff 1.00E-10) against unigenes from each species. The unigene with the best score was translated as the longest open reading frame (ORF) in the frame detected by BLASTP. Gene clustering analysis was performed on the unigenes. Finally, 2,066 orthologous genes were obtained for constructing a phylogenetic tree using the ML method. For ML tree construction, sequence alignments were performed using MUSCLE 3.6 [93]. The substitution models that best fit observed alignment data were estimated using the program jModelTest 2 [94]. Using PhyML [95], we performed ML analysis with the substitution model WAG + gamma + Inv, and 1,000 bootstraps were conducted to produce the branch support values.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A