4.1. Prediction of TFPs from Starfishes

AP Alexander S. Paramonov
MS Mikhail A. Shulepko
AM Alexey M. Makhonin
MB Maxim L. Bychkov
DK Dmitrii S. Kulbatskii
AC Andrey M. Chernikov
MM Mikhail Yu. Myshkin
SS Sergey V. Shabelnikov
ZS Zakhar O. Shenkarev
MK Mikhail P. Kirpichnikov
EL Ekaterina N. Lyukmanova
request Request a Protocol
ask Ask a question
Favorite

To find possible TFPs among the starfish genomes, we initially built a reference list of the known and characterized TFPs. To do this, sequences of proteins containing the LU-domain were extracted from UniProt database (EMBL-EBI, Hinxton, UK) [66], from which sequences no longer than 190 residues were selected to exclude proteins with more than one LU-domain. From the resulting list (153 proteins, Table S1), the sequences of LU-domains were isolated and were used for the search. The search for TFPs was carried out in a database of starfish proteins, which were built from available genome assembly (Accession codes: GCA_902459465.3 for A.rubens and GCA_001949145.1 for A. planci). In total, the search was carried out among 24,050 sequences for A. rubens and 32,215 sequences for A. planci. For TFPs search, BLASTP ver. 2.13.0 (NCBI, Bethesda, MD, USA) was used [67], taking e-value < 10−5 as the threshold.

From the found proteins, LU domains were isolated and combined with a list of reference LU-domains. The resulting sequences were subjected to multiple sequence alignment (MSA) using ClustalW ver. 2.1 (EMBL-EBI, Hinxton, UK) [68]. In this case, a modified substitution matrix was used: similar to BLOSUM62, in which a high (99) and a low (−4) coefficients were assigned for cysteine-cysteine substitutions and for cysteine substitutions for other amino acid residues, respectively. The guide tree obtained from MSA was used to visualize the homology between the sequences. The iTOL ver 6.0 service (Biobyte Solutions, Heidelberg, Germany) [69] was used to build a tree. To calculate pairwise similarity between sequences, the residues were divided into groups: hydrophobic (A,F,H,I,L,M,P,V,W), cysteines (C), polar (G,N,Q,S,T,Y), positively (K,R) and negatively (D,E) charged. Similarity was calculated as the percentage of coincidences of residues belonging to the same group in each position when comparing aligned sequences.

For prediction of the signal peptide and glycosylphosphatidylinositol (GPI) anchoring site in the Lystar5 sequence, SignalP ver. 6.0 (Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark) [70] and PredGPI ver. 1.0 (Biocomputing Group, Department of Biology, University of Bologna, Bologna, Italy) [71] web-services were used.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A