Protein structure prediction and classification

Feng Li; Jing-Jing Yang; Zong-Yi Sun; Lei Wang; Le-Yao Qi; Sina A; Yi-Qun Liu; Hong-Mei Zhang; Lei-Fan Dang; Shu-Jing Wang; Chun-Xiong Luo; Wei-Feng Nian; Seth O’Conner; Long-Zhen Ju; Wei-Peng Quan; Xiao-Kang Li; Chao Wang; De-Peng Wang; Han-Li You; Zhu-Kuan Cheng; Jia Yan; Fu-Chou Tang; De-Chang Yang; Chu-Wei Xia; Ge Gao; Yan Wang; Bao-Cai Zhang; Yi-Hua Zhou; Xing Guo; Sun-Huan Xiang; Huan Liu; Tian-Bo Peng; Xiao-Dong Su; Yong Chen; Qi Ouyang; Dong-Hui Wang; Da-Ming Zhang; Zhi-Hong Xu; Hong-Wei Hou; Shu-Nong Bai; Ling Li

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Protein structure prediction and classification

FL Feng Li

JY Jing-Jing Yang

ZS Zong-Yi Sun

LW Lei Wang

LQ Le-Yao Qi

SA Sina A

YL Yi-Qun Liu

HZ Hong-Mei Zhang

LD Lei-Fan Dang

SW Shu-Jing Wang

CL Chun-Xiong Luo

WN Wei-Feng Nian

SO Seth O’Conner

LJ Long-Zhen Ju

WQ Wei-Peng Quan

XL Xiao-Kang Li

CW Chao Wang

DW De-Peng Wang

HY Han-Li You

ZC Zhu-Kuan Cheng

JY Jia Yan

FT Fu-Chou Tang

DY De-Chang Yang

CX Chu-Wei Xia

GG Ge Gao

YW Yan Wang

BZ Bao-Cai Zhang

YZ Yi-Hua Zhou

XG Xing Guo

SX Sun-Huan Xiang

HL Huan Liu

TP Tian-Bo Peng

XS Xiao-Dong Su

YC Yong Chen

QO Qi Ouyang

DW Dong-Hui Wang

DZ Da-Ming Zhang

ZX Zhi-Hong Xu

HH Hong-Wei Hou

SB Shu-Nong Bai

LL Ling Li

This method is extracted from research article: PNAS Nexus, Apr 2023

Plant-on-chip: Core morphogenesis processes in the tiny plant Wolffia australiana

DOI: 10.1093/pnasnexus/pgad141

Ask a question

Favorite

The predicted mRNAs were firstly translated into protein sequences. MMseqs2 (32) program suite was employed for further protein sequence analysis; cluster submodule was then used for clustering to reduce redundancy of sequence space, with e-value threshold set to 1e⁻⁵. Representative sequences of clusters were compared with those available in structure database (i.e. PDB (52, 53) and AlphaFold Protein Structure Database (31) using search submodule, with e-value threshold set to 1e⁻⁵ as well). As a result, 6,800 nonhomologous protein sequences were left for further structural prediction.

Non-Docker version AlphaFold2 (30) was deployed for speed and scalability. Features (i.e. multiple sequence alignments) needed as input for further prediction were firstly generated on a distributed cluster of machines without GPUs. Further structural prediction by neural network and refinement using molecular dynamics were both conducted on machines with graphics cards. Each task was provided with one graphics card to speed up computation. Finally, we obtained 6,798 predicted structures and their relative information, while the prediction for the other two failed due to video memory limitation.

To identify the superfamily of these predicted structures, we used DaliLite.v5 (54) (i.e. a standalone program for protein structural alignment using Dali method) to compare these with representative structures of superfamilies provided by SCOPE (55, 56). The all-against-all structural comparisons were performed with default parameters. The hits with the highest Z-score were considered as the best ones, and thus the superfamilies of query predicted proteins were considered as same as those of best hits.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol