2.2 CNN-based supervised subtomogram classification

Min Xu; Xiaoqi Chai; Hariank Muthakana; Xiaodan Liang; Ge Yang; Tzviya Zeev-Ben-Mordehai; Eric P Xing

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.2 CNN-based supervised subtomogram classification

MX Min Xu

XC Xiaoqi Chai

HM Hariank Muthakana

XL Xiaodan Liang

GY Ge Yang

TZ Tzviya Zeev-Ben-Mordehai

EX Eric P Xing

This method is extracted from research article: Bioinformatics, Jul 2017

Deep learning-based subdivision approach for large scale macromolecules structure recovery from electron cryo tomograms

DOI: 10.1093/bioinformatics/btx230

Request a Protocol

Ask a question

Favorite

When using CNN for subtomogram classification, the input of the CNN is a 3D subtomogram f, which is a 3D cubic image defined as a function f:ℝ³ → ℝ. The output of the CNN is a vector o: = (o₁, …, o_L), indicating the probability that f is predicted to be each of the L classes defined in the training data. Each class correspond to one macromolecular complex. Given o, the predicted class is $\underset{i}{arg max} o_{i}$ .

In this article, we propose two 3D CNN models based on GoogleNet and VGGNet for supervised subtomogram classification and adapt them for structural feature extraction.

In this section, we propose a 3D variant of tailored inception network (Szegedy et al., 2016a), denoted as Inception3D. Inception network is a recent successful CNN architecture that has the ability to achieve competitive performance with relatively low computational cost (Szegedy et al., 2016a). The architecture of our model is shown in Figure 1a. It contains one inception module (Szegedy et al., 2016a), where 1 × 1 × 1, 3 × 3 × 3, and 5 × 5 × 5 3D filters are combined with 2 × 2 × 2 3D max pooling layer. The filters are implemented in parallel and concatenated, so that the features extracted at multiple scales using filters of different sizes are simultaneously presented to the following layer. The 1 × 1 × 1 filters before the 3 × 3 × 3 and 5 × 5 × 5 convolutions are designed for dimension reduction. The inception module is followed by a 2 × 2 × 2 average pooling layer, then by a fully connected output layer with the number of units equal to the structure class number. All hidden layers are equipped with the rectified linear (ReLU) activation. The output is a fully connected layer followed by a softmax activation layer.

Architectures of our CNN models. These networks both stack multiple layers. Each box represents a layer in the network. The type and configuration of layer are listed in each box. For example, ‘32-5 × 5 × 5-1 Conv’ denotes a 3D convolutional layer with 32 5 × 5 × 5 filters and stride 1. ‘2 × 2 × 2-2 MaxPool’ denotes a 3D max pooling layer implementing max operation over 2 × 2 × 2 regions with stride 2. ‘FC-512’ and ‘FC-L’ denote a fully connected linear layer with 512 and L neurons respectively, where every neuron is connected to every output of the previous layer. L is the number of classes in the training dataset. ‘ReLU’ and ‘Softmax’ denote different types of activation layers

In this section, we propose a 3D variant of tailored VGGNet (Simonyan and Zisserman, 2014), which is another CNN architecture that achieved top classification accuracy on popular image benchmark datasets. Our model is denoted as Deep Small Receptive Field (aka DSRF3D). The architecture of our model is shown in Figure 1b. When compared with the Inception3D model, DSRF3D is featured with deeper layers and very small 3D convolution filters of size 3 × 3 × 3. The stacking of multiple small filters has the same effect of one large filter, with the advantages of less parameters to train, and more non-linearity (Simonyan and Zisserman, 2014). The architecture consists of four 3 × 3 × 3 3D convolutional layers and two 2 × 2 × 2 3D max pooling layers, followed by two fully connected layers, then followed by a fully connected output layer with the number of units equal to the structure class number. All hidden layers are equipped with the ReLU activation layers. The output is a fully connected layer with a softmax activation layer.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol