2.4 Material Attribute-Category Convolutional Neural Network Architecture

This protocol is extracted from research article:

Learning Medical Materials From Radiography Images

**
Front Artif Intell**,
Jun 18, 2021;
DOI:
10.3389/frai.2021.638299

Learning Medical Materials From Radiography Images

Procedure

The *material attribute-category convolutional neural network* (MAC-CNN) is an end-to-end convolutional neural network that seeks to directly learn the *K* material categories while also simultaneously learning the *M* material attributes embedded by $\mathbf{A}$. We improve on the MAC-CNN design in Schwartz and Nishino (2020) by updating the architecture to classify medical materials more robustly. Figure 3 demonstrates the architecture of our MAC-CNN.

The Material Attribute Classifier CNN (MAC-CNN) architecture. The network uses convolutional layers from ResNet34 (He et al., 2015) followed by sequential 512-node and 2048-node fully connected layers to predict the material category ${c}_{i}\in {\left[\mathrm{0,1}\right]}^{K}$. An auxiliary network of fully connected layers also predicts the material attribute probabilities $f\left({x}_{i}\right)$.

The MAC-CNN in Schwartz and Nishino (2020) used VGG-16 (Simonyan and Zisserman, 2014) as its backbone architecture. However, to maintain consistency with the D-CNN and use a more powerful architecture, we introduce an updated version of the MAC-CNN that is built on ResNet34 (He et al., 2015). ResNet is more reliable with deeper layers since its architecture reduces the vanishing gradient problem. This means that, when compared to a deeper version of the VGG network, a deeper version of ResNet could give the MAC-CNN greater predictive power, which could be useful for complex medical material problems. Like all models with more parameters, this comes at the expense of training time.

The fully-connected layers in the ResNet network are replaced by two fully-connected layers to be trained from random initialization. These layers determine the *K* material category predictions as shown in Figure 3, and output a one-hot vector with the material category classification. If the D-CNN is effective at discerning expert categories and the $\mathbf{A}$ matrix encodes these categories well, then the MAC-CNN should be able to categorize expert, naïve and null categories effectively.

To predict the *M* material attributes, the backbone network is augmented with multiple auxiliary classifier networks. The responses from each block of the ResNet backbone, along with the initial pooling layer, are used as inputs to individual auxiliary classifier networks. An additional auxiliary classifier is used to combine each module’s prediction into a single *M*-dimensional prediction vector. The auxiliary network learns to give conditional probabilities that the patch fits each material attribute, allowing the MAC-CNN to retain features that are informative for predicting material attributes.

The goal of the MAC-CNN is realized through training the network on image patches, like the D-CNN. However, the patches’ material categories are learned directly instead of through similarity decisions. The MAC-CNN also learns material attributes. Therefore, the weights from the D-CNN cannot be directly transferred to the MAC-CNN.

To predict the *M* discovered material attributes, the MAC-CNN uses a learned auxiliary classifier *f* with parameters $\mathbf{\Theta}$ that maps an image patch with *d* raw features to the *M* attribute probabilities. The model *f*’s mapping is given by $f\left({\mathbf{x}}_{i};\mathbf{\Theta}\right):{\mathrm{\mathbb{R}}}^{d}\to {\left[\mathrm{0,1}\right]}^{M}$. Each term in the output is a conditional probability that the patch exhibits that particular attribute.

Given a *D*-dimensional feature vector output from a hidden layer of the MAC-CNN, the *M* dimensional material attribute prediction is computed by Eq 6. The network’s weights and biases $\mathbf{\Theta}=\left\{{\mathbf{W}}_{1},{\mathbf{W}}_{2},{\mathbf{b}}_{1},{\mathbf{b}}_{2}\right\}$ have dimensionality ${\mathbf{W}}_{1}\in {\mathrm{\mathbb{R}}}^{H\times D}$, ${\mathbf{W}}_{2}\in {\mathrm{\mathbb{R}}}^{M\times H}$, ${\mathbf{b}}_{1}\in {\mathrm{\mathbb{R}}}^{H}$, and ${\mathbf{b}}_{2}\in {\mathrm{\mathbb{R}}}^{M}$, where *H* is the dimensionality of the hidden layer.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Note: The content above has been extracted from a research article, so it may not display correctly.

Q&A

Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.