2.5 Material Attribute-Category Convolutional Neural Network Training
This protocol is extracted from research article:
Learning Medical Materials From Radiography Images
Front Artif Intell, Jun 18, 2021; DOI: 10.3389/frai.2021.638299

The convolutional layers in the backbone network are pretrained on ImageNet (Deng et al., 2009) for robust feature extraction, while the fully connected layers and auxiliary network are initialized with random weights. The training process optimizes these weights with respect to the target function and allows for a faster training process than starting with random weights for the entire network. A fast training process is important if the MAC-CNN is to be used in many different expert domains with little correlation to each other.

Like the D-CNN, we reduce overfitting by saving the MAC-CNN model from the training epoch with the lowest validation-set loss, which is not necessarily the model from the final epoch. This allows for the model to be trained for more epochs while mitigating potential overfitting later in the training process. To improve the MAC-CNN’s training convergence, we also use a learning rate scheduler that reduces the learning rate by a factor of 10 following epochs where validation set loss increases.

We train the network parameters Θ, dependent on the material attribute-category matrix A, to classify patches into K material categories and M material attributes simultaneously. The training set X is a set of N pairs of raw feature vectors and material category labels of the form T={(xi,yi)}, where xi is the raw feature vectors of image patch i and yi is a one-hot encoded label vector for its K material categories. Equation 7 formalizes the definition of these training pairs.

The loss function and minimization objective for the MAC-CNN is given in Eq. 8, which follows from the loss function used in Schwartz and Nishino (2020). 3 The loss function combines the negative log-likelihood of the K material category predictions for each image patch xiT.

The γ1-weighted term represents the KL-divergence between the M material attribute predictions for xi and a Beta distribution with a,b=0.5. The Beta distribution is again chosen as a comparison distribution for reasons like those discussed in Section 2.2.

The γ2-weighted term constrains the loss to the material attributes encoded in the A matrix. The term represents the mean squared error between rows of A, where each row represents one category’s probability distribution of attributes, and the material attribute predictions on the samples Tk for each category.

The hyperparameters γ1, γ2 assign weights to their respective loss terms and are chosen at training time.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.