2.5. Dispute Model Technique

MS Mitchell Sueker
AD Amirreza Daghighi
AA Alireza Akhbardeh
NM Nicholas MacKinnon
GB Gregory Bearman
IB Insuck Baek
CH Chansong Hwang
JQ Jianwei Qin
AT Amanda M. Tabb
JR Jiahleen B. Roungchun
RH Rosalee S. Hellberg
FV Fartash Vasefi
MK Moon Kim
KT Kouhyar Tavakolian
HZ Hossein Kashani Zadeh
ask Ask a question
Favorite

There are as many as 43 species/classes in our model. This high number of classes and similarities of spectra from groups of biologically close species make it difficult to achieve high levels of accuracy with a single model. Therefore, we devised a novel machine learning technique that improves the identification accuracies for all species by training specialized models that can differentiate between species of a similar underlying biology. Each of these local models learns to classify species of a group whose spectra are too close for one model to separate. These specialized models are called and run by the global model that was explained in the previous section, and are shown in Figure 6 and Figure 7. To determine how many specialized models were needed and the species in each of them, the test results of the global model were evaluated using the confusion matrix. When testing the global model, if the accuracy of a species was lower than a threshold, then that species became a member class of a specialized model, and is called a forming species. This specialized model was then trained on the forming species and the other species that the forming species was generally misclassified as. No species can be in more than one dispute model, because the members of the dispute groups must remain exclusive. Therefore, if one species is consistently being misclassified as multiple other species, a decision has to be made as to which of the low-performance species a dispute model will be formed with. Whichever combination of classes within the dispute models that attains the largest net positive change in performance will be chosen.

Hybrid fusion model where the global model was trained on all species, with the dispute models for each spectroscopic mode determining predictions for spectrally similar fish groupings. A final majority vote occurred to give a final prediction.

Schematic of (a) the global model and (b) hybrid (global integrated with dispute) model technique.

To train and evaluate the dispute models, the training data were relabeled to match each of the dispute models. Relabeled training data were fed into each dispute model to train via a one-dimensional convolutional neural network (1D CNN). The 1D CNN consisted of four layers in total, beginning with the input layer, which applied 64 filters of size five to the input data, with a stride value of four. The activation function used in the first layer was the scaled exponential linear unit (SELU).

The following three layers were dense (fully connected) layers. The second and third layers served as hidden layers, while the fourth layer acted as the output layer. The second and third dense layers had 128 units each, and the fourth dense layer had the number of classes in the given classification task, which was the number of species within each specific dispute model.

To extract features, the network used one-dimensional convolutional layers with 64 filters of size five in each hidden layer. The convolutional layers were followed by activation functions specific to each layer: the exponential linear unit (ELU) for the second layer, the Swish activation function for the third layer, and the softmax activation function for the output layer. Dropout regularization was applied after each dense layer, at a rate of 0.5. This helped to prevent overfitting by randomly setting a fraction of the inputs to 0 during training.

During training, the network optimized the learning rate with the Adam optimizer, an adaptive-learning-rate optimization algorithm. Categorical cross-entropy, commonly used for multi-class classification problems, was utilized as the loss function, as it measures the dissimilarity between the predicted class probabilities and the true class labels.

When given a measurement to classify, first the global model will predict a class. If the predicted class is not a forming species, then the global model’s predicted class will be deemed as the hybrid model’s prediction. However, if the global model’s prediction is among the forming species, the associated specialized model’s prediction will be the class predicted by the hybrid model (Figure 7). The dispute model technique proposed in this study is a supervised classification method and thus different from clustering methods.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A