The assisted diagnosis process using automated systems is imperfect. The result obtained from a classification system represents a probability rather than a correct answer with irrefutable certainty. Different diagnostic measures are thus employed to verify and assure that the results are repeatable and to validate the ability of a system to identify the presence or absence of disease.
In particular, random cross-validation (tenfold) was used in these experiments. The available data were used for data training (70%), and the remaining data (30%) to test the proposed model33. It is important to note that the folds were randomly assembled using a shuffle-split methodology in its stratified version to guarantee a proportional distribution in each set34. Each classification approach was evaluated using logistic regression, support vector machine, decision tree, random forest, multilayer perceptron, and TabNet. For assessing the performance of each model, diagnostic measures such as sensitivity, specificity, accuracy, and precision are used. Additionally, the area under the curve (AUC) of the receiver operating characteristics (ROC) was determined for each model48,49.
TP = true positive
TN = true negative
FP = false positive
FN = false negative
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.