To evaluate the performances of the trained models, the performance measures for classification problems are used in this study including Accuracy, Sensitivity, Specificity and F-Score as shown in Eq. (11) -(14):

where TP and FP (TN and FN) indicate the number of instances in the positive (negative) classes which are classified correctly and incorrectly, respectively.

Moreover, the area under the curve (AUC) of the receiver operating curve (ROC) is considered.

In order to validate the results, the experiments are repeated 50 times, and each time the data is selected based on fivefold C.V.

A novel method named as A-Test has been proposed in a previous study to calculate the structural risk of a classifier model as its instability with the new test data [36]. A-test calculates the misclassification error percentage Γζ,K for different K values using the balanced K-fold validation. In this study, the values of Γζ,K will be reported for different classifiers and different feature sets. Γζ,K is calculated as Eq. (15):

where Kmax cannot be more than the size of the minority class. For estimating the structural risk of a classifier method, the average of the values of Γζ,K is considered as Eq. (16):

where Γζ^ ranges from 0 to 100% which higher values show higher risk of classification and lower values show the higher capacity and generalization ability of the model. Therefore, the lower values of Γζ^ are more desired.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.