The following steps were performed in a loop for each of the methods:
Data for each eye, with the corresponding label indicating whether the eye was subclinical KC or not, was imported into RStudio package;
Each machine learning method was respectively trained to differentiate subclinical KC from control eyes using all 11 parameters;
To validate the results of each model, a 10-fold cross-validation method was used on the full dataset, wherein the data was split into 10 subsets (folds), each representing 10% of the data. On each iteration, a model was trained using nine of these folds (90% of the data) and tested on the remaining fold, repeatedly, 10 times across the folds to assess the performance of the methods as the output. In this way, each fold serves as held-out test data for a model trained on the other nine folds, and the average performance across the 10 folds was measured. This represents a standard evaluation paradigm for small datasets63,64. Figure 1 shows an example of 10-fold cross-validation.
The 10-fold cross validation for analysis of test data. Twenty rhombuses are randomly partitioned into 10 subsets, with two rhombuses in each subset. Of the 10 subsets, one subset is retained as the validation data, and the remaining nine subsets are used to train the model. This cross-validation process is then repeated 10 times. In summary, cross-validation combines measures of 10 fitness and provide an average.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.