All data are described as frequency counts (percentages). R software (version 3.6.2; was used for the statistical analysis.

The least absolute contraction selection operator (LASSO) was used to screen optimal predictors among the current risk factors in patients with DN. This method is ideal in reducing high-dimensional data. Features with regression coefficients equal to zero were removed from the LASSO regression model, while those with nonzero regression coefficients were retained. From the remaining predictors, a multivariable logistic regression analysis was applied, and the predictors that were statistically significant were selected. Finally, the statistically significant predictors were used to construct a nomogram model to predict DN incidence risk in T2DM patients.

To evaluate the discrimination performance of the DN incidence risk nomogram, a C-index examination was conducted. Bootstrap samples (1000 bootstrap resamples) were randomly selected from the original samples to use as an internal validation for a corrected C-index. An area under the receiver operating characteristic curve (AUC) was applied to judge the ability of the DN risk nomogram in discriminating true positives from false positives. To evaluate its calibration capabilities, calibration curves were drawn. A decision curve analysis defined the clinical practicability of the DN risk nomogram by calculating its net benefits based on different threshold probabilities in patients with T2DM.

