Three-dimensional quantitative structure-activity relationship (3D-QSAR) model development was performed using Open3DQSAR software [43]. Docking-based, instead of multifit alignment-based, 3D-QSAR models were developed by enclosing the bioactive conformations of nineteen naphthyl-based compounds, including those of the pharmacophore development training set, in a 1.00 Å step grid-box with 5.00 Å outgap. The bioactive structures were extracted from docking complexes. Steric (Lennard-Jones 6–12 potential) and electrostatic (Coulombic potential) molecular interaction fields (MIFs) within the grid-box were generated using a carbon probe atom and a volumeless probe with a +1 charge, respectively. Binding energies of the nineteen compounds calculated by AutoDock were used as dependent variables in 3D-QSAR evaluation. A series of pre-treatment modifications were made to improve computational efficiency and to derive stronger field properties for partial least squares (PLS) analysis. The electrostatic potential energy values within the naphthamides were subjected to steric cutoff of 10,000 kcal/mol and the energy cutoff values for principal component (PC) extraction were set to 20.0, 30.0, and 40.0 kcal/mol. MIF energy values less than 0.05 kcal/mol were zeroed. The series of nineteen naphthamides were split into a training set of 16 and test set of 3. Test set molecules possessed diverse binding affinities, ranging from −12.05 to −14.33 kcal/mol. In order to allocate equitable importance to all field descriptors and reduce the chance of biased PC extraction, MIF variables with standard deviations less than 0.10 and N-level variables were eliminated from computation, and block unscaled weighting for both fields was performed by assigning collective block weighting coefficients. In order to identify the optimal number of PCs for PLS regression analyses, different cross validation (CV) strategies, including leave-one-out, leave-two-out, and leave-multiple-out, were performed with variable grouping and selection parameters using the in-built Smart Region Definition and Fractional Factorial Design algorithms. QSAR model validation requires training set internal validation and randomized external validation [44]. The cross-validated correlation coefficient (q2) for each CV strategy was computed. Final PLS regression models after variable selection by CV were built and non cross-validated correlation coefficient (r2), non cross-validated predicted correlation coefficient for test-set (rpred2), F-test value, SDEC, standard deviation of error of prediction for non cross-validated training set (SDEP), and standard deviation of error of prediction for cross-validated test set (SDEPtest) were computed. The optimal number of components for scaffold optimization of agnuside was chosen based on strongest q2 for all CV strategies and r2 values. PLS field coefficients for the optimal number of PCs were exported as grid maps and visualized in PyMOL [45].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.