The molecular descriptors for all data sets were calculated using the ochem.eu chemoinformatics database (https://ochem.eu). The “Calculate Descriptors” bar was accessed under the “Models” tab, and the Excel files were uploaded directly to the server. The uploaded molecules were then pre-processed; during this procedure, the salts associated with each compound structure were removed. After this stage, the molecular descriptors were selected (unless otherwise stated, the default settings for the descriptor types were not modulated). The descriptor types used were: ALogPS, GSFragment, QNPR, ISIDA fragments (fragment length was set as 2 to 10), and E-state (all boxes checked apart from “Extended indices—experimental”). These were selected from the entire set as they did not contain any 3D descriptors. This selection was made to avoid errors that occur with 3D descriptors calculation. Any compounds still experiencing errors in calculation were deleted from the descriptor sets, and the remaining sets were saved as.csv files. Resulting in 1356 descriptors for each compound.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.