request Request a Protocol
ask Ask a question
Favorite

Our method, RB-SAScore, is based on the SAScore [19], a widely accepted and well-performing synthetic accessibility metric [5, 31]. SAScore integrates both local and global structural molecular features, with local structure represented by molecule fragments (fragmentScore) and global structure represented by structure complexity (complexityPenalty):

The fragment score is derived from the popularity of each molecular fragment, encoded as Extended-Connectivity Fingerprints [32] (ECFPs), among a set of previously synthesized molecules. The rationale is that fragments appearing more frequently across diverse molecules are more likely to be synthesized, while rare fragments receive negative scores. By fragmenting 934,046 molecules from the PubChem databasese [28], the score of each fragment is computed, with common fragments receiving higher scores and rare ones assigned negative scores. These fragment scores are then averaged to represent the overall local feature of a given molecule.

On the other hand, global features such as the number of atoms and stereocenters in the molecule are captured by the complexity penalty term. Specifically, the complexity penalty comprises four commonly considered features in synthesis accessibility: size complexity (number of atoms), stereo complexity (number of stereocenters), ring complexity (number of bridgehead and spiro atoms), and macrocycle complexity (number of rings with size > 8). Mathematically, they are calculated as follows:

where

Finally, the calculated score from Eq. 1 is multiplied by -1 and scaled between 1 and 10, where molecules with higher SAScore are predicted to be more difficult to synthesize, while those with lower SAScore are predicted to be easier to synthesize.

The distribution of structural complexity for the ES and HS molecules in the three test sets is depicted in Figure S1. Overall, the size penalty and stereo penalty of molecules in TS3 are higher than those in TS1 and TS2, indicating more complex molecular structures in TS3 compared to TS1 and TS2. Additionally, the penalty difference between ES molecules and HS molecules increases progressively from TS1 to TS3.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A