Generation of topological matrices and molecular graph representations

SE Sudha cheranma devi Eswaran
SS Senthil Subramaniam
US Udishnu Sanyal
RR Robert Rallo
XZ Xiao Zhang
ask Ask a question
Favorite

The structure generation mimics the monomer sequencing in chain-growth polymerization by assigning probabilities to the presence of specific monomer pairs. Resulting molecular structures can be represented as graphs where nodes represent monomer units, and the edges correspond to covalent bonds formed during the polymerization process. For example, in case of outgoing edges for linkage type β-O-4, the β position of the monomer represented in the first node is connected to the O-atom attached to the 4th carbon in the ring structure of the monomer in the second node. Studies on lignin structure report that monomers are bidirectionally linked5. To account for link directionality, labels are stored as tuples of variables (e.g., (G, S) (S, G) (G, S) (S, G)) that represent parent-child relationship with bonds represented as directed edges denoting the linkage direction. Monomer sequences and bond patterns generated in the previous step are processed to create linear and branched structures. Linear chains are created by adding monomers one-by-one to the polymer (i.e., endwise lignin growth). Branching chains are formed by coupling fragmented linear chains (i.e., two lignin oligomers) using the coupling patterns described in Table 3. Figures 4 and and55 illustrate the development of linear and branched structures.

Representation of linear chain structures. (a) Graph-based encoding of a linear chain, (b) 2D representation of a linear lignin chain structure formation.

Representation of branched chain structures. (a) Graph-based encoding of a branched structure encompassing two oligomers and a branching node, (b) 2D representation of the lignin branched structure.

Molecular graphs can be efficiently represented in tabular form as a pair of topological matrices46 that define the relationship between monomer units. The information in the topological matrices includes linkages (i.e., adjacency matrix) and bond types (i.e., connectivity matrix). These matrices identify the bond direction and occupied bond position for each graph node (e.g., β carbon as ‘B’, 5th carbon for G unit as ‘5’, 4-O position in the ring as ‘4’). Figure 6 shows the directed graph and 2D molecular representations of a lignin polymer together with the structure encoding using topological matrices. Additional information describing the format of the topological matrices can be found in the Supporting Information (Figure F).

Adjacency and connectivity matrices (a) Directed graph, (b) Adjacency matrix, (c) Connectivity matrix, (d) Molecular graph.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A