Predicting evolutionary domain from topology
This protocol is extracted from research article:
Universal scaling across biochemical networks on Earth
Sci Adv, Jan 16, 2019; DOI: 10.1126/sciadv.aau0149

To demonstrate that topological features of genomes from different domains are distinct, multinomial regression was used. Specifically, we implemented models where the domain of the network was the response class, and a single topological feature, normalized by the size of the LCC of the network, was the dependent variable. We found topological features of networks alone were often not predictive of the domain, but the ratio of the topological properties to the size of the network provided a more accurate prediction. Prior to the regression, these normalized topological measures were scaled and centered (61). The regression was implemented in base R using the glm(..), function. To control for overfitting, the training data were composed of an equal number of samples from each domain. In particular, only 35 networks of each domain were sampled, and the model was tested on the remaining data. This process was repeated 100 times, and the average model error is reported in the text (Fig. 6E).

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.