An updated biologically informed machine learning (BioMM) approach was used (eMethods in the Supplement).25 BioMM is a 2-stage machine-learning approach that first builds separate machine learning models for methylation sites mapping to each of the 2846 pathways, yielding 1 machine-learning model per pathway (first-stage). This procedure compresses data from individual methylation sites into a pathway-level feature. Then, a second-stage algorithm integrates these pathway-level features into a systems-level classifier. BioMM was trained on discovery methylation and the algorithm then applied to all other data sets. In each data set, the output of BioMM was a score (PMS) that quantified the likelihood of a given participant being in the schizophrenia group. To assess predictive accuracy, we determined the area under the receiver operating characteristic curve (AUC) as well as Nagelkerke R2.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.