SUPERLEARNER PREDICTION

S. Ariane Christie; Alan E Hubbard; Rachael A Callcut; Morad Hameed; Fanny Nadia Dissak-Delon; David Mekolo; Arabo Saidou; Alain Chichom Mefire; Pierre Nsongoo; Rochelle A. Dicker; Mitchell Jay Cohen; Catherine Juillard

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

SUPERLEARNER PREDICTION

SC S. Ariane Christie

AH Alan E Hubbard

RC Rachael A Callcut

MH Morad Hameed

FD Fanny Nadia Dissak-Delon

DM David Mekolo

AS Arabo Saidou

AM Alain Chichom Mefire

PN Pierre Nsongoo

RD Rochelle A. Dicker

MC Mitchell Jay Cohen

CJ Catherine Juillard

This method is extracted from research article: J Trauma Acute Care Surg, Nov 2018

Machine Learning Without Borders? An Adaptable Tool to Optimize Mortality Prediction in Diverse Clinical Settings

DOI: 10.1097/TA.0000000000002044

Request a Protocol

Ask a question

Favorite

SL is a previously-validated ensemble machine learning algorithm which has been described in detail in prior publications (18). SL can be downloaded as a package within the R coding language (25). Both the R software and the SL package are open-source and can be accessed without charge to the user (https://cran.r-project.org/ and https://cran.r-project.org/web/packages/SuperLearner/index.html, respectively). This means that this technology is widely accessible to clinicians and researchers in both high-income and LMIC contexts.

Rather than pre-specifying a single statistical approach, SL simultaneously investigates multiple algorithms ranging from simple logistic regression to highly complex machine learning (e.g., neural nets) in order to optimally predict outcomes of interest from complex datasets. SL uses cross-validation to tailor a weighted (convex) combination of learners to optimize prediction on new data from the same data-generating distribution. Embedded cross-validation eliminates the risk of over-fitting (18).

In this study SL was applied to all admission variables of the US, SA, and Cameroonian cohorts to generate setting- specific prediction of hospital mortality. We used a set of algorithms including: logistic and linear regression, generalized additive models with various levels of smoothing (26), random forest (27), lasso (28) and systems-based on sieves of parametric models (e.g. polyclass). To report the ability of the resulting SL fit to future data, we estimated the cross-validated area under the curve (AUC) of receiver-operator characteristic curves (ROC) as well as using cv-AUC as the objective function, so the procedure optimizes prediction towards more clinically-relevant measures of performance (in this case, a function of specificity and sensitivity). SL prediction and cross-validated risk was used to evaluate performance of each model (29).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol