Least Absolute Shrinkage and Selection Operator Logistic Regression Model

ZL Zhaohui Li
YD Yue Du
YX Youben Xiao
LY Liyong Yin
request Request a Protocol
ask Ask a question
Favorite

For an ordinary linear model:

where Y=(y1,y2,,yn)T is the response variable, X = (X(1), X(2), …, X(d)) is the covariate, β=(β1,β2,,βd)T is the regression coefficient, ε=(ε1,ε2,,εn)T is the random variable and εi~N(0,σ2). The smallest penalty likelihood function is used as the regression coefficient estimate (Breiman, 1995; Tibshirani, 1996), which is calculated by

where λ is the weight coefficient of LASSO.

In this study, the input data were the PAC and AAC, including the strength values in V1 and V4, respectively, and the strength values between V1 and V4. The output is the grating orientation corresponding to the cross-frequency coupling strength. If a specific input is denoted by I, then the conditional probability of the corresponding output (O1 and O2) can be calculated by:

where x is the cross-frequency coupling strength, and w is the weight vector. As shown in Figure 1, the inputs are first multiplied by the weight vectors, respectively, and then added up. Next, a non-linear logic process is solved, i.e.,:

The schematic diagram of the least absolute shrinkage and selection operator (LASSO) logistic regression model.

Therefore, the conditional probabilities can be calculated, which represents the possibility that an output corresponded to the input. The weighted vector to the larger possibility is determined as the feature obtained by training the current data set. In this study, the weight of the penalty function is set to 100, and the penalty function takes the Elastic Net. Here, we briefly introduced the main idea of the LASSO model. The training and predicting were implemented by using the glmnet toolbox in MATLAB.

In order to test the performance of the model, the cross-validation method was employed in this study. The error of prediction and the sum of their squares were calculated (Hawkins et al., 2003; Braga-Neto and Dougherty, 2004; Vehtari et al., 2016). In each verification, all samples were randomly divided into M parts. M – 2 of them were used as training set, and the remaining 2 as testing sets. The correct rate for each prediction is denoted by Ri, i = 1, 2, ⋯, L, where L is the total number of the training trials. Therefore, the overall accuracy of the prediction is calculated as:

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A