In order to measure the importance of the variables that make up the IPAQ, the Random Forests (RF) classification technique was used. It is a technique based on decision trees that uses a set of trees to perform the classification. Each tree in the set is induced from randomly selected instances and variables and, for a classification problem, the prediction of the model is determined by the majority vote, that is, the most prevalent class among the classes predicted by the set of trees. In addition to the prediction, the RF can list the variables in order of capacity or predictive importance and this importance can be used to select variables as inputs for other classification models. To measure the importance of a variable m, the RF adds the impurity (Gini index) in all the nodes of a tree. Then, the values of m are shuffled randomly between the instances and the sum of the impurities is performed again. The importance of variable m is given by the average decrease in impurity among all trees [35].
The RF was calculated with the aid of the Matlab R2015b software to measure the importance of the variables.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.