The update of the IEN was performed using Principal Component Analysis (PCA). PCA allows for the identification of interdependence between a set of variables of interest and reduces the dimensionality to provide a smaller number of synthetic indices, or scores, for each of the components created 15 . The first component estimated for the IEN was used since it captured the highest percentage of the total explained variance of the dataset. Thus, it was the most suitable component for determining the household economic condition. This procedure was previously adopted by Barros & Victora 1 .
The selection of variables to include in the PCA was carried out after obtaining two values: the Kaiser-Meyer-Olkin (KMO) statistic and the measure sample adequacy (MSA). Both values assess the suitability of a dataset for PCA. The KMO is a global measure; the MSA measures each variable and helps to exclude variables that do not meet criteria 16 . The R programming language and its specific packages were used for all analyses (https://www.r-project.org). These indicators were estimated using the correlation matrix (svycor function, jtools package), allowing incorporation of the complex sampling design. For comparison, the estimation was also performed disregarding the complex sampling design (cor function of the base R and the KMO function of the psych package) (https://cran.r-project.org/package=psych). Then, extraction of the components, estimation of the proportion of explained variance, and calculation of the IEN scores were carried out.
To estimate the IEN, it was necessary to recode the original variables collected in the field, transforming them into numerical variables and, when necessary, adding some response categories. For example, the variable number of bedrooms was coded from 1 to 4, corresponding to the answers 1, 2, 3, and 4 or more, respectively. In all cases, the resulting variables were ordinal.
The estimation of the IEN was performed using the unit of the household. However, in the ENANI-2019, the head of the household was not defined; instead, information was collected on the mother or caregiver of each child. Thus, in the case of households with more than one mother or caregiver, the one with the highest education level was selected. As a result, only one education level per household was considered. This procedure was performed in 147 of the 12,524 households studied.
The estimation of the synthetic indices that incorporate the sampling design of the obtained data in later analyses is relatively underexplored, but when analyzing data from surveys with a complex sampling design, studies should consider the incorporation of the sampling design to produce a more robust point and variance estimates 17 , 18 . Thus, the estimation of the IEN from the ENANI-2019 data was carried out, incorporating the complex sampling design. However, for evaluation purposes, the IEN was also calculated without considering the complex sampling design. The svyprcomp function of the survey package was used to incorporate the complex sampling design, and the prcomp function of the base R was used for the calculation without the complex sampling design.
The evaluation of the effect of incorporating the complex sampling design in the IEN was carried out in three ways: using the coefficient of variation of the estimates (CV), i.e., a measure of dispersion that indicates the heterogeneity of the data, obtained by the ratio between the standard error and the estimated value of the indicator multiplied by 100 to estimate the percentage of variation, which allowed for the measurement of precision; using the amount of total variation explained; and according to the score distribution.
The validation of the IEN incorporating the complex sampling design was performed using two procedures. The first by examining the association of the IEN with the total household income using Spearman’s correlation analysis (weightedCorr function of the wCorr package). Respondents reported total household income, calculated as the sum of the monetary income of all household members, including job, retirement, pensions, government benefits, savings accounts, rent, and other financial investments. The total household income was chosen over per capita income, as the IEN is a measure that better discriminates the total income and is not adjusted for the number of people in the household 1 . The analysis of total income was conducted on a natural logarithmic scale using Spearman’s correlation. This was the same method used by Barros & Victora 1, which allows comparison between the studies.
Then, the mean IEN for the various household and child categorical variables were analyzed, allowing the discrimination of socioeconomic status and living conditions, using graphs with 95% confidence intervals (95%CI) and the packages survey, srvyr, and tidyverse 19 . The following variables were used: (i) access to sewage system (public sewage or rainwater drainage network); (ii) presence of a beneficiary of the Brazilian Income Transfer Program in the household; (iii) household classification on the Brazilian Food Insecurity Scale (EBIA) as secure or with light/moderate/severe insecurity 20 , 21 ; and (iv) the Z score of the height-for-age index, which was used to classify children into stunting (< -2) and adequate height (≥ -2) categories according to the World Health Organization (WHO) reference curve 22 , 23 . Notably, this last variable used the child as the unit of analysis. These indicators were selected because they are essential in studies on wealth inequality and have been used or are proposed to be analyzed in the future to evaluate the performance of the IEN 1 .
After validation, the average total household income was calculated for each IEN category. The categorization of the IEN was performed based on measures of position: thirds, fourths, fifths, and tenths; incorporating the complex sampling design (survey_quantile function, srvyr package) and plotted with 95%CI. The final IEN score was calculated incorporating the complex sampling design, and its category (a measure of position: thirds, quarters, fifths, and tenths) was determined for each child, according to the location of their residence.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.