Update and validation of IEN

Pedro Gomes Andrade; Raquel Schincaglia; Dayana Rodrigues Farias; Inês Rugani Ribeiro de Castro; Luiz Antonio dos Anjos; Elisa Maria de Aquino Lacerda; Cristiano Siqueira Boccolini; Nadya Helena Alves-Santos; Paula Normando; Maiara Brusco de Freitas; Neilane Bertoni; Gilberto Kac

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Update and validation of IEN

PA Pedro Gomes Andrade

RS Raquel Schincaglia

DF Dayana Rodrigues Farias

IC Inês Rugani Ribeiro de Castro

LA Luiz Antonio dos Anjos

EL Elisa Maria de Aquino Lacerda

CB Cristiano Siqueira Boccolini

NA Nadya Helena Alves-Santos

PN Paula Normando

MF Maiara Brusco de Freitas

NB Neilane Bertoni

GK Gilberto Kac

This method is extracted from research article: Cad Saude Publica, Aug 2023

The National Wealth Score in the Brazilian National Survey on Child Nutrition (ENANI-2019)

DOI: 10.1590/0102-311XEN050822

Request a Protocol

Ask a question

Favorite

The update of the IEN was performed using Principal Component Analysis (PCA). PCA allows for the identification of interdependence between a set of variables of interest and reduces the dimensionality to provide a smaller number of synthetic indices, or scores, for each of the components created ¹⁵ . The first component estimated for the IEN was used since it captured the highest percentage of the total explained variance of the dataset. Thus, it was the most suitable component for determining the household economic condition. This procedure was previously adopted by Barros & Victora ¹ .

The selection of variables to include in the PCA was carried out after obtaining two values: the Kaiser-Meyer-Olkin (KMO) statistic and the measure sample adequacy (MSA). Both values assess the suitability of a dataset for PCA. The KMO is a global measure; the MSA measures each variable and helps to exclude variables that do not meet criteria ¹⁶ . The R programming language and its specific packages were used for all analyses (https://www.r-project.org). These indicators were estimated using the correlation matrix (svycor function, jtools package), allowing incorporation of the complex sampling design. For comparison, the estimation was also performed disregarding the complex sampling design (cor function of the base R and the KMO function of the psych package) (https://cran.r-project.org/package=psych). Then, extraction of the components, estimation of the proportion of explained variance, and calculation of the IEN scores were carried out.

To estimate the IEN, it was necessary to recode the original variables collected in the field, transforming them into numerical variables and, when necessary, adding some response categories. For example, the variable number of bedrooms was coded from 1 to 4, corresponding to the answers 1, 2, 3, and 4 or more, respectively. In all cases, the resulting variables were ordinal.

The estimation of the IEN was performed using the unit of the household. However, in the ENANI-2019, the head of the household was not defined; instead, information was collected on the mother or caregiver of each child. Thus, in the case of households with more than one mother or caregiver, the one with the highest education level was selected. As a result, only one education level per household was considered. This procedure was performed in 147 of the 12,524 households studied.

The estimation of the synthetic indices that incorporate the sampling design of the obtained data in later analyses is relatively underexplored, but when analyzing data from surveys with a complex sampling design, studies should consider the incorporation of the sampling design to produce a more robust point and variance estimates ¹⁷ ^, ¹⁸ . Thus, the estimation of the IEN from the ENANI-2019 data was carried out, incorporating the complex sampling design. However, for evaluation purposes, the IEN was also calculated without considering the complex sampling design. The svyprcomp function of the survey package was used to incorporate the complex sampling design, and the prcomp function of the base R was used for the calculation without the complex sampling design.

The evaluation of the effect of incorporating the complex sampling design in the IEN was carried out in three ways: using the coefficient of variation of the estimates (CV), i.e., a measure of dispersion that indicates the heterogeneity of the data, obtained by the ratio between the standard error and the estimated value of the indicator multiplied by 100 to estimate the percentage of variation, which allowed for the measurement of precision; using the amount of total variation explained; and according to the score distribution.

The validation of the IEN incorporating the complex sampling design was performed using two procedures. The first by examining the association of the IEN with the total household income using Spearman’s correlation analysis (weightedCorr function of the wCorr package). Respondents reported total household income, calculated as the sum of the monetary income of all household members, including job, retirement, pensions, government benefits, savings accounts, rent, and other financial investments. The total household income was chosen over per capita income, as the IEN is a measure that better discriminates the total income and is not adjusted for the number of people in the household ¹ . The analysis of total income was conducted on a natural logarithmic scale using Spearman’s correlation. This was the same method used by Barros & Victora ¹, which allows comparison between the studies.

Then, the mean IEN for the various household and child categorical variables were analyzed, allowing the discrimination of socioeconomic status and living conditions, using graphs with 95% confidence intervals (95%CI) and the packages survey, srvyr, and tidyverse ¹⁹ . The following variables were used: (i) access to sewage system (public sewage or rainwater drainage network); (ii) presence of a beneficiary of the Brazilian Income Transfer Program in the household; (iii) household classification on the Brazilian Food Insecurity Scale (EBIA) as secure or with light/moderate/severe insecurity ²⁰ ^, ²¹ ; and (iv) the Z score of the height-for-age index, which was used to classify children into stunting (< -2) and adequate height (≥ -2) categories according to the World Health Organization (WHO) reference curve ²² ^, ²³ . Notably, this last variable used the child as the unit of analysis. These indicators were selected because they are essential in studies on wealth inequality and have been used or are proposed to be analyzed in the future to evaluate the performance of the IEN ¹ .

After validation, the average total household income was calculated for each IEN category. The categorization of the IEN was performed based on measures of position: thirds, fourths, fifths, and tenths; incorporating the complex sampling design (survey_quantile function, srvyr package) and plotted with 95%CI. The final IEN score was calculated incorporating the complex sampling design, and its category (a measure of position: thirds, quarters, fifths, and tenths) was determined for each child, according to the location of their residence.

This is an open-access article distributed under the terms of the Creative Commons Attribution License

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol