Data analysis

GT Gebreyesus Brhane Tesfahunegn
EA Elias T. Ayuk
SA S. G. K. Adiku
request Request a Protocol
ask Ask a question
Favorite

SPSS version 20 software was used to analyse the data collected from the two study villages. Specifically, descriptive, non-parametric (e.g., chi-square test) and T-test and econometric analyses were used at the probability level (P) ≤ 0.05. The chi-square test was used to assess the statistical significance between the proportions of the respondents’ replied in favour of a certain question. Binary logistic regression was used to analyze the causal-effects of the explanatory variables on the dependent variable (farmers’ perceived on soil erosion as problem) at P ≤ 0.05. The logistic model was selected over the others such as probit because it is more descriptive nature and interpretable for data related to perception on erosion [38, 47]. The dependent variable is farmers’ perceived that erosion is a problem. If a farmer perceived soil erosion as a problem, it is denoted as 1 (yes), 0 otherwise. Where 1 indicates the presence of perception and 0 indicates the absence of perception on the attribute. The independent variables included in this study were: age, gender, education, family size, literacy ratio (illiterate/literate), dependency ratio (dependents/productive), total number of livestock, farm size, land tenure, farmland ownership, farming experience, experiences on soil management practices, off-farm activities, income from agriculture, access to extension services such as training, access to information (media, radio), access to credit, slope of farm land and arable land distance from home and main road. The hypothesised relationships of such variables with the dependent variable were determined based on the existing literature and the researchers’ judgment (Table 1).

Note: H0 indicates a hypothesized effect of an explanatory variable on the dependent variable which is farmers perceived soil erosion as problem (1 = yeas, 0 = no); + is positive and–is negative effect. HH is household head.

Multicollinearity among the explanatory variables was tested separately for continuous and dummy/discrete variables before the analysis conducted using the binary logistic regression. Variance Inflation Factor (VIF) was used to detect multi-collinearity among the continuous independent variables whereas contingency coefficient (CC) was used for the dummy or discrete variables. As a rule of thumb, if the VIF of the association among the variables exceeds 10; there is a strong multi-colinearity problem and should be excluded the non-significant variables from the analysis [5557]. The CC values vary between 0 and 1; in which zero indicates there is no association between variables while values close to 1 indicates high degree of association between variables. The association is said to be high when the value of CC exceeds 0.75 [55, 56]. In this study, the analysis results showed that the values of VIF and CC among the independent variables were within the lower level of association (data not shown) which indicates that there is no serious problem of multi-collinearity effect among most of the explanatory variables. Strong multi-collinearity was only detected between the variables ‘education and literacy ratio’, and ‘total family size and dependency ratio’, in which literacy ratio and dependency ratio were excluded from the analysis as these were non-significantly influenced the dependent variable.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A