We used dEBM applied to cross-sectional data to estimate the most likely sequence in which regions appear increasingly disconnected across the spectrum from no/mild disability to moderate/severe disability or from CP to CI (see Fig. 1).27,28 The conversion of a metric (in this case regional structural disconnection) from healthy to pathological is called as event (), where n indicates the subject and p indicates the region . First, dEBM estimates the posterior probability of each region being within healthy range (i.e. less disconnected) versus pathological (i.e. more disconnected) using a Gaussian mixture model (GMM) of the two subject groups’ distributions of regional structural disconnectivities. The GMM allows calculation of the likelihood that region p of subject n is pathological, i.e. (i.e. the likelihood that the structural disconnection has occurred on region p for subject n), or healthy (i.e. the likelihood that the structural disconnection has not occurred in region p for subject n). Second, the sequence (S) of the events (i.e. regions becoming pathologically disconnected) is created by maximizing the following likelihood:
Workflow. (A) The NeMo tool was used to predict regional structural disconnectivity metrics from the pw multiple sclerosis lesion masks. (B) The dEBM algorithm was applied to the regional structural disconnectivity metrics to estimate the sequence of regional structural disconnections as disability/cognitive impairment occurs. First, a GMM was used to calculate the probability distributions of the regional structural disconnectivity metrics for two groups (for example no disability versus disability groups). Second, the likelihood defined in the text is maximized by finding the optimal sequence of regional structural disconnection ‘events’. Results are visualized via a matrix with rows indicating regional structural disconnectivity and columns indicating the position in the sequence of disability/cognitive impairment progression. The intensity of each matrix entry corresponds to the certainty of that region’s structural disconnection at that position in the sequence across 100 bootstraps.
where X is the data matrix containing the regional disconnectivity scores for all subjects and P(k) is the prior probability of being at stage k, which indicates that events have occurred but have not yet occurred. The entire algorithm is repeated using 100 bootstrapped samples of the data to compute the uncertainty (i.e. the positional variance of the sequence) and standard error of the event centres. dEBM results are visualized via a matrix with rows indicating regions and columns indicating position in the sequence of disability/cognitive impairment progression. The intensity of each matrix entry corresponds to the certainty of that region’s structural disconnection at that position in the sequence across 100 bootstraps (darker colour with greater certainty). The codes to perform the dEBM analysis are publicly available (https://github.com/EuroPOND/pyebm).
In our study, we applied the dEBM algorithm to three different data sets (see Supplementary Fig. 1 for the flow diagram):
For each data set, we compared the regional structural disconnectivity metrics between disability or cognition groups (each domain was calculated separately) using Student’s t-test. First, we repeated Student’s t-test using 100 bootstrap samples of the original data set. Second, the rank of each region was calculated for each bootstrap sample. Third, we computed the average of the ranks over 100 bootstrap iterations for each region. Finally, we used the top 20 regions that showed the highest averaged rank across 100 bootstrap samples in the dEBM algorithms to reduce the dimensionality of the model. To control for confounding variables, covariates age and sex were regressed out of each structural disconnectivity metric using a linear regression model before inclusion in the dEBM.
To investigate if the sequence of the regional structural disconnectivity metrics is different from random, where each region has 1/20 probability of being in each order, we compared the per cent of times a region was assigned its final order across the 100 bootstraps with 1/20 using one-sample Student’s t-test.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.