Breakpoint analyses

EM Eladio J. Márquez
CC Cheng-han Chung
RM Radu Marches
RR Robert J. Rossi
DN Djamel Nehar-Belaid
AE Alper Eroglu
DM David J. Mellert
GK George A. Kuchel
JB Jacques Banchereau
DU Duygu Ucar
request Request a Protocol
ask Ask a question
Favorite

We investigated systemic chronological signatures in temporal peaks by testing, in each cluster, for the existence of “breakpoints,” i.e., short age intervals characterized by significant differences in accessibility in the intervals preceding and following the age interval. For each age t in the sampled age interval from tmin to tmax, we tested for mean difference in accessibility between subjects with ages in the intervals tmin-w vs. tmin + w, where w represents a variable window span parameter, and plotted the observed p-values (i.e., −log10 P, or loginvp) as a function of age (t) to identify maxima that suggested the presence of discrete breakpoints. These tests were carried out on normalized and model-adjusted accessibility data corresponding to the ATAC-seq peaks associated to each sex-specific and common cluster identified as trending by ARIMA as described above. Since there were many more peaks than subjects for any given comparison, we used PCA to reduce the dimensionality of each cluster to n = 3 PCs, and used MANOVA on these three dimensions to compute p-values at each tested value of t. For any given value of w, offset values for tmin and tmax were adjusted to match the age of available samples in the study. For example, a window span of 5 years required a tmin = 27 if the youngest available subject was 22 years old, and a tmax = 83 if the oldest available subject was 90 years old. For a given value of w, results from tests contrasting younger vs. older intervals would vary depending on sample size volume and imbalance, with statistical power increasing with the size of the window span. To take advantage of this effect, we deployed a multi-scale algorithm where we carried out tests using w values ranging from 10 to 20 years in order to identify breakpoints that were maximally supported under multiple window spans. Due to sample sparsity and variation, however, tests carried out under varying values of w may be unevenly affected by edge effects and influencing outlier points, which may result in strong significance of a comparison because of the presence of outliers and the partial overlap of a sampled interval with a breakpoint, limiting the ability of the method to precisely discover where such breakpoints may lie. To limit these effects and increase the robustness of the tests, we smoothed the loginvp distributions by fitting LOESS regressions to each comparison (i.e., each set of tests with the same w value) under a range of smoothing bandwidth parameters (i.e., bw = 0.25, 0.30, …, 0.70, 0.75). We combined the resulting 11 p-values at every sampled age using the Fisher’s method, reapplied LOESS smoothing to the resulting distribution, and used numerical differentiation to determine whether each age was predicted to be minimum or a maximum. Finally, we marked every maximum as a significant breakpoint candidate if it satisfied both a parametric criterion, i.e. significance of the Fisher method-combined p-values (χ2 test), and a heuristic criterion, namely whether the distance between this local maximum and the nearest minimum equaled or exceeded 25% of the value of the global maximum. The procedure described above results in a smoothed loginvp distribution for each w value, each comprising a series of points including maxima and minima, such that slightly different maxima can be estimated for different w values. Finally, we used Gaussian mixture modeling on the distribution of these maxima, as implemented in R-Mclust package, to group loginvp maxima obtained from different window spans into cohesive breakpoint intervals, whose medians and ranges we report herein for each cluster. Since breakpoints are independently calculated for each cluster, observed overlaps are likely the result of aging-related events with a genome-wide impact.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A