Measures of uncertainty
This protocol is extracted from research article:
Deep neural network processing of DEER data
Sci Adv, Aug 24, 2018; DOI: 10.1126/sciadv.aat5218

When applied correctly, the standard Tikhonov regularized DEER data analysis (1214) produces clear results and easily interpretable distance distributions. However, when applied naively to corrupted or featureless data sets, it can result in overinterpretation of the data (12, 36, 38). In particular, less experienced practitioners may have difficulty distinguishing genuine distance peaks from artifacts (62). Feedback from the EPR community has led to the concept of a validation tool that would be able to identify corrupted or featureless DEER traces. These tools exist within the Tikhonov framework (12, 36), although they can be computationally demanding. A similar tool is therefore required for neural networks.

A {“good”, “bad”} classification network would be the obvious solution, but the amount of experimental DEER data in the world is rather small—polling the community for examples of bad DEER traces is unlikely to return a data set of sufficient size. We therefore decided to pursue another common alternative: to train an ensemble of neural networks using different synthetic databases and to use the variation in their outputs as a measure of uncertainty in the distance distribution (63). Such a measure is useful in any case, and a large variation would indicate uninterpretable input data.

To investigate the performance of this approach in estimating distance distribution uncertainties and detecting corrupted data, we trained 100 five-layer networks on different databases (generated as described under “Training database generation” section) and evaluated their performance against a previously unseen database.

The results are shown in Fig. 7. The relative error metric is the ratio of the 2-norm of the difference between the output and the true answer divided by the 2-norm of the true answer. The “worst relative error” refers to the worst-case performance in the entire database. Performance metrics for all networks in the ensemble are plotted as red circles. The networks that scored better than the median on both characteristics are labeled good and additionally marked with a dot. The performance of the arithmetical mean of the outputs of good networks is shown as a blue asterisk. The SD of the mean across the good network ensemble is a measure of uncertainty in the output (Fig. 8).

Each of the networks was started from a different random initial guess and trained in a different randomly generated database. Red dots indicate the good networks that are better than the median on both the mean relative error and the worst relative error. The blue asterisk is the performance of the average output of the good networks.

Easy (left), tough (middle), and worst-case (right) agreement on the training set data. The variation in the outputs of different neural networks within the ensemble is a measure of the uncertainty in their output (63) when the training databases are comprehensive.

In practice, the mean output signal and the SD are computed for each point and plotted in the form of 95% confidence bounds, as shown in the figures presented in the next section. A more detailed investigation of the effect of the noise in the input data on the reconstruction quality and the confidence intervals is given in section S5.

An important practical test of correctness, intended to distinguish a neural network that merely fits a few Gaussians to the data set from a network that is a Fredholm solver, would be to present a DEER trace with four distances to a network that was trained on a database with at most three. A network that has learned to be a Fredholm solver in the sense discussed in (51, 52, 54, 55, 57) should still return the right answer. As Fig. 9 illustrates, our networks pass that test.

Presenting a data set with four distances to networks trained on the database with at most three distances yields the right answer with high confidence. All networks in the ensemble return four peaks.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.