Exploration of parameter combinations

Xusi Han; Genki Terashi; Charles Christoffer; Siyang Chen; Daisuke Kihara

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Exploration of parameter combinations

XH Xusi Han

GT Genki Terashi

CC Charles Christoffer

SC Siyang Chen

DK Daisuke Kihara

This method is extracted from research article: Nat Commun, Apr 2021

VESPER: global and local cryo-EM map alignment using local density vectors

DOI: 10.1038/s41467-021-22401-y

Request a Protocol

Ask a question

Favorite

The voxel spacing and the angle spacing are the two parameters for using VESPER. We used a voxel spacing of 7 Å and an angle spacing of 30° for aligning and searching density maps as the default setting of VESPER we used in this study. This setting was chosen from several parameter combinations we examined because it provided a reasonable balance between the map search accuracy and the computational time. In Table 4 and Supplementary Fig. ⁵, we provided the computational time and the global map retrieval accuracy of parameter combinations with a voxel spacing of 3, 7, or 10 Å and an angle spacing of 10°, 30°, 60°, and 90°. Results shown are the average of three query maps (EMD-3661, EMD-8724, and EMD-1203) against all the 410 maps in the global matching dataset.

CPU hours for combinations of voxel and angle spacing settings.

For each voxel spacing and angle spacing combination, the average CPU hours by VESPER on global map search for three query maps, EMD-3661, EMD-8724, and EMD-1203, against the global map dataset of 410 maps. The upper half of the table shows times for using the DOT score while the latter half shows the times for using CC. Comp. Vector in the DOT score category shows the computational time needed for computing vector representation of the 410 maps. G. kernel in the CC category shows the time needed for applying the Gaussian kernel (Eq. (²)) to the 410 maps. The values shown are an average for processing the 410 maps and 410 comparisons. The times for preparing one file or comparing a pair of maps are shown in parentheses.

gmfit took 72.7 CPU hours for computing the Gaussian mixture models and search by gmfit took 0.09 CPU hours. Fitmap took 14.7 CPU hours. 3DZD took 0.2 CPU hours, among which the 3DZD computation took almost all the time. We used Intel Xeon E5 processor@2.60 GHz with 128 GB memory on the Halstead cluster computer at Purdue to measure the times.

The computational time increased about 6 to over 30 times when the voxel spacing was changed from 7 to 3 Å while it showed a relatively smaller decrease to about when a larger spacing of 10 Å was used. Using a finer angle spacing of 10° also increased the computational cost about 3–20 times from 30°. Comparing the time needed for using the DOT score and CC, CC costs about the half the time of DOT score when an angle spacing of 10° was used, but the differences became smaller when less expensive settings were used.

Turning our attention to the retrieval accuracy (Supplementary Fig. ⁵), 10° and 30° did not make a substantial difference, but using coarser-grained angles, such as 60° or 90°, drastically deteriorated the accuracy. The voxel spacing of 7 Å was also practically convenient because it is the grid spacing of the density maps we used. To further speed up a search, practically we could apply a pre-filtering to reduce the number of maps in the database to search against. For example, maps that have a significantly different volume to the query may be removed. We could also remove some functional classes of maps, e.g. virus entries or ribosome entries, or maps of a certain resolution range, if the user is not interested in them.

In addition to the voxel and angle spacing, the contour level to use for extracting input maps would affect the accuracy. In this work, we used author-recommended contour level provided in EMDB for each map.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol