Some biological units in the PDB have considerable internal redundancy (e.g., viral capsids). We thus used the createPDS program from the MASTER package (58) to identify the minimally nonredundant subset of chains within each entry that preserved all unique quaternary interfaces. Briefly, the procedure begins by enumerating all chain “neighborhoods”—i.e., a central chain and all chains in contact with it. Two neighborhoods are considered redundant if: (i) the two central chains are at least 90% identical in sequence and superimpose to within 1.0 Å; (ii) every pair of corresponding contacting chains pass the same filter; and (iii) the entire neighborhoods superimpose to within 2.0-Å rmsd. A chain neighborhood is said to cover all chains and interchain interfaces contained with it and all neighborhoods redundant to it. Thus, having identified all redundant chain neighborhoods, createPDS proceeds to solve the greedy set cover problem to arrive at a subset of chain neighborhoods that together cover all unique chains and interfaces in the entry. The union of theses subsets is then output, significantly reducing structure size for cases with considerable internal redundancy. Any remaining redundancy is removed in subsequent steps described below.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.