As a global sequence descriptor, composition-transition-distribution (CTD) is often used to characterize protein and peptide sequences, encompassing three feature descriptors: composition (C), transition (T) and distribution (D) [60]. The CTD feature represents the amino acid distribution patterns of a specific structural or physicochemical property in a protein sequence. For each physicochemical property, the 20 natural amino acids are classified into three groups. The first composition descriptor set reflects the global percentage of a particular amino acid attribute group in a protein sequence, the second transition descriptor set reflects the percent frequency of transitions between two different groups along the whole protein sequence, and the third distribution descriptor set reflects the distribution patterns of a particular amino acid attribute group in this protein sequence. In CTD, composition, transition and distribution descriptors are encoded as 21-, 21- and 105-dimensional feature vectors, respectively.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.