3.1. Feature Selection

Yunfeng Wu; Pinnan Chen; Yuchen Yao; Xiaoquan Ye; Yugui Xiao; Lifang Liao; Meihong Wu; Jian Chen

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

3.1. Feature Selection

YW Yunfeng Wu

PC Pinnan Chen

YY Yuchen Yao

XY Xiaoquan Ye

YX Yugui Xiao

LL Lifang Liao

MW Meihong Wu

JC Jian Chen

This method is extracted from research article: Comput Math Methods Med, May 2017

Dysphonic Voice Pattern Analysis of Patients in Parkinson's Disease Using Minimum Interclass Probability Risk Feature Selection and Bagging Ensemble Learning Methods

DOI: 10.1155/2017/4201984

Request a Protocol

Ask a question

Favorite

With a number of fundamental frequency perturbation, amplitude variation, and nonlinear signal dynamics features at hand, we considered selecting the most representative feature combination for further pattern analysis. In this work, we applied the Parzen-window method to establish the probability density function (PDF) of each feature for the IPD and CO subject groups, respectively.

The Parzen-window method is one of the nonparametric kernel-based PDF modeling techniques, which can be used to establish multimodal PDFs [19, 20]. The Parzen-window method commonly estimates an unknown PDF by averaging the accumulated nonnegative kernel functions κ(·), the centers of which are located at the vocal pattern data points x_i, written as

where N is the number of data points and h represents the kernel bandwidth. In the present study, the Gaussian radial basis function was chosen as the kernel window function. According to Hollander et al. [12], the optimal kernel bandwidth of the Gaussian function is given by

where SD denotes the standard deviation of the data points.

Based on the estimated PDFs of each vocal feature for the IPD and CO groups, we would like to analyze and select the possible feature combinations that may contain the most representative discriminant information on pattern classifications. We first calculated the modified Kullback-Leibler divergence (MKLD) to compare the feature differences between the IPD and CO subject groups. The MKLD is revision of the Kullback-Leibler divergence to make a symmetry adjustment of such a relative entropy measure between the probability distributions of ${\hat{f}}_{I P D} (x)$ and ${\hat{f}}_{C O} (x)$ for two subject groups [16], which can be written as

If two probability distributions are similar or completely the same, the MKLD value is close to zero. On the other hand, the MKLD value would become large, if two classes are discriminant based on their probability distributions.

The MKLD is better than the Kullback-Leibler divergence (KLD) because the MKLD is a symmetric probability density measure metric that calculates the twofold relative entropy values between two feature probability distributions. However, it can be observed from the MKLD definition that the relative entropy should avoid a zero denominator with minor numeric revisions, such that the MKLD feature divergence computing sometimes would bring in the systematic bias. With the purpose of better representing the probability density differences of the distinct vocal features, in this work, we propose an overlapping feature distribution measure method to estimate the probabilistic confusion between two classes. The interclass probability risk (ICPR) is computed with the integration of the overlapped PDFs as

If the entire feature probability distributions of two classes are overlapping, the value of ICPR is equal to 1. When two classes are completely separable without any PDF overlap, the ICPR value becomes zero. In general, if the ICPR value is smaller, the classes are easier to be separated with the given feature distributions. The feature selection based on the ICPR measure has a major advantage that it can be adaptive to unimodal or multimodal probability densities.

In order to determine the best feature combination for further pattern classifications, the ICPR and MKLD measures were used as the feature selection metrics in our experiment, respectively. If the probability densities of two classes are overlapped at a random guess level, that is, the overlapped area of two probability densities is equal to the resting nonoverlapped probability density areas of both two class, the ICPR value is about to be 0.67. The optimal features selected by the ICPR method are MDVP:F0, Spread1, MDVP-LDA, Shimmer-LDA, and Nonlinear-LDA, with the ICPR value lower than 0.6, the features of which could help the classifiers make a decision better than a random guess. Since the MKLD is a symmetry metric with the sum of a pair of KLDs, the probability densities of two classes that are overlapped at a random guess level would produce a MKLD value of 1. In our experiments, the best features selected by the MKLD method are MDVP:F0, MDVP:Flo, MDVP-LDA, Shimmer-LDA, and Nonlinear-LDA, with the MKLD value larger than 1.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol