2.2. Feature Extraction Using PPCA

Jiang Wu; Yanju Ji; Ling Zhao; Mengying Ji; Zhuang Ye; Suyi Li

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.2. Feature Extraction Using PPCA

JW Jiang Wu

YJ Yanju Ji

LZ Ling Zhao

MJ Mengying Ji

ZY Zhuang Ye

SL Suyi Li

This method is extracted from research article: Comput Math Methods Med, Aug 2016

A Mass Spectrometric Analysis Method Based on PPCA and SVM for Early Detection of Ovarian Cancer

DOI: 10.1155/2016/6169249

Request a Protocol

Ask a question

Favorite

After the preprocessing stage, the SELDI-TOF-MS data set was still highly dimensional. Extracting features by using dimension reduction techniques not only simplifies the structure of the prediction model but also improves the speed of training and testing. PCA is a commonly used dimension reduction technique based on the minimum variance principle of reconstruction. What is more, it uses the small amount of principle components to replace the massive data. However, PCA is lack of probabilistic model structure and highly order statistics. PPCA, proposed by Tipping and Bishop [16], restricts the factor loading matrix with a noise variance estimation using the principle components ignored by the traditional PCA in the latent variable model and then obtains the optimal probability model through the parameters estimated by the expectation-maximization (EM) algorithm. Consequently, PPCA can find the direction of the principal components from the high-dimensional data more effectively and can obtain the outstanding feature extraction more efficiently.

Suppose that the dimension of an observation data set {S _n, n = 1,2,…, N} is d and the number of samples is N. For one sample, through the latent variable model, the relationship between the observation data S and the latent variable X can be expressed as

where W is a d × q factor loading matrix, X is a q-dimensional latent variable, μ = (1/N)∑_n=1 ^N S _n, is a nonzero mean, ε is error and assume X ~ N(0, I) and ε ~ N(0, σ ² I), and then we can obtain the probability distribution of S under the condition of X through (1) as follows:

If the prior probability model of X conforms to Gaussian distribution

then the probability distribution of S can be expressed as

where C = WW ^T + σ ² I is a d × d matrix. By using Bayes rule, we can derive the posterior probability distribution of X from S:

where M = W ^T W + σ ² I is a q × q matrix. Under this model, the Log-likelihood function of S can be expressed as

where U = (1/N)∑_n=1 ^N(S _n − μ)(S _n − μ)^T is the covariance matrix of the observations, and then we can obtain the maximum likelihood estimates through the EM algorithm:

where W is the old value of the parameter matrix and $\tilde{W}$ is the revised estimates calculated from (7). We bring the parameters obtained from (7) and (8) into (1) to derive the latent variable ${\tilde{X}}_{n}$ which is the dimensionality reduction form of the observations S _n:

From (9), we can reconstruct the observation data ${\tilde{S}}_{n}$ via ${\tilde{X}}_{n}$ :

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol