After the preprocessing stage, the SELDI-TOF-MS data set was still highly dimensional. Extracting features by using dimension reduction techniques not only simplifies the structure of the prediction model but also improves the speed of training and testing. PCA is a commonly used dimension reduction technique based on the minimum variance principle of reconstruction. What is more, it uses the small amount of principle components to replace the massive data. However, PCA is lack of probabilistic model structure and highly order statistics. PPCA, proposed by Tipping and Bishop [16], restricts the factor loading matrix with a noise variance estimation using the principle components ignored by the traditional PCA in the latent variable model and then obtains the optimal probability model through the parameters estimated by the expectation-maximization (EM) algorithm. Consequently, PPCA can find the direction of the principal components from the high-dimensional data more effectively and can obtain the outstanding feature extraction more efficiently.
Suppose that the dimension of an observation data set {S n, n = 1,2,…, N} is d and the number of samples is N. For one sample, through the latent variable model, the relationship between the observation data S and the latent variable X can be expressed as
where W is a d × q factor loading matrix, X is a q-dimensional latent variable, μ = (1/N)∑n=1 N S n, is a nonzero mean, ε is error and assume X ~ N(0, I) and ε ~ N(0, σ 2 I), and then we can obtain the probability distribution of S under the condition of X through (1) as follows:
If the prior probability model of X conforms to Gaussian distribution
then the probability distribution of S can be expressed as
where C = WW T + σ 2 I is a d × d matrix. By using Bayes rule, we can derive the posterior probability distribution of X from S:
where M = W T W + σ 2 I is a q × q matrix. Under this model, the Log-likelihood function of S can be expressed as
where U = (1/N)∑n=1 N(S n − μ)(S n − μ)T is the covariance matrix of the observations, and then we can obtain the maximum likelihood estimates through the EM algorithm:
where W is the old value of the parameter matrix and is the revised estimates calculated from (7). We bring the parameters obtained from (7) and (8) into (1) to derive the latent variable which is the dimensionality reduction form of the observations S n:
From (9), we can reconstruct the observation data via :
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.