Stepwise iterative maximum likelihood method

AS Alok Sharma
DS Daichi Shigemizu
KB Keith A. Boroevich
YL Yosvany López
YK Yoichiro Kamatani
MK Michiaki Kubo
TT Tatsuhiko Tsunoda
request Request a Protocol
ask Ask a question
Favorite

In this section, we describe our proposed method. This method seeks the most optimal partitions in an iterative way. We begin with an initial partition of data and shift a sample from one partition to another partition, and test if such a shift improves the overall log-likelihood. A simple illustration of SIML is given in Fig. 1.

An illustration of stepwise iterative maximum likelihood method using a c = 2 cluster case. In this illustration, two clusters An external file that holds a picture, illustration, etc.
Object name is 12859_2016_1184_Figc_HTML.gif and An external file that holds a picture, illustration, etc.
Object name is 12859_2016_1184_Figd_HTML.gif are given with likelihood functions L1 and L2, respectively. The center of clusters are depicted by μ 1 and μ 2 (shown as ‘+’ inside two clusters). Initial total likelihood is Lold which is the sum of two likelihood functions (L1 + L2). A sample x ∈  An external file that holds a picture, illustration, etc.
Object name is 12859_2016_1184_Fige_HTML.gif is checked for grouping. It is advantageous to shift sample x to cluster An external file that holds a picture, illustration, etc.
Object name is 12859_2016_1184_Figf_HTML.gif only if the new likelihood (Lnew = L 1* + L 2*) is higher than the old likelihood; i.e., L new > L old

If we define class-based log-likelihood of two clusters χi and χj as

and

then we would be interested in knowing how the class-based log-likelihood functions (referred as log-likelihood function hereafter) change if a sample is shifted from χi to χj. In order to know this, let us define mean and covariance of χi and χj as μi and μj, and, as Σi and Σj, respectively. The following equations describe mean and covariance:

and

ni and nj are number of samples in χi and χj, respectively. If the component density is normal and let P(ωi) = ni/n (where n is the total number of samples) then Eqs. 9 and 10 can be written as

where tr() is a trace function. Since trΣi1xχixμixμiT=trniId×d=nid we can write Li as

Similarly, we can write Lj as

and the total log-likelihood for c clusters can be written as

where Lk is from Eq. 15 or 16.

If a sample x^χi is shifted to χj, then the mean and covariance will change as follows (from Eqs. 11, 12, 13 and 14):

where μi*, μj*, Σi* and Σj* are means and covariances after the shift.

In order to find the change in log-likelihood functions Li and Lj, we first introduce the following Lemma.

Lemma 1 If a sample x^χi is shifted to cluster χj and the changed covariance of χj is defined as Σj*=njnj+1Σj+njnj+12x^μjx^μjT then the determinant of Σj* can be given as |Σj*|=njnj+1d|Σj|1+1nj+1x^μjTΣj1x^μj.

Proof By taking determinant of Σj*, we get

since for m × m square matrices |AB| = |A||B| and for a scalar c, |cA| = cm|A|. We can write Eq. L1 as

From Sylvester’s determinant theorem, rectangular matrices A ∈ ℝm × n and B ∈ ℝn × m in |Im × m + AB| is equal to |In × n + BA|. Therefore, we can write

since |c| = c.

Substituting right hand side of Eq. L3 in Eq. L2 proves the Lemma.

As similar to Lemma 1, the determinant of the change in covariance of χi can be written as

We can now observe the change in Lj (Eq. 16) due to the shift of a sample x^ from χi to χj as

From Lemma 1 and Eq. 16, we can rewrite Eq. 23 after doing algebraic manipulation as

where ΔLj is given by

and constant C is given by

In a similar manner, change in Li can be obtained by using Eqs. 15 and 22 as

where ΔLi is given by

and C is same as of Eq. 26.

By adding Eqs. 24 and 27, we can get the change in total log-likelihood (Ltot*) since there is no change in any other clusters apart from χi to χj; i.e., from Eqs. 17, 24 and 27 we have

where ΔLtot = ΔLj − ΔLi. Therefore, the shift of a sample x^ is advantageous if ΔLtot > 0. This will give the following algorithm (Table 1):

Stepwise iterative maximum likelihood method procedure

The following sections discuss the characteristic of the SIML method.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A