In this section, we describe our proposed method. This method seeks the most optimal partitions in an iterative way. We begin with an initial partition of data and shift a sample from one partition to another partition, and test if such a shift improves the overall log-likelihood. A simple illustration of SIML is given in Fig. 1.
An illustration of stepwise iterative maximum likelihood method using a c = 2 cluster case. In this illustration, two clusters and
are given with likelihood functions L1 and L2, respectively. The center of clusters are depicted by μ
1 and μ
2 (shown as ‘+’ inside two clusters). Initial total likelihood is Lold which is the sum of two likelihood functions (L1 + L2). A sample x ∈
is checked for grouping. It is advantageous to shift sample x to cluster
only if the new likelihood (Lnew = L
1* + L
2*) is higher than the old likelihood; i.e., L
new > L
old
If we define class-based log-likelihood of two clusters χi and χj as
and
then we would be interested in knowing how the class-based log-likelihood functions (referred as log-likelihood function hereafter) change if a sample is shifted from χi to χj. In order to know this, let us define mean and covariance of χi and χj as μi and μj, and, as Σi and Σj, respectively. The following equations describe mean and covariance:
and
ni and nj are number of samples in χi and χj, respectively. If the component density is normal and let P(ωi) = ni/n (where n is the total number of samples) then Eqs. 9 and 10 can be written as
where tr() is a trace function. Since we can write Li as
Similarly, we can write Lj as
and the total log-likelihood for c clusters can be written as
where Lk is from Eq. 15 or 16.
If a sample is shifted to χj, then the mean and covariance will change as follows (from Eqs. 11, 12, 13 and 14):
where μi*, μj*, Σi* and Σj* are means and covariances after the shift.
In order to find the change in log-likelihood functions Li and Lj, we first introduce the following Lemma.
Lemma 1 If a sample is shifted to cluster χj and the changed covariance of χj is defined as then the determinant of Σj* can be given as .
Proof By taking determinant of Σj*, we get
since for m × m square matrices |AB| = |A||B| and for a scalar c, |cA| = cm|A|. We can write Eq. L1 as
From Sylvester’s determinant theorem, rectangular matrices A ∈ ℝm × n and B ∈ ℝn × m in |Im × m + AB| is equal to |In × n + BA|. Therefore, we can write
since |c| = c.
Substituting right hand side of Eq. L3 in Eq. L2 proves the Lemma.
As similar to Lemma 1, the determinant of the change in covariance of χi can be written as
We can now observe the change in Lj (Eq. 16) due to the shift of a sample from χi to χj as
From Lemma 1 and Eq. 16, we can rewrite Eq. 23 after doing algebraic manipulation as
where ΔLj is given by
and constant C is given by
In a similar manner, change in Li can be obtained by using Eqs. 15 and 22 as
where ΔLi is given by
and C is same as of Eq. 26.
By adding Eqs. 24 and 27, we can get the change in total log-likelihood (Ltot*) since there is no change in any other clusters apart from χi to χj; i.e., from Eqs. 17, 24 and 27 we have
where ΔLtot = ΔLj − ΔLi. Therefore, the shift of a sample is advantageous if ΔLtot > 0. This will give the following algorithm (Table 1):
Stepwise iterative maximum likelihood method procedure
The following sections discuss the characteristic of the SIML method.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.