For each gene, it is assumed that the density f(x) of the expression value can be described by a normal-mixture model with two components, i.e.,
where μA and μB denote the mean in the two subgroups and p is the proportion of samples in one group (Wang et al., 2009). The BI is defined as
The expectation-maximization (EM) algorithm was used to estimate the BI using the R package mixtools (Benaglia et al., 2009). Ten different starting values were used for the EM-algorithm, generated from a grid with 10 values for the fraction parameter p, evenly spaced between 0 and 1, for more details, see Karlis and Xekalaki (2003). Genes with high BI were selected for analysis.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.