2.2. MLE via stochastic EM

Firouzeh Noghrehchi; Jakub Stoklosa; Spiridon Penev; David I. Warton

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.2. MLE via stochastic EM

FN Firouzeh Noghrehchi

JS Jakub Stoklosa

SP Spiridon Penev

DW David I. Warton

This method is extracted from research article: Stat Med, Feb 2021

Selecting the model for multiple imputation of missing data: Just use an IC!

DOI: 10.1002/sim.8915

Request a Protocol

Ask a question

Favorite

The EM algorithm 9 was designed to find MLE estimates of parameters of a parametric model in an iterative manner when the observed data are incomplete. This procedure makes use of Fisher's identity, where the maximization of the unknown observed log‐likelihood is replaced with the maximization of the conditional expectation of an associated complete log‐likelihood:

An EM iteration $θ^{(t)} \to θ^{(t + 1)}$ consists of two steps. The E‐step computes the expectation of conditional complete‐data log‐likelihood given the observed data (with respect to the imputation model at the current estimate of parameters),

The M‐step updates the estimates of parameters by maximization of the expectation function computed in the E‐step,

In situations where $Q (θ | θ^{(t)})$ is either analytically intractable 36 or computationally intensive, 37 it is possible to replace analytical computation of $Q (θ | θ^{(t)})$ by a suitable approximation to this function, commonly via simulation methods such as MCMC. In this paradigm, stochastic versions of the EM algorithm were designed to numerically compute $Q (θ | θ^{(t)})$ by Monte Carlo approximation. As such, the E‐step in the EM algorithm simplifies to the computation of the imputation model, $p (z | y, r, θ^{(t)})$ , and simulation of the missing data z^(t). In other words, the E‐step turns into an imputation step (I‐step) where

to approximate $Q (θ | θ^{(t)})$ as a Monte Carlo average,

A special case of the EM algorithm is the stochastic EM (StEM) 11 , 12 , 13 summarized in Box 1 (right). StEM, without aiming to produce any approximate computation of $Q (θ | θ^{(t)})$ , imputes the missing data only once in the I‐step until the algorithm converges to its stationary distribution, whose mean is close to the MLE. 33 The random sequence of ${θ^{(t)}}$ generated by the StEM does not converge pointwise to the MLE, but, under mild conditions, does converge in distribution. 32 After the algorithm converges, multiple imputations are generated by running extra M iterations to sample from the stationary distribution, and the sample mean gives an approximate MLE of the observed likelihood.

The StEM estimator is shown to be an asymptotically normal, unbiased, and consistent estimator of $θ$ when considering models from the exponential family. 32 Asymptotic properties of the StEM estimator are given in Wang and Robins 15 and Nielsen. 38

This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol