Let IN denotes a group of gray scale images IN = {In|n = 1, …, N}, , represents each image in the group. The proposed method applies for In as 2D or 3D images, but throughout the rest of the paper, we assume they are 3D images representing one phase in time in a 4D-CT dataset. The objective of GroupRegNet is to find a set of dense transformations that map the same anatomical locations between any two individual images in the group.
The optimization problem to be solved by GroupRegNet is formulated as:
where Lsimi, Lsmo, and Lcyc are the similarity, smoothness, and cyclic regularization losses, is a set of transformations that maps anatomical locations in the template to the corresponding locations in the input images, , and represent the warped nth input image and all warped input images, respectively, is the implicit template by averaging warped input images (Vandemeulebroucke et al 2011), λ0 and λ1 are the weights for smoothness and cyclic regularization, respectively. The cyclic regularization term will only be present if the relative motion in the image group is periodic or symmetric. The objective of the iterative optimization then becomes finding the optimal transformation that aligns every image in the group to a template image while keeping the deformation field smooth and cyclically consistent. The inverse transformation that maps the same anatomical locations in the input image to the implicit template is determined from a fixed-point method (Chen et al 2008). The transformation mapping between the nth and mth image can be calculated using the composition of the deformation field: .
Figure 1 illustrates the components and data flowing of GroupRegNet. As compared to the common structure of a learning-based method VoxelMorph (Balakrishnan et al 2019), GroupRegNet uses similar components including a CNN (to be explained in the later subsections), a spatial transformer (implemented as a 3D linear interpolation), a similarity loss, a cyclic loss, and a smoothness loss. The input images are processed by the CNN to directly estimate the displacement fields. Existing methods in the literature explicitly select the reference and moving images to form a pair and then warp the moving image to the reference image. By contrast, in GroupRegNet, the input images in the group are first stacked in the channel dimension before feeding into the neural network, and the computed transformation then aims to warp the input image into the common space of the template image. It should be noted that CNN’s output is the displacement field instead of the transformation field , which are related through . The details of the components in this flowchart are further elaborated in the next subsections.
Flowchart of GroupRegNet. The expression (n, D, H, W) represents the number of images in the group and the spatial dimensions of the image.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
 Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.