To take full advantage of the proposed gDTH feature, we have incorporated a similarity metric that measures the geometrical similarities of OARs with respect to multiple PTVs. The first term is the Frobenius norm of the differences between the gDTHs of two cases. The same distance distribution alone doesn’t guarantee the same dose distribution, considering the prescription doses to primary PTVs and boost PTVs may vary from patient to patient. In our institution, for instance, there are several commonly used primary/ boost dose prescriptions for HN treatment, such as 44 Gy / 70 Gy, 50 Gy / 60 Gy. To account for such variation, we introduced a second term to represent the dose ratio similarity and defined the similarity metric as:
where gDTHtarget and gDTHref denote the gDTH of the target plan and that of the plan being referenced DTHfrom the database, and λ is a balancing factor empirically tuned to match the mean values of the first term and the second term in the training dataset, and d denotes prescription dose. In our experiments, λ was selected as , which is the ratio of the two square differences averaged over the training set.
With this similarity metric, the k-nearest neighbors (kNN) search then selects a subset of training cases that resemble the validation case. We choose to reference k nearest neighbors because (1) kNN is a known robust non-parametric regression method when k is properly selected and (2) kNN referencing yields similar plans we can model with reduced model complexity. The selected subset is subsequently used to build a DVH prediction model. T-distributed statistical neighboring embedding (t-SNE)27 is used to visualize this high-dimensional feature space and to justify similarity metric measurements on the feature space distribution. T-SNE converts high dimensional Euclidean distances to conditional probabilities and maps high dimension data to low dimension while preserving local structures of the datasets. A visualization of the proposed feature map of a dataset is shown in Fig. 3. Figure 3a shows a two-dimensional t-SNE map of the left parotid gDTHs of the 120 case training dataset in this study (the red and blue dots). Figure 3b is a validation case randomly picked to demonstrate the effectiveness of the proposed feature at differentiating cases with different OAR-PTV shape distributions. The blue dots on the map (Fig. 3a) are the cases selected by the similarity metrics to build the model to predict the parotid DVH of Fig. 3b (the validation case), while the red dots are the cases excluded from the modeling. Fig.3c–f further show the PTVs and left parotid anatomies of the selected (3f) and unselected (3c-e) cases, and their respective locations on the 2D t-SNE map are indicated by the arrows. As shown, 3b and 3f are determined by Eq. 3 as similar in features, even though their PTVs (especially boost PTV) vary significantly in size and location. Previously, the modeling of head and neck treatment plans require manual data stratifications. For instance, ipsilateral and contralateral parotids have to be modeled separately28, and treatment plans should be categorized by sub-sites before model training. The proposed gDTH feature effectively separates cases with different geometries in a non-linear fashion, and it is no longer necessary to stratify data.
A t-SNE visualization example: (a) A two-dimensional t-SNE map of the left parotid gDTHs; (b) The randomly selected validation case; (c) (d) (e) (f) four example cases located in different regions of the feature map. In (a), blue dots mark the cases selected by the similarity metric for modeling the validation case in 3b, and red dots denote the rest of the dataset. (c), (d), (e) show three unselected cases and the arrows indicate their locations on the t-SNE map, while (f) is one of the selected cases even though its PTVs (especially boost PTV) are significantly different to 3b in size and location. Both the x- and the y-axis of the t-SNE map are dimensionless and are of arbitrary units.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.