AI model: PANDA

Kai Cao; Yingda Xia; Jiawen Yao; Xu Han; Lukas Lambert; Tingting Zhang; Wei Tang; Gang Jin; Hui Jiang; Xu Fang; Isabella Nogues; Xuezhou Li; Wenchao Guo; Yu Wang; Wei Fang; Mingyan Qiu; Yang Hou; Tomas Kovarnik; Michal Vocka; Yimei Lu; Yingli Chen; Xin Chen; Zaiyi Liu; Jian Zhou; Chuanmiao Xie; Rong Zhang; Hong Lu; Gregory D. Hager; Alan L. Yuille; Le Lu; Chengwei Shao; Yu Shi; Qi Zhang; Tingbo Liang; Ling Zhang; Jianping Lu

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

AI model: PANDA

KC Kai Cao

YX Yingda Xia

JY Jiawen Yao

XH Xu Han

LL Lukas Lambert

TZ Tingting Zhang

WT Wei Tang

GJ Gang Jin

HJ Hui Jiang

XF Xu Fang

IN Isabella Nogues

XL Xuezhou Li

WG Wenchao Guo

YW Yu Wang

WF Wei Fang

MQ Mingyan Qiu

YH Yang Hou

TK Tomas Kovarnik

MV Michal Vocka

YL Yimei Lu

YC Yingli Chen

XC Xin Chen

ZL Zaiyi Liu

JZ Jian Zhou

CX Chuanmiao Xie

RZ Rong Zhang

HL Hong Lu

GH Gregory D. Hager

AY Alan L. Yuille

LL Le Lu

CS Chengwei Shao

YS Yu Shi

QZ Qi Zhang

TL Tingbo Liang

LZ Ling Zhang

JL Jianping Lu

This method is extracted from research article: Nat Med, Nov 2023

Large-scale pancreatic cancer detection via non-contrast CT and deep learning

DOI: 10.1038/s41591-023-02640-w

Request a Protocol

Ask a question

Favorite

PANDA consists of three stages (Extended Data Fig. Fig.1)1) and was trained by supervised machine learning. Given the input of a non-contrast CT scan, we first localize the pancreas, then detect possible lesions (PDAC or non-PDAC), and finally classify the subtype of the detected lesion if any. The output of PANDA consists of two components, that is, the segmentation mask of the pancreas and the potential lesion, and the classification of the potential lesion associated with probabilities of each class.

The aim of the first stage (Stage 1) is to localize the pancreas. Because the pancreatic lesion is usually a small region in the CT scan, the localization of the pancreas can accelerate the lesion finding process and prune out unrelated information for the specialized training of the pancreatic region. In this stage we train an nnU-Net²³ to segment the whole pancreas (the union mask of healthy pancreas tissue and the potential lesions) from the input non-contrast CT scan. Specifically, the three-dimensional (3D) low-resolution nnU-Net, which trains UNet on downsampled images, is used as the architecture because of its efficiency in inference. The model training is supervised by the voxel-wise annotated masks of the pancreas and lesion. More details on the training and inference for PANDA Stage 1 are given in Supplementary Methods 1.2.1.

The aim of the second stage (Stage 2) is to detect the lesion (PDAC or non-PDAC). We trained a joint segmentation and classification network to simultaneously segment the pancreas and potential lesion, as well as classify the patient-level abnormality label, that is, abnormal or normal. The benefit of the classification branch is to enforce global-level supervision and produce a patient-level probability score, which is absent in semantic segmentation models. Similar designs had been used in previous studies, such as for cancer detection^47,48 and outcome prediction⁴⁹. The network architecture is shown in Extended Data Fig. Fig.1b.1b. This is a joint segmentation and classification network with a full-resolution nnU-Net²³ backbone (left part in Extended Data Fig. Fig.1b).1b). We extract five levels of deep network features, apply global max-pooling, and concatenate the features before carrying out the final classification. We output both the segmentation mask of the potential lesion and pancreas, and the probabilities of abnormal or normal for enhanced interpretability. This network was supervised by a combination of segmentation loss and classification loss:

where the segmentation loss $L_{seg}$ was an even mixture of Dice loss and voxel-wise cross-entropy loss, and the classification loss was the cross-entropy loss. α was set to 0.3 to balance the contribution of the two loss functions. More details on the training and inference of PANDA Stage 2 are given in Supplementary Methods 1.2.2.

The aim of the third stage network (Stage 3) is the differential diagnosis of pancreatic lesion type, which is formulated as the classification of eight sub-classes, that is, PDAC, PNET, SPT, IPMN, MCN, chronic pancreatitis, SCN and ‘other’. Due to the subtle texture change in pancreatic diseases, especially on non-contrast CT scans, we incorporate a separate memory path network that interacts with the UNet path to enhance the ability to model global contextual information, which is usually associated with the diagnosis of pancreatic lesions by radiologists. As shown in Extended Data Fig. Fig.1c,1c, we use a dual-path memory transformer network. This design is inspired by Max-Deeplab²⁵. The architecture of the UNet branch is the same as that of Stage 2, implemented as a full-resolution nnU-Net. The UNet branch takes the input of the cropped 3D pancreas bounding box, which is cropped with a fixed input size of (160, 256, 40). The memory branch starts with learnable memories designed to store both positional and texture-related prototypes of the eight types of pancreatic lesion, and is initialized as 200 tokens with 320 channels. The memory path iteratively interacts with multi-level UNet features (plus a shared learnable positional embedding across layers) via cross-attention and self-attention layers. Through this process the memory vectors were automatically updated to encode both the texture-related information from the UNet features and the positional information from the learnable positional embedding, for example, relative positions of the pancreatic lesion inside the pancreas, resulting in distinguishable descriptors for each type of pancreatic lesion.

The mechanism of the cross-attention and self-attention used in the model is formally described in Supplementary Methods 1.2.3, together with more details on model instantiation, training and inference of PANDA Stage 3.

Additionally, we trained an IPMN subtype classifier in a cascaded fashion following PANDA Stage 3, with the aim of binary classification between main or mixed-duct IPMN and branch-duct IPMN (Supplementary Methods 1.2.3).

One major difference between chest CT and abdominal CT is that the pancreatic and lesion regions are sometimes partially scanned in chest CT, depending on the different scanning ranges of the protocol and the anatomy of the patient. This difference could induce domain shift issues for machine learning models if our AI model was trained only on abdominal CT scans. To address this issue we propose a data augmentation method that randomly (with a probability) cuts off the pancreas region in the axial plane to simulate the imaging scenario in which the pancreas is not fully scanned in the chest CT. This data augmentation is applied to the training process of Stages 2 and 3. This simple simulation of the chest CT effectively helps our model generalize to chest non-contrast CT without the addition of any chest CT data to the training set, while maintaining high performance on abdominal non-contrast CT.

In the real-world clinical evaluation, PANDA was deployed at SIPD by integrating it into the clinical infrastructure and workflow (Supplementary Fig. 9). The deployment facilitates large-scale retrospective real-world studies in the hospital environment by securing data privacy, efficiently utilizing computational resources, and accelerating the process of large data inference and clinical evaluation. Specifically, we deploy PANDA in a local server located in the hospital (Supplementary Methods 1.2.4), which enables radiologists to visualize each case using our user-friendly DAMO Intelligent Medical Imaging user interface (IMI UI; Supplementary Fig. 9), easily review all results and access necessary information from their daily work environment. After RW1 we again collected non-contrast CT data of false positives and negatives and cases of acute pancreatitis from the internal, external and RW1 cohorts. In the field of machine learning this is known as hard example mining and incremental learning. The evolved model was named PANDA Plus and tested on RW2. The collection and annotation of these new training data and the fine-tuning schedule are described in Supplementary Methods 1.2.5.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol