Procedure

When evaluating the model performance for the ampulla detection task, the following methods were used other than main outcome measures such as recall.

Centroid distance is a relative coordinate error between centroids of the GT bbox and the estimated bbox. Its mathematical expression is as follows:

where $xg,yg$, $xe,ye$, $W$, and $H$ denote coordinates of the GT bbox, those of the measured bbox, the width and height of the image, respectively.

A success plot shows success rates for decreasing (or increasing) mean intersection-over-union (mIoU) (or centroid distance) thresholds. It is counted as a success if the IoU (or centroid distance) between the model prediction and the GT label for an image is bigger (or lower) than the threshold.

Human performance was compared with our model. We randomly sampled 30 images from our test set of the first fold using the Python NumPy library and an expert endoscopist conducted the ampulla detection on the sampled data. The performance of our model was measured for the same images.

