In order to tackle the specific class of problems, namely rare object detection, which is complicated by the large level of similarity between the soil debris and other spurious background noise elements to the object of interest (i.e., SCN egg), we improve the architecture’s discriminative ability by the introduction of a selectivity function. Selectivity is an intermediate process between the pixel-wise and super pixel segmentation which learns to label as positive only the bounding box containing the object of interest. The algorithm enables the model to learn how to discard units that are undesirable based on the bounding boxes.
The function only propagates the network activations/units that appear in patches that contain the fully visible SCN eggs, but rejects or masks activations/units that occur in patches where either the SCN eggs are not fully present or where there are no SCN eggs. Thus, apart from regular debris objects (that are visually quite different from eggs), the sets of units that are masked also include the debris that are visually similar to parts of eggs. Qualitatively, selectivity is a dynamic labeling function where the same bounding box can take on both an SCN egg at some point, while the image is masked as a debris at another point. A qualitative description of the function is captured in Fig. 7 and video-based illustration is provided in the attached Supplementary Video.
Patch-wise frames selected from video of the progression of an SCN egg example. Selectivity is seen to have superior properties than the pixel-wise semantic segmentation for solving similarity problem between the SCN eggs and debris.
Although, selectivity is a dynamic labeling technique, it is the major ingredient that enabled the autoencoder network training to achieve the tasks. In Fig. 7, the SCN egg data input of either (I) or (II) are similar to debris in such a manner that a vanilla autoencoder network trained for semantic segmentation on pixel-wise basis would confuse the SCN eggs with debris and increase the likelihood of false alarms, but the selectivity ensures that the occurrence of such false positives is curtailed. We observe that the pose centering ability of the network is due to the selectivity criteria included. Selectivity also improves the SCN egg features learning ability of the autoencoder leading to a better accuracy.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.