Each experiment consisted of 150 trials. For 50 of the trials, the predictions of the model (or the robot) matched the ground-truth labels of the target images. For the remaining 100, the model predictions did not match the ground-truth labels. We selected the target images and the classification categories based on the model’s confusion matrix, with the aim to cover a wide range of model behavior. First we calculated the ResNet-50-PLDA model’s confusion matrix on ImageNet 1K, which contains 1000 categories. Then, we randomly selected 25 categories from each of the following four subsets: the 100 categories on which the model was most accurate, the 100 categories that were most confusable with these most accurate categories, the 100 categories on which the model was least accurate, and the 100 categories that were most confusable with these least accurate categories. This resulted in 100 categories. We recorded the model’s predicted labels of all the training images in these 100 categories and marked all images for which the model predictions were also among these 100 categories.
From this subset where both the image and the top model prediction belonged to our 100 categories, we randomly sampled 50 images where the model prediction matched the ground truth labels and 100 images for which the model predictions did not match the ground-truth labels. For the 50 trials with correctly classified target images, the two classification options participants could choose from were the correct model-predicted category and one of the two most confusable categories (out of our 100 selected categories). Which one of the two most confusable categories was presented were selected randomly for each trial. For the 100 incorrectly classified trials the two classification options were simply the ground-truth category and the incorrect model prediction. This procedure resulted in a total of 83 unique categories used in the experiment (Supplementary Table T1). This number is smaller than 100 because not all confusable categories are unique and not all categories were kept during the random sampling. Figure 7 depicts the trial generating process. The pairs of categories used in the experiments are listed in Supplementary Table T2.
Flowchart of trial generation. (A) Selection of examples and saliency maps with Bayesian teaching. The inputs to Bayesian teaching are: the model to be explained, data sets from two categories, and a target image that belongs to one of the two categories. The green box depicts the inner working of Bayesian teaching. Random image pairs are selected from each of the input categories. Along with the target image, two sets of image pairs, one set from each category, are selected at random to form a trial. The explainee model, which is set to have the same architecture as the input model, takes in a large number of random trials to produce the simulated explainee fidelity (unnormalized teaching probabilities according to Eq. (3)). Here, a trial with high fidelity (probability) is selected, exemplifying the trial generation process in the [helpful] condition. Saliency maps are generated for the target image and the four selected examples using Eq. (6). The final output is a set of ten images: a target image, two examples selected from each of the two input categories, and the saliency maps of the above five images. (B) Trial generation steps peripheral to Bayesian teaching. Our model to be explained is a ResNet-50 trained on ImageNet 1K. A confusion matrix on the 1000 ImageNet categories was computed using the model. Using the confusion matrix, we sampled 25 categories where the model has high accuracy (the “Easy” categories), 25 categories where the model has low accuracy (the “Hard” categories), and the categories that are most confusable with the above 50 categories. To generate a trial, we select at random two categories from the 100 candidates mentioned above as well as a target image that belongs to one of the two selected categories. The model, the target image, and the data associated with the two categories are fed into Bayesian teaching to produce a trial. See Methods for the full details. This figure was created using Adobe Illustrator CS6 (v. 16.0.0)43.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.