Monkey behavioral training and testing utilized standard operant conditioning, head stabilization, and infrared video eye tracking. Custom software (https://mworks.github.io/) was used to present stimuli on an LCD monitor with an 85-Hz refresh rate.
The monkeys performed an invariant delayed-match-to-sample (IDMS) task (Fig. 2). As an overview, the task required the monkeys to make a saccade when a target object appeared within a sequence of distractor images (Fig. 2A). Objects were presented at differing positions, sizes, and background contexts as described above and shown in Fig. 2C. Stimuli consisted of a fixed set of 20 images that included 4 objects, each presented at 5 different identity-preserving transformations (Fig. 2B). Each short block (~3 min) was run with a fixed target object before another target was pseudorandomly selected. Our design included two types of trials: cue trials and test trials (Fig. 2A). Only test trials were analyzed for this report.
A trial began when the monkey fixated on a red dot (0.15°) in the center of a gray screen, within a square window of ±1.5°. Fixation was followed by a 250-ms delay before a stimulus appeared. Cue trials, which indicated the current target object, were presented at the beginning of each short block or after three subsequent error trials. To minimize confusion, cue trials were designed to be distinct from test trials and began with the presentation of an image of each object that was distinct from the images used on test trials (a large version of the object presented at the center of gaze on a gray background; Fig. 2A). Test trials began with a distractor image, and neural responses to the first distractor were discarded to minimize nonstationarities such as stimulus onset effects. During the IDMS task, all images were presented at the center of gaze, in a circular aperture that blended into a gray background (Fig. 2C).
In each block, 5 images were presented as target matches and the other 15 as distractors. Distractor images were drawn randomly without replacement until each distractor was presented once on a correct trial, and the images were then re-randomized. Within each block, four repeated presentations of each of the 20 images were collected, and a new target object was then pseudorandomly selected. Each block lasted ~3 min. Following the presentation of all 4 objects as targets, the targets were rerandomized. At least 10 repeats of each condition were collected on correct trials. When more than 10 repeats were collected, the first 10 were used for analysis.
On most test trials, a target match followed the presentation of a random number of 1–6 distractors (probability of a target match at each position: 0.278, 0.151, 0.142, 0.128, 0.115, and 0.099; Fig. 2A). On a small fraction of trials (0.086), seven distractors were shown, and the monkeys were rewarded for fixating through all distractors. This translated into a function whereby if the monkey had not observed a target match by position N–1 in the trial, the probability that the target match would appear at position N was for positions 1–6: 0.278, 0.209, 0.249, 0.299, 0.383, and 0.536. Each image was presented for 400 ms (or until the monkeys’ eyes left the fixation window) and was immediately followed by the presentation of the next stimulus. Monkeys were rewarded for making a saccade to a response target within a window of 75–600 ms after the target match onset. In monkey 1, the response target was positioned 10° above fixation; in monkey 2, it was 10° below fixation. If the monkeys had not yet moved their eyes after 400 ms following target onset, a distractor stimulus was immediately presented. A trial was classified as a “false alarm” if the eyes left the fixation window via the top (monkey 1) or bottom (monkey 2) outside the allowable correct response period and traveled >0.5°. In contrast, all other instances in which the eyes left the fixation window during the presentation of distractors were characterized as fixation breaks. A trial was classified as a “miss” when the monkey continued fixating beyond 600 ms following the onset of the target match. Overall, monkeys performed this task with high accuracy. Disregarding fixation breaks (monkey 1: 11% of trials; monkey 2: 8% of trials), percent correct on the remaining trials was as follows: monkey 1: 98% correct, ~1% false alarms, and ~1% misses; monkey 2: 94% correct, 2% false alarms, and 4% misses. Behavioral performance was comparable for the sessions corresponding to recordings from the two areas (V4 percent correct overall = 96.5%; IT percent correct overall = 91.4%).
V4 receptive fields at and near the center of gaze are small: on average they have radii of 0.56° at the fovea, extending to radii of 1.4 at an eccentricity of 2.5° (Desimone and Schein 1987; Gattass et al. 1988). We thus took considerable care to ensure that that the images were approximately placed in the same region of these receptive fields across repeated trials. In the second monkey, adequate fixational control could not be achieved through training. We thus applied a procedure in which we shifted each image at stimulus onset 25% toward the center of gaze (e.g., if the eyes were displaced 0.5° to the left, the image was repositioned 0.125° to the left and thus 0.375° from fixation). Image position then remained fixed until the onset of the next stimulus. The value of 25% was determined during training as an amount that enabled us to achieve better consistency of placement of the images on the receptive fields, while maintaining high behavioral performance.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.