ask Ask a question
Favorite

Despite mental imagery being a difficult and abstract concept to target, several self-report questionnaire instruments have demonstrated successful and reliable measurement of various aspects of visual representations. Of particular note are the widely used Vividness of Visual Mental Imagery Questionnaire (VVIQ; Marks, 1973) and its more recent revisions, the Vividness of Visual Mental Imagery Questionnaire-2 (VVIQ-2; Marks, 1995) and the Vividness of Visual Mental Imagery Questionnaire-Revised Version (VVIQ-RV; Marks, 1995; Campos, 2011). In each of these surveys, participants are prompted to visualize specific scenes, such as a sunset, and report on the clarity and detail of the generated images using Likert scale responses. The variants of the VVIQ differ on whether they require the participant to visualize with eyes open or closed. Critical statistical testing of the original VVIQ and both of its variants indicate high internal validity for measuring the mental imagery construct (Campos, 2011). In addition, the Plymouth Sensory Imagery Questionnaire (Psi-Q) is a unique assessment able to provide highly reliable measures of individual tendency to experience vivid imagery across multiple modalities (Andrade et al., 2014). The demonstrated internal validity of assessment items which require participants to generate detailed scenes indicates that individuals are able to perceive multiple and specific visual components in mental representations, and that these components can be reliably captured by straightforward survey items.

At least one questionnaire instrument has attempted to target specific shape information represented within visual mental images. The Mental Imagery Scale (MIS; D’Ercole et al., 2010) was designed to exploit the relationship between verbal descriptions and mental images in order to directly translate structural features present within mental representations into precise verbal descriptions. As the creators note, this type of scale is advantageous for highly visual and communicative fields such as architecture and art didactics. To test the MIS, participants were given a verbal description of a piece of artwork and asked to answer questions aligned with one of six factors describing aspects of mental images and the process of image formation: Image Formation Speed, Stability, Dimensions, Level of Detail, Distance, and Perspective (D’Ercole et al., 2010). The results of the study showed that participant responses supported the proposed six factor model, suggesting that mental imagery is influenced by inherent spatial properties. As it relates to the study of diagnostic features, this instrument demonstrates that reliable and detailed assessment of visual mental imagery is achievable through verbal descriptions alone. If this specificity were to be increased to the level of independent, discrete object components, it is possible that the MIS or similar instruments could target and identify discrete, classificatory visual features through self-report.

The Object-Spatial Imagers Questionnaire (OSIQ; Blajenkova et al., 2006) approaches the level of specificity necessary to identify distinguishing features by assessing object imagery preferences at the level of the individual. However, the purpose of the OSIQ is to reveal individual tendencies for representing images in a holistic, picture-like fashion or spatially, through a compilation of individual parts; the questionnaire does not include explicit appraisal of shape. Tests of the OSIQ demonstrate varying levels of preference for holistic and part-based representation across individuals. These results hold important implications for any study investigating mental imagery, because individual preference for holistic representations could lead to increased Type II error rates when attempting to access part-based visual information. In comparison to the OSIQ, the VVIQ has not been shown to characterize these spatial preferences (Blajenkova et al., 2006), which may be a result of the VVIQ’s focus on contextual visual scene imagery, as opposed to independent objects. Regardless, it would be prudent for future studies to consider the possibility of individual differences in representational style when choosing a questionnaire measure, as well as when analyzing and interpreting study findings.

There are several advantages and drawbacks to employing questionnaires in the study of mental imagery. On the one hand, surveys allow a large amount of detailed data to be collected in a relatively short amount of time, much more so than physiological or biological measures; the questionnaires described above contain 32 items on average. All items consist of a simple Likert scale rating ranging from 5 to 7 steps. In addition, these measures require very little in the way of technical skills or eligibility criteria, making the instruments accessible to a broad and representative population. The reliability of self-report responses of this type is also supported by behavioral results indicating that individuals tend to have reliable and accurate metacognition of their own imaginative experiences (Pearson et al., 2011). However, several complications arise when an individual is asked to verbally describe or physically recreate visual content. For example, perceptual biases and lack of artistic ability may distort participants’ drawings of images, and verbal descriptions may be misinterpreted or incomplete. Indeed, studies of drawings by non-artists have shown that drawing errors are positively correlated with perceptual biases encoded during initial image observation (Ostrofsky et al., 2015). Most importantly, the very nature of questionnaires makes the probing of specific distinguishing features difficult to accomplish without introducing artificial bias. Furthermore, even when bias is minimized, responses are likely to capture only those spatially discrete shapes which lend themselves to canonical lexical labeling.

Despite these shortcomings, the high level of proficiency with which written questionnaires have been shown to access the mental imagery construct warrants their consideration as reflectors of distinguishing features in mental imagery. In order to best take advantage of the benefits provided by their time-efficient and portable format, questionnaires assessing the specific shape structure of visualized images may best be applied to a large group of respondents. Using an extensive population reduces the influence of individual biases and representational preferences on responses. Any significant patterns observed within and across responses could then be identified and targeted for further, more in-depth, analyses. In the meantime, indicators of individual preferences such as the Psi-Q and OSIQ should be considered for use as covariates when measuring partial object information in mental imagery, regardless of the primary methodology employed. Even perceptual biases revealed through drawings may be insightful for inferring the visual aspects which receive the most attention during encoding, thus suggesting features of greater relative cognitive import. If diagnostic features are highly informative for the identity of a given object, patterns among the features or shape aspects reported by a large and varied group hold potential for identifying naturally occurring diagnostic object features. Although the distinguishing features captured by questionnaires would most likely be limited to spatially discrete, nameable object components, these data could then be used to guide further empirical research to evaluate the quality, reliability, and validity of these components as perceptual diagnostic features.

Gestural motor movements have also been explored as an indicator of mental object representation content. Following an established link between functional motor actions and tool use, one such study investigated whether an individual could acquire functional object representations merely by imagining the use of novel objects and visualizing the appropriate corresponding hand gestures (Paulus et al., 2012). Participants were shown pictures of four artificially designed objects with unique functional ends that required distinctive hand grips in order to be brought toward the ear or the nose. Prior to training, participants were instructed on the proper action associated with each object and told to imagine a salient effect resulting from that action (e.g., smelling an odor or hearing a sound). Each participant was trained on two of the four novel objects over three training blocks interspersed between three alternating test blocks. Training blocks consisted of a stimulus image displayed on a screen, followed by the presentation of a photograph in which an actor depicted the object at its correct final action location. Object representations were assessed in subsequent test trials, during which participants were asked to indicate with a button-press whether an object shown in an action demonstration matched the object image that was displayed immediately prior. The results of the study revealed slower response reaction times to images in which a trained object was depicted at an incorrect end location as opposed to a correct one. However, this response time did not vary as a function of whether or not the object in the action demonstration was held using a correct or incorrect grip (Paulus et al., 2012). The sensitivity to action-related end location suggested by response time patterns indicates that participants successfully acquired object representations which included information regarding typical end goal location. The authors of the study propose that proper grip was not encoded in object representations as strongly as motor action due to the fact that participants were instructed only to visualize a salient effect resulting from grip manipulation and never received physical, concrete experience in this aspect. However, the researchers note that this effect may also be related to the novelty of the objects included in their study, and they predict that grip may be more relevant and revealing of object representations when associated with stimuli with which participants have had previous experience.

The findings yielded by the study performed by Paulus et al. (2012) serve to illustrate the importance of the goal end of an object as a key feature of functional object representations. Because motor planning requires an understanding of the object to be interacted with, which in some cases is completely determined by a unique functional end, it is highly likely that motor imagery is related to the type of visual mental imagery performed during object recognition. The interaction between visual cognition and efficient motor planning has been observed in both adults (Janczyk and Kunde, 2012) and infants (Barrett et al., 2008). Although motor planning is thought to be analytical in comparison to object perception, which is argued to generally rely on combined features (Janczyk and Kunde, 2012), this may favor motor planning as a more accessible pathway by which to identify individual features important for visually driven behavior. Paulus et al. (2012) study adds further support to the relative diagnosticity (in this case, diagnostic for classifying the appropriate grasp or movement) of particular object features over others and also suggests a potential avenue for identifying integral object components through associated motor behaviors. Previous research suggests that goal ends of objects are likely to carry categorical information related to their uses and the means, or action behaviors, by which those uses are efficiently achieved (e.g., Creem and Proffitt, 2001). Numerous studies of motor imagery explored through near-infrared technology further illuminate these findings; these are discussed in Section “Neural Activity.”

The implicit connection between gestural actions and the cognitive understanding of objects holds intriguing potential for the study of distinguishing features, but it is subject to significant weaknesses as well. Similar to questionnaires, motor behavior tasks provide a non-invasive, inexpensive method by which to assess distinguishing object parts that inform natural interactive behaviors. However, such testing is considerably more time-consuming than survey administration, and the resulting data require complex scoring and careful interpretation. In order to avoid confounds of novelty and inexperience, investigations of motor behavior regarding distinguishing features may best be applied to ecologically valid objects with which participants have had previous physical interactions. Categorical classification as implicated by specific gestures may allow for efficient object decoding based upon observation alone (Rosenbaum et al., 1992). However, this type of gestural relationship is acutely limited to manipulable objects and, what is more, manipulable objects that are associated with a clearly recognizable, stereotypical gesture. Nevertheless, implicit assessment of object features or categories through functional motor movements may illuminate the spatial locations and qualities of features that are typically targeted in motor movements. Based on the established functional connection between motor actions, such as grip, and the end location of an object (Rosenbaum et al., 1992), motor behavior therefore holds the potential to indicate essential structural features in tools and other manipulable objects. This method may be combined with data collected from other techniques used to assess diagnostic object features, such as questionnaires or neurophysiological measures, in order to form a more complete understanding of an object mental representation and its cognitively informative distinguishing features.

Eye movements associated with imaginary visual tasks are similar to those observed during perceptual tasks. Spontaneous eye movements during visualization of a scene reflect directional patterns comparable to those associated with perceptual viewing (Laeng and Teodorescu, 2002). Participants report experiencing increased difficulty in producing visual mental images when instructed to restrict their eye movements while doing so. When visualizing under this constraint, participants’ descriptions of the imaginary scene tend to become less detailed and limited to rudimentary features (Laeng and Teodorescu, 2002). The enhanced difficulty with which detailed visual mental imagery is produced when eye movements are restricted signifies an automatic, perhaps interdependent relationship between eye movements and the processing of visual imaginary scenes.

The prediction of an association between mental imagery content and concurrent oculomotor movements is by no means a novel one, and it has received empirical support dating back several decades (Brandt and Stark, 1997; Spivey and Geng, 2001; Laeng and Teodorescu, 2002; Johansson et al., 2006; Holm and Mäntylä, 2007; Ryan et al., 2007; Hannula and Ranganath, 2009; Williams and Woodman, 2010; Johansson and Johansson, 2014; Martarelli et al., 2016). In a direct comparison between visual inspection and mental visualization, repetitive sequences of fixation across diagrammatic checkerboard stimuli were recorded and analyzed in relation to the scanpaths observed during mental imagery of the same stimuli (Brandt and Stark, 1997). Participants were first familiarized with a checkerboard stimulus for 20 s and subsequently prompted to visualize the pattern on an empty grid for 10 s, followed by a second viewing period of 10 s. The protocol was repeated three times; stimuli were rotated by 90° in each subsequent trial, and eye movements were recorded using a video-based eye monitoring apparatus. String editing analysis of observed scanpaths across the two conditions revealed a high degree of similarity in saccadic patterns, suggesting that eye movements may play a role in organizing the visual content of a mental representation in the absence of physical stimuli. Although indications of grid size and location remained relatively consistent, scanpaths observed during imagery trials were found to be about 20% smaller than those observed during viewing trials, indicating an analogous but not identical relationship between saccades and the representations they reflect (Brandt and Stark, 1997), perhaps stemming from disparities between the representations and their physical counterparts. Nevertheless, the parallels observed in oculomotor patterns in this experiment lend strong support to the employment of eye movement behavior as an index of object features.

Although the precise nature of the relationship between saccades and object perception is still debated, there is some evidence that saccades index attention to specific object features during visual search. Eye tracking data suggest that saccadic patterns are influenced by peripheral object information acquired during visual search, thus reflecting attention to particular visual features based on available object information (Herwig and Schneider, 2014). Early fixations are also drawn by objects that retain intact low-level visual properties but are altered to exhibit object-intrinsic anomalies, such as unnatural rotation or color distribution, implicating the influence of peripheral object analysis on saccadic eye movement (Becker et al., 2007). These findings lend credence to the possibility that saccades index relevant, object-specific features based upon the observer’s pre-saccadic processing of the image.

There are several limitations that must be considered when applying eye movement tracking to the study of object feature detection, both in perceptual and imaginary tasks. The first of these is the potential confound of covert attention, during which an observer allocates increased cognitive attentional resources to a particular location in the visual field without executing a saccadic eye movement (Mccarley et al., 2002). The ability to manipulate attention in the absence of a change in physical behavior further reduces the reliability of eye movements as a direct and reliable indicator of active cognitive processing. Studies revealing poor memory performance despite accurate saccades to the location of previously displayed stimuli suggest that object properties are not necessarily coded in conjunction with spatial location (Richardson and Spivey, 2000; Johansson and Johansson, 2014). Similarly, tests involving eye movement manipulations during mental imagery have shown greater adverse effects on spatial aspects of mental imagery than on visual details (de Vito et al., 2014). Lack of spatial sensitivity and precision both in eye tracking equipment and the human fovea contribute to these issues.

Nonetheless, the relationship between oculomotor movements and spatial locations may be used to the advantage of object feature research. If discrete object features were equated to independent, distinct spatial locations, similar to the design employed by Brandt and Stark (1997), this connection could provide an opportunity to index individual feature attention through eye tracking. By equating discrete visual components with unique locations outside of the foveal visual field, participants are more likely to execute oculomotor movements in order to fixate individual visual features, thereby increasing the spatial resolution with which specific distinct features may be identified. The order, frequency, or duration of fixations on particular units may suggest a feature that is more salient than others, and may then be tested for efficiency in categorization to determine diagnosticity. This type of investigation could be applied to visual object search and subsequently compared to an analogous mental imagery condition. Several issues remain to be resolved before attempting such an experiment with real-world object stimuli, including the decision of the appropriate size at which object parts should be delineated, thus manipulating the amount of overall object information each unit comprises. In addition, changing the size of an object can alter one’s perception of it Sterzer and Rees (2006), and modifying the spatial configuration of an image may have deleterious effects on its holistic properties, thus influencing the manner in which it is processed (e.g., Martelli et al., 2005). Because the goal of object recognition research is to access the natural perception of stimuli and to identify the properties that facilitate this perception, it is important to minimize the amount of bias introduced by experimental manipulation. These concerns must be carefully addressed if confident inferences are to be made from the association between object features and spatial locations, but the benefits for understanding attention to classificatory visual components could be substantial.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A