Data analysis

SM Sarah L. Master
SL Shanshan Li
CC Clayton E. Curtis
ask Ask a question
Favorite

To confirm the success of our staircase procedure, we computed the mean response accuracy across “easy” and “hard” trials for all participants. We also computed the mean response times across task conditions. We ran planned t-tests to compare the mean accuracy and response times (RTs) across trial types.

To assess the time course of effort-related pupil dilation during the WM delay period, we ran a sliding window procedure comparing mean pupil size across hard and easy trials on the group level. Each window was 50 milliseconds in length (25 samples per condition). We ran a t-test over each window comparing the distributions of mean pupil dilations for hard and easy trials, with data aggregated across all 12 participants. As we tested 306 separate time windows (assessing both the stimulus presentation and delay periods), we corrected for multiple comparisons by assessing significance at a Bonferroni-corrected level of 0.00016. We then plotted the significance of timepoints during the delay period by overlaying stars at the center of each significant window (Figure 1D).

Within-subject, we assessed the significance of the effect of the difficulty manipulation on pupil dilation by assessing the confidence intervals around the predictor betas, and whether they included 0. Most participants’ pupils were significantly more dilated during the delay period of hard trials compared to easy trials (within-subject trial-by-trial regression analyses yielded significant difficulty terms in 9/12 participants). The difficulty manipulation significantly increased pupil dilation in the stimulus presentation period for 7/12 participants, and over the course of the entire trial in 7/12 participants. All results from the fMRI data were the same when we included only the 9 participants with significant pupil differences.

We averaged time series from each voxel across all retinotopy runs (9–12 per participant) in volume space and fit a pRF model for each voxel using a GPU-accelerated extension of vistasoft (github.com/clayspacelab/vistasoft). We fit a compressive spatial summation isotropic Gaussian model as implemented in mrVista (see34 for a detailed description of the model). We created a high-resolution stimulus mask (270 × 270 pixels) to ensure similar predicted response within each bar size across all visual field positions (to mitigate the effects of aliasing with a lower-resolution stimulus mask grid), and began with an initial high-density grid search, followed by subsequent nonlinear optimization. Note that, in these analyses, because we conducted a grid search on all voxels independently, there was no spatial smoothing of parameter estimates across neighboring voxels prior to the nonlinear parameter search. For all analyses described below, we used best-fit pRF parameters from this nonlinear optimization step.

After estimating pRF parameters for every voxel in the brain, ROIs were delineated by projecting pRF best-fit polar angle and eccentricity parameters with variance explained ≥10% onto each participant’s inflated surfaces via AFNI and SUMA. ROIs were drawn on the surface based on established criteria for polar angle reversals and foveal representations34,7274. Finally, ROIs were projected back into volume space to select voxels for analysis. In this report we considered data from V1, V2, V3, V3AB, intraparietal sulcus (IPS) area IPS0, IPS1, IPS2, IPS3, inferior precentral sulcus (iPCS), and superior PCS (sPCS). For simplicity, we combined data across ROIs based on shared foveal representations, resulting in larger ROIs for analysis: V1-V3, V3AB, IPS0-IPS1, IPS2-IPS3, iPCS, and sPCS. Only voxels with ≥10% variance explained in pRF model fits were included in subsequent fMRI analyses.

In addition to using these pRF parameters to draw functional ROIs in surface space, we used them to select specific voxels for analysis trial-by-trial. The pRF model provided an estimate of the location and size of each voxel’s receptive field (RF). Using the PRF-fit parameters for each voxel, we drew simple binary receptive fields containing the fit center of the RF and 15 degrees on either side. If the stimulus location fell within that voxel-specific box, then that voxel’s activity was labeled “in RF” on that trial. Voxels with receptive fields that contained the point 180 degrees away from the stimulus (or points ≤15 degrees away) were labeled “out RF”. We performed univariate analyses of delay period BOLD responses comparing across in RF and out RF voxels (“RF type” analyses).

To visualize the effect of task difficulty on average BOLD activity, we computed percent signal change of the BOLD signal, using the BOLD activity on the first 3 TRs of each trial as the denominator. Using the beginning of the trial as the baseline also linearly detrended the BOLD data. We computed percent signal change separately for each voxel within each ROI, then averaged across voxels, trials, and participants to produce average time courses within our conditions of interest. We then plotted the group mean time courses for each ROI. Standard error of the mean (SEM) was calculated across participants’ average time courses by taking the standard deviation across participants, then dividing by the square root of the sample size (N=12). SEM was then used to draw error bars. To assess differences between conditions, voxel types and ROIs, we performed a 3-way repeated measures ANOVA on late delay period activity (seconds 6–12 of the delay). We confirmed the significance of the F-values obtained by this ANOVA with permutation testing. We constructed null distributions of F values by shuffling the condition, ROI, and RF type labels 1000 different times, assessing the F-statistic from that shuffled dataset each time. We then compared the original F-statistic to the null distribution of F-statistics obtained by shuffling, and obtained a p-value from the number of F-statistics which were greater than or equal to the original F. Interactions between variables were investigated using post-hoc t-tests within ROIs, comparing across condition means (RF type and trial difficulty).

We trained and subsequently tested the Bayesian decoder (TAKFAP; see: Bayesian Decoding with TAKFAP) using the mean percent signal change of each voxel in each ROI from seconds 6 to 12 of the delay period, yielding one number per voxel per trial.

We estimated the magnitude of BOLD responses during the WM delay period (the delay period magnitude) with a general linear model (GLM) which included separate regressors for each trial. We created the trial-wise model in AFNI using 3dDeconvolve. For 10 participants, this resulted in 240 delay period regressors. Two participants completed only 228 and 216 trials, resulting in fewer regressors. Each regressor modeled the activity evoked during the delay of one single trial using a boxcar function, time-locked to delay period onset and convolved with an assumed hemodynamic response function (Gamma function) using the ‘GAM(p,q,d)’ function in AFNI. The parameters of the Gamma function were set to the default values (p=8.6, q=0.547), except for the duration (d) parameter, which was set to match the length of the delay period (12 seconds).

The GLM also included two regressors to account for average cue and response period activity and six motion regressors obtained using afni_proc.py during the motion correction step of preprocessing. The ordinary least squares solution to this trial-wise GLM yielded a unique activation estimate (β value) for each delay period.

We obtained trial-by-trial estimates of WM representational accuracy and uncertainty using a Bayesian decoder31,32. To implement this decoder, we used the publicly available TAFKAP toolbox (https://github.com/jeheelab/TAFKAP). The TAFKAP method provided trial-by-trial probability distributions over possible memorized stimulus values based on the patterns of BOLD activation on those trials. Specifically, in training the algorithm learned a generative model relating trial-by-trial voxel-wise activation patterns to the true location of the target stimulus on those trials. Then it inverted the learned associations between patterns of activation and stimulus locations to provide the posterior probability of the target stimulus’s location, centered on the mean (maximum likelihood) value:

Where b is the multivariate voxel response to each stimulus s, and θj is the set of free parameters which were trained on one set of training data j. To determine the likelihood of voxel patterns b given the stimulus on screen s, the decoder assumed that each voxel had a tuning curve which could be approximated as a weighted sum of eight basis functions which evenly tile location space. The basis functions were half-wave rectified sinusoidal functions raised to the power of 8 and centered on evenly spaced means φK:

The response of each voxel bi given stimulus s was then modeled as a weighted combination of these basis functions’ responses to s

The weighting matrix W determined the degree that the tuning and noise function from each basis function contributed to the response of each voxel. The noise term of each basis function ηk was applied to the same extent as the activation function of that f(s)k, capturing that similarly-tuned voxels exhibit similar noise. υi was the noise specific to voxel i. TAFKAP assumes that both sources of noise are normally distributed with mean 0. Fitting these individual noise terms for each voxel may not be completely possible, especially if the number of voxels to fit is larger than the number of trials in the dataset. Thus TAFKAP shrinks the sample covariance matrix to a simpler target covariance matrix, to a degree determined by free parameter λ. More details of this fitting procedure are available in32.

Our input to the model was the z-scored mean percent signal change during the late delay period (6–12 seconds of delay) for voxels within each ROI. Late delay period activity was z-scored within-run. We ran the model separately for each of our ROIs. We specified a flat prior p(s) over all possible target stimulus locations, as the polar angles of the target stimuli were drawn from a uniform distribution. Each set of training data was bootstrap resampled to improve the fidelity of the recovery process. For each trial in the test data, the decoding results from each resampling were averaged to obtain one decoded posterior probability distribution:

Then, a trial-specific estimate of the remembered stimulus location was obtained by taking the circular mean of the posterior inferred from the mean voxel-wise activity on that trial. The uncertainty around the estimated location was obtained by taking the circular standard deviation of the posterior. The TAFKAP algorithm has been used to relate representational accuracy and uncertainty to behavioral variability and confidence reports during perception and WM31,35. Thus there is evidence that TAFKAP reliably measures the quality of WM representations.

To test whether representational quality changed according to task demands, we performed a 2-way repeated measures ANOVA on WM decoding accuracy and uncertainty with task condition and ROI as the main factors. If we obtained a main effect of trial type or ROI, we ran permutation testing. The true F-scores for each main effect were compared to a null distribution of F-scores obtained by shuffling trial type and ROI labels 1000 times each. P-values were determined by computing the number of F-scores greater than or equal to the true F-score. In addition, we ran t-tests to assess condition differences in representational quality within each ROI.

In addition to understanding differences in condition effects in our ROIs, we aimed to quantify which ROIs contain decodable stimulus information during the delay period. To assess this, we ran circular correlations of decoded stimulus positions versus true stimulus positions trial-by-trial. We then compared the group mean circular correlation values against 0 using a t-test (Supplementary Figure 1).

Here and in other work42,43 we observed a tradeoff across ROIs between decoding accuracy and the magnitude of delay period activity. For example, sPCS demonstrated low decoding accuracy but high activation above baseline during the WM delay. To shed light on the distinct roles of and possible interactions between neural regions, we performed a trial-wise correlation analysis between decoding in visual cortex and amplitude of delay period activity in all voxels in the brain. Our goal was to test the hypothesis that the quality of WM representations in visual cortex were predicted by the magnitude of delay period activity in association cortex. To do so, we correlated delay period betas (magnitudes) with decoded error and uncertainty from visual cortex, specifically V1-V2-V3. With our Bayesian decoder, we obtained trial-wise probability distributions of the location of the memoranda, as a measure of the quality of WM representations. The memory error and uncertainty were derived from the mean and standard deviation of the posterior, respectively (see Bayesian decoding with TAFKAP). Note that the predicted relationship between delay period activity and decoding was likely present during both easy and hard trials. Thus, to maximize our power to detect the predicted correlation, we computed correlations across all trials. Future studies with more power from more trials might be able to test how task difficulty might mediate these correlations.

In order to interpret the correlation magnitudes, we obtained within-subject z-scores by applying an inverse hyperbolic tangent transform to the correlation coefficients, then used these for group-level testing and visualization. To reduce noise on the z-score maps, we applied a 5-mm full-width at half-maximum (FWHM) Gaussian kernel for spatial smoothing using AFNI’s 3dmerge function. To make map-based inferences at the group level, the z-transformed correlation maps of individual participants were spatially normalized from their native space into the normalized Montreal Neurological Institute (MNI 152) space. We first obtained each subject’s high-resolution anatomical image (0.8 mm isotropic voxel), aligning its orientation and voxel size with the standard MNI 152 template (MNI152_T1_2009c) using 3dresample. We then registered the anatomical images to the MNI template using the automatic Talairach (affine) transformation function (@auto_tlrc). With the output registration parameters obtained from registration, we then use 3dAllineate to transform individual z-maps into standard MNI coordinates.

To assess the significance of our effects on a group level, we performed a t-test using AFNI’s 3dttest++ function to identify voxels for which the mean of transformed and smoothed correlation coefficients across participants was significantly greater than zero. Once we obtained an estimate of which voxels showed significant effects, we mitigated the increased false alarm rate associated with running multiple comparisons by running a cluster permutation test to identify significant clusters, not voxels. We conducted a cluster-based simulation to estimate the probability of false positive clusters using 3dClustSim in AFNI. First, we calculated the smoothness of noise from our subject-level maps, using the 3dFWHMx function on each individual’s residual time series (errts files from AFNI 3dDeconvolve), and obtained subject level smoothness in three dimensions (x, y, z). Next, we determined the cluster size threshold using 3dClustSim (input averaged smoothness across participants, see Supplementary Table 1 for simulation results) and displayed the group-level results using the correct thresholding (Figure 5A).

After obtaining significant clusters at the group level, we wanted to verify that they were overlapping with our functionally-defined regions of interest. Because each subject’s ROIs were drawn in native, not template space, we first extracted each subject’s ROIs (see Retinotopic mapping and ROI definition) and transformed them into MNI template space. We then created an overlay in template space of all individuals’ ROI masks, to create a group-level functional ROI map. Each voxel in the overlay had a value from 0 to 12, indicating how many participants’ ROIs contained that voxel (Figure 5B). This map allowed for visual inspection of our significant clusters and the location of the original functional ROIs, to confirm their co-location.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A