The subjects’ dependencies on the direction of the previous stimulus and/or previous response in estimating the direction of the current stimulus were quantified by fitting the first derivative of a Gaussian (DoG) curve(s) to their response errors [28]. The DoG curve is given by
where y is the response error, a is the amplitude of the curve peaks, w determines the width of the curve, and c is a constant, . The input to the function, x, was either the relative direction of the previous stimulus (i.e., Stimulust − 1 − Stimulust, with index t for trial; Stimulus model) or relative direction of the previous response (i.e., Responset − 1 − Stimulust; Response model). To effectively capture the characteristic pattern in the estimation data (Fig. (Fig.2),2), we fitted a linear sum of two independent DoG curves, one receiving as input the relative direction of the previous stimulus, and the other receiving the relative direction of the previous response (Stimulus & Response model). To explore the possibility that the centers of biases are not exactly located on the previous stimulus or previous response, we fitted another linear sum of two independent DoG curves, one with a positive amplitude (i.e., aattraction > 0), and the other with a negative amplitude (i.e., arepulsion < 0; Attraction and Repulsion model). Crucially, the inputs to the curves, xattraction and xrepulsion, were each set to be a combination of the previous stimulus and previous response, relative to the current stimulus: i.e., βattractionResponset − 1 + (1 − βattraction)Stimulust − 1 − Stimulust and βrepulsionResponset − 1 + (1 − βrepulsion)Stimulust − 1 − Stimulust, respectively. Here, βattraction and βrepulsion are free parameters that determine the center of the corresponding bias curves. For example, for the attraction curve, if βattraction is one, xattraction is reduced to Responset − 1 − Stimulust, and the current response is attracted toward the previous response. If βattraction is zero, xattraction is reduced to Stimulust − 1 − Stimulust, and the current response is attracted toward the previous stimulus. Similarly, for the repulsion curve, if βrepulsion is one, the current response is repelled away from the previous response, and if βrepulsion is zero, the current response is repelled away from the previous stimulus. More generally, the beta parameters can be interpreted as the center of the corresponding bias curves in the stimulus space normalized such that the previous stimulus is zero and the previous response is one.
All parameters were estimated using a hierarchical Bayesian approach that uses the aggregated information from the entire population sample to inform and constrain the parameter estimates for each individual [111]. Specifically, we assumed a hierarchical prior on parameters, in which the parameters for each individual were drawn from independent Gaussian or von Mises distributions characterizing the population distributions of the model parameters. At the trial level, the response error was modeled following a von Mises distribution
with indices i for individual subject and t for trial (specifically for the description of the hierarchical Bayesian model, we use parenthesis for indices corresponding to specifications of the hierarchical level). The mean of the von Mises distribution, μ(i, t), was determined by a DoG curve or a linear sum of two DoG curves depending on the model (see above). At the individual level, the model parameters were constrained by population-level parameters. Specifically, all parameters at the individual level were parameterized using Gaussian or von Mises distributions.
where directly represents the peak location of the DoG curve. The noise standard deviation σ(i) was further restricted to positive values. Specifically for the Attraction & Repulsion model, the amplitude parameters aattraction and arepulsion were further restricted to positive and negative values. At the population level, priors on the mean of the population distributions were set to broad distributions with ranges large enough to cover all practically plausible values.
Note that the priors on μβ favored neither previous stimulus (i.e., β = 0) nor previous response (i.e., β = 1) as the center of the bias (i.e., neutral priors). Again, specifically for the Attraction & Repulsion model, the population mean of amplitude parameters, and , were restricted to positive and negative values, respectively.
Priors on the standard deviation of the population distributions were set to gamma distributions
with their shape parameter s and rate parameter r are set so that their mode and standard deviation would be approximately a half and twice the standard deviation of the individual-level parameters, respectively, making the hyper-priors vague on the scale of the data [111]. Based on earlier reports, we assumed that the standard deviation of the amplitude, peak location, and noise standard deviation across individual subjects would be 1°, 5°, and 2°, respectively, and parametrized the hyper-priors accordingly. For the beta parameters in the Attraction & Repulsion model, we naively assumed that the standard deviation across individual subjects would be 0.15.
We used a Markov chain Monte Carlo (MCMC) technique, specifically a Metropolis-Hastings algorithm, to compute the posterior probability density of the parameters. Initial values of the individual-level parameters were set to amplitude = 0°, peak location = 30°, and noise standard deviation = 10°. For the Attraction & Repulsion model, in which amplitude parameters aattraction and arepulsion were constrained to be positive and negative, respectively, we set them to be ±5°, considering the fitting results of the Stimulus & Response model. Initial values of the beta parameters βattraction and βrepulsion were set to 0.5, favoring neither previous stimulus (i.e., β = 0) nor previous response (i.e., β = 1) as the center of the bias. Initial values of the population mean were set to be the same with the individual-level parameters, and initial values of the population standard deviation of amplitude, peak location, noise standard deviation, and beta parameters were set to 1°, 5°, 2°, and 0.15, respectively. We used four independent chains in parallel, each with an independent random number generator, and used the first one million iterations for each chain as a burn-in period, thereby minimizing the influence of the initial values of the model parameters. After the burn-in period, the subsequent one million new samples from each chain were used to estimate the posterior probability density function. We further thinned the samples by selecting every 1000 samples in the chain, resulting in a final set of 4000 samples for each parameter and reducing autocorrelations in the samples to near zero. Convergence of the chains was confirmed by visual inspection of trace plots and Gelman–Rubin tests [112]. All parameters in the models had < 1.1, suggesting that all chains successfully converged to the target posterior distribution.
For the statistical significance of the model parameters, we report the mode and 95% credible interval of the posterior distribution of the population-level mean parameters, along with the results of a classical one-sample t test on the individual-level parameter estimates after testing for normality using a Kolmogorov–Smirnov test. Similarly, to statistically compare the model parameters, we report the mode and 95% credible interval of the posterior distribution of the difference between population-level mean parameters, along with the results of a paired-sample t test on the individual-level parameter estimates.
Model predictions shown in Fig. Fig.3A–C3A–C are the point estimate of the response error made with the population-level parameter estimates for each model. We used AIC for model comparison. To report the AIC, we computed the AIC for each model and each individual using the individual-level parameter estimates and then averaged the AIC difference from the best-fitting model (i.e., the one with the highest mean AIC values) across subjects. The confidence interval for the mean AIC difference was computed by bootstrapping using the bias-corrected and accelerated percentile method (10000 samples).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.