Sequential Reaching Task for the Study of Motor Skills in Monkeys.

The ability to perform a sequence of movements is a key component of motor skills, such as typing or playing a musical instrument. How the brain binds elementary movements together into meaningful actions has been a topic of much interest. Here, we describe two sequential reaching tasks that we use to investigate the neural substrate of skilled sequential movements in monkeys after long-term practice. The movement elements performed in these tasks are essentially identical, but are generated in two different contexts. In one task, monkeys perform reaching movements that are instructed by visual cues. In the other, the monkeys perform reaching movements that are generated from memory after extended practice. With this behavioral paradigm, we can dissociate the neural processes related to the acquisition and retention of motor skills from those related to movement execution.

brain acquires the associations between elements and orchestrates the production of the resulting sequence is a central question in neuroscience. The neural basis of sequential movements has been studied using various behavioral tasks in human and non-human primate subjects. For example, nonhuman primates were trained to perform a sequence of arm movements by manipulating a handle in a predetermined order (e.g., push-pull-turn; Tanji and Shima, 1994) or a sequence of button presses on a board (Nakamura et al., 1998). In humans, the Serial Reaction Time (SRT) task has been widely used to examine procedural learning (Nissen and Bullemer, 1987;Robertson, 2007). In many sequential tasks, however, the subject is required to pause between sequential elements, which may break up the natural fluidity of skilled sequential movements (Nissen and Bullemer, 1987;Nakamura et al., 1998;Tanji and Shima, 1994). In addition, sequential tasks are often designed to be mastered within a relatively short period of training compared to the normally prolonged and staged development of motor skills (Tanji and Shima, 1994;Karni et al., 1995;Nakamura et al., 1998;Dayan and Cohen, 2011;Wymbs and Grafton, 2013). We have devised a variant of the SRT task to overcome these impediments. In our behavioral paradigm, subjects perform the task continuously at their own pace. The design of our behavioral paradigm enables us to detect slow improvement of the motor skills as a change of performance parameters. The learning curve of monkeys for even short sequences extends for substantial amounts 2 www.bio-protocol.org/e3719 of time with continued practice (up to 2 years, Matsuzaka et al., 2007). Furthermore, in our task, subjects execute essentially identical movements in two different contexts: memory-guided and visually-guided.
While both contexts engage processes related to the production of motor output, only the former engages processes related to motor learning and memory. Thus, our behavioral paradigm allows us to disentangle the substrates for the learning of sequential movements from those for motor execution.
Our paradigm has been instrumental in revealing neural properties associated with the acquisition and maintenance of skilled sequential movements in motor areas of monkeys (Picard and Strick, 2003;Matsuzaka et al., 2007;Picard et al., 2013;Ohbayashi et al., 2016;Ohbayashi, 2020).  The monkey is required to touch the target to make a correct response. The monkey receives liquid as reward. We have used touch sensitive monitors based on infrared and surface acoustic wave technology.
While any technology will be adequate for monkey and human performance, it may be a consideration for certain applications.
3. Reward delivery system (bottle, tubing, solenoid and straw) Our system is gravity-based and controlled by a solenoid (Parker Hannifin Co., NJ) positioned between the liquid bottle and the tube that the monkey drinks from Figure 1. The solenoid is normally closed and opened by a DC pulse sent from the task controller. In our experience, the opening of the valve makes an audible click.
4. Speaker to generate a tone for feedback of the task performance (i.e., correct or error)

Program to control the behavioral task and record task performance
We use the TEMPO Experiment Control System from Reflective Computing Inc.
(http://www.reflectivecomputing.com/). Any programmable software to generate video displays, current pulses, and acquire equipment-generated analog or digital signals would be appropriate. 3. The monkey's non-working arm is restrained to the arm holder which is a part of primate chairs using Velcro straps or an adjustable cuff (e.g., Christ Chair system, available here). 4. We train the monkeys to perform the tasks using standard reinforcement-based conditioning procedures.

B. General Task Description
The monkeys are trained to perform the Random and the Repeating tasks in alternating blocks. In both tasks, the monkeys are required to make reaching movements to targets displayed on a touch sensitive monitor with their right arm. However, the spatial visual cues are presented according to a different order and timing between the tasks (for details see Procedure C). There is no visual feature to distinguish the Random and Repeating tasks. Here, we describe general features that are common to both tasks.  2. When the task starts, a trial is initiated and one of the targets is filled with yellow color ( Figure   2).
3. To make a correct response, the monkey is required to contact the yellow target within 800 ms after its coloring. 5. Immediately after the animal's contact on the monitor, the yellow fill disappears, and a new trial starts.
6. The task controller generates a feedback sound for each response (correct: 1 kHz tone, 100 ms duration; error: 50 Hz tone, 100-200 ms duration). In the case of errors or no response, the trial is immediately repeated.
7. The task controller generates a reward after a number of correct responses. Initially, the monkey receives a reward for every correct response. When the monkey understands the rules of the task, the reward frequency is gradually decreased to every 4-5 correct responses. The rate of reward is adjusted for each animal and depending on sequence length in the Repeating task so that the rewards are not associated with particular sequence elements (e.g., every 4th for a 3element sequence). A current pulse generated by the task controller opens the solenoid in the reward delivery system for a period of time. This duration varies depending on the setup of the reward delivery system (i.e., height of the water bottle, tubing size).
8. Once the training session is initiated, the monkeys typically perform the task, touching one target after another without interruption, until satiety.
C. The Random and the Repeating Tasks 1. In the Random task, new targets are presented according to a pseudo-random sequence 100 ms after a correct response (Figures 2A and 3A). With this short delay, the monkeys have no time to try to guess the location of the next target. Therefore, in the Random task, the monkey performs visually-guided reaching. 6 www.bio-protocol.org/e3719   (Figures 2C-2D). The number of movement elements in a sequence can be modified according to the experiment objective. We trained monkeys successfully on sequences of 3-12 elements in length. In the Repeating task, new targets are presented 400 ms after contact of the correct target ( Figures   2B-3B). The monkeys are allowed to touch the new target during the 400 ms delay, before the presentation of the visual cue (yellow fill). When this happens, the current trial ends without coloring the target, the trial is recorded as correct and the task is incremented to the next trial/target. The delay promotes the performance of predictive responses in which the animal anticipates the next target in a sequence ( Figures 2B right panel, 3C), but does not inhibit the flow of movements in a learned sequence. With practice, the monkey performs internally-guided reaching. 7 www.bio-protocol.org/e3719 3. Each task is performed continuously in alternating blocks of 200-500 trials until the monkey stops working for a total usually in the order of 5,000 (Cebus) or 12,000 (Macaque) trials ( Figure   3D). Monkeys are supplemented with water if they do not obtain enough during training. D. Training Schedule 1. We first train the monkey on the Random task. Monkeys become proficient in the performance of the Random task after about 50 days of practice. The monkeys make correct responses in > 80% of trials within 10 training sessions. In the Random task, performance parameters (e.g., RT) stop improving after approximately 50 training sessions.
2. We then introduce the 1 st sequence of the Repeating task. From then on, the monkey practices the sequence of the Repeating task and the Random task on every training session in alternating blocks of trials ( Figure 3D). There is no interruption or visual distinction between blocks of Repeating and Random tasks.

Data analysis
For every trial, we record various task parameters and measures of performance. Recorded performance measures are: correct response, wrong hit error, no hit error or corrective response (correct responses that followed an error). From the times of touch screen hits, we derive the Movement Time (MT), Target Hold Time (THT), and Response Time (RT) associated with each response. We define MT as the interval between the release of contact from one target to the time of touch of the next target. We define THT as the interval between the contact of a target and its release (in the next trial). We define RT during the Random task as the time between the presentation of a new target and contact of that target. RT during the Repeating task is defined as the time between two targets touches minus the delay time, 400 ms. This results in a negative RT when the monkey contacts the next target in the sequence before its coloring ( Figure 3C). RT less than 150 ms is considered to be predictive (Ohbayashi et al., 2016, Ohbayashi, 2020 represents a conservative cut-off for predictive responses as it is too fast for a reaction time to the visual cue. We excluded the following trials from analysis: 1) corrective responses, because in this case the target is predictable as the error trial is repeated; 2) no-hit error responses, because it's impossible to determine what caused the monkey not to respond in the allowed time frame (e.g., low motivation, distraction, or genuine hesitation). We perform a movement-based analysis of performance and experimental data. Data obtained for movements of the Repeating task are best compared with data for the corresponding movements performed during the Random task (e.g., data for movement 5 to 3 during the Repeating task vs data for movement 5 to 3 during the Random task). Because movements of the Repeating sequence are a subset of the movements of the Random task, more Random trials may be necessary to obtain a sufficient sample for analysis.