Assessment of network performance

CC Cristiano Cuppini
MU Mauro Ursino
EM Elisa Magosso
LR Lars A. Ross
JF John J. Foxe
SM Sophie Molholm
request Request a Protocol
ask Ask a question
Favorite

We performed several simulations to test network behavior before, during, and at the end of the training process, modeling performance for unisensory (auditory-alone and visual-alone) and multisensory inputs (congruent and incongruent audiovisual representations).

The network was stimulated with inputs simulating an auditory-alone, a visual-alone, or a congruent visual-auditory (multisensory) speech event. As described above, the speech event was assumed to be correctly recognized if the barycenter of the evoked activity above the detection threshold (30% of the maximum activity) in the pSTG/S area matched the element coding for the speech event presented as input to the model. We used different levels of auditory input amplitude, ranging from ineffective (which minimally activated the auditory speech region and generated 0% correct identifications in the model, efficacy level of 1) to a maximum level (able to saturate the auditory evoked activity and generated more than 80% correct phoneme identification in the adult, efficacy level of 7). The use of different auditory efficacy levels allowed us to mimic speech recognition at different auditory signal to noise ratios, as in previous work (Ross et al., 2011; Foxe et al., 2015). In contrast, the efficacy of the visual stimulus was held constant: we chose a visual level so that, in the adult configuration, the model presented a poor ability to detect speech based on visual information alone. Critically, this mimicked what we see in our experimental work (Foxe et al., 2015).

Since the presence of noise introduced variability in the network's outcome, for every level of efficacy of auditory input, unisensory and multisensory speech-detection abilities were evaluated for 100 speech events. To evaluate the acquisition of speech perception under unisensory and multisensory conditions, we computed the mean responses across all levels of input efficacy at different epochs of training.

The network was stimulated with an auditory or a congruent visual-auditory (multisensory) speech representation and we evaluated the time necessary for the pSTG/S neuron coding for the specified speech unit to reach the detection threshold (30% of its maximum activity). The configuration of the inputs and the simulations were the same as in the previous task. The mean response (in terms of recognition time) over all the 100 outcomes at each level of efficacy was computed separately for the unisensory and multisensory conditions. Moreover, we evaluated these data both at an early stage of maturation (after 2,500 training epochs), and in the mature configuration.

We assessed whether the network was able to reproduce the McGurk effect, whereby conflicting auditory and visual speech inputs lead to an illusory speech percept that did not correspond to the percept evoked by the same auditory input when presented in isolation. In this case, the network was stimulated with mismatched visual-auditory speech inputs, with the visual representation shifted by 4 positions (i.e., outside the receptive field of the veridical corresponding speech unit) with respect to the auditory one. During these simulations: (i) we verified whether the activity in the multisensory area overcame the detection threshold; (ii) in case of detection, phoneme recognition was assessed by computing the barycenter of the supra-threshold activity in the multisensory region, and approximating the closest phoneme. We assumed that the McGurk effect occurred when the detected phoneme was different from that used in the auditory input. Each phoneme was stimulated 20 times by its auditory representation at each level of efficacy, coupled with a visual representation of a 4 position-distance phoneme, and the network response was averaged over all phoneme representations and all levels of auditory input efficacy. We also assessed whether the network was sensitive to the McGurk effect at different training epochs.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A