A simple neural-network architecture comprising a single hidden layer with two nodes was shown in de Taillez et al.23 to yield the best performance from a group of more complicated networks considered. Our adaptation of that network, shown in Fig. 3, includes batch normalization32 before the inputs to each layer, and a hard hyperbolic tangent (as opposed to a linear function) for the output layer’s activation to enforce our prior expectation that the audio envelope be bounded.
The neural network architecture for stimulus reconstruction, based on the design in de Taillez et al.23. There is one hidden layer with two nodes (FC1) to enforce significant compression of EEG data before being transformed to a predicted audio stimulus (see Fig. 1 for the system architecture). BN = batch normalization, FC = fully connected.
The network was trained with the Adam optimizer using a mini-batch size of 1024 samples, no weight decay, a learning rate of 10−3, and 2400 iterations. Following de Taillez et al.23 we also employed a correlation-based loss function rather than a mean-squared error-loss function to exploit the prior knowledge that we ultimately will be testing the reconstructed waveform and AAD performance with a correlation metric.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.