We use residual connections at each message passing time step t to ensure that all levels contribute to the output representation ŷ. Therefore, we define Rt as the readout at time step t and R as the overall readout function. During each time step, a neural network transforms the node’s hidden state to the desired output dimension, followed by a simple add pool to guarantee node order invariance. Finally, we use a LogSoftmax layer to yield the final output:
The final output (ŷ) is then a simple linear combination of the readouts Rt:
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.