The temporal convolutional network (TCN) is a temporal model derived from convolutional neural networks (CNNs). The main architecture of TCN primarily consists of residual blocks and dilated causal convolutions.
TCN has proven effective in short-term power load forecasting. It can effectively capture dependencies in time series data, enabling a deeper understanding of power load fluctuation patterns. Furthermore, the parallel processing capability of TCN allows for efficient processing of large datasets, while its residual module architecture mitigates the issue of gradient vanishing and enhances training stability. This is particularly crucial when handling complex time series data, such as power load.
Temporal convolutional networks utilize dilated causal convolution (DCC) to expand the receptive field, as depicted in Fig 2. DCC samples the input data at intervals, where d denotes the size of the interval. This approach allows for a larger receptive field to be achieved with fewer convolutional layers. The expression for dilated convolution is given by Eq (9).
where X represents the input data sequence and denotes the output results; F(t) represents the convolution result for the t-th element in the input data (X0,…,Xt); h(i) is the i-th element in the convolution kernel; k represents the convolution kernel size, and d is the dilation factor.
The residual block (RB) is a proven technique for overcoming the challenges associated with training deep networks. The architecture of the RB is illustrated in Fig 3. In TCN, the input to the residual block is denoted by X, and the output is represented by o, as shown in Eq (10).
where Activation is the activation function, which in this research is set to ReLU.
The TCN network is constructed by stacking multiple residual blocks, each of which has two dilated causal convolution layers. The weight normalization layer (Weight Norm) standardizes the weights and normalizes the inputs to the hidden layers. The activation function, ReLU, introduces nonlinearities into the TCN network. Dropout regularization prevents overfitting, while residual connections directly map the inputs and mitigate network degradation caused by adding more layers.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.