The autoencoder based on a deep neural network is considered as one of the most robust unsupervised learning models of the last few decades. With the unsupervised model, it becomes possible to extract effective and discriminative features from a huge unlabeled data set, which makes this approach widely applicable for the extraction of features and dimensionality reduction [36]. Basically, an autoencoder consists of a fully connected three-layer neural network where the encoder contains input and hidden layers and the decoder part comprises hidden and output layers. The encoder transfers the input data with a higher dimension into a feature vector with a lower dimension. After that, the decoder converts the data back to the input dimension. One of the main priorities of the deep neural network is to build a complex nonlinear relationship among the input data, which also helps in the autoencoder to effectively reconstruct the output of the decoder. Therefore, the reconstruction error will be decreased simultaneously through the overall training period and significant features will be stored in the hidden layer. Finally, the hidden layer output will depict the efficiency of the feature extraction of the designed autoencoder. Figure 6 represents the configuration of the basic autoencoder.
Basic architecture of the autoencoder.
For the n-dimension input data samples, , the output/activation of the hidden layer with m-dimension (m < n) can be calculated as Equation (10):
Here, and represent the weight matrix connecting the input and hidden layer, bias, and activation function, respectively.
After the decoding process, the reconstructed signal, at the output layer can be expressed as:
Here, represent the weight matrix and the bias vector of the output layer. The activation function used for both encoder and decoder parts is generally set as a sigmoid function, or any other activation function depending on the data type. The training process begins with some initial values of weights and biases.
During the training process, the parameters need to adjust for minimizing the reconstruction error between the original input data and the reconstructed output. The reconstruction error is quantified by the mean squared error (MSE), as mentioned in Equation (12), which is applied in our analysis.
In other cases, if the input values exist between 0 and 1, binary cross-entropy loss will be calculated as the reconstruction error with Equation (13):
In this analysis, a deep autoencoder (one which uses more than one hidden layer) is applied to find an approximation of the normal state of the bearing, whose architecture is provided in Table 3.
The structure of the designed deep autoencoder (DAE).
The scaled exponential linear unit (SELU) is applied as the activation function for both hidden and output layers of the DAE in this analysis. The recorded current signal has both positive and negative values; since SELU is a non-saturating type of activation function, it is a good choice for the type of signal used here, and it also tackles the vanishing gradient problem that occurred in the deep network architecture. Additionally, the normalization properties of SELU help to make the training process fast by converging the deep neural network quickly. The SELU can be defined as Equation (14):
Here, the coefficient values for and are set to approximately 1.05 and 1.6731, respectively, according to [59]. In this work, the optimizer used for updating the weight is adaptive moment estimation (Adam). This optimizer technique is becoming very well-known because of its ability to memorize prior gradients as well as prior squared gradients exponentially decaying average values of the loss function [60].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.