Deep Learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have already dramatically improved the state-of-the-art speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep Learning is a model that can discover the more complicated structure of datasets by using the back-propagation algorithm. According to the discovered structure, the mode can change the internal parameters. The internal parameter of each layer is the result of the previous layer47. For different complex datasets, the number of layers required for Deep Learning is varied. We think that although the relation between clinical side effects and adaptability of drugs may be not so strong, there may have deeper relation between clinical expressiveness and final efficacy. That means clinical expressiveness will affect final efficacy. This relation is suitable for the main idea of Deep Learning which is trying to discover the deeper relation between the data through Multi-layer Network and back Propagation algorithm. Therefore, we try to establish a Deep Learning model to verify our thought.Based on the number of samples and dimensions of the drug datasets processed in this paper, we propose the four-layer Deep Learning model to deal with these datasets. The Deep Learning model which proposed in this paper is shown in Fig. 5.
The four-layer Deep Learning model constructed in this paper, represents the data of each input node, Srepresents the data of each output node. is the weight between the input layer and the hidden layer, is the weight between the first hidden layer and second hidden layer and is the weight between the hidden layer and the output layer.
The number of nodes in the input layer and the output layer of the Deep Learning network. Here we calculated the number of nodes in the hidden layer using the following equation:
Where h is the number of hidden layer nodes, m is the number of input layer nodes, n is the number of output layer nodes, α is an adjustment constant between 1 to 10, and generally, .
Setting the weight between node and node is , the threshold of the node is , and the output value of each node is . The output value of each node in the current layer is changed with the output value of all nodes in the previous layer. The weights and the thresholds of the nodes are implemented by an active function. The equations are as follows:
where is the active function represented by the sigmoid function, and its equation as following:
The computation procedure is from top to bottom and then from left to right, and it needs to be observed strictly to finish the entire forward process.
After finishing the forward pass process, we need to construct the reverse transfer process. The most important thing in the reverse transfer process is the adjustment of the weights and thresholds between each adjacent layer. The specific adjustment steps are as follows:
Step 1. Assume that all results of the output layer are and the equation of error function is as follows:
Step 2. According to the gradient descent method, the weights and thresholds of the functions are modified in several times in order to minimize the error function. The gradient of is divided by the correction of the weight vector at the current position. For the output node j:
Step 3. In order to calculate the weights and thresholds between the hidden layer and the output layer, we derive the active function which represents by equation (4), then through equations (7) and (8) for , finally and are calculated by the equations (9) and (10):
Step 4. Calculate the thresholds between two hidden layers and between the input and hidden layers. In equations (11) and (12), we suppose that is the weight between the node m belongs to the first hidden layer and the node n belongs to the second hidden layer. The is the weight between the node belongs to the input layer and the node belongs to the hidden layer. The thresholds and are calculated by the equations (13) and (14):
Step 5. According to the gradient descent method and the formulas, which mentioned above, equations (15) and (16) are used to adjust the weights and thresholds between the hidden layer and the output layer. The equations (17) and (18) are used to adjust the weights and thresholds between two hidden layers. The equations (19) and (20) are used to adjust the weights and thresholds between the input layer and the hidden layer:
There is the whole procedure of the reverse transfer process in the Deep Learning method which is proposed in this paper. To complete the learning process of the entire Deep Learning network, the continuous adjustments of weights and thresholds are necessary. We can set an error threshold or a maximal number of cycles as a stop criterion to break off the entire learning process.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.