-
The backpropagation algorithm (BP algorithm) consists of two steps (stimulus propagation and weight update) and iteration repeatedly until the network's response to the input reaches a predetermined target range.
-
Backpropagation algorithm, referred to as BP algorithm, is a learning algorithm suitable for multi-layer neuronal networks.
It is based on the gradient descent method. The input-output relationship of the BP network is essentially a mapping relationship: the function of a BP neural network with n input and M output is a continuous mapping from the n-dimensional Euclidean space to a finite domain in the m-dimensional Euclidean space, which is highly nonlinear.
Its information processing ability is higher than the multiple recombination of simple nonlinear functions, so it has a strong function reproducibility ability. This is the basis for the application of the BP algorithm.
Introduction to the motivation of backpropagation algorithmsBackpropagation algorithms are designed to reduce the number of common subexpressions regardless of the overhead of storage. Backpropagation avoids the exponential of duplicate subexpressions. However, other algorithms may avoid more subexpressions by simplifying the computed graph, or may also save memory by recalculating rather than storing those subexpressions.
-
The backpropagation algorithm (process and formula number search derivation) is as follows:
The backpropagation algorithm is suitable for a learning algorithm of multi-layer neuronal networks, which is based on the gradient descent method. The input-output relationship of the backpropagation algorithm network is essentially a mapping relationship: the function of a n-input m-output BP neural network is a continuous mapping from an n-dimensional Euclidean space to a finite domain in the m-dimensional Euclidean space, which is highly nonlinear.
The backpropagation algorithm mainly consists of two links (stimulus propagation and weight update) and iteration repeatedly until the network's response to the input reaches a predetermined target range.
The information processing ability of the backpropagation algorithm is higher than the multiple recombination of simple nonlinear functions, so it has a strong function reproducibility ability. This is the basis for the application of the BP algorithm. Backpropagation algorithms are designed to reduce the number of common subexpressions regardless of the overhead of storage.
Backpropagation avoids the exponential of duplicate subexpressions.
-
Bubbling Happy Weekend Duck!
For example: as shown in the image below, this is:A three-layer neural network with a hidden layer,-Little Girl Hidden layer node.
Yellow Cap outputs the layer node.
Doraemon error.
The little girl receives the input signal on the left side and produces the output result through the hidden layer node, while Doraemon guides the parameters to adjust in a better direction. Since Doraemon can directly feed back the error to the little yellow cap, the parameter matrix on the left that is directly connected to the little yellow hat can be directly optimized by the error (real longitudinal line).The left-hand parameter matrix, which is directly connected to the little girl, cannot be directly optimized because it does not get direct feedback from Doraemon (dotted brown line). However, due to the backpropagation algorithm, the feedback of Doraemon can be transmitted to the little girl and then produce indirect errors, so the left weight matrix connected to the little girl's straight spike can be updated by indirect errors, and the error will be reduced to a minimum after several rounds of iterations.
In other words, the little boy gets a direct error, and the little girl gets an indirect error
The whole process will be demonstrated with an example.
Suppose there is a weighted network layer in the figure below, the first layer is the input layer, which contains two neurons i1, i2, and intercept term b1;The second layer is the hidden layer, which contains two neurons H1, H2 and the intercept term B2, the third layer is the output O1, O2, and the wi of each line is the weight of the connection between the layers, and the activation function is the sigmoid function by default.
Through forward propagation, we get that the output value is [ , which is still far from the actual value [ , and then we backpropagate the error, update the weight, and recalculate the output.
3.Metric updates for input layers --- hidden layers:
In the above calculation of the partial derivative of the total error to w5, it is from out(o1)--net(o1)--w5, but when the weight between the hidden layers is updated, it is out(h1)--buried net(h1)--w1, and out(h1) will accept the error from e(o1) and e(o2), so both must be calculated in this place.
According to the process demonstration of the BP algorithm, the general process of the BP algorithm can be obtained:
1.Forward propagation of FP (Seeking Loss).
In this process, we calculate the final output value and the loss value between the output value and the actual value based on the input sample, the given initialization weight value w, and the value b of the bias term. (Note: If the loss value is not within the given range, the next backpropagation process is carried out, otherwise the update of w,b is stopped.
2.Backpropagation bp (backhaul error).
The output is transmitted back to the input layer layer layer by layer through the hidden layer in some form, and the error is distributed to all the elements of each layer, and the error signal of each layer element is obtained by bending the key, and this error signal is used as the basis for correcting the weight of each element. (The main ones are: Update of the parameter w from the hidden layer to the output layer Update the parameter w from the input layer to the hidden layer.
ending It's still important to understand calculations and formulas!
-
Backpropagation is a very simple algorithm that can be easily understood by anyone who has studied calculus. This paper hopes to avoid the redundancy and complexity that makes people look at it uncomfortably, and concisely and clearly describe the derivation process and solution process of the backpropagation algorithm.
The main points of backpropagation are only 3 formulas of Zen and Zen, which are summarized here as follows:
Known: Derivation:
Full differential review:
Derivation: The essence of backpropagation is chain law + dynamic programming.
In the whole calculation diagram, assuming that each connecting edge represents the derivative of the upper layer to the lower layer, then the traditional method of solving the derivative of a parameter about cost function, according to the chain method, needs to calculate all the derivatives from the last layer to this parameter path, and then multiply them. It is conceivable that the computational complexity will become very large as the depth of the network increases.
In the backpropagation algorithm, the output of each layer is calculated and saved through a forward propagation process, and then the recursive formula from back to front is derived by using the chain rule, so that each edge on the graph can be calculated only once to find the derivative of any parameter.
Fast row (the most common and the easiest).
The idea of algorithms is divide and conquer. >>>More
Time slice rotation scheduling is one of the oldest, simplest, fairest, and most widely used algorithms. >>>More
There are no advantages or disadvantages, this algorithm is just a way to solve the problem of convergence. Advantages and disadvantages need to be compared, and there is no comparison object and the same comparison conditions, how to talk about advantages and disadvantages. There are many algorithms that can be solved for every problem, and it is not necessarily good or bad to iterate. >>>More
First, 3 linklist types list, p, and r. are declaredYou can think of list as a table, but at the beginning it is an empty table, list is assigned to r, start a for loop, specify the next node of r as p (head node), and then assign p to r, the next node of the head node is list, list is assigned to p, enter a from the keyboard, if a > 0, then the data part of the second node is the value of a, and the cycle continues, and the condition for the end is to enter the value a>=0, when p is sure to reach the last node after the while loop ends, p is assigned to r, r is the end node, and then output with do while, output the data of each node, and the end condition is p to the end node. To put it bluntly, first create an empty table k-1 node, then input the keyboard to assign the value of the data part of each node to "0", and finally output the input value.
Solution: sum = 1 + 3 + 5 + ...+47+49, then sum =49+47+....+5+3+1, and +sum = (1+49)+(3+47)+(5+45)+....45+5)+(47+3)+(49+1)=50+50+50…+50+50=50 25=750, sum+and=750, and 2=750, sum=375