What is Backpropagation?
Backpropagation is a supervised learning algorithm, for training Multi-layer Perceptrons (Artificial Neural Networks).It refers to the method of calculating the gradient of neural network parameters. In short, the method traverses the network in reverse order, from the output to the input layer, according to the chain rule from calculus. It looks for the minimum value of the error function in weight space using a technqiue called the delta rule
or gradient descent
. The weights that minimize the error function is then considered to be a solution to the learning problem.
Why do we need Backpropagation?
When training a neural network we also assign random values for weights and biases. Therefore, the random weights might not fit the model the best due to which the output of our model may be very different than the actual output. This results in high error values. Thereore, we need to tune those weights so as to reduce the error. To tune the errors we need to guide the model to change the weights such that the error becomes minimum.
Working of Backpropagation
Let’s us consider the Neural Network Below:
Values: \
- x1= 0.05, x2= 0.10
- b1= 0.35, b2= 0.6
- w1 = 0.15, w2 = 0.20, w3 = 0.25, w4 = 0.30
- w5 = 0.40, w6 = 0.45, w7 = 0.50, w8 =0.55
The above neural network contains the follwowing:
- One Input Layer
- Two Input Neurons
- One Hidden Layer
- Two Hidden Neurons
- One Output Layer
- Two Output Neurons
- Two Biases
Following are the steps for the weight update using Backpropagation:-
Note: We will be using Sigmoid Activation Function.
- Step 1: Forward Propagation
Net Input for h1:
Output of h1:
Similary, Output of h2:
Repeat the process for the output layre neurons, using the output from the hidden layer as input for the output neurons.
Input for y1:
Output of y1:
Similary, Output of y2:
Calculating Total Error:
+
+
- Step 2: Backward Propagation
Now, we will reduce the error by updating the values weights and biases using back-propagation. To update Weights, let us consider
for which we will calculate the rate of change of error w.r.t change in weightNow,
Since, we are propagating backwards, first thing we need to do is calculate the change in total errors w.r.t to the output y1 and y2
Now, we will propate further backwards and calculate change in output y1 w.r.t its initial input
Now, we will se how much y1 changes w.r.t change int
w5
:
- Step 3: Putting all the values together and calculating the updated weight value.
- Now, putting all the values together:
Now, updating the w5
Similarly, we can calculate the other weight values as well
Now, at hidden layer updating w1, w2, w3, and w4:
sUsing above values we can calculate the
and similarly which in turn can be used to calculate the value of . Similarly,calculate the value of and to get the change of error w.r.t to change in weight w1. We repeat this process for all the remaining weights.- After that we will again propagate forward and calculate the output. We will again calculate the error.
- If the error is minimum, we will stop right there, else we will again propagate backwards and upate the weight values.
- This process will keep on repeating until error becomes minimum.
- Now, putting all the values together: