Home Back Propagation
Post
Cancel

Back Propagation

What is Backpropagation?

Backpropagation is a supervised learning algorithm, for training Multi-layer Perceptrons (Artificial Neural Networks).It refers to the method of calculating the gradient of neural network parameters. In short, the method traverses the network in reverse order, from the output to the input layer, according to the chain rule from calculus. It looks for the minimum value of the error function in weight space using a technqiue called the delta rule or gradient descent. The weights that minimize the error function is then considered to be a solution to the learning problem.

Why do we need Backpropagation?

When training a neural network we also assign random values for weights and biases. Therefore, the random weights might not fit the model the best due to which the output of our model may be very different than the actual output. This results in high error values. Thereore, we need to tune those weights so as to reduce the error. To tune the errors we need to guide the model to change the weights such that the error becomes minimum.

img

img Weights vs Error

Working of Backpropagation

Let’s us consider the Neural Network Below: img

Values: \

  • x1= 0.05, x2= 0.10
  • b1= 0.35, b2= 0.6
  • w1 = 0.15, w2 = 0.20, w3 = 0.25, w4 = 0.30
  • w5 = 0.40, w6 = 0.45, w7 = 0.50, w8 =0.55

The above neural network contains the follwowing:

  • One Input Layer
    • Two Input Neurons
  • One Hidden Layer
    • Two Hidden Neurons
  • One Output Layer
    • Two Output Neurons
  • Two Biases

Following are the steps for the weight update using Backpropagation:-

Note: We will be using Sigmoid Activation Function.
Sigmoid Activation Function=11+ex

  • Step 1: Forward Propagation
    • Net Input for h1:
      h1=x1w1+x2w2+b1
      h1=0.050.15+0.100.20+0.35
      h1=0.3775

    • Output of h1:
      out h1=11+eh1
      out h1=11+e0.3775
      out h1=0.5932

    • Similary, Output of h2:
      out h2=0.5968

    Repeat the process for the output layre neurons, using the output from the hidden layer as input for the output neurons.

    • Input for y1:
      y1=out h1w5+out h2w6+b2
      y1=0.59320.40+0.59680.45+0.6
      y1=1.1059

    • Output of y1:
      out y1=11+eh1
      out y1=11+e1.1959
      out y1=0.7513

    • Similary, Output of y2:
      out y2=0.7729

    Calculating Total Error:

    • Etotal=E1+E2
    • Etotal=12(targetoutput)2
      Etotal=12(T1Out y1)2 + 12(T2Out y2)2
      Etotal=12(0.010.751)2 + 12(0.990.772)2
      Etotal=0.29837
  • Step 2: Backward Propagation
    • Now, we will reduce the error by updating the values weights and biases using back-propagation. To update Weights, let us consider w5 for which we will calculate the rate of change of error w.r.t change in weight w5

      Error at w5=dEtotaldw5

      Now,

      dEtotaldw5=dEtotaldouty1douty1dy1dy1dw5


      Since, we are propagating backwards, first thing we need to do is calculate the change in total errors w.r.t to the output y1 and y2

      Etotal=12(T1out y1)2+12(T2out y2)2 dEtotaldouty1=122(T1out y1)21(01)+0 dEtotaldouty1=(T1out y1)(1) dEtotaldouty1=T1+out y1 dEtotaldouty1=0.01+0.7513 dEtotaldouty1=0.7413

      Now, we will propate further backwards and calculate change in output y1 w.r.t its initial input

      douty1dy1=out y1(1out y1) douty1dy1=0.18681

      Now, we will se how much y1 changes w.r.t change int w5:

      dy1dw5=1out h1w511+0+0 dy1dw5=out h1 dy1dw5=0.5932
  • Step 3: Putting all the values together and calculating the updated weight value.
    • Now, putting all the values together:
    dEtotaldw5=dEtotaldouty1douty1dy1dy1dw5=0.0821
    • Now, updating the w5 w5=w5ηdEtotaldw5

      w5=0.40.50.0821 w5=0.3589
    • Similarly, we can calculate the other weight values as well

      w6=0.4808

      w7=0.5113

      w8=0.0613

    • Now, at hidden layer updating w1, w2, w3, and w4:

      dEtotaldw1=dEtotaldout h1dout h1dh1dh1dw1 where, dEtotaldout h1=dE1dout h1+dE2dout h1 where dE1dout h1=dE1dy1dy1dout h1 where

      dE1dy1=dE1dout y1dout y1dy1s

      dE1dy1=0.74130.1868 dE1dy1=0.1384 dy1dout h1=0.05539

      Using above values we can calculate the dE1dout h1 and similarly dE2dout h2 which in turn can be used to calculate the value of dEtotaldout h1. Similarly,calculate the value of dout h1dh1 and dh1dw1 to get the change of error w.r.t to change in weight w1. We repeat this process for all the remaining weights.

    • After that we will again propagate forward and calculate the output. We will again calculate the error.
    • If the error is minimum, we will stop right there, else we will again propagate backwards and upate the weight values.
    • This process will keep on repeating until error becomes minimum.
This post is licensed under CC BY 4.0 by the author.