Today, I want to discuss about a very important algorithm in Machine learning (or you can say the heart of Deep learning) and that is ANN or Artificial Neural Network.
Where did the concept come from ?
On the left , let me show you a very simple picture of a human neuron. Dendrites send electric signals through axon to terminal axon.
And the signals are further transmitted to next Dendrites.
On the right, you can see a simple picture of ANN. The input nodes are similar to Dendrites and the axon is nothing but some functions( in the language of ANN, Activation function ).And the further outputs are sent to next nodes (which is next layer)and go on .
If we break the concept of ANN further , it is nothing but a simple regression (linear or logistic).
For input x1, x2…xn , the simple regression function looks like
Z= wx + b
Where x is a matrix of ( x1, x2…xn).
W( w1,w2…wn) which is a parameter to calculate the function and b is bias .
Do you remember the simple regression formula?
Y = MX+c
This is the same function with diferrent notation.
Now, it is good up to for linear regression but if we are thinking of neural network then we have to add more layers with more nodes .. Do you think that it is linear ? NO.. !
It’s non linear and that is why we need to use activation function on top of transfer function z. We have various activation function like Sigmoid , ReLU , Tanh, Leaky ReLU etc.
For binary classification, we always go with sigmoid function but for multi class classification or regression problem we prefer other activation functions like ReLU.
So the flowchart would be :
Z1= W1X +b1
Z2= W2.A1 + b2
For a 2 layer ANN, the final predicted value yhat = A2.
And the loss function = ( Y – A2)
Please note; if this is a binary classification problem, then the last activation function g(Z2) should be a sigmoid function.
In my next post, I shall explain the activation functions in detail. Stay tuned.