I have to train a model that can approximate the function used to map the inputs (3) onto a single output (1) with sigmoid activation in the hidden layer and tanh in the output layer.
The data is 8 rows of the input-output pair ((X,Y,Z), SUM) where X,Y and Z are the input and SUM is the output.
The values of X,Y and Z are in different random ranges. Now, I am stuck with the problem of deciding between Normalization and/or Standardization. I have gone through some resources but I found the answers in reference to Clustering and Image Classification.
What should I choose? I mean, if normalize or standardize, should it be done for the entire global data (X,Y,Z SUM) or each done differently. Also, if I standardize, then I'll have to de-standardize at the end. Isn't this abnormal?