Big number processing in a neural network

Question

I have a little question.

What would be the best way to train a neural network with big numbers (>1), for example:

input[][] {{10,100,1000}};
desiredOutput {{5000}};

(not really any sense behind this, just e.g.)

Because normal neurons can only output -1 to 1, the net won't be able to output 5000. Would it make sense to divide it at the beginning and to multiply it at the end again?

input[][] {{10,100,1000}}; --> {{0.001,0.01,0.1}}; (divide by 10'000)
desiredOutput {{0.5}}; --> {{5000}}; (multiply by 10'000)

Is there a better or more usual way?

I would not divide the number because calculations with `float` numbers are slower than with `int`. — wake-0, Jun 01 '16 at 11:52
@KevinWallis you can only input doubles anyways and i have enough time :P — Christopher Koho, Jun 01 '16 at 11:53
when you want a more readable algorithm than normalization of all numbers would make sense. otherwise i would live with the "big numbers" — wake-0, Jun 01 '16 at 11:57

Frank Puffer · Answer 1 · 2016-06-01T19:20:34.747

If the numbers are positive and differ by multiple orders of magnitude as in your example, a logarithmic scaling probably makes sense. Otherwise the output calculation performed by the neural network will be dominated by the large inputs while modification of smaller numbers will have little effect.

This is probably not what you want because for most applications relative changes are what's important. If you change an input from 1 to 2 (100%), you probably expect a larger effect on the output than when changing 1000 to 1001 (0.1%), although the absolute differences are the same.

This can be avoided by logarithmic scaling.

Example: To transform the range from 1 to 10000 to a range from 0 to 1, you could use this formula:

transformedInput = (Math.log10(input) - 1.0) / 4.0

To transform the output back to the original range, use exponentiation:

output = Math.pow( 10.0, 4.0 * output + 1.0 );

score 1 · Answer 2 · answered Jun 01 '16 at 18:22

It sounds like you want to use the network for regression. In that case, it makes sense to use a linear activation function for your output layer. The reason is that sigmoidal functions can't output values outside their specified range, as you mention. It will probably also help to center and normalize your inputs.

score 0 · Answer 3 · edited May 23 '17 at 11:44

What you are asking about is called normalization and yes, data should be normalized to the range [0;1] or [-1; 1] before inputting it to your network.

The results you will get from the network will also be scaled to the same range, but this does not mean that the same coefficients should be used on output values as the ones used to normalize the input values.

Output values may have very different meaning that the inputs, and generally there is no reason to scale it the same way as inputs. See also Why do we have to normalize the input for an artificial neural network?

Big number processing in a neural network

3 Answers3