I am trying to make a back propagation neural network. Based upon the the tutorials i found here : MSDN article by James McCaffrey. He gives many examples but all his networks are based upon the same problem to solve. So his networks look like 4:7:3 >> 4input - 7hidden - 3output.
His output is always binary 0 or 1, one output gets a 1, to classify an Irish flower, into one of the three categories.
I would like to solve another problem with a neural network and that would require me 2 neural networks where one needs an output inbetween 0..255 and another inbetween 0 and 2times Pi. (a full turn, circle). Well essentially i think i need an output that range from 0.0 to 1.0 or from -1 to 1 and anything in between, so that i can multiply it to becomme 0..255 or 0..2Pi
I think his network does behave, like it does because of his computeOutputs Which I show below here :
private double[] ComputeOutputs(double[] xValues)
{
if (xValues.Length != numInput)
throw new Exception("Bad xValues array length");
double[] hSums = new double[numHidden]; // hidden nodes sums scratch array
double[] oSums = new double[numOutput]; // output nodes sums
for (int i = 0; i < xValues.Length; ++i) // copy x-values to inputs
this.inputs[i] = xValues[i];
for (int j = 0; j < numHidden; ++j) // compute i-h sum of weights * inputs
for (int i = 0; i < numInput; ++i)
hSums[j] += this.inputs[i] * this.ihWeights[i][j]; // note +=
for (int i = 0; i < numHidden; ++i) // add biases to input-to-hidden sums
hSums[i] += this.hBiases[i];
for (int i = 0; i < numHidden; ++i) // apply activation
this.hOutputs[i] = HyperTanFunction(hSums[i]); // hard-coded
for (int j = 0; j < numOutput; ++j) // compute h-o sum of weights * hOutputs
for (int i = 0; i < numHidden; ++i)
oSums[j] += hOutputs[i] * hoWeights[i][j];
for (int i = 0; i < numOutput; ++i) // add biases to input-to-hidden sums
oSums[i] += oBiases[i];
double[] softOut = Softmax(oSums); // softmax activation does all outputs at once for efficiency
Array.Copy(softOut, outputs, softOut.Length);
double[] retResult = new double[numOutput]; // could define a GetOutputs method instead
Array.Copy(this.outputs, retResult, retResult.Length);
return retResult;
The network uses the folowing hyperTan function
private static double HyperTanFunction(double x)
{
if (x < -20.0) return -1.0; // approximation is correct to 30 decimals
else if (x > 20.0) return 1.0;
else return Math.Tanh(x);
}
In above a function makes for the output layer use of Softmax() and it is i think critical to problem here. In that I think it makes his output all binary, and it looks like this :
private static double[] Softmax(double[] oSums)
{
// determine max output sum
// does all output nodes at once so scale doesn't have to be re-computed each time
double max = oSums[0];
for (int i = 0; i < oSums.Length; ++i)
if (oSums[i] > max) max = oSums[i];
// determine scaling factor -- sum of exp(each val - max)
double scale = 0.0;
for (int i = 0; i < oSums.Length; ++i)
scale += Math.Exp(oSums[i] - max);
double[] result = new double[oSums.Length];
for (int i = 0; i < oSums.Length; ++i)
result[i] = Math.Exp(oSums[i] - max) / scale;
return result; // now scaled so that xi sum to 1.0
}
How to rewrite softmax ? So the network will be able to give non binary answers ?
Notice the full code of the network is here. if you would like to try it out.
Also as to test the network the following accuracy function is used, maybe the binary behaviour emerges from it
public double Accuracy(double[][] testData)
{
// percentage correct using winner-takes all
int numCorrect = 0;
int numWrong = 0;
double[] xValues = new double[numInput]; // inputs
double[] tValues = new double[numOutput]; // targets
double[] yValues; // computed Y
for (int i = 0; i < testData.Length; ++i)
{
Array.Copy(testData[i], xValues, numInput); // parse test data into x-values and t-values
Array.Copy(testData[i], numInput, tValues, 0, numOutput);
yValues = this.ComputeOutputs(xValues);
int maxIndex = MaxIndex(yValues); // which cell in yValues has largest value?
int tMaxIndex = MaxIndex(tValues);
if (maxIndex == tMaxIndex)
++numCorrect;
else
++numWrong;
}
return (numCorrect * 1.0) / (double)testData.Length;
}