9

I am trying to make a back propagation neural network. Based upon the the tutorials i found here : MSDN article by James McCaffrey. He gives many examples but all his networks are based upon the same problem to solve. So his networks look like 4:7:3 >> 4input - 7hidden - 3output.

His output is always binary 0 or 1, one output gets a 1, to classify an Irish flower, into one of the three categories.

I would like to solve another problem with a neural network and that would require me 2 neural networks where one needs an output inbetween 0..255 and another inbetween 0 and 2times Pi. (a full turn, circle). Well essentially i think i need an output that range from 0.0 to 1.0 or from -1 to 1 and anything in between, so that i can multiply it to becomme 0..255 or 0..2Pi

I think his network does behave, like it does because of his computeOutputs Which I show below here :

  private double[] ComputeOutputs(double[] xValues)
    {

    if (xValues.Length != numInput)
            throw new Exception("Bad xValues array length");

        double[] hSums = new double[numHidden]; // hidden nodes sums scratch array
        double[] oSums = new double[numOutput]; // output nodes sums

        for (int i = 0; i < xValues.Length; ++i) // copy x-values to inputs
            this.inputs[i] = xValues[i];

        for (int j = 0; j < numHidden; ++j)  // compute i-h sum of weights * inputs
            for (int i = 0; i < numInput; ++i)
                hSums[j] += this.inputs[i] * this.ihWeights[i][j]; // note +=

        for (int i = 0; i < numHidden; ++i)  // add biases to input-to-hidden sums
            hSums[i] += this.hBiases[i];

        for (int i = 0; i < numHidden; ++i)   // apply activation
            this.hOutputs[i] = HyperTanFunction(hSums[i]); // hard-coded

        for (int j = 0; j < numOutput; ++j)   // compute h-o sum of weights * hOutputs
            for (int i = 0; i < numHidden; ++i)
                oSums[j] += hOutputs[i] * hoWeights[i][j];

        for (int i = 0; i < numOutput; ++i)  // add biases to input-to-hidden sums
            oSums[i] += oBiases[i];

        double[] softOut = Softmax(oSums); // softmax activation does all outputs at once for efficiency
        Array.Copy(softOut, outputs, softOut.Length);

        double[] retResult = new double[numOutput]; // could define a GetOutputs method instead
        Array.Copy(this.outputs, retResult, retResult.Length);
        return retResult;

The network uses the folowing hyperTan function

      private static double HyperTanFunction(double x)
      {
        if (x < -20.0) return -1.0; // approximation is correct to 30 decimals
        else if (x > 20.0) return 1.0;
        else return Math.Tanh(x);
      }

In above a function makes for the output layer use of Softmax() and it is i think critical to problem here. In that I think it makes his output all binary, and it looks like this :

     private static double[] Softmax(double[] oSums)
    {
        // determine max output sum
        // does all output nodes at once so scale doesn't have to be re-computed each time
        double max = oSums[0];
        for (int i = 0; i < oSums.Length; ++i)
            if (oSums[i] > max) max = oSums[i];

        // determine scaling factor -- sum of exp(each val - max)
        double scale = 0.0;
        for (int i = 0; i < oSums.Length; ++i)
            scale += Math.Exp(oSums[i] - max);

        double[] result = new double[oSums.Length];
        for (int i = 0; i < oSums.Length; ++i)
            result[i] = Math.Exp(oSums[i] - max) / scale;

        return result; // now scaled so that xi sum to 1.0
    }

How to rewrite softmax ? So the network will be able to give non binary answers ?

Notice the full code of the network is here. if you would like to try it out.

Also as to test the network the following accuracy function is used, maybe the binary behaviour emerges from it

public double Accuracy(double[][] testData)
    {
        // percentage correct using winner-takes all
        int numCorrect = 0;
        int numWrong = 0;
        double[] xValues = new double[numInput]; // inputs
        double[] tValues = new double[numOutput]; // targets
        double[] yValues; // computed Y

        for (int i = 0; i < testData.Length; ++i)
        {
            Array.Copy(testData[i], xValues, numInput); // parse test data into x-values and t-values
            Array.Copy(testData[i], numInput, tValues, 0, numOutput);
            yValues = this.ComputeOutputs(xValues);
            int maxIndex = MaxIndex(yValues); // which cell in yValues has largest value?
            int tMaxIndex = MaxIndex(tValues);
            if (maxIndex == tMaxIndex)
                ++numCorrect;
            else
                ++numWrong;
        }
        return (numCorrect * 1.0) / (double)testData.Length;
}
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
Peter
  • 2,043
  • 1
  • 21
  • 45
  • 2
    This is way too broad and high-level for SO. You need to ask this in a Computer Science forum. – Enigmativity Jun 02 '17 at 08:02
  • i think you need to use kohonen'n network for your problem http://home.agh.edu.pl/~vlsi/AI/koho_t/index_eng.html – Luke Jun 02 '17 at 08:17
  • My Neural Network knowledge is a bit rusty, but I just tried your code (as best as I could work out) and I am not getting binary values out: `Results: 0.28 0.29 0.43`. I don't think the problem lies with the Softmax function. – Ben Jun 02 '17 at 08:21
  • 6
    @Enigmativity I disagree - SO has a neural-network tag with 3.5k followers. This question has research and shows what the OP has tried already. The question is not too broad - it is a "what have I done wrong" question which is fine for SO. – Ben Jun 02 '17 at 08:26
  • Softmax is a common way of converting the output to a probability, the result will always be an array of probabilities (between 0 and 1) that sums to 1. In a classification problem, it represents the probability that the image is in that class – c2huc2hu Jun 15 '17 at 20:38

2 Answers2

6

Just in case that someone gets into the same situation. If you need some example code of a neural network regression (a NNR) That's how they are called.

Here is link to sample code in C#, and here is a good article about it. Notice the guy writes more articles there, you wont find everything but there's a lot there. Despite I was following this man for a while I missed this specific article as I didn't know how they where called, when I asked the question here on stack overflow.

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Peter
  • 2,043
  • 1
  • 21
  • 45
1

I'm a bit rusty at Neural Netowrks but I think, if you want to have a range of values from your output then you need to make sure your activation functions on your output layer are linear (or something that has a similar effect).

Try adding this method:

private static double[] Linear(double[] oSums)
{
    double sum = oSums.Sum(d => Math.Abs(d));

    double[] result = new double[oSums.Length];
    for (int i = 0; i < oSums.Length; ++i)
        result[i] = Math.Abs(oSums[i]) / sum;

    // scaled so that xi sum to 1.0
    return result;
}

And then in the ComputeOutputs method you need to use this new activation function for the output (rather than Softmax):

...
//double[] softOut = Softmax(oSums); // all outputs at once for efficiency
double[] softOut = Linear(oSums); // all outputs at once for efficiency
Array.Copy(softOut, outputs, softOut.Length);
...

This should now output linear values.

Ben
  • 3,241
  • 4
  • 35
  • 49
  • I updated the question to also show the hyper tanH function. I read you code try to understand why that network (see download url in question) does only out output binary values. Could it be caused by the line `int maxIndex = MaxIndex(yValues); // which cell in yValues has largest value?` in the accuracy function then ? (you can download the code, but i will also provide that function in the question as well. – Peter Jun 02 '17 at 09:43
  • @user3800527 looking into this further - I think that standard neural networks are designed to output binary values. If you haven't seen it already then the answers to [this post](https://stackoverflow.com/q/1523420/340045) are quite interesting. I think the answer comes down to changing the activation functions on the nodes in the output layer. – Ben Jun 02 '17 at 11:30
  • @user3800527 check out [this post also](https://stats.stackexchange.com/q/218542). It seems that while SoftMax might be a good choice for classification, a linear function would be better for you (but only for the output layer). – Ben Jun 02 '17 at 11:33
  • @user3800527 I have edited my answer to show you how I added a linear activation function to your code. – Ben Jun 02 '17 at 12:31
  • I cannt wait to test it out, but i need some free time at home to test it. From how i understand, you do understand my problem, so allready lots of thanks, i hope to code it within a few days, (but family takes time too). – Peter Jun 02 '17 at 14:32
  • Sorry it took a while to test, i did test it and it doesnt seam to work ?; i'm not realy sure though about it. I made a test data set of RGB colors to let the neural net convert it into HSL, with layers 3:11:3 the scoring didnt go beyond 60%.. Or is that a too complex task maybe?. – Peter Jun 08 '17 at 21:29
  • @user3800527 my linear function above was a bit of a guess. I am not sure if I should have used the Math.Abs() calls but I was trying to get the output to sum to 1. I am also not sure if the teaching mechanism (back-propagation) works the same for nodes with a linear activation function. You might want to check these things. I only wanted to point you in the right direction (e.g. using a linear activation function) in order to get a variable output (rather than a binary output). – Ben Jun 09 '17 at 07:50
  • 1
    Ben i've dived a bit deeper too, it might be that such networks, can have only a single output neuron. As for your other comment i dont know. I think i need to think of a different test/train for it. But still i need to have a linear answer instead of binary; – Peter Jun 09 '17 at 09:20
  • 1
    Well i found an answer.. in using another neural network, i found a regresion example and it worked. Also in general it seams they usually have 1 output only so my HSL test wasnt good either. – Peter Jun 12 '17 at 09:05
  • 1
    I also have a task to understand this code myself, so good timing. As far as i can tell Softmax alone is not enough because there are assumptions made in the rest of the code that bases from the fact that softmax will return a sum no more than 1. So other areas do 1 - () for example from this assumption. – cineam mispelt Jun 14 '17 at 08:22
  • 1
    @cineammispelt i just posted the answer, should have done that earlier but i was quite bussy at work. Notice also that NNR's usually have 1 output. – Peter Jun 16 '17 at 10:12