0

I need to make encog program in Java with XOR function that must have string words with definitions as inputs but BasicMLDataSet can only receive doubles. Here is the sample code that I am using:

/**
 * The input necessary for XOR.
 */
public static double XOR_INPUT[][] = { { 0.0, 0.0 }, { 1.0, 0.0 },
        { 0.0, 1.0 }, { 1.0, 1.0 } };

/**
 * The ideal data necessary for XOR.
 */
public static double XOR_IDEAL[][] = { { 0.0 }, { 1.0 }, { 1.0 }, { 0.0 } };

And here is the class that receives XOR_INPUT and XOR_IDEAL:

MLDataSet trainingSet = new BasicMLDataSet(XOR_INPUT, XOR_IDEAL);

The code is from encog xor example

Is there any way that I can acomplish training with strings or parse them somehow and then return them to strings before writing them to console?

Tomislav Brabec
  • 529
  • 2
  • 14
  • 20

2 Answers2

1

I have found a work around for this. As I can only provide double values between 0 and 1 as inputs and as I haven't found any function in encog that can naturally normalize string to double values I have made my own function. I'm getting ascii value from every letter in word and then I'm simply dividing 90/asciiValue to get value between 0 and 1. Keep in mind that this only works for small letters. Function can be easily upgraded to support upper letters also. Here is the function:

    //Converts every letter in string to ascii and normalizes it (90/asciiValue)    
     public static double[] toAscii(String s, int najveci) {
            double[] ascii = new double[najveci];
            try {
                    byte[] bytes = s.getBytes("US-ASCII");
                    for (int i = 0; i < bytes.length; i++) {
                            ascii[i] = 90.0 / bytes[i];
                    }

            } catch (UnsupportedEncodingException e) {
                    e.printStackTrace();
            }
            return ascii;
    }

For word ideal output I'm using similar solution. I'm also normalizing each letter in word but then I make average of those values. Later, I'm denormalizing those values to get strings back and check model training goodnes.

You can view full code here.

Tomislav Brabec
  • 529
  • 2
  • 14
  • 20
1

You can use Encog's EncogAnalyst and AnalystWizard to normalize your data. This posting by @JeffHeaton (the author of Encog) shows an example using .csv files

These classes can normalize both numeric and "nominal" data (e.g. the strings you want to use.) You will likely want to use the "Equilateral" normalization for these strings, as this will avoid some training issues with Neural Networks.

You might also want to check out this tutorial on Encog on PluralSight which has an entire section on Normalization.

Here is an example from the Encog documentation that shows how to normalize a field using code (without a .csv file):

var fuelStats = new NormalizedField( NormalizationAction.Normalize , ”fuel”, 200, 0, −0.9, 0.9) ;

28 Obtaining Data for Encog For the above example the range is normalized to -0.9 to 0.9. This is very similar to normalizing between -1 and 1, but less extreme. This can produce better results at times. It is also known that the acceptable range for fuel is between 0 and 200. Now that the field object has been created, it is easy to normalize the values. Here the value 100 is normalized into the variable n. double n = fuelStats .Normalize(100); To denormalize n back to the original fuel value, use the following code:

double f = fuelStats .Denormalize(n);

Community
  • 1
  • 1
Greg Thatcher
  • 1,303
  • 20
  • 29