0

I have a double arraylist dataset in Java. I want to normalize the data for data clustering , the function to normalize data for clustering is in here normalization for data cluster

input = {{-1.3,2.4,5.3.2.1,0.7},{6.4,-3.3,1.9.4.1,0.3}}

Below is the code I've tried so far, but I'm not sure it's the right normalization method

public void getMinMax(){
    min = new double[input.size()];
    max = new double[input.size()];
    for(int i=0; i < input.size(); i++){
        min[i] = 0;
        max[i] = 0;
        for(int j = 0; j < input.get(i).size(); j++){
            if(input.get(i).get(j) >= max[i]){
                max[i] = input.get(i).get(j);
            } else if(input.get(i).get(j) <= min[i]){
                min[i] = input.get(i).get(j);
            }
        }
    }
}

private void normalizeMaxMin() {
    for(int i=0; i < input.size(); i++){
        for(int j = 0; j < input.get(i).size(); j++){
            input.get(i).set(j, (input.get(i).get(j) - min[i]) / (max[i] - min[i]));
        }
    }
} 
Xavi Jimenez
  • 314
  • 1
  • 3
  • 16
Idham Choudry
  • 589
  • 10
  • 31
  • i've edited my question – Idham Choudry Apr 19 '16 at 06:16
  • 1
    You need to define "normalize the data for data clustering". We can't guess what that means. And anyway, if you're not sure, why don't you test your code and check it does what you intend it to do? – JB Nizet Apr 19 '16 at 06:19
  • @JBNizet i've edited my question – Idham Choudry Apr 19 '16 at 06:22
  • You should rephrase this so that it ends with the question you'd like to get answered. Now it's unclear what you're asking. – iwein Apr 19 '16 at 06:27
  • I agree with @MichaelDibbets : The code is ugly. Consider a method `static void normalize(List list) { ... }` that normalizes a *single* list, and call this method with each `input.get(i)`. It will be more readable, more flexible and more efficient. Apart from that, there is no "right" normalization. The most common ones in Data Mining are "Min-Max" and "Z-Max" (also see http://math.stackexchange.com/questions/362918/value-range-of-normalization-methods-min-max-z-score-decimal-scaling ). Yours is Min-Max, and it looks correct at the first glance, but I didn't test/verify it. – Marco13 Apr 19 '16 at 08:59

1 Answers1

0

This is my method. It's O(n^2) but it works nevertheless.

    public static double[][] normalize(double[][] arr){


    min = Double.MAX_VALUE;
    max = Double.MIN_VALUE;


    for (int i = 0; i < arr.length; i++) {
        for (int j = 0; j < arr[i].length; j++) {

            max = Math.max(arr[i][j],max);
            min = Math.min(arr[i][j],min);


        }
    }

    for (int i = 0; i < arr.length; i++) {
        for (int j = 0; j < arr[i].length; j++) {

            arr[i][j] = (arr[i][j] - min)/(max-min);

        }
    }

    return arr;

}
Gregory
  • 26
  • 4