9

I'm trying to implement a version of the Fuzzy C-Means algorithm in Java and I'm trying to do some optimization by computing just once everything that can be computed just once.

This is an iterative algorithm and regarding the updating of a matrix, the pixels x clusters membership matrix U (the sum of the values in a row must be 1.0), this is the update rule I want to optimize:

alt text

where the x are the element of a matrix X (pixels x features) and v belongs to the matrix V (clusters x features). And m is a parameter that ranges from 1.1 to infinity and c is the number of clusters. The distance used is the euclidean norm.

If I had to implement this formula in a banal way I'd do:

    for(int i = 0; i < X.length; i++)
    {
        int count = 0;
        for(int j = 0; j < V.length; j++)
        {
                double num = D[i][j];
                double sumTerms = 0;
                for(int k = 0; k < V.length; k++)
                {
                     double thisDistance = D[i][k];
                     sumTerms += Math.pow(num / thisDistance, (1.0 / (m - 1.0)));
                }
                U[i][j] = (float) (1f / sumTerms);
         }
     }

In this way some optimization is already done, I precomputed all the possible squared distances between X and V and stored them in a matrix D but that is not enough, since I'm cycling througn the elements of V two times resulting in two nested loops. Looking at the formula the numerator of the fraction is independent of the sum so I can compute numerator and denominator independently and the denominator can be computed just once for each pixel. So I came to a solution like this:

    int nClusters = V.length;
    double exp = (1.0 / (m - 1.0));
    for(int i = 0; i < X.length; i++)
    {
        int count = 0;
        for(int j = 0; j < nClusters; j++)
        {
             double distance = D[i][j];
             double denominator = D[i][nClusters];
             double numerator = Math.pow(distance, exp);
             U[i][j] = (float) (1f / (numerator * denominator));
        }
     }

Where I precomputed the denominator into an additional column of the matrix D while I was computing the distances:

    for (int i = 0; i < X.length; i++)
    {
        for (int j = 0; j < V.length; j++)
        {
            double sum = 0;
            for (int k = 0; k < nDims; k++)
            {
                final double d = X[i][k] - V[j][k];
                sum += d * d;
            }

            D[i][j] = sum;
            D[i][B.length] += Math.pow(1 / D[i][j], exp);
        }
    }

By doing so I encounter numerical differences between the 'banal' computation and the second one that leads to different numerical value in U (not in the first iterates but soon enough). I guess that the problem is that exponentiate very small numbers to high values (the elements of U can range from 0.0 to 1.0 and exp , for m = 1.1, is 10) leads to very small values, whereas by dividing the numerator and the denominator and THEN exponentiating the result seems to be better numerically. The problem is it involves much more operations.

UPDATE

Some values I get at ITERATION 0:

This is the first row of the matrix D not optimized:

384.6632 44482.727 17379.088 1245.4205

This is the first row of the matrix D the optimized way (note that the last value is the precomputed denominator):

384.6657 44482.7215 17379.0847 1245.4225 1.4098E-26

This is the first row of the U not optimized:

0.99999213 2.3382613E-21 2.8218658E-17 7.900302E-6

This is the first row of the U optimized:

0.9999921 2.338395E-21 2.822035E-17 7.900674E-6

ITERATION 1:

This is the first row of the matrix D not optimized:

414.3861 44469.39 17300.092 1197.7633

This is the first row of the matrix D the optimized way (note that the last value is the precomputed denominator):

414.3880 44469.38 17300.090 1197.7657 2.0796E-26

This is the first row of the U not optimized:

0.99997544 4.9366603E-21 6.216704E-17 2.4565863E-5

This is the first row of the U optimized:

0.3220644 1.5900239E-21 2.0023086E-17 7.912171E-6

The last set of values shows that they are very different due to a propagated error (I still hope I'm doing some mistake) and even the constraint that the sum of those values must be 1.0 is violated.

Am I doing something wrong? Is there a possible solution to get both the code optimized and numerically stable? Any suggestion or criticism will be appreciated.

mtrw
  • 34,200
  • 7
  • 63
  • 71
rano
  • 5,616
  • 4
  • 40
  • 66
  • 1
    Quick comment before looking at this, your formula has xj-vi and xj-vk, but your banal algorithm uses D[i][j] and D[i][k] instead. Am I missing something or should the second be D[j][k]? – aepryus Jan 09 '11 at 14:03
  • You are correct, they are different but it does not influence the result. I will use a different formula to be more precise, thank you for pointing it out – rano Jan 09 '11 at 14:10
  • Not trying to be annoying, just trying to get make sure I understand; but along the previous comment's line. - The above formula is squaring the sums; the banal code does not seem to be doing so; the optimized code seems to square the denominator not the numerator. The optimized code is using V[j][k] instead of V[i][k]. I'm guessing c in your formula represents the size of V; might also be good to range from k=0 instead of k=1 just for clarity. What is B and B.length and why are you using +=? – aepryus Jan 09 '11 at 14:51
  • Numerical differences? do you have example values? Reordering floating point operations almost always causes "small" differences. – josefx Jan 09 '11 at 15:59
  • 2
    I'm seeing the same things that aepryus is seing, but ultimately your question is about the math, not about the algorithm. My first thought is that your running up against the ISO Decimal Math problem. Check it with java.util.BigDecimal -- it'll kill your performance but you should see numeric stability -- and if it does you're going to have to start weighing tradeoffs. – lscoughlin Jan 09 '11 at 16:06
  • @aepryus the banal code already uses the `D` matrix which contains the squared distances. In the optimization part the denominator is precomputed in D into the additional last column. You are correct for k that starts from 1 in the formula and about c = the cluster number. I'm correcting it – rano Jan 09 '11 at 16:15
  • @josefx: I will post some values in the question – rano Jan 09 '11 at 16:16
  • @lscoughlin I need performance, or I would have left the inner cycle (I feel I will in the end : D) – rano Jan 09 '11 at 16:17
  • Note that you cannot treat floating-point operations as if you were dealing with normal real numbers. Especially in extreme situations you get huge differences (which is why math libraries often have more than one way of calculating a function and choose the appropriate one based on the input values). If you naïvely assume that basic mathematical properties hold, then yes, you're quite likely to get different results. – Joey Jan 09 '11 at 16:44
  • @Joey you're pretty right, I'm trying to figure out an alternative way to reach an optimization or the correct way (assuming I'm doing something wrong). For an image with 512x512 pixels and 12 clusters, X has 262144 rows and V 12, for each of 262144x12 elements I do not want to cycle 12 times. – rano Jan 09 '11 at 17:05
  • It turned out the error was mine, so that optimization can actually be applied using floating point arithmetics, I thank everyone for the time spent ^^ – rano Jan 10 '11 at 11:00

1 Answers1

8

This problem has nothing to do with floating point stability.

You are getting wrong denominator values on the second and following iterations because you forget to clear its cell before accumulating the sum.

The right denominator for iteration 1 is 6.697905e-27, that is almost 2.0796E-26 - 1.4098E-26.

axtavt
  • 239,438
  • 41
  • 511
  • 482
  • I feel so dumb and stupid now. I should have arrived at your conclusion. I thank you for spending your time on my stupidity's work – rano Jan 10 '11 at 00:02