10

I have to store the product of several probabilty values that are really low (for example, 1E-80). Using the primitive java double would result in zero because of the underflow. I don't want the value to go to zero because later on there will be a larger number (for example, 1E100) that will bring the values within the range that the double can handle.

So, I created a different class (MyDouble) myself that works on saving the base part and the exponent parts. When doing calculations, for example multiplication, I multiply the base parts, and add the exponents.

The program is fast with the primitive double type. However, when I use my own class (MyDouble) the program is really slow. I think this is because of the new objects that I have to create each time to create simple operations and the garbage collector has to do a lot of work when the objects are no longer needed.

My question is, is there a better way you think I can solve this problem? If not, is there a way so that I can speedup the program with my own class (MyDouble)?

[Note: taking the log and later taking the exponent does not solve my problem]

MyDouble class:

public class MyDouble {
    public MyDouble(double base, int power){
    this.base = base;
    this.power = power;
    }

    public static MyDouble multiply(double... values) {
    MyDouble returnMyDouble = new MyDouble(0);
    double prodBase = 1;
    int prodPower = 0;
    for( double val : values) {
            MyDouble ad = new MyDouble(val);
            prodBase *= ad.base;
            prodPower += ad.power;
        }   
        String newBaseString = "" + prodBase;
        String[] splitted = newBaseString.split("E");   
        double newBase = 0; int newPower = 0;
        if(splitted.length == 2) {
            newBase = Double.parseDouble(splitted[0]);
            newPower = Integer.parseInt(splitted[1]);
        } else {
            newBase = Double.parseDouble(splitted[0]);
            newPower = 0;
        }
        returnMyDouble.base = newBase;
        returnMyDouble.power = newPower + prodPower;        
        return returnMyDouble;
    }
}
zachjs
  • 1,738
  • 1
  • 11
  • 22
Rex Roy
  • 125
  • 1
  • 6
  • 5
    Why aren't you using [BigDecimal](http://docs.oracle.com/javase/1.5.0/docs/api/java/math/BigDecimal.html)? – DaoWen Oct 10 '12 at 04:27
  • see http://stackoverflow.com/questions/277309/java-floating-point-high-precision-library – Alexei Kaigorodov Oct 10 '12 at 04:52
  • Correct me if I'm wrong, but wouldn't it have the same problem? double e1 = 1E-309; double e2 = 1E-300; BigDecimal bd = new BigDecimal(e1 * e2); bd = bd.multiply(new BigDecimal(1E300)); System.out.println(bd.doubleValue()); //gives 0 – Rex Roy Oct 10 '12 at 05:07
  • 1
    @Rex Roy - You need to use the `add()` and `multiply()` methods on `BigDecimal`, not just wrap your double-arithmetic in a `BigDecimal` constructor. – DaoWen Oct 10 '12 at 14:50
  • You mentioned that taking the logs and summing them doesn't work for you, can you explain why? – Mathias Oct 14 '12 at 20:29

6 Answers6

5

The way this is solved is to work in log space---it trivialises the problem. When you say it doesn't work, can you give specific details of why? Probability underflow is a common issue in probabilistic models, and I don't think I've ever known it solved any other way.

Recall that log(a*b) is just log(a) + log(b). Similarly log(a/b) is log(a) - log(b). I assume since you're working with probabilities its multiplication and division that are causing the underflow issues; the drawback of log space is that you need to use special routines to calculate log(a+b), which I can direct you to if this is your issue.

So the simple answer is, work in log space, and re-exponentiate at the end to get a human-readable number.

Ben Allison
  • 7,244
  • 1
  • 15
  • 24
  • Well I was thinking I couldn't use log because my terms also involved sums of the exponents. I had the use the logsumexp trick. I finally ended up using everything in the log space because my own implementation, although worked, was really slow. – Rex Roy Oct 30 '12 at 22:12
  • Yeah, most things can be made to work in log space, but for certain operations (sum being the obvious problem) it's non-trivial. – Ben Allison Oct 31 '12 at 14:26
2

You trying to parse strings each time you doing multiply. Why don't you calculate all values into some structure like real and exponential part as pre-calculation step and then create algorithms for multiplication, adding, subdivision, power and other.

Also you could add flag for big/small numbers. I think you will not use both 1e100 and 1e-100 in one calculation (so you could simplify some calculations) and you could improve calculation time for different pairs (large, large), (small, small), (large, small).

2

You can use

BigDecimal bd = BigDecimal.ONE.scaleByPowerOfTen(-309)
        .multiply(BigDecimal.ONE.scaleByPowerOfTen(-300))
        .multiply(BigDecimal.ONE.scaleByPowerOfTen(300));
System.out.println(bd);

prints

1E-309

Or if you use a log10 scale

double d = -309 + -300 + 300;
System.out.println("1E"+d);

prints

1E-309.0
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
1

I'm sure this will be a good deal slower than a double, but probably a large contributing factor would be the String manipulation. Could you get rid of that and calculate the power through arithmetic instead? Even recursive or iterative arithmetic might be faster than converting to String to grab bits of the number.

Ricky Clarkson
  • 2,909
  • 1
  • 19
  • 21
1

In a performance heavy application, you want to find a way to store basic information in primitives. In this case, perhaps you can split the bytes of a long or other variable in so that a fixed portion is the base.

Then, you can create custom methods the multiply long or Long as if they were a double. You grab the bits representing the base and exp, and truncate accordingly.

In some sense, you're re-inventing the wheel here, since you want byte code that efficiently performs the operation you're looking for.

edit:

If you want to stick with two variables, you can modify your code to simply take an array, which will be much lighter than objects. Additionally, you need to remove calls to any string parsing functions. Those are extremely slow.

Arcymag
  • 1,037
  • 1
  • 8
  • 18
1

Slowness might be because of the intermediate string objects which are created in split and string concats.

Try this:

/**
 * value = base * 10 ^ power.
 */

public class MyDouble {

    // Threshold values to determine whether given double is too small or not. 
private static final double SMALL_EPSILON = 1e-8;
private static final double SMALL_EPSILON_MULTIPLIER = 1e8;
private static final int    SMALL_EPSILON_POWER = 8;

private double myBase;
private int    myPower;

public MyDouble(double base, int power){
    myBase  = base;
    myPower = power;
}

public MyDouble(double base) 
{
    myBase  = base;
    myPower = 0;
    adjustPower();
}

/**
 * If base value is too small, increase the base by multiplying with some number and 
 * decrease the power accordingly. 
 * <p> E.g 0.000 000 000 001 * 10^1  => 0.0001 * 10^8  
 */
private void adjustPower()
{
    // Increase the base & decrease the power 
    // if given double value is less than threshold.
    if (myBase < SMALL_EPSILON) {
        myBase = myBase * SMALL_EPSILON_MULTIPLIER;
        myPower -= SMALL_EPSILON_POWER;
    }
}

/**
 * This method multiplies given double and updates this object.
 */
public void multiply(MyDouble d)
{
    myBase  *= d.myBase;
    myPower += d.myPower;
    adjustPower();
}

/**
 * This method multiplies given primitive double value with this object and update the 
 * base and power.
 */
public void multiply(double d)
{
    multiply(new MyDouble(d));
}

@Override
public String toString()
{
    return "Base:" + myBase + ", Power=" + myPower;
}

/**
 * This method multiplies given double values and returns MyDouble object.
 * It make sure that too small double values do not zero out the multiplication result. 
 */
public static MyDouble multiply(double...values) 
{
    MyDouble result = new MyDouble(1);
    for (int i=0; i<values.length; i++) {
        result.multiply(values[i]);
    }
    return result;
}

public static void main(String[] args) {
    MyDouble r = MyDouble.multiply(1e-80, 1e100);
    System.out.println(r);
}

}

If this is still slow for your purpose, you can modify multiply() method to directly operate on primitive double instead of creating a MyDouble object.

Htaras
  • 849
  • 1
  • 7
  • 17