I want to calculate the floating-point rounding error of a series of additions, multiplications, and divisions using the Math.ulp(double) method in Java. According to the wiki page on Unit in the Last place (ULP), it seems that the error from one floating-point calculation, say 2+3 or 2*3 would be 0.5*ulp(2+3) or 0.5*ulp(2*3), respectively, where 2*3 and 2+3 are the floating-point calculations. However, adding up these errors does not account for the actual error I get in the end product. Saying the maximum error, for example, of 2+3*4 = 0.5*ulp(2+[3*4]) + 0.5*ulp(3*4) does not seem to account for the actual error I get. Thus, I am confused, and perhaps I am misunderstanding Math.ulp(double) or maybe I need to use some kind of relative error. I don't know. Could anyone explain this to me and maybe give a few examples of addition, multiplication, and division with floating-point and exact numbers? It would be much appreciated.
I'm attempting to calculate the reduced row echelon form of a matrix for a Matrix class, and I need to know if, after a few calculations, certain items in the two dimensional array I'm using for the calculation are equal to 0. If a row is all zeroes, I exit the code. If it has a nonzero number in it, I divide that number by itself and then perform Gaussian elimination. The problem is that after performing a series of operations, floating-point error can creep in and calculations that should result in a zero end up as a nonzero number which then messes up my matrix calculation. Thus, I am trying to change the condition under which Gaussian elimination occurs from zero to less than a calculated error bound, and I am calculating the error bound for every item in the matrix based on the calculations done to that item, added together in a new error array. Here is my code:
/**
* Finds the reduced row echelon form of the matrix using partial pivoting
* @return rref: The reduced row echelon form of the matrix
*/
public Matrix rref()
{
//ref()
Matrix ref = copy();
int iPivot = 0, jPivot = 0, greatestPivotRow;
double[][] errorArray = new double[height][width];
while(iPivot < height && jPivot < width)
{
do
{
//Finds row with greatest absolute-value-of-a-number at the horizontal value of the pivot position
greatestPivotRow = iPivot;
for(int n = iPivot; n < height; n++)
{
if(Math.abs(ref.getVal(n, jPivot)) > Math.abs(ref.getVal(greatestPivotRow, jPivot)))
greatestPivotRow = n;
}
//Swaps row at pivot with that row if that number is not 0 (Or less than the floating-point error)
//If the largest number is 0, all numbers below in the column are 0, so jPivot increments and row swapper is repeated
if(Math.abs(ref.getVal(greatestPivotRow, jPivot)) > errorArray[greatestPivotRow][jPivot])
ref = ref.swapRows(iPivot, greatestPivotRow);
else
jPivot++;
}
while(jPivot < width && Math.abs(ref.getVal(greatestPivotRow, jPivot)) <= errorArray[greatestPivotRow][jPivot]);
if(jPivot < width)
{
//Pivot value becomes 1
double rowMultiplier1 = 1/ref.getVal(iPivot,jPivot);
for(int j = jPivot; j < width; j++)
{
ref.matrixArray[iPivot][j] = ref.getVal(iPivot,j) * rowMultiplier1;
errorArray[iPivot][j] += 0.5 * (Math.ulp(ref.matrixArray[iPivot][j]) + Math.ulp(rowMultiplier1));
}
//1st value in nth row becomes 0
for(int iTarget = iPivot + 1; iTarget < height; iTarget++)
{
double rowMultiplier0 = -ref.getVal(iTarget, jPivot)/ref.getVal(iPivot, jPivot);
for(int j = jPivot; j < width; j++)
{
errorArray[iTarget][j] += 0.5 * (Math.ulp(ref.getVal(iPivot, j) * rowMultiplier0) + Math.ulp(ref.getVal(iTarget, j)
+ ref.getVal(iPivot, j)*rowMultiplier0) + Math.ulp(rowMultiplier0));
ref.matrixArray[iTarget][j] = ref.getVal(iTarget, j)
+ ref.getVal(iPivot, j)*rowMultiplier0;
}
}
}
//Shifts pivot down 1 and to the right 1
iPivot++;
jPivot++;
}
//rref
Matrix rref = ref.copy();
iPivot = 1;
jPivot = 1;
//Moves pivot along the diagonal
while(iPivot < height && jPivot < width)
{
//Moves horizontal position of pivot to first nonzero number in the row (the 1)
int m = jPivot;
while(m < width && Math.abs(rref.getVal(iPivot, m)) < errorArray[iPivot][m])
m++;
if(m != width)
{
jPivot = m;
//1st value in rows above pivot become 0
for(int iTarget = 0; iTarget < iPivot; iTarget++)
{
double rowMultiplier = -rref.getVal(iTarget, jPivot)/rref.getVal(iPivot, jPivot);
for(int j = jPivot; j < width; j++)
{
errorArray[iTarget][j] += 0.5 * (Math.ulp(rref.getVal(iTarget, j) * rowMultiplier) + Math.ulp(rref.getVal(iTarget, j)
+ rref.getVal(iPivot, j)*rowMultiplier) + Math.ulp(rowMultiplier));
rref.matrixArray[iTarget][j] = rref.getVal(iTarget, j)
+ rref.getVal(iPivot, j)*rowMultiplier;
}
}
}
iPivot++;
jPivot++;
}
//Get rid of floating-point errors in integers
for(int i = 0; i < height; i++)
{
for(int j =0; j < width; j++)
{
if(Math.abs(rref.getVal(i, j) - (int)(rref.getVal(i, j) + 0.5)) <= errorArray[i][j])
rref.matrixArray[i][j] = (int)(rref.getVal(i, j) + 0.5);
}
}
return rref;
}
The last part of the code, converting floating-point numbers less than the calculated error away from an integer value to that integer value is mostly to tell me if my error formula has worked, since some of the matrices I am calculating end up with, instead of integers, 5.000000000000004s and the like. Therefore, I know if I have a number very close to an integer but not the integer, I also know that my error bounds are not large enough, and apparently they are not, so I think I am doing something incorrectly.
My input matrix was one with the instance variable
double[][] matrixArray = {{1,-2,0,0,3}, {2,-5,-3,-2,6}, {0,5,15,10,0}, {2,6,18,8,6}};
And my result was the array
[[1.0, 0.0, 0.0, -2.0000000000000013, 3.0], [0.0, 1.0, 0.0, -1.0000000000000004, 0.0], [0.0, 0.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0]]
Though my error calculations fixed the problem with zeroes being turned into ones and then used in Gaussian elimination, I still have numbers that are not integers, so I know my error bounds are inaccurate. It may have worked in this case, but might not in the next without the correct error bounds.