Non linear lut log precision

Question

Currently I calculate log as following:

#define MAXLOG 1001
double myloglut[MAXLOG];
void MyLogCreate()
{
    int i;
    double exp, expinc;
    expinc = (2.0 - 0.1) / MAXLOG;
    for (i = 0, exp = 0.1; i <= MAXLOG; ++i, exp += expinc)
            myloglut[i] = log(exp);
    myloglut[478] = 0; // this one need to be precise
}

double MyLog(double v)
{
    int idx = (int)((MAXLOG*(v - 0.1)) / (2.0 - 0.1));
    return myloglut[idx];
}

As you can see I'm interested only in range 0.1 - 2.0. However, I need more precision around 0. How can I achieve that non linear calculation? Also is there any way to use some interpolation in this function for better precision?

remember that a double's precision is only accurate up to 17 places. How much precision are you looking for around zero? — Syntactic Fructose, Dec 06 '12 at 09:19
Even less in my case, because double on my MCU is only 4 byte. However, If there will be some generic precision control routine I will make use of it. — Pablo, Dec 06 '12 at 09:23
I will check that, but I don't think they have this particular solution for non linear lut log. — Pablo, Dec 06 '12 at 09:31
@Need4Sleep: are you *completely* mad? This guy is attempting to approximate `log` with a step function that takes on 1000 values, and you're telling him to use a (great, wonderful, amazing) *arbitrary-precision* library? I cannot imagine more complete and total overkill. — Stephen Canon, Dec 07 '12 at 17:16

7heo.tk · Accepted Answer · 2012-12-07T22:25:57.017

Last version

#include <stdio.h>                  // for input/output.
#include <math.h>                   // for mathmatic functions (log, pow, etc.)

// Values
#define MAXELM 1000                 // Array size
#define MINVAL 0.1                  // Minimum x value
#define MAXVAL 1.9                  // Maximum x value
#define EXPVAR 1.4                  // Exponent which makes the variation non linear. If set to 1, the variation will be linear.
#define ACRTPT (MINVAL + MAXVAL)/2  // Accurate point. This value is used to know where to compute with maximum accuracy. Can be set to a fixed value.
// Behavior
#define STRICT 0                    // if TRUE: Return -1 instead of the floored (or closest if out of range) offset when (x) hasn't been calculated for this value.
#define PNTALL 0                    // if TRUE: Print all the calculated values.
#define ASKFOR 1                    // if TRUE: Ask for a x value then print the calculated ln value for it.

// Global vars
double results[MAXELM];             // Array to store computed values.

// Func: offset to var conversion
double getvar(int offset)
{
    double x = (double)MINVAL + ((double)MAXVAL - (double)MINVAL) * (double)offset / (double)MAXELM;

    if(x >= (double)ACRTPT)
        x = pow(x - (double)ACRTPT, (double)EXPVAR) + (double)ACRTPT;
    else
        x = -pow((double)ACRTPT - x, (double)EXPVAR) + (double)ACRTPT;
    // This ^ is the equation used when NONLIN = 1; to have a non linear repartition. Feel free to change it. The inverse equation is in `int getoffset(double)`.
    return x;
}

// Func: var to offset conversion
int getoffset(double var)
{
    double x = var;

    if(x >= (double)ACRTPT)
        x = pow(x - (double)ACRTPT, 1.0/(double)EXPVAR) + (double)ACRTPT;
    else
        x = -pow((double)ACRTPT - x, 1.0/(double)EXPVAR) + (double)ACRTPT;
    // This ^ is the equation used when NONLIN = 1; to calculate offset with a non linear repartition. Feel free to change it (but it must be the inverse of the one in
    // `double getvar(int)` for this to work.). These equations are tied, so you cannot modify one without modifying the other. They are here because
    // `pow(negative, non-integer)` always returns `-nan` instead of the correct value. This 'trick' uses the fact that (-x)^(1/3) == -(x^(1/3)) to cicumvent the
    // limitation.

    int offset = (x - (double)MINVAL) * (double)MAXELM / ((double)MAXVAL - (double)MINVAL);
#if STRICT
    if(getvar(offset) != var)
        return -1;
    return (offset < 0)?-1:(offset > (MAXELM - 1))?-1:offset;
#else
    return (offset < 0)?0:(offset > (MAXELM - 1))?MAXELM - 1:offset;
#endif
}

// Func: main.
int main(int argc, char* argv[])
{
    int offset;
    for(offset = 0; offset < MAXELM; offset++)
        results[offset] = log(getvar(offset));

#if PNTALL
    for(offset = 0; offset < MAXELM; offset++)
    {
        printf("[log(%lf) = %lf] ", getvar(offset), results[offset]);
        if(!((offset + 1) % 6))
            printf("\n");
    }
    printf("\n");
#endif

#if ASKFOR
    double x;
    printf("log(x) for x = ");
    scanf("%lf", &x);
    if((offset = getoffset(x)) < 0)
        printf("ERROR: Value for x = %lf hasn't been calculated\n", x);
    else
        printf("results[%d]: log(%lf) = %lf\n", offset, getvar(offset), results[offset]);
#endif

    return 0;
}

Last versions's characteristics:

Uses a fixed size array.
Computes ONLY the stored values (won't calculate multiple values for one array cell).
Uses functions to get offset from value and value from offset, so you don't have to store the values of which the log has been calculated.

Advantages over the last version:

Does not use cbrt, uses pow instead.
Allows to specify the growth of calculus variable at compilation time. (So the values are more or less grouped around the accurate point (ACRTPT))

Third version

#include <stdio.h>                  // for input/output.
#include <math.h>                   // for mathmatic functions (log, pow, etc.)

// Values
#define MAXELM 1000                 // Array size
#define MINVAL 0.1                  // Minimum x value
#define MAXVAL 1.9                  // Maximum x value
#define ACRTPT (MINVAL + MAXVAL)/2  // Accurate point. This value is used to know where to compute with maximum accuracy. Can be set to a fixed value.
// Behavior
#define NONLIN 1                    // if TRUE: Calculate log values with a quadratic distribution instead of linear distribution.
#define STRICT 1                    // if TRUE: Return -1 instead of the floored (or closest if out of range) offset when (x) hasn't been calculated for this value.
#define PNTALL 0                    // if TRUE: Print all the calculated values.
#define ASKFOR 1                    // if TRUE: Ask for a x value then print the calculated ln value for it.

// Global vars
double results[MAXELM];             // Array to store computed values.

// Func: offset to var conversion
double getvar(int offset)
{
    double x = (double)MINVAL + ((double)MAXVAL - (double)MINVAL) * (double)offset / (double)MAXELM;
#if NONLIN
    x = pow((x - ACRTPT), 3) + ACRTPT;
    // This ^ is the equation used when NONLIN = 1; to have a non linear repartition. Feel free to change it. The inverse equation is in `int getoffset(double)`.
#endif
    return x;
}

// Func: var to offset conversion
int getoffset(double var)
{
#if NONLIN
    int offset = ((
        cbrt(var - ACRTPT) + ACRTPT
    // This ^ is the equation used when NONLIN = 1; to calculate offset with a non linear repartition. Feel free to change it (but it must be the inverse of the one in
    // `double getvar(int)` for this to work.)
                    ) - (double)MINVAL) * (double)MAXELM / ((double)MAXVAL - (double)MINVAL);
#else
    int offset = (var - (double)MINVAL) * (double)MAXELM / ((double)MAXVAL - (double)MINVAL);
#endif
#if STRICT
    if(getvar(offset) != var)
        return -1;
    return (offset < 0)?-1:(offset > (MAXELM - 1))?-1:offset;
#else
    return (offset < 0)?0:(offset > (MAXELM - 1))?MAXELM - 1:offset;
#endif
}

// Func: main.
int main(int argc, char* argv[])
{
    int offset;
    for(offset = 0; offset < MAXELM; offset++)
        results[offset] = log(getvar(offset));

#if PNTALL
    for(offset = 0; offset < MAXELM; offset++)
    {
        printf("[log(%lf) = %lf] ", getvar(offset), results[offset]);
        if(!((offset + 1) % 6))
            printf("\n");
    }
    printf("\n");
#endif

#if ASKFOR
    double x;
    printf("log(x) for x = ");
    scanf("%lf", &x);
    if((offset = getoffset(x)) < 0)
        printf("ERROR: Value for x = %lf hasn't been calculated\n", x);
    else
        printf("results[%d]: log(%lf) = %lf\n", offset, getvar(offset), results[offset]);
#endif

    return 0;
}

This version is cleaner and easier to maintain than the previous ones. If you need anything else, please leave a comment.

You can configure its behavior using the macros at the top of the file.

Caracteristics:

Uses a fixed size array.
Computes ONLY the stored values (won't calculate multiple values for one array cell).
Uses functions to get offset from value and value from offset, so you don't have to store the values of which the log has been calculated.

Second version

Well, here is my second solution. See below it for the original comment.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define MIN_INC 0.001    // This is the minimum increment. If its set to 0, when tmp will be equal to avg, it will never leave this state, since INC_MUL * (tmp - avg)^2 will be 0.
#define INC_MUL 0.2      // This is a number which influences the precision you will get. The smaller it is, the more precise you will be, and the greater will be your result array cardinality.

typedef struct {
    double offset;
    double value;    // value = log(offset). Since the results are not linarly widespread, this is pretty important.
} logCalc;

// Here, we need to use a pointer on a logCalc pointer, since we want to actually SET the address of the logCalc pointer, not the address of one of its copies.
int MyLogCreate(logCalc** arr, double min, double max)
{
    if((*arr) != NULL)
        return 0;
    unsigned int i = 0;
    double tmp, avg = (max + min) / 2.0;
    for( ; min < avg; min += (INC_MUL * ((avg - min) * (avg - min)) + MIN_INC))
    {
        (*arr) = (logCalc*)realloc((*arr), sizeof(logCalc) * (i + 1));
        (*arr)[i].offset  = min;
        (*arr)[i++].value = log(min);
    }
    for(tmp = avg ; tmp < max; tmp += (INC_MUL * ((tmp - avg) * (tmp - avg)) + MIN_INC))
    {
        (*arr) = (logCalc*)realloc((*arr), sizeof(logCalc) * (i + 1));
        (*arr)[i].offset  = tmp;
        (*arr)[i++].value = log(tmp);
    }
    return i;
}

int main(int argc, char** argv)
{
    logCalc *myloglut = NULL;
    unsigned int i,
        t = MyLogCreate(&myloglut, .1, 1.9);
    for(i = 0; i < (t-1); i++)
    {
        printf("[log(%lf) = %lf], ", myloglut[i].offset, myloglut[i].value);
        if(!((i+1)%6))         // Change 6 to what's best for your terminal $COLUMNS
            printf("\n");
    }
    printf("\n");
    free(myloglut);
    return 0;
}

Original comment

The linearity of your calculation comes from the fact that you're using a linear increment. On each iteration of your for loop, you increment exp by (2.0 - 0.1) / MAXLOG.

To get more precise values around 0, you will need:

To define a larger range - a larger array - (to be able to store more values around 0)
To use a non-linear increment. This increment will probably be dependent on i (or on exp, depending on how you do it), so you know precisely the "offset" of the number you are trying to calculate (and the amount you need to increment exp with). Of course, you will calculate more results around 0.

Here is my current implementation of it:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define CALCULATE_UNTIL 2.0
#define PRECISE_UNTIL   1.0

typedef struct {
    double offset;
    double value;
} logCalc;

logCalc *myloglut = NULL;

int MyLogCreate()
{
    double exp = 0.1;
    int i;
    for (i = 0; exp <= CALCULATE_UNTIL; exp += (exp < PRECISE_UNTIL)?0.0001898:0.001898)
    {
        myloglut = realloc(myloglut, sizeof(logCalc) * (i + 1));
        myloglut[i].offset = exp;
        myloglut[i++].value = (i == 4780)?0:log(exp);
    }
    return i; // So you know how big the array is. Don't forget to free(myloglut); at the end of your code.
}

int main(int argc, char** argv)
{
    int i,
    t = MyLogCreate();
    for(i = 0; i < t; i++)
    {
        printf("[log(%lf) = %lf], ", myloglut[i].offset, myloglut[i].value);
        if(!(i%6))    // For formatting purposes.
            printf("\n");
    }
    printf("\n");
    free(myloglut);
    return 0;
}

I've created a new type in order to store the value of exp too, which could be useful for knowing what value the result is the log of.

Update: I'm not sure about what you want to do. Do you want to be precise around log(x) = 0 or around x = 0? On the first case, I might have to re-write the code again for it to work as you want. Also, do you want the results to be more precise at it gets close to 0, or do you want the results to be more precise in a given range (as it is now)?

Thanks, I already know that ) but what I want is how to achieve that. More practically, rather than theory. — Pablo, Dec 06 '12 at 09:30
I don't think zeroing log(1) is a good choise, as that would make lut [478] an outlier compared to lut [478 +- n] — Aki Suihkonen, Dec 06 '12 at 09:48
@7heo.tk: I want the result to be precise when `x` is close to `1`, hence the `log(x)` will be close to `0`. — Pablo, Dec 06 '12 at 12:18
The solution I've added on the top of the post enforces precision on the average of the interval you give it. If you want to have bigger ranges (like from 0.1 to 4) but still center on 1, you will have to hardcode avg (or whatever you call it if you change it) to 1 (or find a way to calculate it from your new values); and change the increment algorithm. Speaking of which, it's currently a quadratic function, based on y = MUL * (x - center)² + INC. (and center is 1). I hope it helps. — 7heo.tk, Dec 06 '12 at 12:38
It's not quite what I want. I would expect some function like y=k*x^3 which would define the velocity(curve) of calculated log values while approaching to `log(1)`. The same curve formula would be used when picking up the values. In other words while approaching to `log(1)` the number of samples are increasing. Also I would prefer to stick with static array with well known size. In your solution precision changes rapidly and number of sample density is fixed. — Pablo, Dec 06 '12 at 16:48
Either I don't understand you, or I haven't been clear enough about what I did... From what I understand, you want to "increase the number of samples when getting close to log(1)", and the listing at the very top of my post does exactly that. Also, having fix-sized arrays with variable precision and/or range is not possible. You can have fix-sized arrays with range depending on the precision (or the other way around), but that's all. Plus, since array indexes are linear, you will have to store the value of x for each result, or it won't mean much. — 7heo.tk, Dec 06 '12 at 20:33
Sorry, somehow I overlooked the new source code. However, to find log(x), how can I effectively locate offset with the x I want? Could you please update with the function to locate interested offset? Thx — Pablo, Dec 06 '12 at 23:09
I'll try to do a fixed-size version of this tomorrow. I didn't understand that you were using a MCU. I will also try to do a version that can use the array offset to deduce x from calculus (through a formula, I've some ideas about it, but I'm not 100% sure how I will implement it); and vice versa (so you can directly access log(x) from the value of x, and know without checking the array if the value will exist or not). I'm off for today! — 7heo.tk, Dec 06 '12 at 23:15
Thanks, I will upvote your answer since it's almost close to solution. Good night! — Pablo, Dec 06 '12 at 23:43
I'm currently experiencing difficulties with `pow` (to calculate the cubic square for the inverse equation) because that *"The pow functions compute x raised to the power y. A domain error occurs if x is ﬁnite and negative and y is ﬁnite and not an integer value."*. So `pow(-0.4, 1.0/3.0)` returns `-nan` instead of its correct value. Aside from that, everything works great. Source: http://stackoverflow.com/questions/4269069/finding-cube-root-in-c#answer-4269119 — 7heo.tk, Dec 07 '12 at 16:19
Okay, I didn't know of the cbrt function, which is not documented here: http://cplusplus.com/reference/cmath. I now use the cbrt function and everything works. I'm gonna post the solution asap. Source: http://stackoverflow.com/questions/8493005/c-finding-cube-root-of-a-negative-number-with-pow-function#answer-8493038 — 7heo.tk, Dec 07 '12 at 17:04
If possible avoid cbrt. I am trying to get away from standard libraries. Afterall I will have to convert function to asm, so it will be better not to use `cbrt`. — Pablo, Dec 07 '12 at 17:06
Well, it's not possible to use pow, so maybe the only outcome here would be to code a custom cbrt function. I'll add the cbrt function and the pow function to the file asap. — 7heo.tk, Dec 07 '12 at 17:08
Well, I'm gonna implement `cbrt` from the information available here: http://metamerist.com/cbrt/cbrt.htm I think. If you add the sqrt assembly code to your post, maybe (I say maybe, it's been a long time since I've done some assembly) I will directly add implement it in asm. Nevertheless the C compiler will generate bytecode, which can be disassembled (I take you do that already). — 7heo.tk, Dec 07 '12 at 17:18
I'm using sqrt from here, however, it's fixed point. Not sure if that may help. http://www.mikrocontroller.net/articles/AVR_Arithmetik#avr-gcc_Implementierung_.2832_Bit.29 it's in German but I guess you will understand. No need to use asm ) — Pablo, Dec 07 '12 at 17:22
Please bear in mind that the reason I'm not using stock `log` function is performance. — Pablo, Dec 07 '12 at 17:49
Well, instead of re-implementing cbrt, which can lead to poor performances (_"With either method a poor initial approximation of can give very poor algorithm performance"_, source: http://en.wikipedia.org/wiki/Cube_root#Numerical_methods), I think I'm gonna use split the method in two parts, using pow, one for when x is greater than ACRTPT (`x - ACRTPT ≥ 0`), the other for when x smaller than ACRTPT (`x - ACRTPT < 0`). — 7heo.tk, Dec 07 '12 at 23:46

score 0 · Answer 2 · answered Dec 06 '12 at 09:43

Shift your function "origin" from zero or 0.1 to 1.0.

 for (i=-478;i<523;i++) {
     double j = 1.0 + (double)i / 523.0;
     myloglut[i+478] = log(j);
 }

This function selects exactly two points: 1.0 and 2.0 as 1.0 + (523.0 / 523.0) == 2.0.

The first value is then:

myloglut[0] = log(0.0860420650095602);

A more "natural" size would be 973, which would make the divisor 512 (exactly) and the first entry would be for 51/512 = ~0.099609375.

tmyklebu · Answer 3 · 2012-12-07T18:07:40.370

How accurate and how fast does this need to be? You can do something pretty good with a piecewise Chebyshev approximation. You control the accuracy using the order of the Chebyshev approximation (higher order = slower but more accurate). I would also suggest doing argument reduction by decomposing your double into its mantissa (between 1 and 2) and its exponent (a power of two whose logarithm is simply the exponent times log(2), which you can precompute.)

I don't think you can come up with anything very accurate on [0.1, 2] without either doing more arithmetic whenever you want a log or using a huge table and incurring all the caching and unpredictable memory access issues that entails. But do consider doing a piecewise Chebyshev approximation if you have enough time. (If you want me to show you code that uses a Chebyshev approximation, let me know with a comment and I'll update this post.)

EDIT: Code for logarithm using Chebyshev approximation. Accurate within 1e-5.

double mylog(double x) {
  static double logtwo = log(2);
  static double tbls[4][5] = {{0,0,0,0,0},
    {-.9525934480e-2,-.87402539075,-1.135790603,1.1519051721,-1.7063112037},
    {.53892330786e-3,-1.0117355213,-.4085197450,-.6242237228,0},
    {.60393290013e-6,-1.0001523639,-.4940510719,-.4058961978,0}};
  if (x>1) return -mylog(1/x);
  int expo,e2;
  x = 1-frexp(x, &expo);
  double y = frexp(x, &e2);
  if (e2 < -3) e2 = -3;
  double *tbl = tbls[-e2];
  return expo*logtwo + tbl[0]+x*(tbl[1]+x*(tbl[2]+x*(tbl[3]+x*tbl[4])));
}

I calculated the Chebyshev approximations using Maple and expanded them as traditional polynomials for speed.

If you want good accuracy very close to 1, you can change if (e2 < -3) e2 = -3 and add the row {0,-1,-.5,-1/3.,-.25} to the end of tbls corresponding to Taylor's approximation. If you want it faster, compute a better cubic approximation to log(x) between 1/2 and 3/4 and store it in the first row of tbls.

Since precision is varying, I could say on the edges 10E-4 is enough and around center(x=1) 10E-9. Actually it would be also best to play with precision to see the performance and find golden middle point. Please post the Chebyshev approximation sample code. — Pablo, Dec 07 '12 at 00:26
I don't think you can beat hardware logarithm if you need that much precision. — tmyklebu, Dec 07 '12 at 16:59

Non linear lut log precision

3 Answers3

Last version

Third version

Second version

Original comment