7

Maybe it seems a little bit rare question, but I would like to find a function able to transform a double (c number) into a long (c number). It's not necessary to preserve the double information. The most important thing is:

double a,b;

long c,d;

c = f(a);

d = f(b);

This must be truth:

if (a < b) then c < d for all a,b double and for all c,d long

Thank you to all of you.

Stefan Falk
  • 23,898
  • 50
  • 191
  • 378
user2965069
  • 301
  • 2
  • 7
  • maybe what you're looking for is [`floor()`](http://linux.die.net/man/3/floor) – Sourav Ghosh Dec 08 '14 at 20:57
  • 2
    Assuming a long is 4 bytes and a double is 8 bytes, you can't. The double fundamentally stores more information, making it impossible to map all states to a unique state (which is essentially what you're asking by preserving order). If they are however of the same number of bytes - just cast them bitwise, like so: _c = *((long*)(&a))_ – Invalid Dec 08 '14 at 20:58
  • 1
    Using `floor` satisfies the condition `if (a < b) then (c <= d)`, close but not quite the same. I'm fairly certain that there is no function that satisfies your original condition, though. – Kenogu Labz Dec 08 '14 at 21:00
  • 1
    rare condition. did you examine whether your claim is feasible or not? maybe starting from following question should be a good choice: what's `f(0.99)`, `f(1.0)`, `f(1.1)` according to your condition? – Jason Hu Dec 08 '14 at 21:03
  • 2
    @Invalid Casting `double` to `long` is just incorrect. – AlexD Dec 08 '14 at 21:04
  • What is behind your question? – Weather Vane Dec 08 '14 at 21:04
  • 1
    @AlexD Heh now I think about it for more than 3 seconds, you're totally right, it doesn't make a lot of sense XD.. Regardless, the rest of the comment should hold. – Invalid Dec 08 '14 at 21:05
  • @Invalid Yep, according to the standard: _"When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined."_ – AlexD Dec 08 '14 at 21:07
  • @AlexD that's when doing normal casting though, I didn't cast like normal, I reinterpreted by simply reinterpreting the pointed to data as a different type - regardless, it still doesn't necessarily work depending on the implementation of both types. – Invalid Dec 08 '14 at 21:08
  • Just gave an informal proof. @Invalid, your observation that long -> double loses information is correct under the pigeonhole principle. Such a mapping cannot exist. – Kenogu Labz Dec 08 '14 at 21:17
  • 1
    If `sizeof(double)==sizeof(some_int_type)` and `double` is a FP like [binary64](http://en.wikipedia.org/wiki/Double-precision_floating-point_format) and endian are correct and a number of other conditions, then `some_int_tpye f(double x) { union { double d; some_int_tpye i;) u = { x}; return u.i; }` works if `some_int_type` is sign-magnitude. Else more work needed to cope with 2's complement. – chux - Reinstate Monica Dec 08 '14 at 21:18
  • @Invalid *If they are however of the same number of bytes* do you mean where `float` is size 4 and `long` is size 4, or `double` and `long long` are size 8? Within the range of the integral types they store more information than the real types. – Weather Vane Dec 08 '14 at 21:21
  • 2
    according to sizeof, applied to a double and a long in my computer, both are 8 bytes long. – user2965069 Dec 08 '14 at 21:28
  • `long` is mostly 8 bytes long nowadays except stupid eccentric Windows. – The Paramagnetic Croissant Dec 08 '14 at 21:32
  • @TheParamagneticCroissant so I suppose your system defines `short` as size 4? – Weather Vane Dec 08 '14 at 21:41
  • @WeatherVane nope, `short` is usually 2 bytes. `int` is what's 4 bytes long for me (and it often is on other platforms as well). – The Paramagnetic Croissant Dec 08 '14 at 21:43
  • 1
    Also, if you simply reinterpret the bytes of the `double` as an appropriately-sized signed integer in 2's complement form, and the `double` doesn't store anything tricky (NaN, denormals, infinity, negative zero, etc.), then for doubles `d1` and `d2` and their respective integer representations `i1` and `i2`, `d1 < d2` implies `i1 < i2`. – The Paramagnetic Croissant Dec 08 '14 at 21:43
  • @TheParamagneticCroissant, sorry, irony is usually misinterpreted in these contexts and I can inform you that my `int` can be 64 bits, if I let it. I thought the point of `byte`, `short` and `long` was to make such declaration of known size (1, 2, 4), as oppposed to `int`. – Weather Vane Dec 08 '14 at 21:56
  • @WeatherVane nope, unfortunately :/ `short` is just as arbitrarily-sized in C as `int`, `long` and `long-long` and `char` (modulo the `sizeof(char) == 1` constraint, but that still doesn't imply that `char` be 8 bits long). If you want exact-width integers, you should use `(u)int8_t`, `(u)int16_t`, etc. – The Paramagnetic Croissant Dec 08 '14 at 22:02
  • @TheParamagneticCroissant Yet you said above that `short` is 2, after I said it could be 4? You are getting into knots. – Weather Vane Dec 08 '14 at 22:09
  • @WeatherVane No, I'm not. it's *usually* 2 bytes long, but it *need not.* The only thing the C standard defines is that `short` needs to be able to represent integers between -32767 and +32767 *at the very least*. However, most practical modern C implementations provide an exactly 16-bit and 2-byte wide `short`. There's no contradiction in that whatsoever. – The Paramagnetic Croissant Dec 08 '14 at 22:13
  • @The Paramagnetic Croissant "doubles d1 and d2 and their respective integer representations i1 and i2, d1 < d2 implies i1 < i2" is not so for negative doubles if `i1` is 2's complement. `double` is laid out more like sign-magnitude. – chux - Reinstate Monica Dec 08 '14 at 22:39
  • @chux correct. Too bad I can't edit my comment now. – The Paramagnetic Croissant Dec 08 '14 at 22:46
  • @WeatherVane however that's irrelevant as he wants a function from the real type to the integral type ;) – Invalid Dec 08 '14 at 22:52

3 Answers3

5

Your requirement is feasible if the following two conditions hold:

  1. The compiler defines sizeof(double) the same as sizeof(long)
  2. The hardware uses IEEE 754 double-precision binary floating-point format

While the 2nd condition holds on every widely-used platform, the 1st condition does not.

If both conditions do hold on your platform, then you can implement the function as follows:

long f(double x)
{
    if (x > 0)
        return double_to_long(x);
    if (x < 0)
        return -double_to_long(-x);
    return 0;
}

You have several different ways to implement the conversion function:

long double_to_long(double x)
{
    long y;
    memcpy(&y,&x,sizeof(x));
    return y;
}

long double_to_long(double x)
{
    long y;
    y = *(long*)&x;
    return y;
}

long double_to_long(double x)
{
    union
    {
        double x;
        long   y;
    }
    u;
    u.x = x;
    return u.y;
}

Please note that the second option is not recommended, because it breaks strict-aliasing rule.

barak manos
  • 29,648
  • 10
  • 62
  • 114
  • 1
    It's 1. sizeof(double) <= sizeof(long) and 2. floating point format is normalized. A union may also be used to avoid copying. – hdante Dec 08 '14 at 21:56
  • @hdante: Good point on the `<=`, I just "instinctively" assumed that `sizeof(double) >= sizeof(long)`. But in the case of `<=`, we have to set `y=0`, and we cannot use `f` safely on big-endian. n short, since I used "if" (and not "if and only if"), I'm going to keep the answer as is. It may be feasible also in the rare case of `<=` as you mentioned, but a **portable** implementation would be slightly more difficult. – barak manos Dec 08 '14 at 21:58
  • The union trick **DOES NOT** break strict aliasing in C99 and later. So, it's strongly preferred to the (currently-included) pointer aliasing hack. (but even if you use pointer casting, you should be returning `y` and not `x` – in fact, you don't even need the variables.) – The Paramagnetic Croissant Dec 08 '14 at 22:02
  • @TheParamagneticCroissant: Nice one, but since both you and the other commentator have suggested that, I wouldn't feel fair "stealing" your idea... Thanks :) – barak manos Dec 08 '14 at 22:04
  • 2
    The method reverses the order of negative numbers assuming `long` is 2's complement. The layout of binary64 is more like sign-magnitude. Need to factor that in. Also corner case of +0 and -0 will not compare equally. – chux - Reinstate Monica Dec 08 '14 at 22:34
  • @chux: Thanks very much for pointing that out! Can you please confirm the revised implementation? – barak manos Dec 08 '14 at 22:54
  • Edit looks good to me, 1 exception (-0.0). If you replace `if (x >= 0)` with `if (!signbit(x))` then your solution will work. Alternately, add `x *= 1.0;` to the top of code. I used `y= LONG_MAX - y`; in my answer. Many possibilities. – chux - Reinstate Monica Dec 08 '14 at 23:05
  • Suggest forming a _best_ answer using `union`, your `x = -x;` approach and something to cope with `-0.0`. Otherwise we have the best pieces split amongst answers. Else should we form a community wiki answer? – chux - Reinstate Monica Dec 08 '14 at 23:58
  • @chux: I accept your suggestion. Can you please validate the revised answer? Thanks again :) – barak manos Dec 09 '14 at 07:02
  • 1
    Looks goods. The cast approach may have a aliasing issue (as you noted) and a pedantic solution would use `if (x > 0) return double_to_long(x);` to cope with `-0.0`. All-in-all, close enough for government work. – chux - Reinstate Monica Dec 09 '14 at 15:07
1

There are four basic transformations from floating-point to integer types:

floor - Rounds towards negative infinity, i.e. next lowest integer.
ceil[ing] - Rounds towards positive infinity, i.e. next highest integer.
trunc[ate] - Rounds towards zero, i.e. strips the floating-point portion and leaves the integer.
round - Rounds towards the nearest integer.

None of these transformations will give the behaviour you specify, but floor will permit the slightly weaker condition (a < b) implies (c <= d).

If a double value uses more space to represent than a long, then there is no mapping that can meet your initial constraint, thanks to the pigeonhole principle. Basically, since the double type can represent many more distinct values than a long type, there is no way to preserve the strict partial order of the < relationship, as multiple double values would be forced to map to the same long value.

See also:

Community
  • 1
  • 1
Kenogu Labz
  • 1,094
  • 1
  • 9
  • 20
1

Use frexp() to get you mostly there. It splits the number into exponent and significand (fraction).

Assume long is at least the same size as double, other-wise this is pointless. Pigeonhole principle.

#include <math.h>
long f(double x) {
  assert(sizeof(long) >= sizeof(double));
  #define EXPOWIDTH 11
  #define FRACWIDTH 52
  int ipart;
  double fraction = frexp(fabs(x), &ipart);

  long lg = ipart;
  lg += (1L << EXPOWIDTH)/2;
  if (lg < 0) ipart = 0;
  if (lg >= (1L << EXPOWIDTH)) lg = (1L << EXPOWIDTH) - 1;
  lg <<= FRACWIDTH;

  lg += (long) (fraction * (1L << FRACWIDTH));
  if (x < 0) {
    lg = -lg;
  }
  return lg;
}

-

Notes:

The proper value for EXPO depends on DBL_MAX_EXP and DBL_MIN_EXP and particulars of the double type.

This solution maps the same double values near the extremes of double. I will look and test more later.


Otherwise as commented above: overlay the two types.

As long is often 2's complement and double is laid out in a sign-magnitude fashion, extra work is need when the double is negative. Also watch out for -0.0.

long f(double x) {
  assert(sizeof x == sizeof (long));
  union {
    double d;
    long lg;
  } u = { x*1.0 };  // *1.0 gets rid of -0.0
  // If 2's complement - which is the common situation
  if (u.lg < 0) {
    u.lg = LONG_MAX - u.lg;
  }
  return u.lg;
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256