Is there a function to retrieve the number of available distinct values within a range?

Question

I'm using double precision floating point variables within an application I'm making.

I normalize some ranges of values. Going from (for example; I've many ranges) -48.0 to 48.0 to 0.0 to 1.0, using this simply function:

double ToNormalizedParam(double nonNormalizedValue, double min, double max, double shape) {
    return pow((nonNormalizedValue - min) / (max - min), 1.0 / shape);
}

I'd like to know the differences in available and distinct values mapping from a range to another.

Is there a ready-to-go function in C++? I've looked at numeric_limits, but I can't find anything useful.

You should be aware that, in c++, a range is usually taken to mean a pair of iterators that denote the beginning and end of a set of elements in a container. While the question isn't technically wrong, the terminology may lead to some confusion. — François Andrieux, Nov 08 '17 at 14:48
Off the top of my head, no. Rolling your own doesn't sound fun either, but is plausible I suppose — Passer By, Nov 08 '17 at 14:50
Not sure what you are asking. You have a range of values and you want to count distinct values? Is that correct? — Ron, Nov 08 '17 at 14:55
@Ron question is about a range of floating point values, in the sense of an interval [a,b], not a range of elements in a container or iterators — 463035818_is_not_an_ai, Nov 08 '17 at 14:57
@Ron check it out https://en.wikipedia.org/wiki/Unit_in_the_last_place — markzzz, Nov 08 '17 at 15:00
@Ron no, if you think there is infinite number of floating points in [a,b] you need to either review floating points or infinity ;) — 463035818_is_not_an_ai, Nov 08 '17 at 15:01
@Ron i am not familiar with Computer Science SE, but as the question is asking for a concrete solution in c++, i dont see why it would not fit here — 463035818_is_not_an_ai, Nov 08 '17 at 15:03
If you are interested in preserving as much available and distinct values as possible, you may consider to use `long double`s for the intermediate calculations. As it's now, you are already restricting to [0,1] and then applying the `pow` function, loosing some other distinct values. — Bob__, Nov 08 '17 at 15:57

chux - Reinstate Monica · Accepted Answer · 2017-11-09T02:59:51.767

4

Is there a ready-to-go function in C++?

Perhaps. If not, it is easy enough to form a function to assigned a sequence number to each double value.

Assuming matching FP/integer for endian & size and typical FP layout like double64, the below is valid -INF to INF.

// Return a sequence number for each `double` value.
// Numerically sequential `double` values will have successive (+1) sequence numbers.
uint64_t double_sequence(double x) {
  uint64_t u64;
  memcpy(&u64, &x, sizeof u64);
  if (u64 & 0x8000000000000000) {
    u64 ^= 0x8000000000000000;
    return 0x8000000000000000 - u64;
  }
  return u64 + 0x8000000000000000;
}

Is there a function to retrieve the number of available distinct values within a range?

Simply subtract the sequence numbers. +1 or -1 depending on if an open or closed range.

double_sequence(1.0)  - double_sequence(0.0)   + 1 --> 0x3ff0000000000001
double_sequence(48.0) - double_sequence(-48.0) + 1 --> 0x8090000000000001

Notes:
Keep in mind that FP are logarithmically distributed overall and linear within powers of 2.
For about half of all FP, |x| < 1.0.
There are as many FP numbers 0.5 to 1.0 as between 16.0 to 32.0.
There are over twice as many double in the [-48.0 ... 48.0] versus [0.0 ... 1.0] range, primarily due to negative values.

edited Nov 09 '17 at 02:59

answered Nov 08 '17 at 15:53

chux - Reinstate Monica

143,097
13
135
256

I don't get `There are as many FP numbers 0.5 to 1.0 as between 16.0 to 32.0.`. If between `0.5` and `1.0` there are (let say) `X` possible values, between `0.0` and `40.0` there are those `X` plus the ones between `0.0-0.5` and `1.0-40.0`. Way big. How can they be "the same" considering different ranges? :O – markzzz Nov 09 '17 at 11:42
@markzzz Does [distributed linearly but logarithmically in linear groups](https://stackoverflow.com/a/43284459/2410359) help? – chux - Reinstate Monica Nov 09 '17 at 11:56
@markzzz X values between 1.0 to 2.0, X values between 2.0 to 4.0, X values between 4.0 to 8.0, etc. X values between 0.25 to 0.125, X values between 0.125 to 0.0625, X values between 0.0625 to 0.03125, etc – chux - Reinstate Monica Nov 09 '17 at 11:59
no you misunderstand me :) http://coliru.stacked-crooked.com/a/5731f7dd8f486e68 : the more the range is "huge", the more available values you can use. 10 and 91 setting 1 magnitude up. X change for every range. – markzzz Nov 09 '17 at 12:11
@markzzz Between 0.5 to 1.0, your program will report X different values, maybe about 4.5e15 (if you let it run long enough). Between 16.0 to 32.0, it will report the same. Try with `float`, X should be about 8,300,000. – chux - Reinstate Monica Nov 09 '17 at 12:19
@markzzz Your code compares 2 ranges, the seconds is a [superset](https://en.wikipedia.org/wiki/Subset#Definitions) of the first, so a larger number is expected. My "as many FP numbers 0.5 to 1.0 as between 16.0 to 32.0" compares 2 separate [disjoint sets](https://en.wikipedia.org/wiki/Disjoint_sets). Yes, [0 40.0] has more members than sets [0.5 1.0] and [16.0 32.0] combined. Yet that is not the point. Set [0.5 1.0] and set [16.0 32.0] each have the same member count. – chux - Reinstate Monica Nov 09 '17 at 13:01
Man, Am I on drugs? I'm talking about how many different values there (i.e. I can use) between two different sets. You also said this: "Yes, [0 40.0] has more **members** than sets [0.5 1.0] and [16.0 32.0] combined." – markzzz Nov 09 '17 at 13:31
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/158607/discussion-between-markzzz-and-chux). – markzzz Nov 09 '17 at 13:32

Passer By · Answer 2 · 2017-11-10T08:54:53.587

1

Given a positive IEEE 754 double precision floating point number with exponent e and mantissa m, both interpreted as an integer, the distinct values (not counting denormalized values) less than it but greater than zero will be exactly m + (e - 1) * 2^52.

Which could be extracted like so

#include<iostream>
#include<tuple>
#include<cstdint>
#include<cstring>

using std::uint64_t;

std::tuple<uint64_t, uint64_t, uint64_t> explode(double d)
{
    static_assert(sizeof(double) == 8);
    uint64_t u;
    std::memcpy(&u, &d, sizeof(d));
    return { (u & 0x8000000000000000) >> 63,
             (u & 0x7FF0000000000000) >> 52,
              u & 0x000FFFFFFFFFFFFF };
}

uint64_t distinct(double d)
{
    auto [_, e, m] = explode(d);
    return m + ((e - 1) << 52);
}

int main()
{
    std::cout << "[-48, 48]: " << 2 * distinct(48) << "\n[0, 1]: " << distinct(1) << '\n';
}

Live

edited Nov 10 '17 at 08:54

answered Nov 08 '17 at 15:01

Passer By

19,325
6
49
96

1

I'm not sure this is what the OP's *really* looking for, because the map given by ToNormalizedParam ( and its formal inverse ) is neither injective nor surjective and depends on rounding mode ( and iee754 conformance )... – Massimiliano Janes Nov 08 '17 at 15:11
@MassimilianoJanes I made the only conclusion I can, we'll have to wait for OP to clarify I suppose. – Passer By Nov 08 '17 at 15:14
@MassimilianoJanes how ToNormalizedParam round doesn't matter here. Different values will collide on different ones, yes, but the range between 0 and 1, despite the round function, is always that on a fixed machine – markzzz Nov 08 '17 at 15:36
i.e. I'm not asking which numbers will be used after the normalization, but how many possible values I can use – markzzz Nov 08 '17 at 15:36

Is there a function to retrieve the number of available distinct values within a range?

2 Answers2

Linked