2

Is there a way to convert a std::bitset<64> to a double without using any external library (Boost, etc.)? I am using a bitset to represent a genome in a genetic algorithm and I need a way to convert a set of bits to a double.

CagedMantis
  • 983
  • 1
  • 7
  • 8
  • 1
    You want the bits to represent a 64-bit integer, that you then convert to a double? (as opposed to treating the bits as a binary representation of a double-precision floating point number). – Vaughn Cato Apr 27 '12 at 13:07
  • 3
    using `*reinterpret_cast(&bitset<64>::to_ullong())`, not ok? – user2k5 Apr 27 '12 at 13:14
  • @user2k5: That dereferences an object through an incompatible pointer type... I'm not sure what the rules are in C++, but that action is undefined behaviour in C. – dreamlax Apr 27 '12 at 13:21
  • it seems fine. http://ideone.com/ua8zv. Downside is C++11 needed. – user2k5 Apr 27 '12 at 13:25
  • @user2k5 Dreamlax is right, this is UB in C and C++ alike. However, the route via `char[64]` works. – Konrad Rudolph Apr 27 '12 at 13:31
  • I am not dereferencing an object, I am dereferencing the unsigned long long from to_ullong()... – user2k5 Apr 27 '12 at 13:33
  • @user2k5: the pointers are not only useless, they introduce a violation of strict aliasing. – Matthieu M. Apr 27 '12 at 13:33
  • i agree it violates strict aliasing. So I was asking if it is ok... – user2k5 Apr 27 '12 at 13:35
  • @CagedMantis: it would be much easier if you could use a `uint64_t` directly instead of a `double`, is it not possible ? – Matthieu M. Apr 27 '12 at 15:56
  • @MatthieuM.: I do need a double instead of a uint64_t. I won't have a need to convert the double back to a bitset though. – CagedMantis Apr 27 '12 at 20:33

3 Answers3

5

The C++11 road:

union Converter { uint64_t i; double d; };

double convert(std::bitset<64> const& bs) {
    Converter c;
    c.i = bs.to_ullong();
    return c.d;
}

EDIT: As noted in the comments, we can use char* aliasing as it is unspecified instead of being undefined.

double convert(std::bitset<64> const& bs) {
    static_assert(sizeof(uint64_t) == sizeof(double), "Cannot use this!");

    uint64_t const u = bs.to_ullong();
    double d;

    // Aliases to `char*` are explicitly allowed in the Standard (and only them)
    char const* cu = reinterpret_cast<char const*>(&u);
    char* cd = reinterpret_cast<char*>(&d);

    // Copy the bitwise representation from u to d
    memcpy(cd, cu, sizeof(u));

    return d;
}

C++11 is still required for to_ullong.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • isn't this UB? Unions can only contain one member at a time. When you activate `d`, as a consequence `i` will be invalid? see http://stackoverflow.com/q/10271929/1025391 – moooeeeep Apr 27 '12 at 13:42
  • 4
    Also seee http://stackoverflow.com/questions/6136010/is-using-an-union-in-place-of-a-cast-well-defined. Yes this is UB, but yes this works on most platforms. Do you want to depend on it? that is a question no one else can answer. – PlasmaHH Apr 27 '12 at 13:44
  • That looks nice. Very slick indeed. –  Apr 27 '12 at 14:09
  • 1
    @moooeeeep: yes it is technically UB. Any *brutal* conversion is. Still, this kind of conversion has been used for a very long time. For example, the [Fast Inverse Square Root](http://en.wikipedia.org/wiki/Fast_inverse_square_root) actually requires it (or its brother which breaks strict aliasing). And as such, compiler writers actually take care that it works properly; since the Standard says anything can happen (UB), the compilers are still conforming :D – Matthieu M. Apr 27 '12 at 14:14
  • @MatthieuM. No, the route via `char*` aliasing is unspecified, not UB. Important difference. We had a discussion about this in the chat a while ago. – Konrad Rudolph Apr 27 '12 at 15:23
  • @KonradRudolph: I'd like to see the chat transcript if you still have it, because it seems fishy to me. The spirit of strict aliasing relaxation for `char*` was to allow byte-wise inspection of a value; it still seems to me that as long as a `uint64_t*` and a `double*` points to the same memory location, then strict aliasing is violated. – Matthieu M. Apr 27 '12 at 15:56
  • @MatthieuM. True. You need to alias via `char*` and then *copy*. That’s what I meant. – Konrad Rudolph Apr 27 '12 at 16:22
  • @KonradRudolph: Ah right! I had not thought about copying. I tweaked the edit to go this road and I feel much better about it now, thanks! – Matthieu M. Apr 28 '12 at 10:26
2

Most people are trying to provide answers that let you treat the bit-vector as though it directly contained an encoded int or double.

I would advise you completely avoid that approach. While it does "work" for some definition of working, it introduces hamming cliffs all over the place. You usually want your encoding to arrange things so that if two decoded values are near to one another, then their encoded values are near to one another as well. It also forces you to use 64-bits of precision.

I would manage the conversion manually. Say you have three variables to encode, x, y, and z. Your domain expertise can be used to say, for example, that -5 <= x < 5, 0 <= y < 100, and 0 <= z < 1, where you need 8 bits of precision for x, 12 bits for y, and 10 bits for z. This gives you a total search space of only 30 bits. You can have a 30 bit string, treat the first 8 as encoding x, the next 12 as y, and the last 10 as z. You are also free to gray code each one to remove the hamming cliffs.

I've personally done the following in the past:

inline void binary_encoding::encode(const vector<double>& params)
{
    unsigned int start=0;

    for(unsigned int param=0; param<params.size(); ++param) {
        // m_bpp[i] = number of bits in encoding of parameter i
        unsigned int num_bits = m_bpp[param];

        // map the double onto the appropriate integer range
        // m_range[i] is a pair of (min, max) values for ith parameter
        pair<double,double> prange=m_range[param];
        double range=prange.second-prange.first;
        double max_bit_val=pow(2.0,static_cast<double>(num_bits))-1;
        int int_val=static_cast<int>((params[param]-prange.first)*max_bit_val/range+0.5);

        // convert the integer to binary
        vector<int> result(m_bpp[param]);
        for(unsigned int b=0; b<num_bits; ++b) {
            result[b]=int_val%2;
            int_val/=2;
        }

        if(m_gray) {
            for(unsigned int b=0; b<num_bits-1; ++b) {
                result[b]=!(result[b]==result[b+1]);
            }
        }

        // insert the bits into the correct spot in the encoding
        copy(result.begin(),result.end(),m_genotype.begin()+start);
        start+=num_bits;
    }
}

inline void binary_encoding::decode()
{
    unsigned int start = 0;

    // for each parameter
    for(unsigned int param=0; param<m_bpp.size(); param++) {
        unsigned int num_bits = m_bpp[param];
        unsigned int intval = 0;
        if(m_gray) {
            // convert from gray to binary
            vector<int> binary(num_bits);
            binary[num_bits-1] = m_genotype[start+num_bits-1];
            intval = binary[num_bits-1];
            for(int i=num_bits-2; i>=0; i--) {
                binary[i] = !(binary[i+1] == m_genotype[start+i]);
                intval += intval + binary[i];
            }
        }
        else {
            // convert from binary encoding to integer
            for(int i=num_bits-1; i>=0; i--) {
                intval += intval + m_genotype[start+i];
            }
        }

        // convert from integer to double in the appropriate range
        pair<double,double> prange = m_range[param];
        double range = prange.second - prange.first;
        double m = range / (pow(2.0,double(num_bits)) - 1.0);

        // m_phenotype is a vector<double> containing all the decoded parameters
        m_phenotype[param] = m * double(intval) + prange.first;

        start += num_bits;
    }
}

Note that for reasons that probably don't matter to you, I wasn't using bit vectors -- just ordinary vector<int> to encoding things. And of course, there's a bunch of stuff tied into this code that isn't shown here, but you can probably get the basic idea.

One other note, if you're doing GPU calculations or if you have a particular problem such that 64 bits are the appropriate size anyway, it may be worth the extra overhead to stuff everything into native words. Otherwise, I would guess that the overhead you add to the search process will probably overwhelm whatever benefits you get by faster encoding and decoding.

deong
  • 3,820
  • 21
  • 18
1

Edit:: I've decided that I was being a bit silly with this. While you do end up with a double it assumes that the bitset holds an integer... which is a big assumption to make. You will end up with a predictable and repeatable value per bitset but still I don't think that this is what the author intended.

Well if you iterate over the bit values and do

output_double += pow( 2, 64-(bit_position+1) ) * bit_value;

That would work. As long as it is big-endian

Dennis
  • 3,683
  • 1
  • 21
  • 43
  • I’m guessing that `^` here should be the power operator? that doesn’t work in C++ (but you can use bit shifting instead) and the algorithm is also wrong. – Konrad Rudolph Apr 27 '12 at 13:13
  • Trying to figure out what is wrong with the algo. I've changed the ^... sorry started in pseudo code and forgot to edit before posting. EDIT: Can't spot any problem with the algorithm. – Dennis Apr 27 '12 at 13:18
  • 1
    `pow` is way too slow since it operates on floating point. your algorithm converts integers, not `double` in IEEE format. – Konrad Rudolph Apr 27 '12 at 13:27
  • I'm not sure I understand your objection Konrad. All `pow` variations return double or float. You can divide by a precision factor to obtain a non-natural number if that is what you mean, but from the information provided it was not clear that that would not do. Obviously you could also use a bit shifting version of this same algorithm if necessary but premature-optimisation is the root of evil and all that. – Dennis Apr 27 '12 at 13:34
  • 1
    Using bitshifting here instead of floating-point `pow` isn’t premature, it’s an algorithmic improvement. Regarding the other objection, your code simply doesn’t work with the IEEE floating point representation. You are assuming that the input bitset represents an integer but the OP clearly means that the input represents the bits in a 64 bit IEEE floating point. In order to convert this to a `double` without aliasing (pointer cast) you need to read and interpret the mantissa and exponent manually. Your code doesn’t do this. – Konrad Rudolph Apr 27 '12 at 15:22
  • I see your point now, and I agree that if the bit set is a bit for bit representation of a double as stored according to the IEEE standard then this will not translate it at all. But I honestly don't think that that is clear from the question at all. The bitset in fact represents a genome, and all the author wants is to convert it to a double. This algorithm will do that. It may not be the double he wanted or expected but it will do what he asked in the question. – Dennis Apr 27 '12 at 17:12