4

I've recently been implementing a specialized parser for a slightly modified Abstract Syntax Notation. The specification says that integers are encoded as an array of octets which are to be interpreted as a binary two's-complement integer.

So, at first I thought the best way to unserialize this into an actual C++ int would be to simply start with a value of 0, and then OR each octet with the value like:

uint64_t value = 0;
int shift = 0;
std::vector<uint8_t> octets = { /* some values */ };

for (auto it = octets.rbegin(); it != octets.rend(); ++shift, ++it)
{
  value |= uint64_t(*it) << (shift * 8);
}

This would leave me with a bit pattern stored in value, which I could then interpret as a signed (two's-complement) integer by casting it:

int64_t signed_value = static_cast<int64_t>(value);

But it occurred to me that this is really relying on implementation-defined behavior. C++ doesn't guarantee that signed integers are represented as two's complement. So, to get the actual value of the encoded integer as a C++ int64_t, I'd need to actually calculate the summation of 2^N for each Nth bit in the bit pattern, taking into account the sign bit. This seems kind of silly when I know that casting should just work most of the time.

Is there a better solution here that would be both portable and efficient?

Community
  • 1
  • 1
Channel72
  • 24,139
  • 32
  • 108
  • 180
  • 1
    According to http://en.cppreference.com/w/cpp/types/integer in `c++11` you *are* guaranteed 2s complement for the signed size-specific integer typedefs. – BoBTFish Jul 12 '13 at 18:28
  • @BoBTFish, that's the greatest news I've heard all day... if it's true. But the c++11 draft standard says: Types bool, char, char16_t, char32_t, wchar_t, and the signed and unsigned integer types are collectively called integral types.48 A synonym for integral type is integer type. The representations of integral types shall define values by use of a pure binary numeration system.49 [ Example: this International Standard permits 2’s complement, 1’s complement and signed magnitude representations for integral types. —end example ] – Channel72 Jul 12 '13 at 18:40
  • Yes, but I did have some vague memory of things changing for `c++11`, so that's why I went to look. I'll see what I can dig up in The Standard. The thing is, those typedefs aren't required to exist anyway. – BoBTFish Jul 12 '13 at 18:42
  • I found http://stackoverflow.com/a/5254075/1171191 but that seems to be related to `c`. Nothing similar I can find in The `c++` Standard. – BoBTFish Jul 12 '13 at 18:59
  • AHA! Section 18.4.1 Header `` synopsis, paragraph 2 "The header defines all functions, types, and macros the same as 7.18 in the C standard." Not honestly sure if that includes the requirements on types, or just that the actual names of the types in the typedefs have to be the same as a `c` implementation on the same platform. (Edit: That's in N3337, first draft released *after* the actual 2011 Standard.) – BoBTFish Jul 12 '13 at 19:07
  • Well, any sane hardware developer will use 2s complement representation, because it allows the processor to completely forget about signs while doing arithmetic, the only points where signs crop up is loading values from memory/opcodes and when comparing integers. So I think, it is safe to assume 2s complement hardware... – cmaster - reinstate monica Jul 12 '13 at 19:11

1 Answers1

1

If your solution works, I think you can use a bit of metaprogramming to test whether your platform is one's complement or two's complement.

struct is_ones_complement {
    static const bool value = ( (1 & -1) == 0);
}

And then, you can write an inlinable conversion function:

template<bool is_ones_complement>
uint64_t convert_impl(const std::vector<uint8_t>& vec);

template<>
uint64_t convert_impl<true>(const std::vector<uint8_t>& vec) {
    // Your specialization for 1's-complement platforms
}

template<>
uint64_t convert_impl<false>(const std::vector<uint8_t>& vec) {
    // Your specialization for 2's-complement platforms
}

inline uint64_t convert(const std::vector<uint8_t>& vec) {
    return convert_impl<is_ones_complement::value>(vec);
}

Untested, but it should work.

Laurent LA RIZZA
  • 2,905
  • 1
  • 23
  • 41