2

I have a data stream that I get from a device. The chunks are 8 bytes, which I read as uint_64t words. A few of the higher bits are flags (not all 4 bytes) that define the type of data. Some, but not all, of the chunks have the lower 4 bytes representing a float in binary representation.

How do I correctly extract that part into a float variable?

Let word be {4-byte flags, 4-byte float [LSB]}.

This "seems" to work:

float extracted = *reinterpret_cast<float *>(&word);

Yet, the compiler (GCC 10) with '-Wall' warns about type-punning

warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]

for optimization levels >= -O2.

I suspect I'm doing all sorts of evil here and don't feel comfortable with the warning. What's the correct way doing this?

Thanks for your help!

JaMiT
  • 14,422
  • 4
  • 15
  • 31
millow
  • 33
  • 2
  • If you want the lower bytes you'll need to treat that as `float[2]` won't you? – tadman May 15 '20 at 19:38
  • "A uint64_t word is encoded such that the lower 4 bytes are a float in binary representation" - what makes you think that? – Jesper Juhl May 15 '20 at 19:40
  • Sorry, if the question was misleading. This is a data stream that I get from a device. The chunks are 8 bytes a few of the higher bits are flags (not all 4 bytes), that define the type of data. "Some", not all, of them have the lower 4 bytes encoded in the way I described in the question. – millow May 15 '20 at 19:54
  • 1
    No need to use a `uint64_t`, you can use a struct with the proper members and so get rid of the aliasing problem. – Werner Henze May 15 '20 at 19:58

3 Answers3

6

Type punning may "work" in your compiler, but it is really frowned upon by the C++ standard. memcpy() (or equivalent) is really the only option the standard supports, eg:

// note that casting a pointer-to-type to a pointer-to-char for
// purposes of accessing the type's raw bytes IS allowed by the
// C++ standard...
uint64_t word = ...;
uint32_t flags;
float extracted;
std::memcpy(&flags, &word, sizeof(flags));
std::memcpy(&extracted, reinterpret_cast<char*>(&word)+sizeof(flags), sizeof(extracted));

Or:

#pragma pack(push, 1) // or equivalent
struct flags_and_float
{
    uint32_t flags;
    float value;
};
#pragma pack(pop) // or equivalent

uint64_t word = ...;
flags_and_float ff;
std::memcpy(&ff, &word, sizeof(ff));
float extracted = ff.value;
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Thanks to you and the many other insightful posts. Why do I need a reinterpret_cast in std::memcpy? Isn't that implicitly casted to void* anyway? `std::memcpy(&extracted, reinterpret_cast(&word)+sizeof(flags), sizeof(extracted));` – millow May 15 '20 at 21:44
  • "*Why do I need a reinterpret_cast*" - for the **pointer arithmetic** that is being used *before* `memcpy()` is called. The starting memory address of `word` is casted to `char*` **and then incremented by 4 bytes**, that new address (the address of the `float`) is what gets passed to `memcpy()`. Notice the 1st `memcpy()` is not using `reinterpret_cast`, because it is using the starting memory address of `word` (the address of the `flags`) as-is. – Remy Lebeau May 15 '20 at 22:18
2

To comply with the strict-aliasing rules, it would be helpful to read your data stream as unsigned char (or std::byte if you can use C++17) that you analyze and parse in groups of 8. The rules have an exception that allow reinterpreting byte data (char, unsigned char, and std::byte) as another type. So call the data a sequence of bytes until you know how you want to interpret it.

JaMiT
  • 14,422
  • 4
  • 15
  • 31
-2

If you want to get rid of the warning cast the address to void * before the cast to float *.

float extracted = *static_cast<float*>(static_cast<void*>(&word));

So long as you know that the referenced bytes really do hold a float this is fine.

SoronelHaetir
  • 14,104
  • 1
  • 12
  • 23