12

I am using the HIDAPI to send some data to a USB device. This data can be sent only as byte array and I need to send some float numbers inside this data array. I know floats have 4 bytes. So I thought this might work:

float f = 0.6;
char data[4];

data[0] = (int) f >> 24;
data[1] = (int) f >> 16;
data[2] = (int) f >> 8;
data[3] = (int) f;

And later all I had to do is:

g = (float)((data[0] << 24) | (data[1] << 16) | (data[2] << 8) | (data[3]) );

But testing this shows me that the lines like data[0] = (int) f >> 24; returns always 0. What is wrong with my code and how may I do this correctly (i.e. break a float inner data in 4 char bytes and rebuild the same float later)?


EDIT:

I was able to accomplish this with the following codes:

float f = 0.1;
unsigned char *pc;
pc = (unsigned char*)&f;

// 0.6 in float
pc[0] = 0x9A;
pc[1] = 0x99;
pc[2] = 0x19;
pc[3] = 0x3F;

std::cout << f << std::endl; // will print 0.6

and

*(unsigned int*)&f = (0x3F << 24) | (0x19 << 16) | (0x99 << 8) | (0x9A << 0);

I know memcpy() is a "cleaner" way of doing it, but this way I think the performance is somewhat better.

JeJo
  • 30,635
  • 6
  • 49
  • 88
Michel Feinstein
  • 13,416
  • 16
  • 91
  • 173
  • 1
    The reason `(int)f >> 24` returns `0` is that the `int` casted `f` is equal to `0` in the first place: the cast sends the float to its floor. It's undefined behavior but to do it that hacky way you would need something like `*(int*)&f >> 24`. – Andrey Mishchenko Jan 08 '14 at 20:46

6 Answers6

25

You can do it like this:

char data[sizeof(float)];


float f = 0.6f;

memcpy(data, &f, sizeof f);    // send data


float g;

memcpy(&g, data, sizeof g);    // receive data

In order for this to work, both machines need to use the same floating point representations.


As was rightly pointed out in the comments, you don't necessarily need to do the extra memcpy; instead, you can treat f directly as an array of characters (of any signedness). You still have to do memcpy on the receiving side, though, since you may not treat an arbitrary array of characters as a float! Example:

unsigned char const * const p = (unsigned char const *)&f;
for (size_t i = 0; i != sizeof f; ++i)
{
    printf("Byte %zu is %02X\n", i, p[i]);
    send_over_network(p[i]);
}
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • silly question, why the 'f' after 0.6? I have seen it before, just never saw the reason... – Michel Feinstein Jan 08 '14 at 20:49
  • I like this answer, but I got curious, there arent any other ways for doing a byte level acess on a float? – Michel Feinstein Jan 08 '14 at 20:54
  • 1
    @mFeinstein `0.6f` is a constant of type `float`, `0.6` is a constant of type `double`. A constant of type `double` would be converted to `float` automatically anyway (but naive compilers might generate worse code for `f = 0.6`, and some platforms might round differently). There are other ways of doing byte-level access, but `memcpy` is the best way here. – Gilles 'SO- stop being evil' Jan 08 '14 at 21:19
  • @Gilles: Yes, you can treat `f` directly as an array of chars: `char const * data = (char const *)&f;`, now use `data[i]` for a range of `i`. – Kerrek SB Jan 08 '14 at 21:20
  • There are places where failing to distinguish between float, double, and int constants may get you in trouble, so it isn't a bad idea to make a habit of always specifying the `f` or `d` suffix. Having said that, I must admit I tend to do so only when I either need to make sure it's the right type or need to make sure the reader understands what type it is. – keshlam Jan 08 '14 at 21:21
  • @mFeinstein: That's a `float` literal. I thought that's what's required, so that's what I'll supply. – Kerrek SB Jan 08 '14 at 21:23
  • @mFeinstein : Check my answer to see how to access individual bytes in a float. – pablo1977 Jan 08 '14 at 22:31
9

In standard C is guaranted that any type can be accessed as an array of bytes. A straight way to do this is, of course, by using unions:

 #include <stdio.h> 

 int main(void)
 {
    float x = 0x1.0p-3; /* 2^(-3) in hexa */

    union float_bytes {
       float val;
       unsigned char bytes[sizeof(float)];
    } data;

    data.val = x;
    for (int i = 0; i < sizeof(float); i++) 
          printf("Byte %d: %.2x\n", i, data.bytes[i]);

    data.val *= 2;   /* Doing something with the float value */
    x = data.val;    /* Retrieving the float value           */
    printf("%.4f\n", data.val);

    getchar();
 }

As you can see, it is not necessary at all to use memcpy or pointers...

The union approach is easy to understand, standard and fast.

EDIT.

I will explain why this approach is valid in C (C99).

  • [5.2.4.2.1(1)] A byte has CHAR_BIT bits (an integer constant >= 8, in almost cases is 8).
  • [6.2.6.1(3)] The unsigned char type uses all its bits to represent the value of the object, which is an nonnegative integer, in a pure binary representation. This means that there are not padding bits or bits used for any other extrange purpouse. (The same thing is not guaranted for signed char or char types).
  • [6.2.6.1(2)] Every non-bitfield type is represented in memory as a contiguous sequence of bytes.
  • [6.2.6.1(4)] (Cited) "Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); [...]"
  • [6.7.2.1(14)] A pointer to a structure object (in particular, unions), suitably converted, points to its initial member. (Thus, there is no padding bytes at the beginning of a union).
  • [6.5(7)] The content of an object can be accessed by a character type:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively,amember of a subaggregate or contained union), or
a character type

More information:

A discussion in google groups
Type-punning

EDIT 2

Another detail of the standard C99:

  • [6.5.2.3(3) footnote 82] Type-punning is allowed:

If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

pablo1977
  • 4,281
  • 1
  • 15
  • 41
  • 3
    I'd like to note that while this may be valid in C (C99, I think, but not C89?), this would be undefined behaviour in C++. Just in case any C++ users walk by and see this. – Kerrek SB Jan 08 '14 at 22:42
  • @KerrekSB : Yes, my code is standard in C99. I am not pretty sure if the union technique is standard in C89. (The `int` inside the `for` loop is valid only in C99). However, the question has the tag `c` but not `c++`. – pablo1977 Jan 08 '14 at 22:53
  • Yes, of course, it was just a remark. Sometimes people think they can apply things from one language to the other, so I just wanted to be on the safe side. C is a lot more relaxed about accessing memory than C++ is; I don't have the standard reference that says this is valid, but I trust you're right. – Kerrek SB Jan 08 '14 at 23:03
  • @KerrekSB : Note that I am aliasing the `float` object with `unsigned char[]`. This is very specific, and I think it is standard in C++, too. One has undefined behaviour by using a different type that `unsigned char`. [see this discussion](https://groups.google.com/forum/#!topic/comp.lang.c++.moderated/eMqQL2vds0c) – pablo1977 Jan 09 '14 at 02:50
  • 1
    @KerrekSB I walked into the type-punning through unions mine field in [this answer](http://stackoverflow.com/a/20956250/1708801) and I linked some of the better discussions on this. Pascal Cuoq's interpretation and the DR he links supports that it has been legal since C89. The C++ case is not clear at all and I would lean on the side of it being undefined but it may not be. – Shafik Yaghmour Jan 09 '14 at 04:13
  • You might want to add a link to the draft C99 standard, you can find a ton of standards drafts [here](http://stackoverflow.com/questions/81656/where-do-i-find-the-current-c-or-c-standard-documents). – Shafik Yaghmour Jan 09 '14 at 04:31
  • Thats an interesting hack...but I am worried about moving the code to C++ in the future... – Michel Feinstein Jan 09 '14 at 07:47
  • @ShafikYaghmour: In C++ you have the rule about only being allowed to access the active union member, though, which I think is not terribly unclear. On the other hand, the paragraph you linked in your answer [refers to something else](http://stackoverflow.com/questions/20275322/how-to-access-an-objects-storage-through-an-aggregate) I believe, not to whether it's OK to access different union members at once. – Kerrek SB Jan 09 '14 at 08:28
  • 1
    @KerrekSB I was referring to [Purpose of Unions in C and C++](http://stackoverflow.com/questions/2310483/purpose-of-unions-in-c-and-c) and also I linked this one [Accessing inactive union member - undefined?](http://stackoverflow.com/questions/11373203/accessing-inactive-union-member-undefined/11996970#11996970). – Shafik Yaghmour Jan 09 '14 at 10:25
  • @ShafikYaghmour: Thanks! The [defect report](http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm) and subsequent fix really clarifies that this has always been intended to be permitted in C. Very good to know. – Kerrek SB Jan 09 '14 at 10:45
  • In C89 this is explicitly implementation-defined (C90 §6.3.2.3 note 33). It seems that all versions of GCC (as a C compiler) implement C11-compliant behavior on this respect. – Gilles 'SO- stop being evil' Jan 09 '14 at 15:45
  • I've added more details to my answer. There is an explicit mention of type-punning in C99 (footnote 82). – pablo1977 Jan 09 '14 at 16:39
  • @mFeinstein: About my answer, I think that the technique `union`+`unsigned char []` is useful to "read" the bytes of an object, but probably the intent to "write back" that bytes to the array, in order to "recover" the original object, is not legal (it would be Undefined Behaviour). I am not expert in C++. But I think that the "spirit" of C++ is to keep itself as a very high level language, so you should avoid "low-level hacks". Probably you need to do more research about the `float` representation in C. – pablo1977 Jan 09 '14 at 16:55
  • @pablo1977 I remember in school doing some memory dumps with ints and floats and changing the values to see the behavior, but I guess its a long time so I forgot the way it was done lol...not I think it was a unsigned char thing just like you guys showed – Michel Feinstein Jan 09 '14 at 19:23
  • @mFeinstein: In practice it's almost sure that type-punning doesn't bring any problem. But, you cannot be sure about anything. Can you be sure that your program ever run in a system having no-padding bits (for integer types), 2's complement and little endianess? If your answer is "no" or "don't" know, then you have to use `unsigned char`, because it is the only type in C with precise binary representation guaranted by the standard. Now, if you write a member of a `union`, can you be sure what happens if you read another member of it? (Almost) only arrays of char. types are guaranteed to work. – pablo1977 Jan 09 '14 at 19:53
  • @mFeinstein: (and) Can you be sure that float will be always 4 bytes, little-endian and have the format binary32 as detailed by the standard IEEE 754? If your answer is "no" or "don't know", then tie to the standard and, better, investigate how the float types are represented in the two devices that you are connecting. This depends of several factors. Read the compiler docs for more information or well check the value of the constants given in . Also, it would be a good idea to study floating point formats. [IEEE 754](http://en.wikipedia.org/wiki/IEEE_floating_point) – pablo1977 Jan 09 '14 at 20:10
  • @pablo1977 I believe it complies with the stardads, but in any case, I am making some test routines to see if the values are being sent as expected – Michel Feinstein Jan 09 '14 at 22:01
1

The C language guarantees that any value of any type¹ can be accessed as an array of bytes. The type of bytes is unsigned char. Here's a low-level way of copying a float to an array of bytes. sizeof(f) is the number of bytes used to store the value of the variable f; you can also use sizeof(float) (you can either pass sizeof a variable or more complex expression, or its type).

float f = 0.6;
unsigned char data[sizeof(float)];
size_t i;
for (i = 0; i < sizeof(float); i++) {
    data[i] = (unsigned char*)f + i;
}

The functions memcpy or memmove do exactly that (or an optimized version thereof).

float f = 0.6;
unsigned char data[sizeof(float)];
memcpy(data, f, sizeof(f));

You don't even need to make this copy, though. You can directly pass a pointer to the float to your write-to-USB function, and tell it how many bytes to copy (sizeof(f)). You'll need an explicit cast if the function takes a pointer argument other than void*.

int write_to_usb(unsigned char *ptr, size_t size);
result = write_to_usb((unsigned char*)f, sizeof(f))

Note that this will work only if the device uses the same representation of floating point numbers, which is common but not universal. Most machines use the IEEE floating point formats, but you may need to switch endianness.


As for what is wrong with your attempt: the >> operator operates on integers. In the expression (int) f >> 24, f is cast to an int; if you'd written f >> 24 without the cast, f would still be automatically converted to an int. Converting a floating point value to an integer approximates it by truncating or rounding it (usually towards 0, but the rule depends on the platform). 0.6 rounded to an integer is 0 or 1, so data[0] is 0 or 1 and the others are all 0.

You need to act on the bytes of the float object, not on its value.

¹ Excluding functions which can't really be manipulated in C, but including function pointers which functions decay to automatically.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
0

Assuming that both devices have the same notion of how floats are represented then why not just do a memcpy. i.e

unsigned char payload[4];
memcpy(payload, &f, 4);
Ed Heal
  • 59,252
  • 17
  • 87
  • 127
  • Because the bytes will be read back in a microcontroller and I wasnt sure there was a memcpy in the microcontroller librabry....but now I see there is – Michel Feinstein Jan 08 '14 at 21:08
0

the safest way to do this, if you control both sides is to send some sort of standardized representation... this isn't the most efficient, but it isn't too bad for small numbers.

hostPort writes char * "34.56\0" byte by byte
client reads char * "34.56\0" 

then converts to float with library function atof or atof_l.

of course that isn't the most optimized, but it sure will be easy to debug.

if you wanted to get more optimized and creative, first byte is length then the exponent, then each byte represents 2 decimal places... so

34.56 becomes char array[] = {4,-2,34,56}; something like that would be portable... I would just try not to pass binary float representations around... because it can get messy fast.

Grady Player
  • 14,399
  • 2
  • 48
  • 76
  • This will be a lot cumbersome for my needs since I have a microcontroller to receive the data and the performance is not the best one around – Michel Feinstein Jan 08 '14 at 20:51
-2

It might be safer to union the float and char array. Put in the float member, pull out the 4 (or whatever the length is) bytes.

Phil Perry
  • 2,126
  • 14
  • 18
  • Actually, no, abusing a union this way is not safe. If you write to a member of a union, you're not allowed to read back from another member (it's undefined behavior). Compilers take advantage of this restriction to optimize. – Gilles 'SO- stop being evil' Jan 08 '14 at 21:17
  • Bull. That's the whole purpose of a **union**... to put _in_ data in one format and pull _out_ the unaltered bits as another format. – Phil Perry Jan 09 '14 at 14:28
  • 1
    No, this is not the purpose of a union. The purpose of a union is to use the same slice of memory to store different data as different times. In C89, the behavior is implementation-defined. In C++, GCC does take advantage of this to optimize; I thought some versions also did this in C, but after looking it up I was wrong: this is safe with GCC. [C11 has also changed to make this defined](http://stackoverflow.com/questions/11373203/accessing-inactive-union-member-undefined/11996970#11996970), so it's probably safe in practice on current compilers, even if they aren't fully C11 compliant. – Gilles 'SO- stop being evil' Jan 09 '14 at 15:42
  • Wrong. The _primary_ use of a **union** is to allow the bits and bytes in a given chunk of data to be accessed in different ways -- as a float and as an array of chars (bytes), for example. It _can_ be used to save some space by using the memory for different purposes at different _times_, but that's a secondary usage. – Phil Perry Jan 09 '14 at 17:05
  • 1
    Again, no, you have it exactly backwards. Type punning in unions was explicitly not standard (though widely supported) in earlier versions of the standard. The primary purpose of a union is to store different, unrelated objects in the same space in memory (often, but not necessarily, with an enum or integer object nearby indicating which field of the union is currently valid). – Gilles 'SO- stop being evil' Jan 09 '14 at 17:45
  • @PhilPerry: Don't insist with that idea. Gilles is right. What's up if you do a type-punning with a `signed long` having padding bits, for example? Only `unsigned char` is guaranted to be compatible with the intent of "retrieving" the bits of an object. But "compatible" doesn't mean "accessible". To ensure that the bytes are "accessible" with another member, we need another claim of the standard. I think one can "read" the bytes, but perhaps "writing back" bytes is UB. Check my answer for more details. – pablo1977 Jan 09 '14 at 18:39
  • 1
    @Gilles The footnote in the answer http://stackoverflow.com/questions/11373203/accessing-inactive-union-member-undefined/11996970#11996970 that you link to was already there in C99TC3. If you believe in standard committee infallibility, this means that it was in the C99 standard all along, although not expressed explicitly. Under this interpretation, none of C89, C99 and C11 make it undefined to use a union for type-punning. – Pascal Cuoq Jan 09 '14 at 23:17