1

I need to convert integer value into char array on bit layer. Let's say int has 4 bytes and I need to split it into 4 chunks of length 1 byte as char array.

Example:

int a = 22445;
// this is in binary 00000000 00000000 1010111 10101101
...
//and the result I expect
char b[4];
b[0] = 0; //first chunk
b[1] = 0; //second chunk
b[2] = 87; //third chunk - in binary 1010111
b[3] = 173; //fourth chunk - 10101101

I need this conversion make really fast, if possible without any loops (some tricks with bit operations perhaps). The goal is thousands of such conversions in one second.

Peter Krejci
  • 3,182
  • 6
  • 31
  • 49

6 Answers6

3
int a = 22445;
char *b = (char *)&a;
char b2 = *(b+2); // = 87
char b3 = *(b+3); // = 173
Matt
  • 678
  • 1
  • 6
  • 13
3

I'm not sure if I recommend this, but you can #include <stddef.h> and <sys/types.h> and write:

*(u32_t *)b = htonl((u32_t)a);

(The htonl is to ensure that the integer is in big-endian order before you store it.)

ruakh
  • 175,680
  • 26
  • 273
  • 307
  • I think you are missing a dereference on the left (but IIRC that syntax - pointer cast + dereference as lvalue - may not be standard; you can always declare a separated pointer variable). – Matteo Italia Dec 19 '11 at 22:46
  • @MatteoItalia: Re: missing a dereference: whoops, ha, you're right, fixed now, thanks! Re: pointer cast + dereference as lvalue: Really? A pointer cast in itself is not an lvalue, but I thought that a dereferencing of it *is*. (See e.g. http://stackoverflow.com/questions/7446489/casting-a-pointer-does-not-produce-an-lvalue-why, where several people say that it is, and no one seems to contradict them.) – ruakh Dec 19 '11 at 22:56
  • @ruakh: uh, you are right, I remembered incorrectly the issue discussed in that question. +1, then! :) – Matteo Italia Dec 19 '11 at 22:59
  • I hope this solution is portable, it works well. On Linux you have to use `` and type `uint32_t`. But what about larger data types like `long long`? – Peter Krejci Dec 19 '11 at 23:59
  • @PeterKrejci: [here](http://stackoverflow.com/questions/105252/how-do-i-convert-between-big-endian-and-little-endian-values-in-c) you can find many solutions. – Matteo Italia Dec 20 '11 at 18:41
3

Depending on how you want negative numbers represented, you can simply convert to unsigned and then use masks and shifts:

unsigned char b[4];
unsigned ua = a;

b[0] = (ua >> 24) & 0xff;
b[1] = (ua >> 16) & 0xff;
b[2] = (ua >> 8) & 0xff
b[3] = ua & 0xff;

(Due to the C rules for converting negative numbers to unsigned, this will produce the twos complement representation for negative numbers, which is almost certainly what you want).

caf
  • 233,326
  • 40
  • 323
  • 462
1

To access the binary representation of any type, you can cast a pointer to a char-pointer:

T x;  // anything at all!

// In C++
unsigned char const * const p = reinterpret_cast<unsigned char const *>(&x);

/* In C */
unsigned char const * const p = (unsigned char const *)(&x);

// Example usage:
for (std::size_t i = 0; i != sizeof(T); ++i)
    std::printf("Byte %u is 0x%02X.\n", p[i]);

That is, you can treat p as the pointer to the first element of an array unsigned char[sizeof(T)]. (In your case, T = int.)

I used unsigned char here so that you don't get any sign extension problems when printing the binary value (e.g. through printf in my example). If you want to write the data to a file, you'd use char instead.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • Watchout on little endian vs big: https://stackoverflow.com/questions/1001307/detecting-endianness-programmatically-in-a-c-program#:~:text=If%20the%20first%20byte%20of,the%20system%20is%20Big%2DEndian. – Gelldur Feb 15 '23 at 11:11
0

You have already accepted an answer, but I will still give mine, which might suit you better (or the same...). This is what I tested with:

int a[3] = {22445, 13, 1208132};

for (int i = 0; i < 3; i++)
{
    unsigned char * c = (unsigned char *)&a[i];
    cout << (unsigned int)c[0] << endl;
    cout << (unsigned int)c[1] << endl;

    cout << (unsigned int)c[2] << endl;
    cout << (unsigned int)c[3] << endl;
    cout << "---" << endl;
}

...and it works for me. Now I know you requested a char array, but this is equivalent. You also requested that c[0] == 0, c[1] == 0, c[2] == 87, c[3] == 173 for the first case, here the order is reversed.

Basically, you use the SAME value, you only access it differently.

Why haven't I used htonl(), you might ask?

Well since performance is an issue, I think you're better off not using it because it seems like a waste of (precious?) cycles to call a function which ensures that bytes will be in some order, when they could have been in that order already on some systems, and when you could have modified your code to use a different order if that was not the case.

So instead, you could have checked the order before, and then used different loops (more code, but improved performance) based on what the result of the test was.

Also, if you don't know if your system uses a 2 or 4 byte int, you could check that before, and again use different loops based on the result.

Point is: you will have more code, but you will not waste cycles in a critical area, which is inside the loop.

If you still have performance issues, you could unroll the loop (duplicate code inside the loop, and reduce loop counts) as this will also save you a couple of cycles.

Note that using c[0], c[1] etc.. is equivalent to *(c), *(c+1) as far as C++ is concerned.

neeKo
  • 4,280
  • 23
  • 31
-2
typedef union{
  byte intAsBytes[4];
  int int32;
}U_INTtoBYTE; 
Martin James
  • 24,453
  • 3
  • 36
  • 60
  • 1
    In C++ this is undefined behaviour. – Kerrek SB Dec 19 '11 at 22:40
  • I expected the 'not portable - endianness' to come up. If you have a way of detecting endianness, then sure, use it! You can read the bytes any which way you want. – Martin James Dec 19 '11 at 22:42
  • Not at all. From all we know, it's actually possible that the OP *wants* to investigate how types are implemented on his platform. What's UB in C++ is to access inactive union members. – Kerrek SB Dec 19 '11 at 22:44
  • 1
    How is this different from the first answer? – jman Dec 19 '11 at 22:45
  • @MartinJames: "seeming to work fine" is a perfectly legal manifestation of Undefined Behaviour! :-) Your bible wasn't touched... – Kerrek SB Dec 19 '11 at 22:47