Integer into char array

Question

I need to convert integer value into char array on bit layer. Let's say int has 4 bytes and I need to split it into 4 chunks of length 1 byte as char array.

Example:

int a = 22445;
// this is in binary 00000000 00000000 1010111 10101101
...
//and the result I expect
char b[4];
b[0] = 0; //first chunk
b[1] = 0; //second chunk
b[2] = 87; //third chunk - in binary 1010111
b[3] = 173; //fourth chunk - 10101101

I need this conversion make really fast, if possible without any loops (some tricks with bit operations perhaps). The goal is thousands of such conversions in one second.

Do not tag both C and C++. It is *important* which you are coding in. C++ contains significant extra restrictions on memory aliasing and such. — Puppy, Dec 19 '11 at 22:40
Possible dup: http://stackoverflow.com/questions/1522994/store-an-int-in-a-char-array — jman, Dec 19 '11 at 22:44
Sorry about the tags, I removed `C`, I need it in C++ project. — Peter Krejci, Dec 19 '11 at 22:52

score 3 · Answer 1 · answered Dec 19 '11 at 22:34

3

int a = 22445;
char *b = (char *)&a;
char b2 = *(b+2); // = 87
char b3 = *(b+3); // = 173

answered Dec 19 '11 at 22:34

Matt

678
1
6
13

2

It's perfectly legal to alias any type with a `char*`. – Puppy Dec 19 '11 at 22:39
@DeadMG its not portable because of endian dependency – jman Dec 19 '11 at 22:47

ruakh · Accepted Answer · 2011-12-19T22:50:05.277

3

I'm not sure if I recommend this, but you can #include <stddef.h> and <sys/types.h> and write:

*(u32_t *)b = htonl((u32_t)a);

(The htonl is to ensure that the integer is in big-endian order before you store it.)

edited Dec 19 '11 at 22:50

answered Dec 19 '11 at 22:34

ruakh

175,680
26
273
307

I think you are missing a dereference on the left (but IIRC that syntax - pointer cast + dereference as lvalue - may not be standard; you can always declare a separated pointer variable). – Matteo Italia Dec 19 '11 at 22:46
@MatteoItalia: Re: missing a dereference: whoops, ha, you're right, fixed now, thanks! Re: pointer cast + dereference as lvalue: Really? A pointer cast in itself is not an lvalue, but I thought that a dereferencing of it *is*. (See e.g. http://stackoverflow.com/questions/7446489/casting-a-pointer-does-not-produce-an-lvalue-why, where several people say that it is, and no one seems to contradict them.) – ruakh Dec 19 '11 at 22:56
@ruakh: uh, you are right, I remembered incorrectly the issue discussed in that question. +1, then! :) – Matteo Italia Dec 19 '11 at 22:59
I hope this solution is portable, it works well. On Linux you have to use `` and type `uint32_t`. But what about larger data types like `long long`? – Peter Krejci Dec 19 '11 at 23:59
@PeterKrejci: [here](http://stackoverflow.com/questions/105252/how-do-i-convert-between-big-endian-and-little-endian-values-in-c) you can find many solutions. – Matteo Italia Dec 20 '11 at 18:41

score 3 · Answer 3 · answered Dec 19 '11 at 22:47

Depending on how you want negative numbers represented, you can simply convert to unsigned and then use masks and shifts:

unsigned char b[4];
unsigned ua = a;

b[0] = (ua >> 24) & 0xff;
b[1] = (ua >> 16) & 0xff;
b[2] = (ua >> 8) & 0xff
b[3] = ua & 0xff;

(Due to the C rules for converting negative numbers to unsigned, this will produce the twos complement representation for negative numbers, which is almost certainly what you want).

Kerrek SB · Answer 4 · 2011-12-19T22:42:40.907

To access the binary representation of any type, you can cast a pointer to a char-pointer:

T x;  // anything at all!

// In C++
unsigned char const * const p = reinterpret_cast<unsigned char const *>(&x);

/* In C */
unsigned char const * const p = (unsigned char const *)(&x);

// Example usage:
for (std::size_t i = 0; i != sizeof(T); ++i)
    std::printf("Byte %u is 0x%02X.\n", p[i]);

That is, you can treat p as the pointer to the first element of an array unsigned char[sizeof(T)]. (In your case, T = int.)

I used unsigned char here so that you don't get any sign extension problems when printing the binary value (e.g. through printf in my example). If you want to write the data to a file, you'd use char instead.

Watchout on little endian vs big: https://stackoverflow.com/questions/1001307/detecting-endianness-programmatically-in-a-c-program#:~:text=If%20the%20first%20byte%20of,the%20system%20is%20Big%2DEndian. — Gelldur, Feb 15 '23 at 11:11

score 0 · Answer 5 · answered Dec 20 '11 at 08:12

You have already accepted an answer, but I will still give mine, which might suit you better (or the same...). This is what I tested with:

int a[3] = {22445, 13, 1208132};

for (int i = 0; i < 3; i++)
{
    unsigned char * c = (unsigned char *)&a[i];
    cout << (unsigned int)c[0] << endl;
    cout << (unsigned int)c[1] << endl;

    cout << (unsigned int)c[2] << endl;
    cout << (unsigned int)c[3] << endl;
    cout << "---" << endl;
}

...and it works for me. Now I know you requested a char array, but this is equivalent. You also requested that c[0] == 0, c[1] == 0, c[2] == 87, c[3] == 173 for the first case, here the order is reversed.

Basically, you use the SAME value, you only access it differently.

Why haven't I used htonl(), you might ask?

Well since performance is an issue, I think you're better off not using it because it seems like a waste of (precious?) cycles to call a function which ensures that bytes will be in some order, when they could have been in that order already on some systems, and when you could have modified your code to use a different order if that was not the case.

So instead, you could have checked the order before, and then used different loops (more code, but improved performance) based on what the result of the test was.

Also, if you don't know if your system uses a 2 or 4 byte int, you could check that before, and again use different loops based on the result.

Point is: you will have more code, but you will not waste cycles in a critical area, which is inside the loop.

If you still have performance issues, you could unroll the loop (duplicate code inside the loop, and reduce loop counts) as this will also save you a couple of cycles.

Note that using c[0], c[1] etc.. is equivalent to *(c), *(c+1) as far as C++ is concerned.

score -2 · Answer 6 · answered Dec 19 '11 at 22:38

-2

typedef union{
  byte intAsBytes[4];
  int int32;
}U_INTtoBYTE;

answered Dec 19 '11 at 22:38

Martin James

24,453
3
36
60

1

In C++ this is undefined behaviour. – Kerrek SB Dec 19 '11 at 22:40
I expected the 'not portable - endianness' to come up. If you have a way of detecting endianness, then sure, use it! You can read the bytes any which way you want. – Martin James Dec 19 '11 at 22:42
Not at all. From all we know, it's actually possible that the OP *wants* to investigate how types are implemented on his platform. What's UB in C++ is to access inactive union members. – Kerrek SB Dec 19 '11 at 22:44
1

How is this different from the first answer? – jman Dec 19 '11 at 22:45
@MartinJames: "seeming to work fine" is a perfectly legal manifestation of Undefined Behaviour! :-) Your bible wasn't touched... – Kerrek SB Dec 19 '11 at 22:47

Integer into char array

6 Answers6

Linked