Converting 32 bit number to four 8bit numbers

Question

I am trying to convert the input from a device (always integer between 1 and 600000) to four 8-bit integers.

For example,

If the input is 32700, I want 188 127 00 00.

I achieved this by using:

32700 % 256 
32700 / 256

The above works till 32700. From 32800 onward, I start getting incorrect conversions.

I am totally new to this and would like some help to understand how this can be done properly.

Are you sure it really is _from 32800 onward_ or is it actually from 32768? — PiCTo, Jun 27 '20 at 07:28
You are right!! I guess you see the issue.....tell me tell me... — Aman Kejriwal, Jun 27 '20 at 07:29
Is that `%` a typo? I'd expect `MOD 256` to get you the 188 and `/256` to get you the 127. The confusing part is that `%` is the C operator for taking modulo, which surely you mean by `MOD`. I.e. to me the two lines you show are semantically identical. Please show the result you get for the two lines. Which one gets you the desired 188, which the 127? — Yunnosch, Jun 27 '20 at 07:38
Thank you for pointing that out. I have edited the question. — Aman Kejriwal, Jun 27 '20 at 07:40
Also, I realised that 32767 is the upper limit for int. Changing to long to see if things get better. — Aman Kejriwal, Jun 27 '20 at 07:40
`600000) to a 32 bit signed integer.` I do not understand. A number `32700` by itself is already a 32-bit signed integer. `input is 32700, I want 188 127 00 00` that is converting a signed 32-bit integer into four 8-bit integers. `I achieved this by using:` I do not understand, that's just two numbers. `I start getting incorrect conversions` If you are having problems with some actual code, instead of explaining, please _show the code_. Please post an [MCVE]. Code speaks 1000 words. Your description may be completely unrelated to the problem you are facing. — KamilCuk, Jun 27 '20 at 07:44
Yes you are correct. I want to convert the integer to four 8-bit integers — Aman Kejriwal, Jun 27 '20 at 07:45
`I start getting incorrect conversions` What architecture, compiler, compiler version and compiler options are you using? How are you checking that? On what operating system/environment are you executing your program? How are you "observing" that an incorrect conversion has been made? With a terminal output, something prints on your printer, you read a value with your debugger? [How do we ask a good question on SO](https://stackoverflow.com/help/how-to-ask). I think related: [converting an int to 4 byte](https://stackoverflow.com/questions/3784263/converting-an-int-into-a-4-byte-char-array-c) — KamilCuk, Jun 27 '20 at 07:48
You are right. I wonder if this is a commonly used operation in this field? I am just trying to figure out how to write a small C program to achieve this. — Aman Kejriwal, Jun 27 '20 at 08:01
An answer below talks about casting it using the int32_t type. Trying to find the header file for that — Aman Kejriwal, Jun 27 '20 at 08:02
@AmanKejriwal ``32700 % 256`` explains how you get 188 //ly 127. But how do you get last two zeros? Are they always zero? — m0hithreddy, Jun 27 '20 at 08:11

PiCTo · Answer 1 · 2020-06-27T09:56:56.487

Major edit following clarifications:

Given that someone has already mentioned the shift-and-mask approach (which is undeniably the right one), I'll give another approach, which, to be pedantic, is not portable, machine-dependent, and possibly exhibits undefined behavior. It is nevertheless a good learning exercise, IMO.

For various reasons, your computer represents integers as groups of 8-bit values (called bytes); note that, although extremely common, this is not always the case (see CHAR_BIT). For this reason, values that are represented using more than 8 bits use multiple bytes (hence those using a number of bits with is a multiple of 8). For a 32-bit value, you use 4 bytes and, in memory, those bytes always follow each other.

We call a pointer a value containing the address in memory of another value. In that context, a byte is defined as the smallest (in terms of bit count) value that can be referred to by a pointer. For example, your 32-bit value, covering 4 bytes, will have 4 "addressable" cells (one per byte) and its address is defined as the first of those addresses:

|==================|
| MEMORY | ADDRESS |
|========|=========|
|  ...   |   x-1   | <== Pointer to byte before
|--------|---------|
| BYTE 0 |    x    | <== Pointer to first byte (also pointer to 32-bit value)
|--------|---------|
| BYTE 1 |   x+1   | <== Pointer to second byte
|--------|---------|
| BYTE 2 |   x+2   | <== Pointer to third byte
|--------|---------|
| BYTE 3 |   x+3   | <== Pointer to fourth byte
|--------|---------|
|  ...   |   x+4   | <== Pointer to byte after
|===================

So what you want to do (split the 32-bit word into 8-bits word) has already been done by your computer, as it is imposed onto it by its processor and/or memory architecture. To reap the benefits of this almost-coincidence, we are going to find where your 32-bit value is stored and read its memory byte-by-byte (instead of 32 bits at a time).

As all serious SO answers seem to do so, let me cite the Standard (ISO/IEC 9899:2018, 6.2.5-20) to define the last thing I need (emphasis mine):

Any number of derived types can be constructed from the object and function types, as follows:

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. [...] Array types are characterized by their element type and by the number of elements in the array. [...]

[...]

So, as elements in an array are defined to be contiguous, a 32-bit value in memory, on a machine with 8-bit bytes, really is nothing more, in its machine representation, than an array of 4 bytes!

Given a 32-bit signed value:

int32_t value;

its address is given by &value. Meanwhile, an array of 4 8-bit bytes may be represented by:

uint8_t arr[4];

notice that I use the unsigned variant because those bytes don't really represent a number per se so interpreting them as "signed" would not make sense. Now, a pointer-to-array-of-4-uint8_t is defined as:

uint8_t (*ptr)[4];

and if I assign the address of our 32-bit value to such an array, I will be able to index each byte individually, which means that I will be reading the byte directly, avoiding any pesky shifting-and-masking operations!

uint8_t (*bytes)[4] = (void *) &value;

I need to cast the pointer ("(void *)") because ~~I can't bear that whining compiler~~ &value's type is "pointer-to-int32_t" while I'm assigning it to a "pointer-to-array-of-4-uint8_t" and this type-mismatch is caught by the compiler and pedantically warned against by the Standard; this is a first warning that what we're doing is not ideal!

Finally, we can access each byte individually by reading it directly from memory through indexing: (*bytes)[n] reads the n-th byte of value!

To put it all together, given a send_can(uint8_t) function:

for (size_t i = 0; i < sizeof(*bytes); i++)
    send_can((*bytes)[i]);

and, for testing purpose, we define:

void send_can(uint8_t b)
{
    printf("%hhu\n", b);
}

which prints, on my machine, when value is 32700:

Lastly, this shows yet another reason why this method is platform-dependent: the order in which the bytes of the 32-bit word is stored isn't always what you would expect from a theoretical discussion of binary representation i.e:

byte 0 contains bits 31-24
byte 1 contains bits 23-16
byte 2 contains bits 15-8
byte 3 contains bits 7-0

actually, AFAIK, the C Language permits any of the 24 possibilities for ordering those 4 bytes (this is called endianness). Meanwhile, shifting and masking will always get you the n-th "logical" byte.

I get the value as a number from a linear sensor based on the device position. — Aman Kejriwal, Jun 27 '20 at 07:43
My taks is to take this number and convery it into a 32-bit signed integer representation to be sent over to a display unit using a CAN message — Aman Kejriwal, Jun 27 '20 at 07:44
so you have to split a 32-bit number in 8-bit "packets"? I'm not familiar with the CAN protocol... That would explain the modulus though. Be careful about endianess! — PiCTo, Jun 27 '20 at 07:46
issue is that this is not my primary field either and I am just doing it for my kid's RC plane controller :D — Aman Kejriwal, Jun 27 '20 at 07:47

cup · Answer 2 · 2020-06-27T12:36:20.437

It really depends on how your architecture stores an int. For example

8 or 16 bit system short=16, int=16, long=32
32 bit system, short=16, int=32, long=32
64 bit system, short=16, int=32, long=64

This is not a hard and fast rule - you need to check your architecture first. There is also a long long but some compilers do not recognize it and the size varies according to architecture.

Some compilers have uint8_t etc defined so you can actually specify how many bits your number is instead of worrying about ints and longs.

Having said that you wish to convert a number into 4 8 bit ints. You could have something like

unsigned long x = 600000UL;  // you need UL to indicate it is unsigned long
unsigned int b1 = (unsigned int)(x & 0xff);
unsigned int b2 = (unsigned int)(x >> 8) & 0xff;
unsigned int b3 = (unsigned int)(x >> 16) & 0xff;
unsigned int b4 = (unsigned int)(x >> 24);

Using shifts is a lot faster than multiplication, division or mod. This depends on the endianess you wish to achieve. You could reverse the assignments using b1 with the formula for b4 etc.

the "faster" comment is not accurate; the compiler should choose the best option , and right-shift is defined as division by two anyway — M.M, Jun 27 '20 at 09:04
I haven't examined the code generated by compilers lately. The ones I did look at went through the whole modulo/division process instead of shifting but that was over 15 years ago. Code generators may have moved on since then. — cup, Jun 27 '20 at 09:38
@cup why do you have to make explicit that the 60.000 is a signed long when it’s being stored at an unsigned long variable? — abetancort, Jun 27 '20 at 10:58

Yunnosch · Answer 3 · 2020-06-27T08:12:04.803

0

You could do some bit masking.

600000 is 0x927C0

600000 / (256 * 256) gets you the 9, no masking yet.
((600000 / 256) & (255 * 256)) >> 8 gets you the 0x27 == 39. Using a 8bit-shifted mask of 8 set bits (256 * 255) and a right shift by 8 bits, the >> 8, which would also be possible as another / 256.
600000 % 256 gets you the 0xC0 == 192 as you did it. Masking would be 600000 & 255.

edited Jun 27 '20 at 08:12

answered Jun 27 '20 at 07:48

Yunnosch

26,130
9
42
54

Aman Kejriwal · Accepted Answer · 2020-07-30T06:35:31.577

0

I ended up doing this:

unsigned char bytes[4];
unsigned long n;

n = (unsigned long) sensore1 * 100;

bytes[0] = n & 0xFF;     
bytes[1] = (n >> 8) & 0xFF;
bytes[2] = (n >> 16) & 0xFF;
bytes[3] = (n >> 24) & 0xFF;
       CAN_WRITE(0x7FD,8,01,sizeof(n),bytes[0],bytes[1],bytes[2],bytes[3],07,255);

edited Jul 30 '20 at 06:35

answered Jun 28 '20 at 12:11

Aman Kejriwal

521
1
7
28

1

Use `uint32_t` instead of `unsigned long` – Antti Haapala -- Слава Україні Jul 30 '20 at 07:08

score 0 · Answer 5 · answered Nov 07 '21 at 09:38

I have been in a similar kind of situation while packing and unpacking huge custom packets of data to be transmitted/received, I suggest you try below approach:

typedef union 
{
   uint32_t u4_input;
   uint8_t  u1_byte_arr[4];
}UN_COMMON_32BIT_TO_4X8BIT_CONVERTER;

UN_COMMON_32BIT_TO_4X8BIT_CONVERTER un_t_mode_reg;
un_t_mode_reg.u4_input = input;/*your 32 bit input*/
// 1st byte = un_t_mode_reg.u1_byte_arr[0];
// 2nd byte = un_t_mode_reg.u1_byte_arr[1];
// 3rd byte = un_t_mode_reg.u1_byte_arr[2];
// 4th byte = un_t_mode_reg.u1_byte_arr[3];

score -1 · Answer 6 · answered Jun 27 '20 at 07:51

The largest positive value you can store in a 16-bit signed int is 32767. If you force a number bigger than that, you'll get a negative number as a result, hence unexpected values returned by % and /.

Use either unsigned 16-bit int for a range up to 65535 or a 32-bit integer type.

Converting 32 bit number to four 8bit numbers

6 Answers6