1

I have a little problem in my C-Embedded software. In fact I would like to convert a char pointer on integer pointer but I encounter some issues...

In fact I fill a char-table and would like to read this buffer in integer variable. Could a cast permits to concatenate 4-char to one integer ?

Example:

char tab[4] = {0x01,0x02,0x03,0x04};

and I would like to get an integer containing the value 0x01020304. I tried to do that but I don't get the wanted value:

val_int =*((int*)tab);

Could you give me some advice? Is it mandatory to cast each cell once?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Pierrot
  • 13
  • 2

5 Answers5

3

You cannot write code like this, as it violates the "strict aliasing" rule. Which, simply put, dictates that the compiler is free to assume that your array of characters is never accessed through a pointer to int. Because of this, the compiler might optimize away large portions of your code.

For example, it might decide that the whole char array is never used by your program and remove it entirely. Therefore the outcome of your code has undefined behavior.

That being said, even if your code would work as you intended, it would still be endianess-dependent. If portability/endianess is no concern, then you could use a union, which would make the code safe against pointer aliasing bugs:

#include <stdint.h>
#include <inttypes.h>
#include <stdio.h>

typedef union
{
  uint32_t val32;
  uint8_t  val8 [sizeof(uint32_t)];
} val_t;


int main (void)
{
  val_t v = {.val8 = {0x01,0x02,0x03,0x04} };

  printf("%.8" PRIx32, v.val32);

  return 0;
}
Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Well, char is actually an exception: "a character type may alias any other type." – Alex Skalozub Feb 15 '16 at 10:37
  • 1
    @AlexSkalozub Only if the cast was _from_ int _to_ char pointer. Not the other way around. `(char*)&my_int` does not violate aliasing, but `(int*)char_array` does. – Lundin Feb 15 '16 at 10:38
2

To do not depend on endiness of your platform:

const uint32_t val_int = (tab[0] << 24) | (tab[1] << 16) | (tab[2] << 8) | tab[3];
dmi
  • 1,424
  • 1
  • 9
  • 9
  • `const uint32_t val_int = (tab[0] << CHAR_BIT * 3) | (tab[1] << CHAR_BIT * 2) | (tab[2] << CHAR_BIT) | tab[3];` may be better. Embedded devices can be strange. – nalzok Feb 15 '16 at 11:12
1

You can use a C union This indicates that the data can be accessed by different types:

#include <stdio.h>
#include <stdlib.h>

union data{
    int i;
    char arr[4];
};

int main()
{
    union data d;
    d.arr[0] = 0x01;
    d.arr[1] = 0x02;
    d.arr[2] = 0x03;
    d.arr[3] = 0x04;

    printf("the value: %#010x\n", d.i); //outputs 0x4030201 on my little endian computer

    return 0;
}

Also note Endianess this might make your result look on different order than what you expected.

nalzok
  • 14,965
  • 21
  • 72
  • 139
antonpuz
  • 3,256
  • 4
  • 25
  • 48
  • 1
    It is very questionable practice to use signed types for the union. Not only is `int` signed, the `char` have implementation-defined signedness. – Lundin Feb 15 '16 at 09:55
  • @Lundin interesting point, for the sake of study, what do you mean by char implementation-defined signedness? and how can it have any effect if I state the hex value to be stored in the char? – antonpuz Feb 15 '16 at 10:00
  • `char` is a dysfunctional type that doesn't follow the same rules as the other standard integer data types. It may or may not be signed, it is up to the compiler to decide. Therefore, you should never use `char` for anything but text strings. If you need a 1 byte data type, use `uint8_t`. – Lundin Feb 15 '16 at 10:02
  • Suppose for example that `char` is signed on your system and you try to store the values `0x81, 0x82, 0x83, 0x84` inside the array. There would then be an implementation-defined conversion from unsigned to signed. The system might not support it. So your code above relies on 3 different forms of implementation defined behavior: the signedness of char, the manner of unsigned to signed integer conversion and the sign format of `int` (likely 2's complement, but still). And then of course endianess on top of that. – Lundin Feb 15 '16 at 10:08
  • union is not guaranteed to work as you expect. So you should say: _union can probably be abused to do that if you are lucky_ instead of _union can be used to do that_ – Ctx Feb 15 '16 at 10:25
0

On Little Endian machine, it's possible to cast but you'd have to re-organize your initial char array the other way around to get the expected result. Here's an example on X86:

char tab[] = {0x04,0x03,0x02,0x01};
unsigned int *p_int = ( unsigned int * )tab;
printf( "val = 0X%X \n", *p_int );
artm
  • 17,291
  • 6
  • 38
  • 54
0

There's also another problem (besides already mentioned) on embedded which may arise from code like that.

On some platforms read (and write) instructions are required to be aligned to the size of data read (written), i.e. 8-bit read is unaligned, 16-bit read is aligned to 2 bytes boundary, and 32-bit read is aligned to 4 bytes.

When you're allocating your byte array, its beginning is not aligned at all (because they're bytes), but when you cast it to int* and read it, 32-bit read instruction would be used.

As a result, you'll get a random crash.

Alex Skalozub
  • 2,511
  • 16
  • 15