7

I'm using this code to move pointer by 1 byte now, but I'm feeling something unclear..

int* a = (int*)malloc(sizeof(int));
void* b = ((char*)a)+1;   

char is 1 byte, but not defined for byte operation purpose. I believe there's another way to do this byte operation. What's the correct way to byte operation?

PS. I modified sample code to be valid. It's now compiled as C++ with Clang.

eonil
  • 83,476
  • 81
  • 317
  • 516
  • Are you using C or C++? The code you showed isn't valid for either. And what do you mean by "`char` is not defined for byte operation"? – Ben Voigt Feb 06 '11 at 15:45
  • 5
    There is no correct way to do this; what you're doing causes undefined behaviour. – Jonathan Grynspan Feb 06 '11 at 16:01
  • I don't know why this got downvoted. Even if the purpose seems fishy, the question is legit. – Alexandre C. Feb 06 '11 at 16:24
  • @Ben Thanks for correction. It's my mistake. I modified my question :) – eonil Feb 06 '11 at 17:05
  • 1
    @Alexandre: Because there's only one valid `int*` pointer into a buffer of length `sizeof (int)`. Downvote revoked now that the question is creating a `void*`. – Ben Voigt Feb 06 '11 at 17:13
  • 1
    There's a lot of standards-waving going on here! OP did not state why he's trying to do this, so giving the benefit of the doubt, he's in the same situation as everyone who writes C has been in-- you need to parse some network data, or some tricky file or whatever, and you need to index into an array of bytes. I'm kind of surprised by people claiming so loudly this is conceptually impossible and should be avoided at all costs. It may be conceptually wonky in C, but let's be honest, the code the OP suggests is perfectly functional on most common platforms, and will likely get the job done. – Ben Zotto Feb 06 '11 at 23:11
  • @quixoto I agree with you. My situation is exactly what you said which handle 'byte' itself. I'm finding correct way to handle byte in the language. But the people's talks are right too. They let me know about many things to care about :) – eonil Feb 07 '11 at 02:57
  • 1
    that's the unfortunate drawback of C specification; it has no "byte" type. It has a char. the standard uint8_t was only introduced in c99(with stdint.h), so a lot of compilers like turbo c don't come with it. To be extra annoying, many C++ compilers treat char, signed char as different types, try for yourself - make 2 functions foo(char *) and foo(signed char *) and C++ won't complain about ambiguity, but you can very easily call the wrong function later on. imagine C had the u8, u16, u32, u64(relative to 16 bit architecture) predefined; we'd not have a million type synonyms today! – Dmytro Jan 19 '18 at 21:04

6 Answers6

5

I think you are confused:

char is 1 byte, but not defined for byte operation purpose. I believe there's another way to do this byte operation. What's the correct way to byte operation?

What exactly are you expecting byte to mean, if not the exact same thing that char means?

In C and in C++, chars are bytes. By definition. What is not the case is that bytes are necessarily octets. A byte contains at least 8 bits. There is no guarantee that a given platform even makes it possible to reference a chunk of memory that is exactly 8 bits.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
4

In C99, you have the stdint.h header, which contains int8_t and uint8_t types, which are guaranteed to be 8 bits (and generally are just typedefs for char). Beyond this, there is no real language level support for bytes in C or C++, in fact the standard goes out of its way to say that sizeof for example is in units of char (and not bytes). There is also the CHAR_BIT macro which tells you the number of bits in a byte, on some platforms char was 9bits for example. Of course I'm assuming by byte you mean octet.

Logan Capaldo
  • 39,555
  • 5
  • 63
  • 78
  • Thanks. It seems the concept of `byte` is completely abstracted in C/C++. Right? And if so, `char` should be minimum unit of data type. Is it? – eonil Feb 06 '11 at 17:12
  • 1
    So char is the minimum unit of data type, except for bit fields. However, since you can't name the type of a bit field (and consequently can't have a pointer to it, or take its address) or perform pointer arithmetic in units of less than char, for the purposes of pointers, char is the minimum, yes. – Logan Capaldo Feb 06 '11 at 17:34
  • 1
    It's `CHAR_BIT`, not `CHAR_BITS`. – caf Feb 06 '11 at 22:30
2
((char*&)a)++;

Or:

a = (int*)((char*)a+1);

I hope you know exactly what you're doing. For one thing, you're ending up with - by definition - unaligned int pointer. Depending on architecture and OS, this might be trouble.

Seva Alekseyev
  • 59,826
  • 25
  • 160
  • 281
  • 4
    Not only an unaligned pointer, but one that is invalid because it extends beyond the bounds of the allocated buffer. – Ben Voigt Feb 06 '11 at 15:56
2

((char*)a)++

This is one of those evil Microsoft extensions. A pointer casting expression is an rvalue, but according to the C++ language rules, the increment operator only works on lvalues. g++ refuses to compile this.

fredoverflow
  • 256,549
  • 94
  • 388
  • 662
1

You should not do this. Many architectures have data alignment requirements. For example, dereferencing a pointer not aligned to a word boundary on a SPARC machine, will crash the program with a Bus Error (SIGBUS).


The portable way to split your int into bytes is by using bitwise operations (assuming 8-bit bytes):

uint8_t b3 = 0x12, b2 = 0x34, b1 = 0x56, b0 = 0x78;
uint32_t a;

a = (b3 << 24) | (b2 << 16) | (b1 << 8) | b0;

printf("%08X\r\n", a);

a = 0x89ABCDEF;

b3 = (a >> 24) & 0xFF;
b2 = (a >> 16) & 0xFF;
b1 = (a >> 8) & 0xFF;
b0 = a & 0xFF;

printf("%02X%02X%02X%02X\r\n", b3, b2, b1, b0);

The same can be non-portably achieved with type punning tricks through unions, such as:

typedef union {
    uint32_t val;
    uint8_t  bytes[4];
} DWORD_A;

typedef union {
     uint32_t val;
     struct {
         unsigned b0:8;
         unsigned b1:8;
         unsigned b2:8;
         unsigned b3:8;
     };
} DWORD_B;

However, this technique leads to implementation defined behaviour and thus is not recommended:

  • Byte order depends on host system's endianness.
  • Packing of the bit fields is not portable.
  • Added complexity/overhead due to code generated by the compiler to prevent misaligned access.
  • Alignment issues, on implementations that don't prevent them.
makes
  • 6,438
  • 3
  • 40
  • 58
  • Thanks. I didn't know about alignment in detail. If so, it might be impossible to make byte-manipulation code portable. It seems require any kind of abstraction. – eonil Feb 07 '11 at 02:40
  • If you have a chunk of data you need to access byte-by-byte, you can declare it as a `char` array in the first place. Also, the bit shifting method I showed, is portable. See http://c-faq.com/strangeprob/ptralign.html for more information. – makes Feb 07 '11 at 03:10
0

plz, use void*

int g = 10;
int *a = &g;
printf("a : %p\n",a);
printf("a : %p\n", ++a);
printf("a : %p\n", (void*)((char*)a+1));

a : 0xbfae35dc a : 0xbfae35e0 a : 0xbfae35e1

Jeremy
  • 21
  • 1