Incrementing pointer to pointer by one byte

Question

#include <stdio.h>

int main(){
 int a = 5;
 int *p = &a;
 int **pp = &p;

 char **cp = (char **)pp;  
 cp++;                    // This still moves 8 bytes
 return 0;
}

Since the size of a pointer is 64 bits on 64 bit machines, doing a pp++ will always move 8 bytes. Is there a way to make it move only 1 byte?

What evidence do you have that `pp++` "will always move 8 bytes"? — Scott Hunter, Mar 10 '20 at 13:00
You are actually incrementing a `char *`, which is larger than 1 byte. And you cannot increment something by a fraction of its size. — ryyker, Mar 10 '20 at 13:00
You can do this `cp = ((uint64_t)cp) + 1` if you want to do it at all cost assuming the pointer size 8 byte in your machine. — Eraklon, Mar 10 '20 at 13:07
@Eraklon But you can't really do anything with that value. It no longer points to a pointer of any type, and dereferencing it is therefore undefined behavior. (And pedantically, `uintptr_t` should be used instead of `uint64_t` if you're going to break the rules that way.) — Andrew Henle, Mar 10 '20 at 13:11
@AndrewHenle Well obviously it is pointless to do this, but I am just stating how it could be done. The `uintptr_t` is a good suggestion though, thanks! — Eraklon, Mar 10 '20 at 13:13
It is not true that “the size of a pointer is 64 bits on 64 bit machines.” The size of a pointer is determined by the C implementation, not by the machine it executes on. I have worked with a C implementation that targeted 64-bit machines but used 32-bit pointers. — Eric Postpischil, Mar 10 '20 at 13:53
The phrasing ”doing a `pp++` will always move 8 bytes” is not good. “Moving eight bytes” generally means copying the values of eight bytes from one location to another. A better phrasing would be “executing `pp++` adjusts `pp` to point to an address eight bytes greater.” — Eric Postpischil, Mar 10 '20 at 13:55

chux - Reinstate Monica · Answer 1 · 2020-03-10T18:20:19.887

Is there a way to make it move only 1 byte?

Maybe.

All object pointers can be converted to void * and since char * has the same representation, to char *. ++ increments a char * by 1.

#include <stdio.h>

int main() {
 int a = 5;
 int *p = &a;
 int **pp = &p;
 char **cp = (char **)pp;  

 char *character_pointer = (char *) cp;
 character_pointer++; // Increment by 1

Now is the tricky part. Can that incremented pointer convert back to a char **. C allows that unless

If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. C17dr § 6.3.2.2 7

 cp = (char **) character_pointer;
 return 0;
}

Reading *cp can readily cause undefined behavior as cp does not certainly point to a valid char *. Unclear as to OP's goal at this point.

hyde · Answer 2 · 2020-03-10T14:21:06.497

C is not assembly. What you are trying to do is undefined behavior, and compiler might not do what you ask, and the program might do anything, including possibly what you think it should do if C were just "assembly" with different syntax.

That being said, you can do this:

 int a = 5;
 int *p = &a;
 int **pp = &p;
 uintptr_t temp;
 memcpy(&temp, &pp, sizeof temp);
 temp++;
 memcpy(&pp, &temp, sizeof temp);

Above code is likely to do what you want, even though that last memcpy already triggers undefined behavior, because it copies invalid value to a pointer (that is enough for it to be UB). Actually using pp, which now has invalid value, has increasing chance of messing things up.

To understand why having any UB is indeed UB: compiler is free to decide that the effect of the code, which can be proven to have UB, is nothing, or is never reached. So if that last memcpy is inside if, and compiler can prove UB occurs if condition is true, it may just assume condition is never true and optimize whole if away. Presumably C programmer knows to write their condition so that it would never result in UB, so this optimization can be made at compile time already.

Yeah, it is a bit crazy. C is not just assembly with different syntax!

ryyker · Answer 3 · 2020-03-10T20:10:42.473

Incrementing pointer to pointer by one byte

If you find an implementation where the size of a pointer to pointer variable contains only 8 bits, (i.e. one that uses 1 byte addressing, btw, very unlikely), then it will be doable, and only then would it be safe to do so. Otherwise it would not be considered a practical or safe thing to do.

For an implementation that uses 64 bit addressing, 64 bits are needed to represent each natural pointer location. Note however though _[t]he smallest incremental change is [available as a by-product of] the alignment needs of the referenced type. For performance, this often matches the width of the ref type, yet systems can allow less._ (per @Chux in comments) but de-referencing these locations could, and likely would lead to undefined behavior.

And in this statement

 char **cp = (char **)pp; //where pp is defined as int **

the cast, although allowing a compile without complaining, is simply masking a problem. With the exception of void *, pointer variables are created using the same base type of the object they are to point to for the reason that the sizeof different types can be different, so the pointers designed to point to a particular type can represent its locations accurately.

It is also important to note the following:

            sizeof char ** == sizeof char * == sizeof char *** !!= sizeof char`   
     32bit  4 bytes           4 bytes            4 bytes             1 byte 
     64bit  8 bytes           8 bytes            8 bytes             1 byte  

            sizeof int ** == sizeof int * == sizeof int *** !!= sizeof int`   
     32bit  4 bytes           4 bytes            4 bytes             4 bytes (typically)
     64bit  8 bytes           8 bytes            8 bytes             4 bytes (typically)

So, unlike the type of a pointer, its size has little to do with it's ability to point to a location containing an object that is smaller, or even larger in size than the pointer used to point to it.

The purpose of a pointer ( eg char * ) is to store an address to an object of the same base type, in this case char. If targeting 32bit addressing, then the size of the pointer indicates it can point to 4,294,967,296 different locations (or if 64 bits to 18,446,744,073,709,551,616 locations.) and because in this case it is designed to point to char, each address differs by one byte.

But this really has nothing to do with your observation that when you increment a pointer to pointer to char that you see 8 bytes, and not 1 byte. It simply has to do with the fact that pointers, in 64bit addressing, require 8 bytes of space, thus the successive printf statements below will always show an increment of 8 bytes between the 1st and 2nd calls:

 char **cp = (char **)pp; 
 size_t size = sizeof(cp);
 printf("address of cp before increment: %p\n", cp);
 cp++;                    // This still moves 8 bytes
 printf("address of cp after increment: %p\n", cp);
 return 0;

"For an implementation that uses 64 bit addressing, 64 bits are needed to represent each pointer location, therefore the smallest incremental change in location will be 8 bytes." Not quite. The smallest incremental change is a requirement of the alignment needs of the referenced type. For performance, this often matches the width of the ref type, yet systems can allow less. — chux - Reinstate Monica, Mar 10 '20 at 15:27
Of course dereferencing an invalid pointer is UB. My comment was on the valid _locations_ of a pointer, not their value - pointers are not specified by C to exist on N-byte boundaries (in a system with N-byte pointers). — chux - Reinstate Monica, Mar 10 '20 at 15:36
No, I think you are on the right track, just some of the rational is off a bit. It is curious as to _why_ OP wants to do this. Sound like [XY](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) — chux - Reinstate Monica, Mar 10 '20 at 15:44
Actually there are implementations where a pointer has a size of 1 byte. One of them are C compilers for the 8051. — the busybee, Mar 10 '20 at 19:27
@thebusybee - Per the [conversation here](https://stackoverflow.com/a/40978974/645128), I am no longer certain that your claim is really watertight. Can you substantiate it with an actual example of a single byte pointer for a C compiler for the 8051? ( I still appreciate your comment. ) — ryyker, Mar 10 '20 at 20:25
I am thinking of pointers to variables in the 8051's DATA and PDATA memory sections. Those sections are addressed by 8 bits only, so their pointers are only 1 byte wide. (Side note: PDATA is a paged view into the 64KiB XDATA section, which normally needs 16-bit addresses.) Well, you could say, this is non-conformance, because one seems to have to use a special keyword like `__data`. But you can call the compiler with an option to place all variables in DATA or PDATA by default. The source is clean, then, like `char *p;`. — the busybee, Mar 10 '20 at 20:55
Actually, Keil C51 supports several pointer types, ranging from 1 to 3 bytes width. The "bigger" derivatives even may use 4 bytes, but I'm talking about the generic 8-bit processor of the 8051. — the busybee, Mar 10 '20 at 20:57

Incrementing pointer to pointer by one byte

3 Answers3